Add rustc flag to disable mutable no-aliasing optimizations?

If missing inlining in debug mode is a serious concern, then there’s less invasive and more effective changes to Rust that could address the problem.

I don’t think that “performance differences in debug mode” is ever a valid argument for indroducing dedicated syntax for something. These two things should have absolutely nothing to do with each other. All that “performance differences in debug mode” could ever warrant is an introduction of (potentially unstable/internal) features that allows debug performance to be more like release mode performance in some particular aspect.

This means that, provided that these missing inlines in debug mode for UnsafeCell methods are even a practical problem at all, they could be a reason to e.g. introduce certain inlining for debug mode with special inlining annotations or something like that. Furthermore, (unlike “placement new”,) UnsafeCell is stable, so there’s nothing to be gained from new “dedicated syntax” anyways.

12 Likes

Well, if you're a C++ compiler, implementing a __builtin_move to replace std::move with in order to reduce the amount of (literally do nothing) function calls in debug builds seems at least somewhat reasonable. Or maybe that was an april fool's day post? Or maybe not?

An interesting difference between rustc and, say, gcc, is that rustc is Rust's specification currently. If rustc stabilizes something (barring some very specific legacy things), it's part of Rust, the language. GCC has a bit more leeway to support __builtin_function, since the split between core language support and vendor extensions is clearer. (For rustc, that split is #![feature], and no vendor extensions are considered stable.)

In any case, #[inline(always)] includes debug mode. We can (and probably should) have a policy for using #[inline(always)] on core items where the body of the function call is "as cheap as a function call." (Defining that is the hard part.)

An alternative is to run some subset of MIR inlining passes, even in debug compilation, to obtain the same level of always inlining "as cheap as a function call" function calls. Doing so might even speed up debug compilation, as we'd be giving LLVM less LLIR to chew through.

But ultimately, if you're having actual issues with debug mode adding too much overhead of unnecessary function calls, that's what -O1 is for. It does a really good job of maintaining the useful debug information while stripping out those "just a formality" bits that using library rather than built-in building blocks introduces.

2 Likes

"This kind of thing" was me alluding to this being just one instance of a general category of issues. You can swap "performance in debug mode" with "performance in optimized mode in certain situations" and the same issues crop up. In general optimizers are not good at consistently making the the right decision in all contexts. Their job is too hard and they are insanely complicated, so they inevitably have edge cases they handle poorly.

You could say without introducing any new syntax that certain instances of function call syntax are never actually function calls. The argument against it and the reason that I would consider syntax to be a related question at all is that syntax implies code in the compiler to handle the syntax which (weakly) implies the compiler directly handling the operation which increases my confidence when writing code that uses that syntax that I'm not going to find undesired calls in my assembly output.

Unfortunately when I tried -O1 still doesn't inline UnsafeCell::get in my program despite it being a no-op function. In my experience you can depend on compilers to do the right thing on average but they inevitably have edge cases they screw up. Built-in compiler support for something like this is more likely to work consistently, it's way less likely for LLVM to add a function call that was never in the IR than it is for LLVM to fail to eliminate a call in the IR.

Dedicated syntax does not imply that you can have confidence that the compiler does what you want. If you are more confident that it does just because of the syntax, you're mistaken, and you need to stop thinking like that. The compiler can poorly implement features with dedicated syntax, or it can behave well with preexisting syntax. The syntax is not a reliable way to measure the things you want to be confident in.

The compiler handles all code you pass to it, whether it have dedicated syntax or no. There's no implication about how it does it just because of the syntax used, because it could easily farm off the most dedicated syntax to a poorly written module or ineffective optimizer.

9 Likes

Sure, and I admit I avoided the version where the discussion is about “performance in optimized mode”, because the argument might become less trivial in this case compared to when we’re only talking about debug mode. My opinion on this however is that even performance in optimized mode should have nothing to do with introducing special syntax for something that, functionally, doesn’t need it. If the runtime behavior of using ordinary function is correct in all regards except performance then the the only changes needed are in the optimizer, nowhere else. It still doesn’t have anything to do with user-facing syntax.

While this is true, it is also true that special syntax is not an adequate way to handle all cases “in general” including “edge cases”. Or in other words: If your code performance is fixed by the special-casing that comes with special sytax than it’s also equally well fixed by the same kind of special-casing without the extra syntax. But I guess you kind-of already are on the same page on this point...

So the only argument is that a programmer might feel more confident when using special syntax instead of a function call that no “calls” in assembly are generated. IMO this sounds like it is based on a quite outdated mindset that “writing a function in code corresponds to a sort-of C-style-calling-convention ‘call’ at runtime”. My opinion is that function syntax in high-level code a priori doesn’t have any direct correspondence with compiler output. We should move past the “it’s just a portable assembler, and recently they’ve added some optimization, but optimizers are complicated and unreliable so let’s just pretend they don’t exist” view of a systems programming language. The whole point of “zero cost abstraction” is that one should stop worrying about the abstraction. If an abstraction isn’t zero-cost enough already isn’t that more of an argument for fixing the abstraction cost instead of giving up on the abstraction entirely?

7 Likes

I've filed Use `#[inline(always)]` on trivial UnsafeCell methods by joshtriplett · Pull Request #83858 · rust-lang/rust · GitHub to help ensure that UnsafeCell remains zero-cost even in lower optimizations levels or debug mode.

This kind of thing is also what the MIR inliner may help with.

12 Likes

but optimizers are complicated and unreliable

Yes, yes they are. The fact that UnsafeCell can end up not being inlined is proof of that fact. Think of it this way. Maybe you can get the optimizer to inline this stuff every time. But then it's an extra step that wouldn't be necessary if it was just normal syntax in the first place. If it makes it harder for a naive implementation of the language's compiler to do it right, that's a bad thing. Rust lately seems to be adding syntax for things that aren't important for a systems language and should have been methods (like the atrocity of .await; What were they smoking?), and not adding syntax for stuff that is helpful for a systems language, (gotos, placement new, interior mutability, asm stabilization).

Optimizers aren't perfect, and frequently it's found that assumed zero cost abstractions actually aren't. C++'s std::unique_ptr not being inlined correctly comes to mind. Rust already performs much worse in debug builds than C does. Why would you defend the mentality that has led us here? Some things should be syntax. Now I'm not "syntax happy", for one I'm against box syntax and in favor of a DerefMove instead, but core, critical functionality shouldn't be a possibly costly function tagged with an unstable lang item, it should be syntax.

I should add, I think part of the problem is that you're thinking in terms of rustc being the only serious implementation forever. If Rust succeeds, that will not be true, the Rust developers will lose control, and eventually these shortcomings will become a permanent fixture of the language and an argument against it. "Cons: abhorrent performance in debug builds with most compilers"

Rust is a young language, and it can be forgiven for fixing such details now in a backwards compatible way, but it won't be that way forever.

Sure, let's suppose that some weird sigil, say, ~~, was used for UnsafeCell. Then what? What exactly was achieved besides making code that uses UnsafeCell harder to read?

await is not merely a matter of method call sugaring. It allows for a means of writing code that could not have been written otherwise (without heavily abusing unsafe operations, that is). Now, I may have particular opinions against the particular syntax they chose, but much of the code that uses await wouldn't have been feasible to write without it.

4 Likes

The reason for having .await is that it cannot be a method - the compiler desugars async fns to state machine, which get 'suspended' at each .await

3 Likes

Do you have any concrete benchmarks showing that lang items like UnsafeCell are responsible for this? My guess would be that iterators would usually be responsible for the decreased debug performance.

3 Likes

And what would that built-in syntax lower as? Almost certainly a call to a built in function, which would then need to be inlined.

If you define a systems language as C, then yes, async/await isn't in C, and C programs are still written. C++ is unquestionably a systems programming language, though, and they felt it necessary to add coroutines/co_await.

Rust's .await can't be a method, either, because async/.await is a full state machine transform. Not something that is just a function call, like, say, reading a memory location is.

Interior mutability is a type system concept, so we have UnsafeCell, a type system concept to address it. What would be materially different if UnsafeCell<T> were spelled interior_mut T?

It'd be more of a pain to use as a normal type, because it'd be a new kind of type (the way Box<T> currently still is in rustc), but as part of the language, not a detail that could eventually be removed. It'd be a pain to explain the difference between &mut T, being <&mut>::<T>, and &interior_mut T, being <&>::<interior_mut T>.

And what's the benefit? Instead of UnsafeCell::get you have a whole bunch of awkward-to-specify unsafe-gated coercions from interior_mut T to T. How is this better?

We literally just got asm!, an inline asm functionality with an actual track to stabilization. It's moving along perfectly nicely.

It's not like C or C++ have standard ways to write inline assembly, either. Compilers have compiler-specific language extensions to add inline assembly, with subtly incompatible behavior. I.e. exactly the situation with #[feature(asm)]: a compiler specific, opt-in extension to the language to support a specific use case.

I think this is a point of issue here: there is no such thing as a naive compiler anymore.

This is actually due to ABI. As a class type, std::unique_ptr<T> is always passed by stack, unlike T*, which can be passed in a register. If compilers were willing to break ABI here, it would be truly zero cost.

A fairer comparison here would be C++. C just doesn't support the tower of zero cost abstraction niceties that Rust does, and relies on the optimizer to peel away.

Also, C really cheats here in most common configurations. With C, your dependencies (if you even use any libraries beyond libc) are almost certainly compiled with optimizations. With Rust, any libraries you use aren't optimized (unless you tell rustc to optimize them).

Beyond C and C++, what other popular languages have multiple competing, mainstream compilers? What's so special about systems languages that they have to have multiple compilers?

  • Java - javac
  • Python - cpython, pypy
  • C# - Roslyn, which is replacing Mono
  • Visual Basic - (I don't know)
  • JavaScript - V8 (Chrome), SpiderMonkey (FireFox)
  • PHP - (I don't know)
  • SQL - different dialects are considered different languages due to vastly different feature support
  • Ruby - (I don't know, the standard is useless anyway)
  • Go - gc, gccgo
  • Swift - just Apple's compiler

Honestly: what benefit is there to having multiple frontends to a modern language? What benefit comes from redoing work in competition rather than cooperating on a shared implementation?

4 Likes
  • I think goto would either have an extremely complex interaction with the rest of the language (how does it affect drop order? what does it do in an async fn? What happens during unwinding), or would need to be unsafe. Do you have a concrete use-case that's not handled by an existing language feature?
  • I think there have been some discussions about 'placement new' recently, but I'm not certain.
  • Interior mutability already exists - that's what UnsafeCell is for
  • The implementation of asm! still has some work being done on it - though people are already using it on nightly. A major feature like asm! isn't going to be stabilized before it's ready.
3 Likes

I feel kind of bad joining in on the dogpile, but…

The cost of implementing inlining isn’t particularly higher than the cost of implementing any other calling convention (which is what inlining basically is), even for a non-optimising compiler. Adding syntax to direct the compiler to always inline a function is pretty trivial.

It wouldn’t have been possible for await to be realised as a call instruction even if it re-used method call syntax, which is what people mean when they misleadingly say that ‘it cannot be a method’. So under your philosophy of ‘every built-in must have a special syntax, and every function call is a call instruction’, the choice is between having special syntax for await and not having it at all. If it’s the latter that you’re arguing for, why aren’t you saying so?

(I find the ‘systems language’ epithet rather meaningless, but that’s for another day.)

When people judge the performance of C code, they look at GCC, Clang, MSVC and Intel CC. All of them state of the art modern optimising compilers that perform inlining easily. Nobody seems to judge C mostly by the behaviour of TinyCC. I don’t see a reason why the same wouldn’t be true of Rust.

1 Like

@Subsentient, please tone down your rhetoric if you want to keep participating here, and please contact the moderation team rather than posting in-thread if you have comments about moderation.

5 Likes

This makes me wonder if it'd be a good idea to have "different rustc flags for the current crate vs its dependencies". Sure, this would not cover code from dependencies that is only monomorphized now, but it could still help. You'd still have fast rebuild times (assuming you don't change dependencies), but get more optimized code.

3 Likes

Doesn't cargo already have that?

3 Likes

Ah, I guess one could use [profile.dev.package."*"] to set different flags for all non-workspace packages. I did not know that is possible. It certainly is not widely advertised as a way to combat "slow debug builds".

1 Like

I've seen it. Adding typestate back to the language would be cool, in a way, but almost certainly not justified just to extend the existing concept of initializedness tracking to custom pointer types.