Add rustc flag to disable mutable no-aliasing optimizations?

Note: Box magic wrt. moving deref is a lot more complicated than simple DerefMove.

For core, I would strongly caution against allowing it's implementation in stable rust. core, and many of it's components, necessarily interacts with internal implementation details of the rust compiler. By exposing these stably, they are fixed, and do not provide variation in those implementation details. Control over the precise assembly would require control over these implementation details (which comes at a cost of portability, including portability to different versions of the same implementation).

And, as an implementor, trying to (eventually) have parity with rustc stable, I very much thank them for that limitation. I'd prefer that crater runs, when I have a compiler that can do those, not fail because crates are opting out of stability on the version of the compiler that I should be able to rely upon implementing the core language, and only making the core language available.

I can list 5 more features in the C++ Standard Library that are language critical, and at least dozen that are magic and cannot be implemented in user code. For example, good luck implementing std::bit_cast correctly without some semblance of knowledge about the compiler (Hint: It cannot be done in the general case). While C is much less attached to the standard library, <stdarg.h> and <setjmp.h> would like to have a word with you.

Not to sound like a broken record, but I consider that a good thing. Stable Rust is also portable rust. Not only would opting into the unstable lang items and builtins be unstable, they are inherently non-portable. In my (very WIP) implementation, lccc, some things that are lang items in rustc are not (Box, ManuallyDrop, and UnsafeCell are examples that come to mind), and there are additional lang items, such as one for an unstable DerefMove and DerefPlace (which together are used to implement Box moving deref). I would like to eventually have the ability to test lccc against stable rustc, and this is fundamentally incompatible with stable rustc allowing access to unstable implementation details to any crate that wants them.

As someone who builds C++ software using trunk clang (like, at most a week old off main, trunk clang), I absolutely disagree with the statement. Also, for the reasons I set forth above, I think it's a good thing that stable rustc be stable rust only.

11 Likes

You may want to look at ?Uninit types [exist today]. Also let’s talk about DerefMove

As a C and C++ programmer who doesn't like to use nightly, I can tell you that this isn't a big deal. At this point, Rust is feature-rich and being developed actively enough that whatever unstable things nightly offers in terms of "going low-level" are basically limited to micro-optimizations. The big features that I sometimes (rarely) miss are related to the type system, but even those are being worked on and can be worked around.

I certainly wouldn't go as far as saying that it's a "psychotic" experience.

1 Like

I remember having this discussion before, about offset_of (whether it should be a keyword or an intrinsic macro) and about await (if it should be a keyword or a method with an exotic calling convention). The thing is… it doesn’t matter much. Either way the decision is made, you’re still accessing the same compiler intrinsic that behaves the exact same way and has the exact same limitations and overheads. The only thing that changes is syntax. That’s not to say syntax doesn’t matter at all – sometimes it matters more than in other situations. But let’s not pretend this issue is something more profound than that.

And by the way,

This isn’t true in C either. You cannot write your own offsetof macro without invoking UB. You cannot write your own stdarg.h. You cannot portably compute INT_MIN or define uint32_t without #ifdefs checking for each particular target and compiler. You cannot write your own typedef for the type returned by the sizeof operator without size_t. And it doesn’t make sense to give up those things any more than it does to give up the volatile keyword.

I can agree that accessing fundamental compiler intrinsics ought not to pull in additional runtime cost, but I think the core versus std split accomplishes that already quite well. I see no good reason to ever forgo core: it doesn’t save you any resources, and it doesn’t grant you any extra degrees of freedom over just using the already-implemented version. Even if you defined your own UnsafeCell, you’d still have to use it the same way as the one defined in core, because the only possible implementation is using the #[lang] attribute to say ‘yes, compiler, this is that thing’. It would be at best an exercise in reinventing the wheel.

11 Likes

Sorry if this is somewhat off-topic but I just wanted to point out how the premise of using Rust without core reminded me of this (quite enjoyable) post that I once read

6 Likes

Even just at the assembly level, in debug builds it's not zero cost. UnsafeCell::get doesn't get inlined. Presumably a built-in language feature would not have this issue. This kind of thing has lead me to wonder if other features are better off with dedicated syntax, e.g. placement new.

If missing inlining in debug mode is a serious concern, then there’s less invasive and more effective changes to Rust that could address the problem.

I don’t think that “performance differences in debug mode” is ever a valid argument for indroducing dedicated syntax for something. These two things should have absolutely nothing to do with each other. All that “performance differences in debug mode” could ever warrant is an introduction of (potentially unstable/internal) features that allows debug performance to be more like release mode performance in some particular aspect.

This means that, provided that these missing inlines in debug mode for UnsafeCell methods are even a practical problem at all, they could be a reason to e.g. introduce certain inlining for debug mode with special inlining annotations or something like that. Furthermore, (unlike “placement new”,) UnsafeCell is stable, so there’s nothing to be gained from new “dedicated syntax” anyways.

12 Likes

Well, if you're a C++ compiler, implementing a __builtin_move to replace std::move with in order to reduce the amount of (literally do nothing) function calls in debug builds seems at least somewhat reasonable. Or maybe that was an april fool's day post? Or maybe not?

An interesting difference between rustc and, say, gcc, is that rustc is Rust's specification currently. If rustc stabilizes something (barring some very specific legacy things), it's part of Rust, the language. GCC has a bit more leeway to support __builtin_function, since the split between core language support and vendor extensions is clearer. (For rustc, that split is #![feature], and no vendor extensions are considered stable.)

In any case, #[inline(always)] includes debug mode. We can (and probably should) have a policy for using #[inline(always)] on core items where the body of the function call is "as cheap as a function call." (Defining that is the hard part.)

An alternative is to run some subset of MIR inlining passes, even in debug compilation, to obtain the same level of always inlining "as cheap as a function call" function calls. Doing so might even speed up debug compilation, as we'd be giving LLVM less LLIR to chew through.

But ultimately, if you're having actual issues with debug mode adding too much overhead of unnecessary function calls, that's what -O1 is for. It does a really good job of maintaining the useful debug information while stripping out those "just a formality" bits that using library rather than built-in building blocks introduces.

2 Likes

"This kind of thing" was me alluding to this being just one instance of a general category of issues. You can swap "performance in debug mode" with "performance in optimized mode in certain situations" and the same issues crop up. In general optimizers are not good at consistently making the the right decision in all contexts. Their job is too hard and they are insanely complicated, so they inevitably have edge cases they handle poorly.

You could say without introducing any new syntax that certain instances of function call syntax are never actually function calls. The argument against it and the reason that I would consider syntax to be a related question at all is that syntax implies code in the compiler to handle the syntax which (weakly) implies the compiler directly handling the operation which increases my confidence when writing code that uses that syntax that I'm not going to find undesired calls in my assembly output.

Unfortunately when I tried -O1 still doesn't inline UnsafeCell::get in my program despite it being a no-op function. In my experience you can depend on compilers to do the right thing on average but they inevitably have edge cases they screw up. Built-in compiler support for something like this is more likely to work consistently, it's way less likely for LLVM to add a function call that was never in the IR than it is for LLVM to fail to eliminate a call in the IR.

Dedicated syntax does not imply that you can have confidence that the compiler does what you want. If you are more confident that it does just because of the syntax, you're mistaken, and you need to stop thinking like that. The compiler can poorly implement features with dedicated syntax, or it can behave well with preexisting syntax. The syntax is not a reliable way to measure the things you want to be confident in.

The compiler handles all code you pass to it, whether it have dedicated syntax or no. There's no implication about how it does it just because of the syntax used, because it could easily farm off the most dedicated syntax to a poorly written module or ineffective optimizer.

9 Likes

Sure, and I admit I avoided the version where the discussion is about “performance in optimized mode”, because the argument might become less trivial in this case compared to when we’re only talking about debug mode. My opinion on this however is that even performance in optimized mode should have nothing to do with introducing special syntax for something that, functionally, doesn’t need it. If the runtime behavior of using ordinary function is correct in all regards except performance then the the only changes needed are in the optimizer, nowhere else. It still doesn’t have anything to do with user-facing syntax.

While this is true, it is also true that special syntax is not an adequate way to handle all cases “in general” including “edge cases”. Or in other words: If your code performance is fixed by the special-casing that comes with special sytax than it’s also equally well fixed by the same kind of special-casing without the extra syntax. But I guess you kind-of already are on the same page on this point...

So the only argument is that a programmer might feel more confident when using special syntax instead of a function call that no “calls” in assembly are generated. IMO this sounds like it is based on a quite outdated mindset that “writing a function in code corresponds to a sort-of C-style-calling-convention ‘call’ at runtime”. My opinion is that function syntax in high-level code a priori doesn’t have any direct correspondence with compiler output. We should move past the “it’s just a portable assembler, and recently they’ve added some optimization, but optimizers are complicated and unreliable so let’s just pretend they don’t exist” view of a systems programming language. The whole point of “zero cost abstraction” is that one should stop worrying about the abstraction. If an abstraction isn’t zero-cost enough already isn’t that more of an argument for fixing the abstraction cost instead of giving up on the abstraction entirely?

7 Likes

I've filed Use `#[inline(always)]` on trivial UnsafeCell methods by joshtriplett · Pull Request #83858 · rust-lang/rust · GitHub to help ensure that UnsafeCell remains zero-cost even in lower optimizations levels or debug mode.

This kind of thing is also what the MIR inliner may help with.

12 Likes

but optimizers are complicated and unreliable

Yes, yes they are. The fact that UnsafeCell can end up not being inlined is proof of that fact. Think of it this way. Maybe you can get the optimizer to inline this stuff every time. But then it's an extra step that wouldn't be necessary if it was just normal syntax in the first place. If it makes it harder for a naive implementation of the language's compiler to do it right, that's a bad thing. Rust lately seems to be adding syntax for things that aren't important for a systems language and should have been methods (like the atrocity of .await; What were they smoking?), and not adding syntax for stuff that is helpful for a systems language, (gotos, placement new, interior mutability, asm stabilization).

Optimizers aren't perfect, and frequently it's found that assumed zero cost abstractions actually aren't. C++'s std::unique_ptr not being inlined correctly comes to mind. Rust already performs much worse in debug builds than C does. Why would you defend the mentality that has led us here? Some things should be syntax. Now I'm not "syntax happy", for one I'm against box syntax and in favor of a DerefMove instead, but core, critical functionality shouldn't be a possibly costly function tagged with an unstable lang item, it should be syntax.

I should add, I think part of the problem is that you're thinking in terms of rustc being the only serious implementation forever. If Rust succeeds, that will not be true, the Rust developers will lose control, and eventually these shortcomings will become a permanent fixture of the language and an argument against it. "Cons: abhorrent performance in debug builds with most compilers"

Rust is a young language, and it can be forgiven for fixing such details now in a backwards compatible way, but it won't be that way forever.

Sure, let's suppose that some weird sigil, say, ~~, was used for UnsafeCell. Then what? What exactly was achieved besides making code that uses UnsafeCell harder to read?

await is not merely a matter of method call sugaring. It allows for a means of writing code that could not have been written otherwise (without heavily abusing unsafe operations, that is). Now, I may have particular opinions against the particular syntax they chose, but much of the code that uses await wouldn't have been feasible to write without it.

4 Likes

The reason for having .await is that it cannot be a method - the compiler desugars async fns to state machine, which get 'suspended' at each .await

3 Likes

Do you have any concrete benchmarks showing that lang items like UnsafeCell are responsible for this? My guess would be that iterators would usually be responsible for the decreased debug performance.

3 Likes

And what would that built-in syntax lower as? Almost certainly a call to a built in function, which would then need to be inlined.

If you define a systems language as C, then yes, async/await isn't in C, and C programs are still written. C++ is unquestionably a systems programming language, though, and they felt it necessary to add coroutines/co_await.

Rust's .await can't be a method, either, because async/.await is a full state machine transform. Not something that is just a function call, like, say, reading a memory location is.

Interior mutability is a type system concept, so we have UnsafeCell, a type system concept to address it. What would be materially different if UnsafeCell<T> were spelled interior_mut T?

It'd be more of a pain to use as a normal type, because it'd be a new kind of type (the way Box<T> currently still is in rustc), but as part of the language, not a detail that could eventually be removed. It'd be a pain to explain the difference between &mut T, being <&mut>::<T>, and &interior_mut T, being <&>::<interior_mut T>.

And what's the benefit? Instead of UnsafeCell::get you have a whole bunch of awkward-to-specify unsafe-gated coercions from interior_mut T to T. How is this better?

We literally just got asm!, an inline asm functionality with an actual track to stabilization. It's moving along perfectly nicely.

It's not like C or C++ have standard ways to write inline assembly, either. Compilers have compiler-specific language extensions to add inline assembly, with subtly incompatible behavior. I.e. exactly the situation with #[feature(asm)]: a compiler specific, opt-in extension to the language to support a specific use case.

I think this is a point of issue here: there is no such thing as a naive compiler anymore.

This is actually due to ABI. As a class type, std::unique_ptr<T> is always passed by stack, unlike T*, which can be passed in a register. If compilers were willing to break ABI here, it would be truly zero cost.

A fairer comparison here would be C++. C just doesn't support the tower of zero cost abstraction niceties that Rust does, and relies on the optimizer to peel away.

Also, C really cheats here in most common configurations. With C, your dependencies (if you even use any libraries beyond libc) are almost certainly compiled with optimizations. With Rust, any libraries you use aren't optimized (unless you tell rustc to optimize them).

Beyond C and C++, what other popular languages have multiple competing, mainstream compilers? What's so special about systems languages that they have to have multiple compilers?

  • Java - javac
  • Python - cpython, pypy
  • C# - Roslyn, which is replacing Mono
  • Visual Basic - (I don't know)
  • JavaScript - V8 (Chrome), SpiderMonkey (FireFox)
  • PHP - (I don't know)
  • SQL - different dialects are considered different languages due to vastly different feature support
  • Ruby - (I don't know, the standard is useless anyway)
  • Go - gc, gccgo
  • Swift - just Apple's compiler

Honestly: what benefit is there to having multiple frontends to a modern language? What benefit comes from redoing work in competition rather than cooperating on a shared implementation?

4 Likes
  • I think goto would either have an extremely complex interaction with the rest of the language (how does it affect drop order? what does it do in an async fn? What happens during unwinding), or would need to be unsafe. Do you have a concrete use-case that's not handled by an existing language feature?
  • I think there have been some discussions about 'placement new' recently, but I'm not certain.
  • Interior mutability already exists - that's what UnsafeCell is for
  • The implementation of asm! still has some work being done on it - though people are already using it on nightly. A major feature like asm! isn't going to be stabilized before it's ready.
3 Likes