Add rustc flag to disable mutable no-aliasing optimizations?

In C and C++, we have -fno-strict-aliasing, to disable all such aliasing optimizations that break type-punning. A similar though not identical issue exists in Rust, where unsafe code cannot do some things with defined behavior with aliasing &mut references.

The situation is worse than it first appears. Even if you use a *mut to a struct to allow aliasing, you cannot ever create a &mut to any of its member data at any point. This opens up a whole can of worms of its own.

More often than not, when I write C++, I compile with -fno-strict-aliasing. I find the freedom it gives me is worth more than the small performance improvement.

There are some workflows where the inability to alias &mut becomes a handicap. E.g. recursive mutexes, which are useful in some cases. You could use a Cell, but that can complicate things significantly and disallows multithreading, especially if the lock is behind the pointer instead of in front of it, as may be the case in e.g. FFI purposes.

I understand that one could argue that this option would allow bad practice, but if you're using unsafe, it should be assumed that you already know what you're doing, and I personally find the ideology of a language forcing you to write code in a particular way to be highly offensive. The programmer should beat the language into shape, not the other way around.

Thanks for your time.

Whether or not rustc makes use of the codegen backend's support for optimizing &mut using noalias, by specification and defition &mut is still an exclusive reference. Various parts of the language and library depend on that exclusivity for soundness. &mut is not the mechanism you're looking for. There is a way to tell the language exactly what you want: you can use unsafe, raw pointers, UnsafeCell, and abstractions built atop them.

As an example, an abstraction could hand you a &mut SharedThing<T>, and provide methods to let you write to the underlying T, without ever giving you a &mut T. Such abstractions need not have any overhead, and should be able to compile down to exactly the machine operations you'd expect. Such abstractions could even project to fields (e.g. accessing a field by going from &mut SharedThing<T> to &mut SharedThing<TypeOfSomeFieldOfT>); there's support in the works now that would make that feasible.

21 Likes

That's not correct, is it? From my understanding of Stacked Borrows, you just have to not dereference the pointer between uses of the borrow.

(I don't recall if Stacked Borrows was ever formally adopted as Rust's aliasing model, but I think a lot of existing code, even in std, would be unsound if what you said is true.)

Hmm. Perhaps one thing that would be acceptable is a stabilized version of the lang item attribute for UnsafeCell. This would allow creating custom interior mutability patterns without the need for the core library. Honestly I think core/std related lang items should be stabilized anyways.

What's wrong with using core and UnsafeCell or some of the higher-level types like Cell or RefCell? They have been implemented and reviewed carefully, and they are way harder to use incorrectly or inefficiently than whatever someone else were to build themselves. core is possible to use even in constrained, OS-less environments, so there's no need to add your own lang items. You cannot circumvent the rules of the core language anyway, not even with custom types.

3 Likes

What's wrong with using core and UnsafeCell or some of the higher-level types like Cell or RefCell ?

That touches on something that worries me overall on Rust. Make no mistake, I enjoy Rust as a language, but traditionally a systems language is more able to be divorced from its standard library.

C can be divorced entirely, even from the freestanding headers. C++ can be divorced nearly completely, with some exceptions like std::initializer_list if you want to actually use those features.

But, Rust not only uses the core library for basic language ability, but the ability to forgo it and replace its functionality is badly crippled by the instability of lang items, and what's worse, the inability to voluntarily enable unstable features in stable Rust.

So, your choices are to use a nightly compiler, which would be absolutely psychotic to any C or C++ programmer, or stick to stable features only. I strongly believe that feature flags should be usable from stable without RUSTC_BOOTSTRAP. If stable rustc malfunctions while trying to use unstable features, that's a price I'd gladly pay. They are called unstable features for a reason, after all, and serves me right for trying to use unstable features. I just want the choice. There was previously a workaround for that, you could set RUSTC_BOOTSTRAP=1 in build.rs, but that's now been patched with a completely artificial limitation to prevent you from doing that. Again circles back to the language developers trying to force you to use the language a certain way, which seems very much against FOSS philosophy, in spirit if not in letter.

I must admit to liking C++ as well for being the complete opposite of Rust in that philosophy, but the problem with C++ is that memory safety is a sick joke there, especially when it comes to any kind of multithreading. Rust is among the only languages in existence that provides memory safety without a garbage collector runtime penalty, and Rust is by far the best supported. If the Rust devs felt so strongly about RUSTC_BOOTSTRAP, they should have banned crates using it from crates.io, but not prevented its use in cargo based crates themselves.

It's not clear to me why libcore is problematic in your eyes. It's clear you'd like to use Rust without using libcore, but so far I haven't seen the motivation for why that's worth pursuing.

For C and C++ it's more understandable: the C and C++ standard libraries interact with the operating system in certain ways. But Rust's libcore doesn't really have that problem.

It's not clear to my why this is such a big deal. Stable Rust inherently excludes unstable features. Enabling unstable features is the opposite of using stable Rust.

The one slightly compelling argument I've heard for something like this is in the Mechanism for beta testing unstable features thread. But even there I think the focus is on limiting what unstable features can be used, not necessarily whether stable/beta/nightly channels are used.

As a C/C++/etc. programmer, I strongly disagree. Nightly Rust isn't some #YOLO wild west. It's quite stable (in terms of predictable and correct behavior).

3 Likes

The problem here is that you need to abandon all stability guarantees to use any unstable functionality. It's objectively better to have the guarantee that stable features will function correctly, then to have no guarantee for stability whatsoever simply because you need one or two unstable features. I've had nightly break on me before, it's not perfectly stable and is not suitable for production.

Could you please say why you want to use Rust without libcore?

4 Likes

Because I'd like to potentially replace it with an even more minimalist standard library at some point with different semantics and features.

OK, but to be clear, I didn't say std. I said core. core is smaller than std. Putting aside that, here is my advice:

  1. Go build the more minimal core library with the semantics you desire. Use whatever unstable features are necessary.
  2. Test your use case against it. Does it actually work with the rest of the language?
  3. Open a Pre-RFC and seek guidance from the lang team on how to make your use case work with stable Rust, if such a thing were even possible at all.

This will depend on heavily on what use cases are unlocked by this. The Rust teams have limited bandwidth to make changes.

12 Likes

Arguably, libcore is simply part of the language, and inherently should not even need to be replaced. This is why people aren't understanding what you want.

Consider UnsafeCell as an example- as a lang item, it's a fundamental primitive that the compiler has to know about. It could have been provided as special syntax, like C++'s mutable. Defining it as a struct in a Rust source file is more of a convenience than anything (i.e. it can get away with reusing 99% of the syntax and semantics of a normal struct, it just needs to tweak some compiler analysis), but even then there's not much flexibility in how you define it.

You've invoked a relatively vague idea that "systems languages" can be divorced from their standard library, but IMO this is thinking about this the wrong way. Rust already can be divorced from its standard library. What remains is just a bunch of stuff like UnsafeCell that you don't really benefit from removing, because it's not platform-dependent and it doesn't require any runtime support. Things that are essentially language features but don't warrant being fully hard-coded into the compiler.

So if you really want to work without libcore, you should be able to point to some other benefit beyond that.

19 Likes

Personally, I too would like to use Rust with a minimal standard library. Basically, in order to minimize binary size and maximize control, I want to be responsible for every line of code that gets translated to assembly.

But I agree that it's fine for core::cell::UnsafeCell to be the mechanism by which the language feature of interior mutability is exposed. This is because it's a truly zero-overhead abstraction compared to just applying a lang item attribute to my own struct – in the sense that it does not generate any additional assembly or trace in the binary whatsoever. (Except for debug info, but whatever; Rust's debugger story has bigger problems than that.)

Similarly, the current practice of wrapping extern "rust-intrinsic" functions in trivial wrapper functions is fine, since they do not generate any extra assembly compared to calling the intrinsic directly, at least at any nonzero optimization level.

However, I see two areas for improvement:

  1. Eliminate all cases where non-zero-overhead library abstractions are mandatory in order to use Rust or certain Rust language features. That includes, among others:

    • Box, because you can move out of it; likely to be fixed with DerefMove
    • Panics, when produced by mandatory features, like integer overflow or indexing a slice out of bounds. Last I checked, it was possible but hard to even eliminate the built-in formatting machinery. I would like to not only do that, but also replace it with my own formatting machinery, while still getting line numbers etc. for panics.
    • Waker is required for async and is very prescriptive; could be fixed with some existing proposals for generalized coroutine support.
  2. Add a mechanism, possibly based on Cargo feature flags + build-std, to disable all the non-zero-overhead parts of libcore. This way I can ensure I (or someone else working on the code) doesn't use those parts by accident.

    A lint could also work, but any external lint would have a hard time keeping track of which APIs are zero-overhead; there really needs to be some kind of marking inside libcore itself.

5 Likes

There was previously a workaround for that, you could set RUSTC_BOOTSTRAP=1 in build.rs, but that's now been patched with a completely artificial limitation to prevent you from doing that.

If the Rust devs felt so strongly about RUSTC_BOOTSTRAP, they should have banned crates using it from crates.io, but not prevented its use in cargo based crates themselves.

Note that cargo only prohibits setting RUSTC_BOOTSTRAP from build.rs. In other words, your users can still opt-in to breakage, but you cannot opt-in for them without their knowledge.

13 Likes

Note: Box magic wrt. moving deref is a lot more complicated than simple DerefMove.

For core, I would strongly caution against allowing it's implementation in stable rust. core, and many of it's components, necessarily interacts with internal implementation details of the rust compiler. By exposing these stably, they are fixed, and do not provide variation in those implementation details. Control over the precise assembly would require control over these implementation details (which comes at a cost of portability, including portability to different versions of the same implementation).

And, as an implementor, trying to (eventually) have parity with rustc stable, I very much thank them for that limitation. I'd prefer that crater runs, when I have a compiler that can do those, not fail because crates are opting out of stability on the version of the compiler that I should be able to rely upon implementing the core language, and only making the core language available.

I can list 5 more features in the C++ Standard Library that are language critical, and at least dozen that are magic and cannot be implemented in user code. For example, good luck implementing std::bit_cast correctly without some semblance of knowledge about the compiler (Hint: It cannot be done in the general case). While C is much less attached to the standard library, <stdarg.h> and <setjmp.h> would like to have a word with you.

Not to sound like a broken record, but I consider that a good thing. Stable Rust is also portable rust. Not only would opting into the unstable lang items and builtins be unstable, they are inherently non-portable. In my (very WIP) implementation, lccc, some things that are lang items in rustc are not (Box, ManuallyDrop, and UnsafeCell are examples that come to mind), and there are additional lang items, such as one for an unstable DerefMove and DerefPlace (which together are used to implement Box moving deref). I would like to eventually have the ability to test lccc against stable rustc, and this is fundamentally incompatible with stable rustc allowing access to unstable implementation details to any crate that wants them.

As someone who builds C++ software using trunk clang (like, at most a week old off main, trunk clang), I absolutely disagree with the statement. Also, for the reasons I set forth above, I think it's a good thing that stable rustc be stable rust only.

11 Likes

You may want to look at ?Uninit types [exist today]. Also let’s talk about DerefMove

As a C and C++ programmer who doesn't like to use nightly, I can tell you that this isn't a big deal. At this point, Rust is feature-rich and being developed actively enough that whatever unstable things nightly offers in terms of "going low-level" are basically limited to micro-optimizations. The big features that I sometimes (rarely) miss are related to the type system, but even those are being worked on and can be worked around.

I certainly wouldn't go as far as saying that it's a "psychotic" experience.

1 Like

I remember having this discussion before, about offset_of (whether it should be a keyword or an intrinsic macro) and about await (if it should be a keyword or a method with an exotic calling convention). The thing is… it doesn’t matter much. Either way the decision is made, you’re still accessing the same compiler intrinsic that behaves the exact same way and has the exact same limitations and overheads. The only thing that changes is syntax. That’s not to say syntax doesn’t matter at all – sometimes it matters more than in other situations. But let’s not pretend this issue is something more profound than that.

And by the way,

This isn’t true in C either. You cannot write your own offsetof macro without invoking UB. You cannot write your own stdarg.h. You cannot portably compute INT_MIN or define uint32_t without #ifdefs checking for each particular target and compiler. You cannot write your own typedef for the type returned by the sizeof operator without size_t. And it doesn’t make sense to give up those things any more than it does to give up the volatile keyword.

I can agree that accessing fundamental compiler intrinsics ought not to pull in additional runtime cost, but I think the core versus std split accomplishes that already quite well. I see no good reason to ever forgo core: it doesn’t save you any resources, and it doesn’t grant you any extra degrees of freedom over just using the already-implemented version. Even if you defined your own UnsafeCell, you’d still have to use it the same way as the one defined in core, because the only possible implementation is using the #[lang] attribute to say ‘yes, compiler, this is that thing’. It would be at best an exercise in reinventing the wheel.

11 Likes

Sorry if this is somewhat off-topic but I just wanted to point out how the premise of using Rust without core reminded me of this (quite enjoyable) post that I once read

5 Likes

Even just at the assembly level, in debug builds it's not zero cost. UnsafeCell::get doesn't get inlined. Presumably a built-in language feature would not have this issue. This kind of thing has lead me to wonder if other features are better off with dedicated syntax, e.g. placement new.