Goals and priorities for C++

Centril · March 28, 2020, 1:11am

I understand the distinction between defaults and escape hatches / the availability of a feature for the edge-case. This discussion is not just about prioritizes in the latter though, but also about priorities in the former (in which the examples I enumerated are relevant, less so in the latter, I agree). If your language commonly skews defaults away from performance, then "used so widely" is bound to happen. Fortunately, I don't think that is true for Rust. In my view, we've struck a good balance where many idioms are both convenient, sound, and performant.

("Don't have to use [the run-time]" is not the same as "I can make use of all parts of my hardware". The former is primarily about the lack of something whereas the latter is primarily about additional features.)

As I mention above, the tradeoffs @matklad is referring to apply equally to defaults, not just to access to uncommonly used parts of hardware.

The disagreement is indeed about the level of control here. I'm not saying that we shouldn't accept some niche features. I do however disagree with "no room for a lower level language" being used as a slogan to say that we must accept every such feature (I'm not saying that is your view). To me it's perfectly legitimate to do a cost/benefit analysis and decide that a "C parity" feature wasn't worth it, or perhaps that it needs to be exposed in a more general way. This actually feels in line with what Chandler et. al are saying:

At this stage, the primary prioritization should be based on the relative cost-benefit ratio of the feature. The cost is a function of the effort required to specify and implement the feature. The benefit is the number of impacted users and the magnitude of that impact. We don’t expect to have concrete numbers for these, but we expect prioritization decisions between features to be expressed using this framework.

Secondarily, priority should be given based on effort: both effort already invested and effort ready to commit to the feature. This should not overwhelm the primary metric, but given two equally impactful features we should focus on the one that is moving fastest.

rpjohnst · March 28, 2020, 5:07pm

It's probably never worth it to expose a C feature or a hardware feature exactly as it is, but that's not what anyone is suggesting here. A goal to "leave no room for a lower level language" pushes us to find a more Rust-appropriate way to give users the control they need.

Of course everything in running a language requires weighing the options, there are limits to our time, number and knowledge of contributors, etc. But that doesn't mean it's a bad goal! Having strong goals like that is exactly how you prioritize those limited resources.

withoutboats · March 28, 2020, 10:38pm

Not a direct response to anyone:

I don't really get the idea of leaving no room for a lower level language - surely a lower level language will always exist, and it is called assembly. But simultaneously I do agree that Rust is not meant to be a good choice for "application programming" - that is to say, if you don't usually need to care about fine grained control of memory layout in this program, Rust is not meant to be for you. Rust requires you to care about memory layout.

It seems really silly to me to try to say abstractly whether we prioritize performance over maintainability or vice versa. We prioritize both! We recognize that a curved surface exists between all of these positive attributes and we are trying to reach the maximal point - for systems programming, at least. This is sort of the important point: we know what our domain is.

I find very appealing that line from Perl - that easy things should be easy and hard things should be possible. And I think we should put work into making the hard things easier, too. I think if we aim at having no floor to our use cases, we can accidentally dip our ceiling too low. Surely we also don't people saying they use C++ not because it is faster than Rust, but because it is easier.

To me this is connected to what I see as the deepest conflict within Rust's design: between an ideology of prohibition - no one should be able to write bad code - and an ideology of empowerment - everyone should be able to write good code. I'm squarely on the side of empowerment. We should always be striving to enable people to do it right, keeping them from doing it wrong is not enough.

Centril · March 29, 2020, 3:17am

So I think we understand each other and are making different value judgements and priorities for those limited resources.

As for whether Rust is meant for "application programming" (whatever that means?), I have to disagree. While I have to care less about memory management than in a garbage collected language like say Java and Haskell, the compiler is sufficiently friendly through error messages and elision, and provides sufficient abstractions that memory management and resource control isn't an issue most of the time.

Particularly in a large scale application (let's take rustc as an example), there are a few people who write the high-perf abstractions, providing them as a library or "DSL", and then there are folks who use those. People in the latter category usually don't need to be thinking about memory layout and management in day to day programming (which is the case for example if you're working on e.g., rustc's parser or diagnostics in general).

I find that this makes Rust a particularly empowering language. I can write ordinary non-perf critical code that there will be a lot of and I do not need to fear undefined behavior at all, but when I need to (and when I want to learn those aspects), I can also tweak those hot loops, or improve that library abstraction for everyone else.

matthieum · April 4, 2020, 12:01pm

I believe that this is a domain where features could be used to great effect.

Whether unsafe is used as a mean to achieve greater performance, or as a mean to provide more features, a library could feature-gate those, in possibly fine-grained ways:

Users who do not need absolute performance from this library -- they do not use it in their hot-spots -- can simply leave the features off, and the library is therefore "no-unsafe" for them.
Users who do need absolute performance can enable the toggles, possibly piecemeal to limit themselves to code they have audited, or are willing to "bet" on.

It does requires some extra work from both the library author and their most performance demanding authors -- such is the price of attempting to cater to the wants of a larger user base -- however it certainly seems possible even today, and there may be ways to make it easier in the future.

I happen to care much about software performance. I now work in HFT, where the most performance critical services must reply in single-digits microseconds.

My experience is that there is often a trade-off to be made between correctness (& accuracy) vs performance (latency & throughput), and I believe Rust the language is well-suited to enabling users to make the trade-off they wish.

CAD97 · April 4, 2020, 3:51pm

This is the exact kind of "unsafe fear" @kornel warns against (correct me if I'm wrong).

A library using unsafe may be harder to trust to be memory safe than a fully safe, machine-checked one, yes. And, given the same performance, the safe method should always be preferred. But the two, from the outside, should appear identical.

Maintaining two implementations is the worst of both worlds, and potentially even worse, though! Immediately when you write the same code twice, you have to maintain and make sure that both code paths are equivalent; memory-unsafety isn't the only kind of bug that the code can have. Additionally, one of the code paths is going to be much less used and tested than the other.

If the unsafe is just to skip checking some invariants, then it's fairly simple to just check the invariants. But most justified unsafe cannot be replaced without restructuring the code entirely (think graph-of-ptrs vs graph-of-indices kind of thing), which would mean any such nounsafe branch would diverge further from the optimal path.

With Rust, it shouldn't ever be a trade-off between correctness and runtime efficiency. Correctness is always required. And maintaining two implementations is making correctness harder, not easier in any way.

kornel · April 4, 2020, 3:54pm

unsafe is not merely a performance boost. Framing it as such is too narrow. It is legitimately required for mutable iterators, for example. It's needed for FFI.

The biggest issue is that it's often entirely impractical to have unsafe as a toggle, because there are situations where whether you use it or not has to leak to the public interface in a very big way. For example if you can't use unsafe to implement a mutable iterator that gives out &mut T, you may have to change the whole data representation to Arc<Mutex<T>> and this is a major change for both implementation and external users.

Even for purely performance cases, changing of zero-copy to cloning may change algorithms from O(1) to O(n), and therefore change uses of them from O(n) to O(n^2), i.e. it's not a choice of fast vs bit slower, it's choice of "it works" vs it "grinds to halt".

To be on topic: "safety first, speed second" should be left to other languages. There are plenty slightly slower languages that can be sandboxed well. Even if you'd still prefer to use Rust, you can target WASM+WASI to make unsafe safe at the cost of speed.

withoutboats · April 4, 2020, 4:16pm

I like this: Rust's motto is "safety first, speed first." The whole point is that you can have both.

Ixrec · April 4, 2020, 5:22pm

I do think that "leave no room for a lower level language" is not a helpful slogan at this point. It made a lot of sense back when C and C++ were new and needed to be sold on their merits, but for me, when talking about Rust in 2020, it sounds redundant at best.

It's already uncontroversial that there is "no room" between Rust and assembly, at least at the level of core language design. In fact, from what I've seen it's significantly more controversial whether there's still "no room" between C/C++ and assembly (most bluntly: "C is not a low-level language"). It also seems pretty unhelpful as a guiding principle for evaluating feature requests; I don't think anyone is arguing for inline assembly on the basis that it's "lower level" than "outline" assembly.

More abstractly, but IMO even more importantly, I think a huge part of the value of Rust is showing us how much the "low-level" vs "high-level" dichotomy is not a fundamental law of programming language design, but a historical accident. We've all seen plenty of examples where using the "high-level" feature optimizes much better in practice, especially when composing several abstractions in a larger project. So I'd much rather dispense with all this talk of levels in our broadest value statements. All the posts in this thread disagreeing over what the slogan appears to mean at first glance seem to me to prove that the levels metaphor has simply become counterproductive.

Of course, the intent behind slogans like "leave no room for a lower-level language" is usually accurate and well-meant, even if I'd strongly prefer not to phrase it that way. For example, I think it is uncontroversial to say that coming to a community consensus on what, if any, form of inline assembly Rust should have is a higher priority than coming to a community consensus on what, if any, form of dependent typing Rust should have (and not just because const generics is already accepted). Similarly, I believe figuring out const generics, GATs, placement new, etc are significantly higher priorities than figuring out delegation, named/optional parameters, fields-in-traits, etc (incidentally, this is why I haven't been talking about delegation despite it being the closest thing I have to a "pet feature").

I don't currently have an alternative catchy slogan to encapsulate why I think that's the case. I'm also convinced by the arguments above that a priority ranking like "1. safety 2. performance" would be similarly unhelpful, inaccurate and misleading. The cases where they conflict tend to be far more complex design decisions than "N safety < 2N performance, therefore perf wins".

If I had to take a stab at defining what Rust is all about in slogan form, I'd probably go with something like "raising the bar for all aspects of systems programming." But for marketing purposes, I think the actual marketing slogans we've been using historically like "fearless concurrency" and "memory safety without garbage collection" and "fast, reliable, productive — pick three" are already very good, perhaps even close to optimal.

On the subject of proposals to opt-out of unsafe and other forms of "unsafe fear" (if we choose to call it that), everything I have to say is in this painfully thorough post:

and AFAIK nothing has happened since then to change any of the reasoning summarized and synthesized there. So I'd like to avoid us retracing that entire design space yet again. If @matthieum or anyone else has a genuinely new idea in this space that doesn't succumb to familiar objections, that would easily deserve a whole new thread.

Stepping all the way back to the original premise of this thread, my honest reaction to the goals articulated in the paper is that Rust already embraces all of them in a broad sense, albeit with very different wording and with many small details changed (e.g. the priority ranking might be helpful and accurate in a C++ context, but for Rust I think it'd be a distraction at best).

This may sound incorrect for "Backwards or forwards compatibility" as a non-goal, but they're clearly talking about a much more draconian sort of "compatibility", and loosening that in favor of enabling more "language evolution" would essentially be bringing C++ much closer to what we might call Rust's "stability without stagnation" policies.

matthieum · April 5, 2020, 11:25am

I never said that unsafe was only about a performance boost. I said:

Which identifies a subset of usecases where both a safe and an unsafe alternatives could be provided.

I agree that it's not always possible. Whenever it is, though, extra work can be done in exchange for peace of mind for users -- extra work which may very well be provided by the interested users in the first place.

matthieum · April 5, 2020, 11:47am

This slogan is very much at the heart of the controversy going on in the C++ community at this point, and reflects the disparate opinions on a number of axis.

Over the last couple months there's been an increasingly amount of frustration being voiced by the more performance-minded part of the C++ community which is discontent with a number of technical decisions in the language and standard library -- either past decisions that cannot be challenged in the name of backward compatibility or new decisions.

Examples of backward compatibility:

De-facto ABI stability is preventing a lot of changes. There was (finally) a discussion in C++20... and the committee essentially punted on the question, leaving both people arguing for and against disgruntled. Various issues are slow std::regex, std::unique_ptr passed on the stack rather than in register, etc...
Move semantics looked good compared to Copy semantics, but Rust came and proved that they left quite a bit of performance on the table -- unfortunately by now they're sanctified in the C++ standard, so it's not clear how to improve on them.

Example of new decision:

Stackless coroutines allocate. For some reason, the C++ committee decided to innovate, and rather than follow the typical pattern of lowering a coroutine to a variant (state-machine) such as C# or Rust does, they took a new approach. Gor Nishanov made a couple demonstrations with "negative abstraction overhead" showing off how the optimizer could rip through the layers of abstractions and remove the allocation. As the feature was implemented in more compilers, and more code start using it, it quickly became clear that -- as usual -- optimizers sometimes fail to optimize. Cue disgruntled users.

This is the context in which this paper is written -- at Google scale any little of bit of performance matters, so Googlers obviously favor performance over pretty much anything else, and are not so satisfied about all the C++ features that sacrifice performance.

I do not think that Rust suffers from the same issues. The Rust language team is very much performance conscious, and the various features of Rust have clearly been designed with performance in mind.

Of course, Rust is younger, and as the years pass it may start showing its age, or may switch tack. I certainly hope not -- and perhaps articulating the language goals would help keeping in pointed in the current direction.

Pauan · April 7, 2020, 1:49am

This is a bit off-topic, but I have to mention it because this is a big misunderstanding.

Compiling to Wasm does not protect you at all against unsafe: Wasm has access to a large contiguous chunk of memory which is used for both the stack and heap in Rust.

This is basically the same as the virtual memory provided by the OS, and Wasm can write whatever bytes it wants into the memory, at any location. So you can still get buffer overflows, undefined behavior, corrupted memory, stack smashing, NUL pointers, dangling pointers, segfaults, etc.

The Wasm sandbox is not like the JVM sandbox, it's far simpler and lightweight. Its purpose is not to protect your Wasm code, its purpose is to protect the host OS from your Wasm code.

It does that by preventing the Wasm code from accessing outside of its large memory chunk (similar to page faults), and also preventing the Wasm code from accessing host OS APIs (unless the OS chooses to give those APIs to Wasm). That's it. You don't really get any more protection than that.

In particular, "memory safe" just means Wasm can't muck around with the host OS's memory, it does not mean that Wasm's internal memory is safe. Wasm is basically just a lightweight process, nothing more.

You should treat Rust code compiled to Wasm the same way you would treat any other compiler target: the behavior will be the same (including unsafe), and the performance will be very similar.

The reason why Wasm is 10-50% slower than native code is generally because of a lack of optimizations (which will get fixed). The sandbox has a small performance cost, but it's generally quite small (the memory overflow checks are usually transformed by the Wasm compiler into page fault checks, so they're fast).

kornel · April 7, 2020, 10:04am

I know, but if you insist on distinction between protecting the host OS vs protecting integrity of the program itself, then the problem changes from a well-defined security boundary to a more vague problem of program correctness and bugs. Buggy or malicious libraries can screw their users in lots of ways without any unsafe (e.g. a regex library that lies about strings it finds, or a parser that injects attacker's data).

Pauan · April 7, 2020, 12:25pm

@kornel That seems like a really strange argument to me. Rust programs are run in OS processes. The OS gives the Rust process its own memory, so it cannot corrupt other process's memory (or the OS memory). Yet it can still use unsafe to corrupt its own memory (with distastrous effects).

That is the same situation that Wasm is in. So I don't see how you could argue that Wasm protects against unsafe, unless you're willing to claim that OS processes also make unsafe safe (and therefore all uses of unsafe are safe).

We are not talking about general program bugs. You very specifically said that Wasm makes unsafe safe. I want to be very clear here: Wasm does not make unsafe safe. That is the only thing I have argued in my post.

Because unsafe is potentially dangerous, I do not want to encourage people to think, "I can use unsafe without worrying about soundness, because the Wasm sandbox magically makes unsafe safe!" That is simply incorrect and will lead to undefined behavior.

RalfJung · April 7, 2020, 12:35pm

Even with WASM, if your Rust code has UB the resulting program can have any possible behavior (that the sandbox permits). Seemingly unrelated changes anywhere in the program may have catastrophic effect on any other part of the program. It would even be legal for rustc to emit ill-formed wasm.

That is a huge difference to UB-free Rust code that doesn't behave as documented.

HeroicKatora · April 7, 2020, 12:37pm

Part of the promise of safety is a separation argument: It's possible to perform local proofs/code reviews and rely on their correctness even when adding arbitrary other safe, or at least sound unsafe, code. You can leverage this any program and environment by verifying (to whatever formal or informal degree) some restrictive interfaces and then being able to trust that even in the presence of logic bugs in other parts these restrictions hold. This is not possible with arbitrary unsafe code. You thought all of your network communication through a particular ocket was encrypted? Well, this other piece of code in a separate module with no reference to your secret keys actually wandered around in memory and zeroed them. It should be obvious that this example applies to wasm equally.

withoutboats · April 7, 2020, 12:51pm

I'm not sure how much you know about wasm but I was assuming kornel meant that you could sandbox the Rust by having it in its own wasm module, which cannot read/write memory or execute code that is not explicitly exposed to it. Ill-formed wasm is just rejected by the validator and not executed. wasm's sandboxing between modules executing in a single process is quite strong.

Obviously writing sound unsafe code is still a requirement but wasm mitigates many of the security issues that could arise from programs containing UB.

RalfJung · April 7, 2020, 1:12pm

Fair point, the concerns I raised apply within a single wasm module. How does Rust code usually get compiled to wasm, does it produce one big module or does it somehow automatically "modularize" the Rust code? That seems really hard, which is why I assumed there would be no wasm sandboxing within the Rust code, and in that case my statement stands I think. Or am I missing something?

Okay that was an extreme case, but it could also be arbitrary but well-formed wasm code that has nothing to do with what the programmer wrote (transformed beyond recognition by optimizations that went wild because UB allowed them to).

ckaran · April 7, 2020, 1:18pm

THIS! I don't know about others, but I spend most of my time maintaining code, rather than writing new code. Given that I'm a researcher, which means that my code is experimental & unsupported, that means that I do less maintenance than anyone in production, but I still need a language that supports maintenance well. Rust's strong typing makes it fairly easy for me to ensure that I use everything correctly (at the cost of type explosion), and ensures that we I make a change that I propagate the changes throughout.

Note that this does not diminish anything that @matklad said; I agree with him 100% that a good high-level document would be useful. I'm just stating what I think one of the high-level goals should be.

withoutboats · April 7, 2020, 1:21pm

Today you're correct. My understanding is that in the long term this is one of the key advantages of the WASM Interface Types proposal (see the section on "share nothing linking"). Dynamically linking arbitrary Rust libraries is still a hard problem because of monomorphization, unstable ABIs, etc, but the direction of this work is toward a future in which at least some sorts of untrusted code can be easily sandboxed without process isolation.

Topic		Replies	Views
Priorities after 1.0 policy	104	42558	March 25, 2019
[Blog post] Rustacean principles community	24	5878	December 23, 2021
Requirements for Rust guidelines	4	2522	March 25, 2019
The most wanted libraries! libs	5	1374	February 17, 2020
I've never contributed directly to the overall Rust project before. I am however quite interested in getting the "windows-gnu" target back to actually "feeling" like a first-class target. How can I help?	6	1131	February 15, 2020

Goals and priorities for C++

Related topics