Fleshing out libstd "scenarios"

Forgive me for being cranky (perhaps its too early in the morning), but I think we need less strawman syntax and more mathematics. I see some syntax, but little clarified semantics since the last post.

Additionally, I'm thinking that #[cfg] is entirely orthogonal to #[scenario]. We can see here that the methods are tagged with both, but you could imagine #[scenario] implying #[cfg] perhaps eventually.

Uh, sounds like they are not orthogonal? What is the meaning of cfg without scenario, or scenario without cfg?

Here we see a crate that activates the unix and windows scenarios`

This gets confusing when we normally think of #![foo] as applying #[foo] to all items. I believe this currently holds for all items. I think we can restore this convention, but more on that in a bit.

I would also imagine that there's sort of a global namespace for scenarios.

Woah, this is very dangerous! To preserve modularity if any crate can define features I think we need to namespace by crate. Otherwise we end up with a situation where nobody but the standard library can confidently take a name from the global namespace.


Ok, here are my design sketch. First, an aside: I don't see how there won't be really awkward overlap with scenarios and features and + built-in CFG tokens. The former is are nice per-crate black box variables, and the latter is nicely connected to target definitions. Scenarios clearly will need some elements of both, so hopefully we can flesh out the division of labor once we figure out what scenarios mean.

Second, the problem. When we say code is portable, we mean one of two things. The first, and arguably original, type of portability might be deemed parametric. This means the code "simply doesn't do anything non-portable", or more formally, it doesn't use any cfg'd/scenario'd item or language construct, and is thus trivially guaranteed to work on all systems. The second would be non-parametric, where the code attempts to maintain compatibility by defining piecewise its implementation--i.e. a bunch of cfg'd versions of internal methods.

The second case usually isn't as good, because the cfg assumptions are rarely pred and not(pred), but rather a bunch of different predicates with the hope that they are disjoint and at least one is true. This hope invariably is not true forever as new systems are introduced which play with basic assumptions in various ways. At the same time, the second case is inevitable when writing software. The problem is we want to verify the portability of code without brute force compiling on different platforms, because that is intractable.

Without trying to sound academic, I firmly believe the essence of the scenarios problem is satisfiability modulo theories. I mean less the associations of off-the-shelf solvers, than the core mathematical concept. Plain boolean satisfiability is "does this formula hold true under any interpretation", SMT is "does this formula hold true under any interpretation consistent with the theory"--i.e. "does this formula hold true under any of these interpretations I actually care about". What is a formula? For our purposes it is an boolean expression (no quantifiers)....like a cfg formula. What are the variables? Primitive CFG tokens, attached by the compiler to targets, library-defined features, and perhaps something else if we need it (I think we won't). What are the theories/sets-of-interpretations? Basic axioms like and(windows, unix) = false [though @eternaleye will shut down that example beautifully]. What does SMT become when you negate your formula? "does this formula hold true under all of these interpretations I actually care about". What do you get putting it all together? "does this cfg hold true under all of the possible truthinesses of cfg-primitives/features I care about". Boom! That's portability.

Wait!, you ask. My code has many cfgs, not just one! How can that be portability?! Here's how. Recall how name resolution works: ignoring things like globs and shadowing (cause what twisted soul would use those in conjunction with #[cfg(..)]), we need all identifiers in an item to resolve to exactly one item. This means we need cfg of each item to imply that exactly one of the cfgs of everything it could possbility resolve to is true:

cfg-of-item => one-of(cfg-of-helper-v1, cfg-of-helper-v2, ...) 

Now => is straight up propositional logic, and one-of can be elaborated as propositional logic, so via this means we can derive a portability formula for each item in a crate. The standard interpretation of #![foo] actually works in that we are simply asking all items in the module to be portability up to this cfg-theory (but yes, we may want to remove the module instead of making it empty, I don't know).

Ok, I got to go to work so no time to proof-read or talk about anything else. Later I will dive into what theories Cargo's current resolution algorithm implies, and what to think about target specs / builtin configs (@nagisa's RFC!).

2 Likes