Blog post: Contexts and capabilities in Rust

How does the API of HashMap::get indicate that the context must be the same on the two calls; or conversely if that is the default, how does some other API indicate that it doesn’t care whether the contexts are the same on two separate calls?

1 Like

Hm, not sure about this: the K: Hash bound is on insert, not on the HashMap itself, so nothing “escapes” the with clause on the type level. Between with clauses, it’s just HashMap<K, V> and you can, even, ask it’s len.

3 Likes

You can't write Hash to take another parameter today, though, because it's not allowed by the trait. If with allows adding arbitrary additional parameters to anything, that's no longer the case.

(Though yes, there are ways using interior mutability to make a bad Hash too. But I consider that different since there are so many ways for interior mutability to cause weird behaviour.)

1 Like

Ah, you're right. That's interesting.

I wonder how often we could catch this sort of thing with a lint. Probably only in basic cases.

Another approach is to make it so you can write the where clause like

impl<K, V, S> HashMap<K, V, S>
where in('self) // <-- this is new
    K: Eq + Hash,
    S: BuildHasher,

which would tie the lifetimes together in the way I described. This is backwards compatible with the current form of where clauses that we see everywhere, in('static). We've wanted something that represents "the lifetime of Self" for other reasons before.

3 Likes

Static variables, thread locals, interaction with OS, current time, reading garbage from a register %)

I think what you actually want to spell out is that Hasher impls have to be pure: depend on nothing but arguments explicitly passed in. Could perhaps const provide a way to express that? Or some annotation? This seems a bigger pursuit than just disallowing with parameters.

Making effects explicit (this code can do IO) could be seen as bringing Rust closer to being able to express purity.

2 Likes

Woah, I almost wrote a similar blog post! Some thoughts, mostly coming from the difference of what I was going to write.

Please don't call it capability. This proposal is for a data sharing mechanism, whereas capabilities are unforgeable access-granting objects. This data sharing mechanism would be useful for sharing capabilities, but it's also useful for sharing other data, and capabilities could be shared with this or in other ways. People learn words through examples, and I really don't want to try to explain to people "no, Rust doesn't have capabilities by default, unless you use the cap-std crate, oh you're talking about the things literally called capability... no those are unrelated". In my not-yet-written blog post, I was going to call it shared. Some other possibilities: implicit, context.

I was picturing a different implementation: each context would be shared in a global (static) variable. It would be uninitialized memory to start, and set by the with expression. But I've been trying to translate your example to this style and failing; there's no place for the lifetime 'a of the arena to go. So I don't think this works, and the implementation needs to be passing a function argument down instead (which I think is what the post is proposing, though I'm not entirely sure?). It seems strange to me, though: there can only be one basic_arena at once, right? Shouldn't it be possible to store it in a single global location?

For making it more economical to use contexts all over the place, I imagined being able to annotate an module with a with clause, which would effectively put a with clause on every top-level function it defined. This would be useful in a big codebase where 90% of the code wants to share a s

If you write context alloc: Allocator;, where Allocator is a trait, is there going to be a v-table somewhere? If so, I would think the Rusty thing to do would be to be explicit, and simply state the type of the actual value being passed around: context alloc = Box<dyn Allocator> or whatever.

Anyhow, I never actually wrote that post because I hadn't figured out the details, or a concrete proposal that would work. So take this all with a grain of salt. Hopefully at least useful in a brainstorming sort of way.

9 Likes

...the word "effect" is another candidate

thread locals would work

when the class implementing an effect/context is statically known all functions in the chain of invocation could be monomorphised so that the right methods are invoked w/o a v-table. In this case effect/context can become an extra parameter at implementation level as an alternative to thread locals.

I'd love an answer to my earlier question on how this could work with dyn traits w/o thread locals..

1 Like

I think the concepts are related, but I see the distinction you're making. I was considering context as well and I think I'm leaning that way now.

I think you can model capabilities using this feature, but it needs more thought around the details. jam1's post covers how these can interact with non-Copy values, and sunfishcode's post explicitly talks about how you can use this for capabilities.

Not necessarily. You could have an outer arena for a larger computation and create a smaller one for a sub-computation.

That said, it's still possible to use thread locals in some cases. Overriding would mean "pushing onto the stack" and then popping off again later. Care has to be taken around the design of when a context is "captured" in a value (caller does not know about it) versus "required" (caller is providing the context): how should overriding behave in this case? I've been thinking overriding should only apply in the latter case, but I'm not sure.

+1. I'd like to see module-level generics one day, too. (But I don't really want to derail the thread talking about this either :slight_smile:)

Not necessarily. Consider it a hidden generic argument on any code that declares a with context with unspecified type. Even if you capture the context in a dyn value, the vtable that's already created by dyn should point to functions that already statically know the types of your context.

You could absolutely write context alloc = Box<dyn Allocator> if you wanted to.

Thanks to you and to everyone else who has taken the time to pick this feature apart and ask questions. I haven't had time to respond to every post myself, but the other commenters have done a good job fleshing things out.

2 Likes

I plan to do a blog post covering this at some point, but in short, there are a few approaches you could take:

  • Create a new type (T, Context) where T: Trait and box that instead. If you already have a Box, this means you need to create another.
  • The "extra-wide pointer" you mention could potentially work. That would probably be pointing to a lookup table somewhere so we don't need another pointer for each piece of context.
  • Use thread locals and access the aforementioned lookup table that way.
1 Like

Can somebody check my logic here pls?

If we're dealing with a trait object we don't have T, we have one of &T, &mut T or Box<T>. We can't always create a new (T, Context) box because T could for example be Pin<...>.

What we create though is one of (&T, ..), (&mut T, ..) or (Box<T>, ..).
(&T, ..) or (&mut T, ..) can be created on stack.
A new allocation is necessary to hold (Box<T>, ..).

To create (&T, ..) or (&mut T, ..) means to re-borrow T. This is fine.
To put Box<T> inside the new tuple is a move. It's fine too (?) because the source code already had a move for that Box.

It would be bad for the new allocation necessary for a Box<T> to be invisible. So this "coercion" from Box<dyn T> with context:Context to Box<dyn T> can be an explicit call to an intrinsic fn. In fact there could be two of them: for Context : 'static and for borrowed Context - in the latter case the type of Box would change adding this lifetime.

Have I messed up or is it indeed fine?

P.S. can some unsafe code assuming the pointer is to actual data break as a result?..
P.P.S. there're also Rc<dyn T>/Arc<dyn T>, but they're okay to put into (... , ...) as well, right?
Are there any others?..

We can mitigate this by making the difference between free functions with with-context and methods in impls: with bound impls always capture context (in an anonymous type), so that we ensure we have the same context for all method invocations (I guess that it will be implied by most of users, but thing we don't guarantee); witth bound functions just capture contexts in ad-hoc maner.

This means, that contexts captured by impls are by default final - one could change this by declaring something like with default capab = ... ,.

Given

v : &dyn Trait with f : Foo

isn't it by design I can invoke methods on v supplying a different f on each call?

If it wasn't I'd expect to see a factory object instead

v : fn(f : Foo) -> Box<dyn Trait>

And if I can should I not be able to cast v to &dyn Trait twice supplying different f-s?

While it is possible, I believe that both should be supported and possible: the example with hashing demonstrated this.

For context we capture on per method basis we just shadow impl one's with local bound.

If a user is okay with factory object then let it be, there is no need in implicit context at all.

In theory, yes, but will it make sense? Like, context of an entire operation chain is likely meant to be singular, not a thing which will change from one method to another.

For example, async runtime, logger, etc.

Also there we have a problem with verification: while it's clear in which with-ctx a method of an impl was called, how to know this in dyn use case? What about Vec?

How to express that the method must be called only within the same context?

I feel hashmap example has derailed the discussion by asking for unnecessary and impossible:

  1. Impossible

    // it is not possible to prevent HashMap problem
    // because this is exactly the code you want people to be able to write
    let map : &dyn Map with hasher : Hasher = ...;
    with hasher = HasherOne::new() { map.put( .. ) }
    with hasher = HasherTwo::new() { map.put( .. ) }
    {
        // imaginary syntax to bake in a with parameter
        let bakedIn : &dyn Map = map with hasher = HasherThree::new();
        bakedIn.put( ... )
    }
    with hasher = HasherFour::new() { map.put( .. ) }
    
  2. Unnecessary

    If we ever wanted to only use one hasher we wouldn't use with parameters. We would use a factory object instead. No new language features necessary.

Is factory object not sufficient then? Why do people want with parameters at all?

Imagine you have a single boxed impl Future. It can be run multiple times on different threads. Each time you want to pass in a different async context.

Imagine you have impl Datasource with logger : Logger talking to database in a web app. When you invoke it you pass in a special logger which prepends the name of the currently logged in user to whatever messages datasource logs on its own. Each time you pass in a new logger.

I believe these are the motivating examples for with parameters. If you just want something static that never changes you don't need this new language feature at all.

It was my impression that &dyn Trait with h : H and &dyn Trait would be different types. Some sort of coercion/conversion from one to the other will be necessary explicit or implicit. I was thinking that with-ctx active at the time of this coercion/conversion would be used.

As stated above I view this is an anti-goal. We never want to prove any such thing nor can we.

3 Likes

Well, this is also a way.

With all above, with-bound impls may just serve two needs:

  • as a sugar for all methods taking same kind of context;
  • signifying a temporal implementations of a trait.

This means that we don't have guards against things like allocated in one allocator, deallocated in the other, and co.

Like, captured with clauses, as I imagine, stay in between of "allocator reference is passed to a constructor and stored, referrent doesn't change" and "allocator reference passed ad-hoc, in every method, referrent can change". I wanted this to be the best of two worlds.

If we don't care about coherence here, than I am wondering if allowing the following would be a good idea?

impl your_crate::Trait for their_crate::Type 
with
  my_context: crate::MyContext

This is effectively Idea: Named sets of impls

2 Likes

Suppose allocator wasn't passed as a with, suppose it was passed explicitly.
Are the typesystems in our languages strong enough to express the following requirement?

you shall only ever free() an object with
the same allocator you malloc()-ed it from?

It's not about the allocator type anymore, it's about the instance.
It would be nice to have typesystem check this for us.

But they can't can they? In my mind by consequence we won't be able
to express it for with parameters either.

To be honest I don't see why we should be jumping from

  • methods taking a hidden parameter
  • essentially from glorified thread-locals

to tearing down coherence.

Regardless of how they are actually implemented aren't the with parameters semantically thread-locals? With the additional nicety that you don't have to wrap them into an Option and that they can have lifetimes in them?

Yes, this can be done with lifetimes and borrowing.

struct WithAlloc<'a,T,A: Allocator + 'a> {
   data: *mut T,
   _pd: PhantomData<fn(&'a ()) -> &'a ()>,  //this gives us invariant lifetime, which cannot be weaken nor strengthen
}

Here, WithAlloc struct doesn't borrow an allocator, but yet, an instance of the type may not be called with a reference to a wrong allocator.

If it is a type produced by desugaring, then the following example is correct (Edit: it is not - I didn't understood how ghost-cell works):

let alloc1 = ...;
let alloc2 = ...;

let box_in_1;
with alloc: Allocator = &alloc1 {
   box_in_1 = Box::new(...); //there, `box_in_1` has got a branded reference to `alloc1`
}

with alloc = &alloc1 {
   drop(box_in_1) //OK, since both references have the same lifetime.
}

with alloc = &alloc2 {
   drop(box_in_1) //Error, because lifetime of `&alloc2` is not the same as of `&alloc1`
}

So we can prove whether we had allocated with some specific allocator or not.

1 Like

This is an interesting technique. Could you possibly check this? Somehow I can't make it error out.. But maybe I'm doing something wrong?..

I know this mechanism is intended to be more general than Go's context package, but it would be good to take that use case into account, in addition to the three listed in the original post. IIUC, it will be possible to write this kind of thing in a more concise way than Go allows, but it might be good to make it an explicit goal.