Matching on struct patterns with private (read-only?) fields


#1

The string-cache library has a macro (generated by build script with a phf set) like this:

#[macro_export]
macro_rules! atom {
("index") => { $crate::Atom { data: 0x2 } };
("partialdiff") => { $crate::Atom { data: 0x100000002 } };
("additive") => { $crate::Atom { data: 0x200000002 } };
// ...
("style") => { $crate::Atom { data: 0x4dc00000002 } };
}

What’s nice about this macro is that it can be used both as an expression and as a pattern. For example, in html5ever (simplified a bit):

            match name {
                atom!("select") => if html_elem_named(ancestor, atom!("template")) { return InSelect }
                atom!("td") | atom!("th") => if !last { return InCell; },
                atom!("tr") => return InRow,
                // ...

Each atom represents a string. Strings that are not in the static set (and more than 7 bytes) are represented as a pointer to a reference-counted String in a global hash map. This means that letting users set the data field to an arbitrary value is unsound. In some cases, safe methods of Atom interpret the bits of data as a raw pointer and dereference it.

The typical way to deal with this is to make the field private, and have constructors that either always set data to a valid value or are unsafe. The atom! macro could expand to e.g. unsafe { $crate::Atom::from_data_unchecked(0x4dc00000002) }. But that macro couldn’t be used as a pattern since neither function calls (even const fn) or unsafe blocks are supported in patterns.

However, the only reason to make data private would be to prevent arbitrary writes. Reading it is fine, it’s not secret. (Item privacy is no good for secrets anyway: if an attacker is allowed to write Rust code they can transmute anything and it’s game over.) Would it make sense to add read-only field to the language? Code outside the module could not set that field (like private fields) but they could read it, and more importantly match on it in a pattern.

Or maybe some other language change would help. Unsafe struct fields with unsafe { pat } blocks in patterns to match them? Having the expansion of macro_rules macros follow the privacy rules of the macro definition site rather than that of the expansion site? Supporting const fn calls in patterns?

(Additionally, we would like to use NonZero in Atom, but since its field is private we have the same problem with patterns.)

CC @nox


#2

I’d been thinking that it could make sense to have separate privacy for introduction and elimination forms (i.e. setting and getting), somewhat similarly to C#'s public get; private set; for computed properties. Which is more or less the same thing you wrote. In total there’s 4 combinations – (private + public) * (get + set) – but we already have two of those covered. So we only have to add the case where we only want to make it half-public, instead of fully. So the syntax could be something like:

struct Example<T> {
    private_get_private_set: T,
    pub(get) public_get_private_set: T,
    pub(set) private_get_public_set: T,
    pub public_get_public_set: T
}

(Not necessarily the best syntax, just a good guess. Might conflict with pub(restricted), but that’s not stable yet, so we could still give special treatment to get, set, or whatever words we want to use if they’re not already keywords.)

pub(get) fields would (outside the owning module) disallow mentioning the field in struct literals, functional record updates, LHS of assignments, or taking an &mut or ref mut reference to the field. (Overwiting the whole struct is okay. The guarantee that’s being preserved is that any value that’s in the field can only have come from the owning module. If client modules can’t construct their own structs with their own value in the field, which they can’t, then they can’t violate this property through whole-struct assignment either.) pub(set) fields would, outside the owning module, disallow mentioning the field except in struct literals, functional record updates, or LHS of assignments.

I’m not sure what’s a good use case for pub(set).

This could’ve worked analogously for enum variants as well if enum variants had had privacy.

I believe this is planned for macros 2.0. (cc @nrc)


#3

Any reason you can’t create a const item for each one, to use in patterns?
That should work even with NonZero, AFAICT.


#4

That doesn’t work, at least in rustc 1.10.0-nightly (0667ae93f 2016-05-17).

#![feature(const_fn)]

mod a {
    pub struct B(u64);
    impl B {
        pub const fn new(c: u64) -> Self {
            B(c)
        }
    }
}

const D: a::B = a::B::new(4);

fn e(f: a::B) -> &'static str {
    match f {
        D => "four",
        _ => "not four"
    }
}
error: constant evaluation error: unimplemented constant expression: tuple struct constructors [E0471]
 --> a.rs:7:13
7 |>             B(c)
  |>             ^^^^
note: in pattern here
  --> a.rs:16:9
16 |>         D => "four",
   |>         ^

error: unimplemented constant expression: tuple struct constructors
 --> a.rs:7:13
7 |>             B(c)
  |>             ^^^^

Edited: I got my earlier code example wrong.


#5

I certainly plan to make privacy hygiene a thing for macros, not sure when that will happen though


#6

I handles a const fn call but not creating a tuple struct?! Sounds like a bug to me.


#7

We can’t create aggregates in const eval. :wink:

While struct-expressions work, they don’t work in const fn until we get aggregates:

#![feature(const_fn)]

mod a {
    pub struct B {
        pub b: u64,
    }
    impl B {
        pub const fn new(c: u64) -> Self {
            B {
                b: c,
            }
        }
    }
}

const D: a::B = a::B::new(4);

fn main() {
    let x: [i8; D.b as usize] = unimplemented!();
}
error: array length constant evaluation error: non-constant path in constant expression [--explain E0250]
  --> <anon>:10:20
10 |>                 b: c,
   |>                    ^

See https://github.com/rust-lang/rust/issues/29928 for details


#8

The error message was confusing me, I suppose atm the only thing that works is ExprStruct, not an ADT-constructing ExprCall.