Semantics of AsRef

Dear all,

I recently had some discussions on the semantics of AsRef and I would like to raise some questions in that matter. As shown later in this post, implementations in std show some behavior that could be considered inconsistent or unwieldy. Of course, we cannot change std, but I would like to explore all aspects of this issue anyway for the sake/case of

  • a potential future "Rust 2.0",
  • future existence of mechanisms to change std without breaking backwards compatibility,
  • improving documentation given the current situation,
  • considering entanglement with future features like specialization.

Some questions are:

  • When/how should a type implement AsRef?
  • When/how should an API be generic over AsRef?
  • When should .as_ref() be used and when &, .borrow(), or dereference (e.g. &*)?

I do not want to challenge the importance of backwards compatibility here, but I would like to explore “what if” scenarios for the sake of better understanding semantics and limitations of AsRef as it is defined now and as it could be defined.

That said, I would like to start off with some example code:

use std::borrow::Cow;
use std::ffi::OsStr;
use std::path::Path;

fn foo(_: impl AsRef<Path>) {}
fn bar(_: impl AsRef<i32>) {}

fn main() {
    foo("Hi!");
    foo("How are you?".to_string());
    foo(&&&&&&("I'm fine.".to_string()));
    foo(&&&&&&&&&&&&&&&&&&&"Still doing well here.");
    //bar(5); // ouch!
    //bar(&5); // hmmm!?
    bar(Cow::Borrowed(&5)); // phew!
    foo(Cow::Borrowed(OsStr::new("Okay, let me help!")));
    foo(&&&&&&&&Cow::Borrowed(OsStr::new("This rocks!!")));
    //foo(&&&&&&&&Cow::Borrowed("BANG!!"));
}

The commented-out lines fail to compile. (Playground)

We can see that AsRef<Path> is implemented for:

  • str
  • String
  • &String, &&String, &&&String, and so on
  • &&&&&&&&Cow<OsStr> (I'll get back to that later)

But we see that AsRef<Path> is not implemented for:

  • Cow<str> (or &Cow<str>, etc.)

Moreover, AsRef<i32> is not implemented for:

  • i32
  • &i32 (or &&i32, etc.)

But AsRef<i32> is implemented for:

  • Cow::Borrowed(&5)

This feels somewhat inconsistent or at least incomplete.

But vague feelings aside, we should try to carve out what's wrong here of if something is wrong here at all.

Let's try to look at conversions in std and the language itself. We have:

Currently, the documentation in std::convert states the following:

  • Implement the AsRef trait for cheap reference-to-reference conversions
  • Implement the AsMut trait for cheap mutable-to-mutable conversions
  • Implement the From trait for consuming value-to-value conversions
  • Implement the Into trait for consuming value-to-value conversions to types outside the current crate
  • […]

Generic Implementations

  • AsRef and AsMut auto-dereference if the inner type is a reference
  • […]
  • From and Into are reflexive, which means that all types can into themselves and from themselves

Moreover, documentation of AsRef says:

Used to do a cheap reference-to-reference conversion. […] If you need to do a costly conversion it is better to implement From with type &T or write a custom function.

AsRef has the same signature as Borrow, but Borrow is different in a few aspects:

Unlike AsRef, Borrow has a blanket impl for any T, and can be used to accept either a reference or a value. Borrow also requires that Hash, Eq and Ord for a borrowed value are equivalent to those of the owned value. For this reason, if you want to borrow only a single field of a struct you can implement AsRef, but not Borrow.

Note: This trait must not fail. […]

Generic Implementations

AsRef auto-dereferences if the inner type is a reference or a mutable reference (e.g.: foo.as_ref() will work the same if foo has type &mut Foo or &&mut Foo)

The key points regarding AsRef here:

  • reference-to-reference conversion
  • conversion should be cheap
  • "auto-dereferences if the inner type is a reference or mutable reference" (or in other words "As lifts over &", see below)
  • converted type is not required to act equivalently regarding Hash, Eq, and Ord (for that, Borrow::borrow can be used)
  • opposed to From and Into, AsRef and AsMut are not reflexive, which is why let _: &i32 = 0i32.as_ref() fails

The lack of reflexivity isn't stated explicitly but follows from this implementation due to restrictions regarding overlapping implementations:

// As lifts over &
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_const_unstable(feature = "const_convert", issue = "88674")]
impl<T: ?Sized, U: ?Sized> const AsRef<U> for &T
where
    T: ~const AsRef<U>,
{
    #[inline]
    fn as_ref(&self) -> &U {
        <T as AsRef<U>>::as_ref(*self)
    }
}

(See also Playground)

This implementation shall ensure that if T implements AsRef<U>, &T will also implement AsRef<U>. Why is that needed? One advantage is that it allows us to write a function like this:

use std::path::Path;

// unsightly syntax:
fn takes_path_unsightly<P: ?Sized + AsRef<Path>>(_path: &P) {}

// easier syntax:
fn takes_path<P: AsRef<Path>>(_path: P) {}

fn main() {
    let s: &str = "ABC";
    let string: String = s.to_owned();
    takes_path_unsightly(s);
    takes_path_unsightly(&string); // we need to borrow here, but that's not unusual
    takes_path(s);
    takes_path(string);
}

(Playground)

See also PR #23316 and this post on URLO by @quinedot.

However, not being automatically reflexive may come with disadvantages. When having a set of types (or even just two types) where cheap reference-to-reference conversion is desired between those (two of them), we end up with mostly redundant implementations like:

#[stable(feature = "rust1", since = "1.0.0")]
impl AsRef<str> for str {
    #[inline]
    fn as_ref(&self) -> &str {
        self
    }
}

(source)

Trivial functions like that need to be implemented for custom other types as well.

While references implement AsRef according to the implementation of the pointed-to type (as said above, source), many generic smart pointers do not, such as Cow (source), Rc (source), Arc (source), or even Box (source). Regarding Box, see also Playground. This seems to result in more manual/specific implementations such as:

#[stable(feature = "cow_os_str_as_ref_path", since = "1.8.0")]
impl AsRef<Path> for Cow<'_, OsStr> {
    #[inline]
    fn as_ref(&self) -> &Path {
        Path::new(self)
    }
}

(source)

But these implementations are endless. Note that std does not implement AsRef<Path> for Cow<'_, str> for example:

use std::borrow::Cow;
use std::ffi::OsStr;
use std::path::Path;

fn takes_path<P: AsRef<Path>>(_: P) {}

fn main() {
    takes_path(&OsStr::new("ABC"));
    takes_path(Cow::Borrowed(OsStr::new("ABC")));
    takes_path("ABC");
    // fails:
    takes_path(Cow::Borrowed("ABC"));
}

(Playground)

This was also one of the things demonstrated in the very first example of this post.

I would like to get back to some questions of the beginning of this post:

  • When/how should a type implement AsRef?
  • […]
  • When should .as_ref() be used and when &, .borrow(), or dereference (e.g. &*)?

I would like to make a few hypotheses in that matter:

  1. Any type T: ?Sized may always implement AsRef<T> with a trivial implementation (but it's not provided automatically).
  2. A type T: ?Sized should implement AsRef<T> with a trivial implementation if it's part of a set of types that can be converted into each other using cheap reference-to-reference conversion (through AsRef).
  3. A generic smart pointer T should ideally not implement AsRef<<T as Deref>::Target> but instead implement AsRef<U> where <T as Deref>::Target: AsRef<U>. Note that this differs from what std currently does.
  4. Going from a reference or generic smart pointer to the pointee should not (and sometimes cannot) be done through AsRef::as_ref, but through &* or .borrow(), where the latter gurantees equivalent Hash/Eq/Ord implementations but might sometimes require type annotations to aid type inference, though.

I believe that this would (hypothetically speaking!) fix several inconsistencies that currently exist, but I lack overview to really be sure and thus would like to start a discusson on that matter.

The issue seems to come up repeatedly, and I also stumbled upon it. See also the following issues and PRs on GitHub:

Of course, we cannot simply change implementations in std and even adding some concrete implementations (such as suggested in #73390) can break code. So given the current situation, perhaps different recommendations may be better than those stated above (as hypotheses). I would like to discuss both the ideal scenario as well as the "Rust 1.x" case. In either case, there should be more clarity on what AsRef does and how it's used, and ideally the documentation could be extended in that matter.

7 Likes

I’d go even further and say that non-generic smart pointer types should arguably do the same (and that isn’t what std does either)! So:

impl<U> AsRef<U> for String where str: AsRef<U>

and

impl<T, U> AsRef<U> for Vec<T> where [T]: AsRef<U>

would probably have been nicer, and spared a bunch or manual impls on String. And I don’t see how AsRef<Vec<T>> for Vec<T> is all that useful, even though it currently exists. (On the other hand, and slightly related: maybe AsMut<Vec<T>> for Vec<T> is useful though, because other types could also contain a Vec<T> and you want to mutate it by pushing elements or the like.)

Of course, that’s not what the implementations are, and it seems impossible to change, so the discussion is fairly theoretical, can only be used as recommendations for new external crates.

1 Like

I assume that is what the FIXME in core/src/convert/mod.rs would cover (in a generic fashion):


Regarding recommendations for exernal crates, I'm very unsure. std seems to be pretty consistent in providing

  • an impl AsRef<<D as Deref>::Target> for many pointer-like types D (like Rc, Arc, Box, as well as String and Vec, which you brought up)

instead of

  • an impl<U: ?Sized> AsRef<U> where <D as Deref>::Target: AsRef<U> for such types D.

While the first would be ideal, the latter is what std does the latter might be ideal, the first is what std does. So what to do in external crates?

  • Use std's practice regarding smart-pointers?
  • Be consistent with std's implementation of AsRef for references?

Maybe this question can only be answered individually for each use case? Or perhaps there is no satisfactory answer at all.

I had to make such a decision when implementing AsRef for deref_owned::Owned. I finally decided in favor of the latter, because there are cases where only references and the Owned wrapper are used, thus leading to some cases where everything is consistent (which fails as soon as Cow is involved). This doesn't apply to other use cases though.

I think that in either case, AsRef's documentation should be extended with a note that there is no blanket impl<T: ?Sized> AsRef<T> for T (at least I didn't find an explicit note, only the note on From and Into being reflexive) and perhaps also include an explanation that it can (or when it should) be provided manually.


Even if the existing implementations cannot be changed, there is still the question of when to use these existing implementations: Is it semantically correct (or is it idiomatic, or should it be idiomatic) to use .as_ref() to go from &Rc<T> to &T, for example? Or should &* or .borrow() be used for that? (Hypothesis 4 in my OP)

I created a Playground example to demonstrate how that could look like (without specialization or any Nightly features, just using stable Rust):

// Same as `Deref` but duplicated to allow demonstration here
trait MyDeref {
    /* … */
}

/* … */

// Same as `AsRef` but with different blanket implementation
trait MyAsRef<T: ?Sized> {
    fn my_as_ref(&self) -> &T;
}
impl<T, U> MyAsRef<U> for T
where
    T: ?Sized + MyDeref,
    U: ?Sized,
    <T as MyDeref>::Target: MyAsRef<U>,
{
    fn my_as_ref(&self) -> &U {
        self.my_deref().my_as_ref()
    }
}

// Manual (trivial) implementation needed
// (would conflict with `MyDeref` if also implemented for `str`)
impl MyAsRef<str> for str {
    fn my_as_ref(&self) -> &str {
        self
    }
}

// Manual (trivial) implementation needed
// (would conflict with `MyDeref` if also implemented for `Path`)
impl MyAsRef<Path> for Path {
    fn my_as_ref(&self) -> &Path {
        self
    }
}

// Cheap conversion from `&str` into `&Path`
impl MyAsRef<Path> for str {
    fn my_as_ref(&self) -> &Path {
        Path::new(self)
    }
}

fn takes_path(_: impl MyAsRef<Path>) {}

fn main() {
    takes_path("Hello".to_string());
    takes_path(&"Hello".to_string() as &String);
    takes_path(&&&&&"Hello".to_string() as &&&&&String);
    takes_path("Hello");
    takes_path(&&"Hello");
    takes_path(Path::new("Hello"));
    takes_path(&&&Path::new("Hello"));
    takes_path(Cow::Borrowed(&"Hello"));
    takes_path(&&&&Cow::Borrowed(&&&&&&&&&"Hello"));
    takes_path(&&&&Cow::Borrowed(&&&&&&&&&Path::new("Hello")));
    takes_path(Box::new("Hello".to_string()));
    takes_path("Hello".to_string().into_boxed_str());
    takes_path(Box::new(Cow::Borrowed(&&&"Hello")));
}

(Playground)

Note that the main() function compiles properly here, opposed to the broken examples in my OP.

(edit: moved implementation of MyDeref for String up and removed wrong comment)

That's

Unlike AsRef, Borrow has a blanket impl for any T, and can be used to accept either a reference or a value.

Removing the reference to Borrow, we get that

AsRef does not have a blanket impl for any T

thus T: AsRef<T> does not hold in general.

If you have x: &Rc<T>, the best way to get &T is just x, and letting deref coercion happen. If deref coercion does not kick in (you're using a (likely over-) generic API), then apply it manually as &**x.

Coherence-impacting negative impls bounds are not implemented, but have a strong potential to be implemented in the future. Interestingly, AIUI that would allow the trivial implementations of MyAsRef above to be provided as two blanket impls:

impl<T: ?Sized> MyAsRef<T> for T
where
    T: !MyDeref,
;

impl<T: ?Sized> MyAsRef<T> for T
where
    T: MyDeref,
    <T as MyDeref>::Target: !MyAsRef<T>,
;

// or perhaps they can be combined into this convoluted thing:
impl<T: ?Sized> MyAsRef<T> for T
where
    T: for<U: MyAsRef<T>> !MyDeref<Target=U>,
;

Then instead of impl MyAsRef<str> for str you'd write impl !MyDeref for str to promise that str is not and will never be MyDeref. (Perhaps "is semantically not." We probably need some restricted guideline for when to a negative impl guarantee should be provided before allowing them.) This has an interesting overlap to #[fundamental] types as well, where #[fundamental] is very roughly speaking a promise that the type will not gain any trait implementations for existing traits. (This has impacts on when an impl is considered local.) Similarly, you could imagine marking a trait as #[fundamental], saying that either a type implements it from semver day 0, or it's known to not implement the trait and coherence can rely on the lack of that impl (what impl !Trait gives us explicitly).

2 Likes

I was missing that part, even though I cited it in my OP. So the documentation does state that AsRef isn't reflexive (not in the overview in std::convert though, and it does so in the context of Borrow only).

Maybe it could be said more explicitly? Something like:

(example of how it could be phrased)

Trait std::convert::AsRef

Reflexivity

Ideally, AsRef would be reflexive, that is there is an impl<T: ?Sized> AsRef<T> for T, with as_ref simply returning &self. Such a blanket implementation is currently not provided due to technical restrictions of Rust's type system (it would be overlapping with another existing blanket implementation for &T where T: AsRef<U> which allows AsRef to auto-dereference, see "Generic Implementations" above).

A trivial implementation of AsRef<T> for T must be added explicitly for a particular type T where needed or desired. Note, however, that not all types from std contain such an implementation, and those cannot be added by crates due to orphan rules.

Therefore, writing .as_ref() cannot be used generally to go from a type to its reference, e.g. the following code does not compile:

let i: i32 = 0;
let r: &i32 = i.as_ref(); // let r: &i32 = &i; must be used instead

Maybe that's a bit easier to understand for newcomers. Particularly, it would explain that AsRef isn't lacking reflexivity for semantic but only for technical reasons (P.S.: and for backwards-compatibility, as I try to show further below).

Explaining this might help to reduce confusion on this subject for non-newcomers as well.

There are several methods in std which take a P: AsRef<Path> (which might or might not be considered to be over-generic). I would say it makes it easy to invoke them, but not sure if that has been a good design decision.

Getting back to some of the breaking examples in my OP, it should be said that circumventing these problems in practice isn't so difficult. We can do:

use std::borrow::Cow;
use std::ffi::OsStr;
use std::fs::File;

fn main() {
    let _: Result<File, _> = File::open("nonexistent");
    let _: Result<File, _> = File::open(&Cow::Borrowed(OsStr::new("nonexistent")));
    // This fails to compile:
    // let _: Result<File, _> = File::open(&Cow::Borrowed("nonexistent"));
    // If we have a reference to a `Cow`, we must use `&**`:
    let _: Result<File, _> = File::open(&**(&Cow::Borrowed("nonexistent")));
}

(Playground)

Just like you said, using &** (or &* if the Cow is owned and may be consumed). Not nice, but doable.

But these practical issues aside, I wonder if the use of .as_ref() in the following is (or rather: should be) considered idiomatic:

use std::borrow::Cow;
use std::fs::File;

fn main() {
    let cow: Cow<_> = Cow::Borrowed("nonexistent");
    let ref_to_cow: &Cow<_> = &cow;
    let _: Result<File, _> = File::open(&**ref_to_cow);
    let _: Result<File, _> = File::open(ref_to_cow.as_ref());
}

(Playground)

This works because of:

#[stable(feature = "rust1", since = "1.0.0")]
impl<T: ?Sized + ToOwned> AsRef<T> for Cow<'_, T> {
    fn as_ref(&self) -> &T {
        self
    }
}

(source)

I would like to state the hypothesis that:

  • while this implementation exists (and cannot be removed without breaking code), it should not have been added in the first place,
  • using ref_to_cow.as_ref() in the above Playground should be avoided in idiomatic code (hypothesis 4).

Moreover, I believe that existing code which uses .as_ref() in that way makes it impossible to solve the outlined problems by solely adding two blanket implementations (AsRef<T> for T and AsRef<U> for T where T::Target: AsRef<U>, e.g. with negative bounds) in the future.

This can be demonstrated by using my_as_ref as defined in my above post:

fn takes_path(_: impl MyAsRef<Path>) {}

fn main() {
    let cow: Cow<_> = Cow::Borrowed("nonexistent");
    let ref_to_cow: &Cow<_> = &cow;
    let _ = takes_path(&**ref_to_cow);
    let _ = takes_path(ref_to_cow.my_as_ref());
}

(Note that the first (reflexive) blanket implementation cannot provided for technical reasons in this demonstration but only a concrete implementation for str.)

(Playground)

Here type inference will fail due to multiple implementations of MyAsRef<_> for str:

   Compiling playground v0.0.1 (/playground)
error[E0282]: type annotations needed
  --> src/main.rs:78:35
   |
78 |     let _ = takes_path(ref_to_cow.my_as_ref());
   |                        -----------^^^^^^^^^--
   |                        |          |
   |                        |          cannot infer type for type parameter `T` declared on the trait `MyAsRef`
   |                        this method call resolves to `&T`

error[E0283]: type annotations needed
  --> src/main.rs:78:35
   |
78 |     let _ = takes_path(ref_to_cow.my_as_ref());
   |                        -----------^^^^^^^^^--
   |                        |          |
   |                        |          cannot infer type for type parameter `U`
   |                        this method call resolves to `&T`
   |
note: multiple `impl`s satisfying `str: MyAsRef<_>` found
  --> src/main.rs:51:1
   |
51 | impl MyAsRef<str> for str {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^
...
66 | impl MyAsRef<Path> for str {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^
note: required because of the requirements on the impl of `MyAsRef<_>` for `Cow<'_, str>`
  --> src/main.rs:38:12
   |
38 | impl<T, U> MyAsRef<U> for T
   |            ^^^^^^^^^^     ^
help: use the fully qualified path for the potential candidates
   |
78 |     let _ = takes_path(<str as MyAsRef<Path>>::my_as_ref(&ref_to_cow));
   |                        +++++++++++++++++++++++++++++++++++          ~
78 |     let _ = takes_path(<str as MyAsRef<str>>::my_as_ref(&ref_to_cow));
   |                        ++++++++++++++++++++++++++++++++++          ~

Some errors have detailed explanations: E0282, E0283.
For more information about an error, try `rustc --explain E0282`.
error: could not compile `playground` due to 2 previous errors

Ironically just writing takes_path(ref_to_cow) would work! (Playground)

While these considerations are all very hypothetical and ignore that std doesn't do all these things, it might still be worth noting, because

  • being able to provide two blanket implementations for AsRef in the future would not automatically solve all the outlined problems but providing such a second blanket implementation for AsRef might instead induce more problems with existing code,
  • third party crates may go a different way than std.

Regarding the first point, I believe that the problem really cannot be solved without starting over with an entirely new AsRef (likely not possible at this point, but I still hope that somehow the Edition mechanism could be used in future to overcome bad design decisions of the past).

The latter point brings me back to the question:

Moreover, should .as_ref() be avoided in teaching material in cases where it's used to dereference (edit: generic(?)) smart-pointers?

If I understand it right, then a blanket implementation like in this other FIXME (note that the DerefMut bound is missing there) would (in stable Rust) be incompatible with an implementation of AsMut<P> for P for some type P that is also DerefMut (which includes AsMut<Vec<T>> for Vec<T>).

Thus, while the first FIXME (for AsRef and Deref) could (theoretically, i.e. disregarding backward-compatibility) be implemented in stable Rust to create a consistent AsRef implementation, doing that for AsMut and DerefMut would not work unless useful implementations like AsMut<Vec<T>> for Vec<T> would be removed.


Of course, it would still be possible to manually add such "auto-dereferencing" implementations like in the FIXME for generic smart pointers such as Rc, Arc, etc. while providing a trivial implementation that just returns &mut self for non-generic [1] smart pointers such as String, Vec<T>, etc.


I created a Playground to illustrate that:

/* … */

// This blanket implementation works fine
impl<T, U> MyAsRef<U> for T
where
    T: ?Sized + MyDeref,
    U: ?Sized,
    <T as MyDeref>::Target: MyAsRef<U>,
{
    fn my_as_ref(&self) -> &U {
        self.my_deref().my_as_ref()
    }
}

// This blanket implementation conflicts with the following 6 implementations
/*
impl<T, U> MyAsMut<U> for T
where
    T: ?Sized + MyDerefMut,
    U: ?Sized,
    <T as MyDeref>::Target: MyAsMut<U>,
{
    fn my_as_mut(&mut self) -> &mut U {
        self.my_deref_mut().my_as_mut()
    }
}
*/

// Would be covered by blanket implementation
impl<'a, T, U> MyAsMut<U> for &'a mut T
where
    T: ?Sized + MyAsMut<U>,
    U: ?Sized,
{
    fn my_as_mut(&mut self) -> &mut U {
        self.my_deref_mut().my_as_mut()
    }
}

// Would be covered by blanket implementation
impl<T, U> MyAsMut<U> for Box<T>
where
    T: ?Sized + MyAsMut<U>,
    U: ?Sized,
{
    fn my_as_mut(&mut self) -> &mut U {
        self.my_deref_mut().my_as_mut()
    }
}

// Would NOT covered by blanket implementation
// because `<String as Deref>::Target` is `str` and not `String`
impl MyAsMut<String> for String {
    fn my_as_mut(&mut self) -> &mut String {
        self
    }
}

// Would be covered by blanket implementation
// because `<String as Deref>::Target` is `str`
// and `str` implements `AsMut<str>`
impl MyAsMut<str> for String {
    fn my_as_mut(&mut self) -> &mut str {
        self
    }
}

// Would NOT covered by blanket implementation
// because `<Vec<T> as Deref>::Target` is `[T]` and not `Vec<T>`
impl<T> MyAsMut<Vec<T>> for Vec<T> {
    fn my_as_mut(&mut self) -> &mut Vec<T> {
        self
    }
}

// Would be covered by blanket implementation
// because `<Vec<T> as Deref>::Target` is `[T]`
// and `[T]` implements `AsMut<[T]>`
impl<T> MyAsMut<[T]> for Vec<T> {
    fn my_as_mut(&mut self) -> &mut [T] {
        self
    }
}

/* … */

(Playground)

This is just to experiment with these traits a bit to see what's possible to do without features like coherence-impacting negative bounds or specialization with the goal to get a better understanding for AsRef and AsMut, and to be able to deduce whether these features could help in future. Due to the conflicting implementations (that are all desirable), I assume that coherence-impacting negative bounds are not sufficient here, but that specialization is needed to provide such an "auto-dereferencing" blanket implementation for AsMut (edit: under the premise that AsMut<String> for String, AsMut<Vec<T>> for Vec<T>, etc. shall be implemented).

I do not want to imply that any of this could currently (or even with specialization) be implemented, as it would break existent code anyway (if I understand it right, especially where .as_ref() is used to dereference generic smart pointers as demonstrated in my previous post).


I tested whether the "auto-dereferencing" blanket implementation for AsMut in combination with impl AsMut<String> for String and impl<T> AsMut<Vec<T>> for Vec<T> compiles with specialization, but it doesn't (Playground).


  1. Vec<T> in this context isn't entirely generic because even if being generic over T it always points to a slice, i.e. Deref::Target = [T] and not Deref::Target = T. ↩︎

While trying to work on a proposal (all half-baked yet) for some improvements regarding AsRef's documentation, I stumbled upon the phrase "inner type" here:

Generic Implementations

AsRef auto-dereferences if the inner type is a reference or a mutable reference (e.g.: foo.as_ref() will work the same if foo has type &mut Foo or &&mut Foo)

Why does that section say "inner type" there? Isn't that rather

  • an "outer type", or
  • a "receiver" of as_ref, or
  • the "implementing type"?

Given &String, I would say the inner type is String and not a reference. Or am I misunderstanding something?


I guess &self is the receiver of AsRef::as_ref and the "inner type" refers to self here. I.e. for an impl AsRef<U> for T impl<'a> AsRef<U> for &'a T, the outer type (receiver) is &&T, the "inner type" is &T. Then it makes sense. But still quite confusing, as I initially thought T would be the "inner type". It's phrased a bit ambiguous.

It's worth noting that AsRef is pretty ancient in terms of Rust traits. I didn't actually do my research here, but I suspect "As lifts over &" is referring to the implemented trait As. It's possible that AsRef is descended from sugar/generalization over &self as &T? (This syntax is another way of getting deref coercions, and an oft overlooked use of the as operator to ascribe a reference type.)

I want to provide a hypothesis here: .as_ref() shouldn't be used. Similarly, .borrow() also shouldn't be used. This is the case with either std or jbe versions of these traits.

Or perhaps a little more generously: like .into(), these methods should not be called unless the target type is unambiguously known. Unlike .collect(), the target type cannot be specified with a turbofish, and instead has to be known from constrained usage as a non-generic function parameter.

These traits should solely be used to define generic APIs, and within the definition, should ideally be immediately invoked and passed to a #[momo]-style non-generic tail.

current concept of a perfect definition

Just the shared versions; extends equivalently to mut versions.

/// Generalization of method syntax autoref & autoderef.
///
/// Default implementations lift autoref and autoderef.
trait AsRef<T: ?Sized> {
    fn as_ref(&self) -> &T;
}

// autoderef; "lifts over &"
impl<T: ?Sized, U: ?Sized> AsRef<U> for T
where
    T: Deref,
    T::Target: AsRef<U>,
{
    default fn as_ref(&self) -> &U { (**self).as_ref() }
}

// autoref; *overrides autoderef*
impl<T: ?Sized> AsRef<T> for T {
    fn as_ref(&self) -> &T { self }
}

/// Generalization of `&` / refinement of `AsRef`.
///
/// `&T` exposes a strict subset of `&Self`'s behavior.
trait Borrow<T: ?Sized>: AsRef<T> {
    fn borrow(&self) -> &T;
}

impl<T: ?Sized> Borrow<T> for T {
    fn borrow(&self) -> &T { self }
}

// undecided -- does Deref imply Borrow<Self::Target>? autoderef syntax says yes.
impl<T: ?Sized, U: ?Sized> AsRef<U> for T
where
    T: Deref,
    T::Target: Borrow<U>,
{
    default fn borrow(&self) -> &U { (**self).borrow() }
}

std could perhaps incrementally move towards this by introducing a lint against calling .as_ref()/.borrow() on concrete types (challenge: only when the result is not uniquely constrained by a separate constraint than available impls). Then, new impls causing inference breakage only can be provided with pre-warned breakage (alternatively, use edition-dependent method lookup to avoid the breakage).

1 Like

I created a draft on how documentation could be improved, which just reflects the current situation. Since I'm not familiar with the workflow of contributing (and also because I'm not very experienced with terminology), I didn't create a PR yet, but would like to ask you here for some feedback or comments first (if there are any).


I'm not so sure about this part:

Note that due to historic reasons, the above does not hold generally for all dereferenceable types, e.g. foo.as_ref() will not work the same as Box::new(foo).as_ref().

That is because "historic reasons" aren't the only reason why Box implements AsRef as it does. There is also the reason that AsRef isn't always reflexive (which means you could otherwise not use .as_ref() to retrieve a &T from a Box<T> where T: !AsRef<T>).

Edit: Though in these cases where T: !AsRef<T>, there will be no AsRef<T> implementation for &T either, so the generic use of AsRef<T> for any type T: !AsRef<T> is limited anyway. :man_shrugging: I would therefore still claim that implementations would ideally be different (even with stable Rust, and without negative bounds or specialization). Thus claiming that the current behavior is like it is (primarily) due to "historic reasons" seems to be reasonable to me.

So maybe it would better be phrased:

Note that due to historic and technical reasons (see section on “Reflexivity” below), the above does not hold generally for all dereferenceable types, e.g. foo.as_ref() will not work the same as Box::new(foo).as_ref(). Instead, many smart pointers provide an as_ref implementation which simply returns a reference to the pointed-to value (but do not perform a cheap reference-to-reference conversion for that value).

(Changes including fixup on GitHub)

These changes would make the documentation a bit wordy, but I believe explaining this in the docs can aid giving users a better understanding of AsRef and AsMut. Maybe some of the weird behavior isn't really that weird but just a consequence of technical limitations of Rust's type system (in the past and present).

What do you think?

For the sake of analyzing where exactly the current state of std is inconsistent, or where mistakes have been made, I would like to share some more theoretical considerations. Note that I'm doing these considerations not to conclude what to do or what to change in Rust in the short-term right now, but to better understand what's wrong as a first step (I'll get back to what to do later in this post). For these considerations, I would like to take the limitations of Rust's type system as granted, i.e. I assume that we have no negative bounds or specialization.

We first can see that the blanket implementation of AsRef<U> for &T conflicts with a blanket implementation of AsRef<T> for T (reflexivity). Moreover, any blanket implementation of the form AsRef<U> for SomeTypeConstructor<T> would conflict with a blanket implementation for reflexivity. But blanket implementations of the form AsRef<T> for SomeTypeConstructor<T> do not. (Playground)

Thus, given the constraints of Rust's type system, we have two choices:

  1. Make AsRef reflexive through a blanket implementation.
  2. Allow references, generic wrappers, smart pointers, etc. to act transparent in regard to implementing AsRef in the same way as the the wrapped or targeted type does.

Let's for a moment assume that we decide to make AsRef reflexive through a blanket implementation. That certainly saves us some explicit implementations for concrete types. However, the benefit is limited, because:

  • We could also manually implement AsRef<T> for T for all T which support cheap reference-to-reference conversions from and to other types (e.g. str and Path).
  • In the generic case it is of no use, because wrappers, smart pointers, etc. will not pass that implementation through, e.g. Cow::<str> would not implement AsRef<Path> unless there existed a concrete implementation for that. (Playground)

For that reason, I believe (given the constraints of Rust's type system) it was a good choice to refrain from providing a blanket implementation AsRef<T> for T. This makes it possible to implement AsRef<U> for &T as well as AsRef<U> for SomeTypeConstructor<T> in general.

But do we really want AsRef<U> for &T or AsRef<U> for SomeSmartPointer<T>?

We have three choices here:

  1. Provide a blanket implementation of AsRef<T> where T is the inner type (or target) for all of these (references as well as smart pointers).
  2. Provide a blanket implementation of AsRef<T> only for smart pointers (with target T) but implement AsRef<U> for &T where T: AsRef<U>.
  3. In all cases (references as well as smart pointers), provide a blanket implementation of AsRef<U> where the inner type T implements AsRef<U>.

The first and second option seems to be nice because it allows us to "unwrap" any type from a Cow<'_, T> or Box<T> or Rc<T> even if the inner type T: !AsRef<T>. But as shown in the previous Playground, it makes it impossible to generally pass &SomeSmartPointer<U> where we expect an &impl AsRef<U>.

Only the third choice seems to come with a true and consistent benefit for generic APIs as shown in my post #4 in this thread: Playground.

I would therefore conclude that the following implementations of AsRef are indeed "mistakes":

But honestly, the whole subject is pretty complex and confusing, and I'm not sure if I made a mistake in my reasoning.

What do do?

First of all, I would appreciate some feedback in regard to whether my above reasoning seems to be correct or whether I made a mistake somewhere.

Assuming the analysis is correct, I think it's best to document these inconsistencies. Regarding my proposal to update the docs (see previous post), maybe it's even wise to give specific advice to not make the same mistake in third party crates (but I'm not sure about that).

I really would like to see if these mistakes could be solved in future (or if it could be shown that these are no mistakes). As I said here, simply solving this with coherence-impacting negative bounds or (an improved variant of) specialization doesn't seem to be possible as outlined here:

Maybe this would be a feasible solution to improve things:

Either way, it's not going to be easy, I assume.

2 Likes

I would agree here. Interestingly, the docs on AsMut give a somewhat awkward example though:

Examples

Using AsMut as trait bound for a generic function we can accept all mutable references that can be converted to type &mut T. Because Box<T> implements AsMut<T> we can write a function add_one that takes all arguments that can be converted to &mut u64. Because Box<T> implements AsMut<T>, add_one accepts arguments of type &mut Box<u64> as well:

fn add_one<T: AsMut<u64>>(num: &mut T) {
    *num.as_mut() += 1;
}

let mut boxed_num = Box::new(0);
add_one(&mut boxed_num);
assert_eq!(*boxed_num, 1);

Wouldn't this be better written as follows?

fn add_one(num: &mut u64) {
    *num += 1;
}

fn main() {
    let mut boxed_num = Box::new(0);
    add_one(&mut boxed_num);
    assert_eq!(*boxed_num, 1);
}

(Playground)

The .as_mut() call is entirely superfluous in this example. Not sure about an API that takes an impl AsMut<u64> though. Is there any real use-case for it? What are the real use-cases of AsMut anyway?

Note that here the same oddities arise that exist with AsRef:

fn add_one<T: AsMut<u64>>(num: &mut T) {
    *num.as_mut() += 1;
}

fn main() {
    let mut boxed_num = Box::new(0);
    add_one(&mut boxed_num);
    assert_eq!(*boxed_num, 1);
    let mut num = 0;
    //add_one(&mut num); // fails!
    let mut referenced_num = &mut num;
    //add_one(&mut referenced_num); // fails too!
}

(Playground)

Compare with:

-fn add_one<T: AsMut<u64>>(num: &mut T) {
-    *num.as_mut() += 1;
+fn add_one(num: &mut u64) {
+    *num += 1;
 }
 …
-    //add_one(&mut num); // fails!
+    add_one(&mut num); // works!
 …
-    //add_one(&mut referenced_num); // fails too!
+    add_one(&mut referenced_num); // works too!

where the failing lines compile properly. (Playground)

I'm asking because I have been thinking about how to improve the documentation further:

This is what I came up with, so far.

Generic Implementations

AsRef auto-dereferences if the inner type is a reference or a mutable reference (e.g.: foo.as_ref() will work the same if foo has type &mut Foo or &&mut Foo).

Note that due to historic reasons, the above currently does not hold generally for all dereferenceable types, e.g. foo.as_ref() will not work the same as Box::new(foo).as_ref(). Instead, many smart pointers provide an as_ref implementation which simply returns a reference to the pointed-to value (but do not perform a cheap reference-to-reference conversion for that value). However, AsRef::as_ref should not be used for the sole purpose of dereferencing; instead Deref coercion’ can be used:

let x = Box::new(5i32);
// Avoid this:
// let y: &i32 = x.as_ref();
// Better just write:
let y: &i32 = &x;

Types which implement Deref should consider implementing AsRef as follows:

impl<T> AsRef<T> for SomeType
where
    T: ?Sized,
    <SomeType as Deref>::Target: AsRef<T>,
{
    fn as_ref(&self) -> &T {
        self.deref().as_ref()
    }
}

The idea here is to discourage using superfluous .as_ref() and .as_mut() calls where actually deref-coercion is what should be used instead. But the existing example in AsMut's docs would conflict with that recommendation.

So my question is: What's the actual use of AsMut? (beside dereferencing Box'es, for which it maybe shouldn't be used?)


Update:

PR #28811 added the non-transitive dereferencing AsMut (and AsRef) implementation(s) to Box (and Rc and Arc), saying that:

These common traits were left off originally by accident from these smart pointers, […]

[…]

These trait implementations are "the right impls to add" to these smart pointers and would enable various generalizations such as those in #27197.

If I understand right, then the referenced use case is being generic over AsRef<[u8]>. Aside of that being the non-mutable case, an implementation like

impl<T, U> AsMut<U> for Box<T>
where
    T: ?Sized + AsMut<U>,
    U: ?Sized,
{
    fn as_mut(&mut self) -> &mut U {
        self.deref_mut().as_mut()
    }
}

would do just fine (or even better, Playground) for that particular use case, because AsRef and AsMut are reflexive for slices.

So maybe the example in the AsMut docs should be replaced to use an AsMut<[u8]> instead of AsMut<u64>. Generally, using AsRef<T> or AsMut<T> in cases where reflexivity is not implemented for a particular type T leads to odd behavior (because you cannot pass T, &T, or &mut T, as shown in this Playground above).


There aren't really many implementations of AsMut in std, so perhaps the following could be a suitable example for the docs: Playground. (Along with some descriptive text.)

1 Like

I created a pull request to incorporate the above ideas into the documentation:

As this is my first PR, I hope I did everything correctly. If there are any reasons against this PR, please comment accordingly.

1 Like

My personal view is this:

Deref generalizes over the * operator. To convert between references, it should only be used, when the code can be desugered to use &*. DerefMut should only be used, when the dereference target can be moved from and to convert between reference, when the code can be desuggered to use &mut * or &*.

Borrow generalizes over the & operator. To convert between references, it should only be used, when the convert from reference has just been created. Ideally only the expressions <T as std::borrow::Borrow>::borrow(&someinstance) and someinstance.borrow() should be allowed. Code should never use <T as std::borrow::Borrow>::borrow(somereference) or somereference.borrow(). Some other restrictions to the resulting borrow should also apply (like the Hash condition.)

A similar condition should apply for BorrowMut, where Borrow is replaced by BorrowMut, & is by & mut and borrow by borrow_mut.

There should never be a case, where one wishes to consume and convert a U into a &T, unless U is &S for some type S. The only exception is if U is &mut T. Similarily there should never be a case, where one whishes to consume and convert a U into a &mut T, unless U is &mut S for some type S.

Hence Into<&T> should ideally not be used, hence &T should not implement From<U> for any type. This should apply for any type, that can be written as a reference. The exception is From<V> for V which is also implemented if V can be written &T for some T. A similar thing goes for Into<&mut T>.

If we want to be able to convert U to &T this now means, that

a) U should be &S for some S

b) We should not implement From<U> for &T.

This is now the use case for AsRef. We should implement AsRef<T> for S if we want to convert from &S to &T, in the same manner we would use a hypotetical Into<&T> implemented for &S. A similar thing goes for AsMut.

I generally agree with jbe's opinion on when AsRef should be used.

I still don't fully understand the actual reason(s) to use

  • AsRef<U> for T

instead of

  • for<'a> Into<&'a U> for &'a T

?

Though I feel like one of the main reasons might actually be the desirable "auto-dereferencing" mechanism realized through AsRef's blanket implmentation for references ("As lifts over &") and how it conflicts with reflexivity.

Other reasons might be:

I have been writing several comments to issue #45742, which I believe is the central issue in regard to the technical problems behind AsRef and AsMut. It is currently tagged as

  • A-specialization
  • C-cleanup
  • S-blocked

While specialization may be needed to be able to add new implementations, it's neither sufficient to solve the problem nor it's blocking to work towards a solution, I believe.

Such a lint could be added without specialization. So my question is, would it be reasonable to add a label "A-lint" (and maybe remove "S-blocked"?)

Or should a lint be a separate issue?

Moreover, should #45742 be labeled "T-libs-api"? Or doesn't it make sense until some time later when/if it would perhaps become more feasible to do anything that goes beyond a lint?

P.S.: Unfortunately, the links to the FIXMEs are broken in the OP of that issue #45742, and there seems to be a confusing mistake too.

P.P.S.: And are there reasons against such a lint?

(Maybe it's best to wait first anyway until the PR on the documentation fix has been successfully accepted or been rejected, or been replaced with something different, so there's more clarity on the issue.)

I just stumbled upon a potential use case where using &impl AsRef<T> may be used: It allows you to specify a lifetime requirement.

fn takes_static_bytes(bytes: &'static [u8]) {
    println!("got: {bytes:?}");
}

fn foo<'a, T>(bytes: &'a impl AsRef<[T]>) -> (usize, &'a [T]) {
    let bytes = bytes.as_ref();
    (bytes.len(), bytes)
}

fn main() {
    let (len, bytes) = foo(&"Hello");
    println!("len={len}");
    takes_static_bytes(bytes);
}

(Playground)

Compare with the following, which won't / can't compile:

-fn foo<'a, T>(bytes: &'a impl AsRef<[T]>) -> (usize, &'a [T]) {
+// won't compile:
+fn foo<'a, T>(bytes: impl 'a + AsRef<[T]>) -> (usize, &'a [T]) {

(Playground)

Not sure if that really happens in practice often. I believe that the output of as_ref is commonly only used within the current block or function.

Since my last post on this, neither the PR (99460) nor the original issue (45742) made any progress. As I have too little experience and expertise with Rust (at least for now), I don't think I could implement this idea myself (or even provide a more formalized proposal on it):

Right now, this feels like the only feasiable option to repair the AsRef mess somehow. Maybe not everyone feels so bad about the current state of AsRef as I do, but I feel like it's really a bad flaw, (introduced way back in PR #28811), in one of the most basic traits in Rust, and I find it sad we have to live with this.

Maybe there's someone else who also thinks that the AsRef situation is unfortunate (e.g. see first Playground), and who would like to help fixing it in the long run, e.g. by helping reviewing PR 99460 or adding critique and/or proposing alternatives to the presented approaches to address the problem to #99460 or #45742.

Or do you think this is unfixable? I still think at least the current semantics could/should be better documented.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.