Improving usability of having many nearly-identical methods

Rust libraries, especially the standard library, often end up having multiple slightly different variants of methods:

  • try_, _checked/_unchecked
  • _ref, _mut, and occasionally _ptr, _mut_ptr, _drop
  • split and search ×
    • forward and reverse
    • exclusive and _inclusive
    • unlimited, limited to n, and _once
  • _start/_end, _prefix/_suffix
  • sort ×
    • stable/_unstable
    • _by closure, _by_key, _by_cached_key, and _floats
  • collections ×
    • _in allocator
    • _uninit, _zeroed
    • fallible try_
    • _with_capacity (_and_hasher)
    • _within_capacity
    • _entry/_key_value
  • _array or _slice (sometimes _exact)
  • [u8] vs _str, _os_str, _ascii, _wide or _with_nul
  • immediate value vs _with/_then/_else
  • wrapping_, saturating_, overflowing_, strict_, unbounded_, algebraic_, carrying_ and _signed, _eulicd.
  • le_/ne_/be_ bytes
  • make_/_in_place vs copy to_/into_
  • rayon has _init/_with/_any/_first/_last, _context, in_place_ and _fifo
  • SIMD has type × width × low/high/single/vector/horizontal/unaligned/masked
  • Async has _blocking, and sometimes _send/_local or _on (handle).

So far Rust survives doing nothing about it, but that's not ideal. This isn't a one-time accident, this is how Rust de-facto works. Some duplication (like _mut vs ref vs owned) is unavoidable. The standard library won't ever have fewer methods, and it's likely it will only grow.

Adding a degree of flexibility (fallible allocation, non-UTF8 stings, float size, closure callback, unchecked) can easily double the API size, and this compounds exponentially when the API supports multiple cross-cutting aspects like these.

The fear of multiplying API surface to a problematic size causes pushback against adding more flavors of methods, at cost of losing more specific methods that could have better performance, error handling, usability, or additional features.

I think it's worth considering what can be done in the language, libraries, tooling, design patterns, etc. to either reduce the need to multiply methods, or to improve working with bunches of similar methods to make it a non-issue.

18 Likes

Something like combinators or builders could work. For string splitting for example, I've proposed a combinator API to get all possible combinations of (reverse, once, terminated, etc): ACP

5 Likes

Keyword generics potentially would have a path to cover try/mut/ref/const.

Some of these are logical differences. I suppose the APIs for to_bytes_XX could have been written to_bytes(ByteOrder::XX), and e.g. atomics already work that way a_var.load(Ordering::SomeOrder) if you want to add that as a counter example.

Some of these could potentially be covered by some default argument proposal, e.g. fn new(capacity: usize default 0).

Named arguments (with defaults) would be super helpful. Even named arguments without defaults would be a little helpful, in cases where the default is very obvious e.g. new(foo=None).

4 Likes

Some improvements could be done without changing the language, e.g.

  • rustc could be taught that methods are related, and that some errors can be fixed by using a more appropriate method (e.g. instead of complaining about passing a closure where a primitive type was expected, say it should be calling a _by/_with method instead).

  • rustdoc could automatically cross-reference between different flavors of methods (point out when there's a try_ version of panicking method). Perhaps it could even group similar methods by their "stem".

  • There could be a rustdoc attribute for hiding/de-emphasizing rustdoc documentation of the most niche flavors of methods, so that adding _during_a_full_moon flavor to everything doesn't clutter the documentation.

  • Maybe IDEs/rust-analyzer could have specialized auto-complete/auto-fixes for choosing the right method. If I type v.push(x)?, make it v.try_push(x)?. Turn x.get().mutate() into x.get_mut().mutate().

7 Likes

Unfortunately, the design pattern of to_bytes(ByteOrder::XX) is a hassle in Rust, since it requires importing the right enum type. Swift has much easier with to_bytes(.XX), but this died in bikeshedding between copying un-Rust-like .-prefix vs ugly to_bytes(_::XX).

Optional named arguments would be great IMHO. Especially that Rust could copy the ObjC/Swift approach of making it a syntax sugar for longer method names, so v.sort(by_key: k)[1] could call v.sort_by_key(k).


  1. or something like it if : doesn't parse well ↩︎

6 Likes

I think the most important solution will be "mini-builders" via https://github.com/rust-lang/rfcs/pull/3681 (plus a way to say "well this can be extended with more fields in the future but I promise they'll always have default values, so you can construct it so long as you use , ..).

That will let you have all the _in, _with_capacity, etc as fields on the options struct, so you can omit them if you want the defaults (global alloc, empty capacity, etc) but we can add more things without the combinatorial explosion of APIs.

And of course the pie in the sky vision for things like try_ is that effects make it "just work".

1 Like

I was trying to form some identical function groupings. A couple of these stuck out as things that would be fine as an enum (especially an enum with a default!) but don't use an enum now. And it's basically because using an enum can be several times longer at the call site than just copy/pasting a function a couple times with minor modifications. But in some cool future world with defaults, keywords, automatic polymorphism we could have, um...

fn sort(
    by: CmpFn = std::cmp::identity_cmp,
    by_key: KeyFn = std::convert::identity,
    order: Ordering = _::Asc,
    stability: Stability = _::Stable
) where ...

No idea on how sort_by_cached_key fits in that scheme exactly as it's doing a bit more than something like sort(by_key = cached(my_key_fn)) would allow. Unfortunate if bikeshedding on _::Thing or whatever was all that killed it because if that existed in any reasonable form, I would use it.

5 Likes

A large chunk of those nearly-same methods differ in effects, and so could be squashed by an effect system. Alas, that ship has long sailed past us. There was the "keyword generics" proposal a few years ago, which is like half of an effect system, but I haven't heard a peep about it in a long while.

Because of a lack of interest/devtime, complexity, backwards incompatibility with other features or due to other reasons?

It does seem useful and could extend to new functionality (+ potential security/reliability improvements).

This seems like a worthwhile diagnostics attribute. Perhaps you could list related methods, and rustc would check if any of them would type check and if so suggest switching to it? (I don't know nearly enough about the internals to determine if that I'd even feasible.)

4 Likes

Are you sure? I was really excited about effects and was under the impression that it was still a real possibility (I'm thinking about effects in the way described in this blogpost by Yoshua Wuyts). Based on:

We're currently in the process of formalizing the effect generic work via the A-Mir-Formality. MIR Formality is an in-progress formal model of Rust's type system. Because effect generics are relatively straight forward but have far-reaching consequences for the type system, it is an ideal candidate to test as part of the formal model.

and:

Once both the formal modeling and compiler refactorings conclude, we'll begin drafting an RFC for effect-generic trait definitions. We expect this to happen sometime in 2024.

(both at the end of the blogpost)

I was under the impression this was just behind schedule, and that there was still work being done in the area (such as a-mir-formality!194 and const traits as a whole).

Controversial take certainly, but C#-style function overloading and optional/default/named arguments could help.

fn new() { ... }

fn new(capacity: usize) { ... } (as opposed to with_capacity)

or

fn new(capacity: usize = 0) { ... }

Rust already has some weaker and more boilerplate-heavy forms of overloading via turbofish, so I don't think it's as much of a stretch as might be believed.

4 Likes

My guess is that currently other things take precedence, like Polonius, const time type evaluation, and so forth.

3 Likes

I think we start with making structs easier to use first.

For example, if you have Vec::new_with(.{ capacity: 123, .. }) or similar it's pretty close to having named arguments with defaults, but avoids all the questions like "how do named arguments work with Fn" since it's just a struct getting passed to a function.

6 Likes

I read this as the anonymous struct being a new struct VecOptions, where only logical constructor fields are available.

For the Sorting example, you'd specify which fields you want to override from the defaults, and I suppose even those fields could be anonymous structs too? (continuing the anonymous struct syntax to those fields)

3 Likes

To reduce boilerplate we have invented more boilerplate /s

But to actually say something productive? We also have the issue of having many identical methods.

str for example has as_bytes and AsRef<[u8]> which do the same thing, and this is super common. String has as_str and AsRef<str>, Borrow<str>, Deref<Target = str> and unsize &String -> &str

It also has as_bytes, AsRef<[u8]> and all the methods from Deref<Target = str>

If I want to convert &String to &str or &String to &[u8] there are like 5 ways to do both that are indentical.

Most of those you should never call if you know your type is &String and only exist for the sake of generic code, documentation, and the fact that type inference is really bad when it comes to traits like AsRef and Into, so maybe we could do something about that.

#[inherent]
impl AsRef<str> for String {
    #[alias(as_str)]
    fn as_ref(&self) -> &str { &**self }
}

inherent would simply mean this trait implementation is always in scope for String, any methods on String must not conflict with methods from the inherent trait, and all inherent methods show up in doc similar to normal methods.

The alias attribute would allow us to name the specific AsRef::<str>::as_ref function which would help with type inference.

Personally i almost never use AsRef or Into on my own types simply because the type inference fails almost 100% of the time. The trait implementation also takes longer to write than just slapping a as_str method into an impl block.

2 Likes

I don't think this is a good solution in most cases. Yes, it is possible without changes to Rust, but it also has a significant amount of downsides compared to the easy to read/understand sort(asc: true, stable: Stability::xxx):

  • Users are forced to know about and import Sorting,
  • then need to figure out which method chain they actually need.
  • When debugging you likely need to look up which fields are set to what by which functions (which could in theory even do arbitrary other things).
  • The function definition site requires a lot more boilerplate.
  • It's likely harder to read/understand/intuit on both declaration and usage side.

Personally, I really like the foo.sort(key: value, key2: value) or foo.sort(key=value, key2=value) syntax, which feels a lot more familiar than the builder pattern (even though I have used both). And it doesn't add another layer with function names.

This syntax feels unintuitive/unfamiliar, even though it does make sense.

Any reason not to do the same with a key=value or key: value syntax (i.e. without .{ and }? [1] Both require new syntax and as far as I know the function abi isn't even stable yet. Couldn't Vec::new_with(false, capacity: 123) by syntax sugar for Vec::new_with(false, VecNewWithArgs{capacity: 123}), too?

As long as VecNewWithArgs is not nameable/constructable [2] it shouldn't even prevent changing that behavior in the future and let us make it into individual arguments instead of one struct (if desired), then it would be up to the ABI whether it's passed as a struct or individual arguments and thus potentially be changeable in the future.

One downside is of course that it isn't an explicit opt-in to default values for all other named arguments, but that can also be a big advantage as it allows adding new optional arguments without needing a breaking change, so perhaps this should even be the default (insert discussions about non-exhaustive here). [3]

#[non_exhaustive] // Maybe
#[derive(Default)]
pub struct VecWithNewArgs {
    capacity: usize
}
impl Vec {
    // Explicit opt-in (bikesheddable, not sure if this one would work)
    pub fn new_with(bar: bool, args: ..VecWithNewArgs) -> Self {}
    // Could always allow it on the last argument (if it derives Default).
    // Not sure if that'd be a good idea though.
    pub fn new_with(bar: bool, args: VecWithNewArgs) -> Self {}
}

fn foo() {
    Vec::new_with(false, capacity: 123);
}

Though your suggestion works better if there is a need for 2 argument structs that are forwarded to other calls, but it feels less intuitive and the only thing you really save is writing+importing the name of the struct.

This would also be familiar for those coming from e.g. Python, where iirc the kwargs are also effectively just a single object (below the syntax that hides it): PEP 692 – Using TypedDict for more precise **kwargs typing | peps.python.org

All of that is mainly relevant for different inputs of course, not for varying the output or having an additional generic.


  1. Though you'd likely need some kind of opt-in on the declaration side, even if it behaves like passing a struct. ↩︎

  2. There are likely advantages to having it nameable or even be your own type, as it allows to just forward arguments. So perhaps a public struct with private fields. ↩︎

  3. I guess there is also the option to take the non_exhaustive attribute of the struct into account and force the call to end with .., but I don't really see how that would be better for arguments that are explicitly "optional"/named. ↩︎

1 Like

Because it means you can use it in other places too, like say

x.position = .{ x: 10, y: 20};

where the name of that type isn't particularly important.

Or if you prefer,

x.position = _ { x: 10, y: 20};

since it's just the classic "_ to infer the type" that way.

(I've been liking the swift-style version, though, thus the why I've been writing these with the . syntax)

6 Likes

I've tried to solve the problem of several similar functions in my code with const generics, but it is not allowed (yet?). E.g.,

enum MyFnMode {
    Loud,
    Normal,
}

fn my_fun<const MODE: MyFnMode>(actual_arg: i8) {
    match MODE {
        MyFnMode::Loud => println!("{actual_arg}!!!"),
        MyFnMode::Normal => println!("{actual_arg}")
    }
}

fn main() {
    use MyFnMode::*;
    my_fun::<Loud>(3);
}

This is my least favorite specific vision of default args I've seen on here, but even then if it existed I would use it (assuming some sugar). Default args would be great.

As a point of meta discussion, it would be nice if the stuff about the various ways to implement default args were moved to a different thread. I can't actually mandate that this happen though.