Deprecate vec! macro

Is there a reason vec! is a macro rather than two regular functions?

vec! takes two forms:

  1. vec![a, b, c]
  2. vec![v; n]

It would be preferable if these were regular functions rather than macros. One reason is that the type signatures would be clear in the documentation (a, b, c, v need to be of type T, v needs to be Clone, n needs to be usize).

1 already exists as a function: Vec::from([a, b, c]).

2 could be a new function, for example Vec::repeat(v, n). It can already be written std::iter::repeat(v).take(n).collect() but that's verbose.

4 Likes

Whether function equivalents of these invocations make sense or not, the macro is far too widely used to even consider deprecating.

31 Likes

How about this:

  1. Add Vec::repeat(v, n).
  2. Do not mark vec! deprecated.
  3. In the documentation of vec! simply say that the macro is equivalent to the two functions and those can be used instead.

Saying that vec![v; n] is equivalent to Vec::repeat(v, n) would immediately make it obvious that the expression v gets evaluated once since that follows from regular function call rules.

7 Likes

There's a function that's technically already stable already usable on stable without features or workarounds:

fn main() {
    let v = std::vec::from_elem(0, 4);
    dbg!(v);
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c453dbedbd55d2baa2320233e7edb1df

Huh, I can't find that one in the documentation. Apparently it is hidden. But not deprecated.

Is it just there to support the macro? Presumably not a stable API though?

Any other interesting hidden functions like that?

2 Likes

the vec! macro is extremely useful for teaching the language to people coming from a language like python. they often get caught up trying to use slice literals like they would a heap allocated array in a dynamic language, and telling them to just use vec! if they want something closer to what they're used to is usually the easiest option.

3 Likes

I was certainly expecting it to be marked unstable, with the magic attribute on vec! to let it use the unstable thing.

But apparently not!

I don't understand how saying "just use vec!" is any better than saying "just call Vec::from".

Generally if you can define something as a function, it's better to have it as a function than as a macro.

There is no string! macro for building Strings.

I think the actual reason vec! exists is that Vec::from([T; N]) is relatively new, and it was impossible in its full generality before const generics.

6 Likes

If there's an enduring reason for it, it would be because it allows a different order of evaluation for stuff. The macro can allocate first, then evaluate the parameters one-by-one storing them into the allocation. Something taking an array needs to, at least hypothetically, evaluate the whole argument array before doing the allocation.

But there's absolutely no reason for vec![x; n], since that's purely better done as Vec::repeat(x, n) and has always been possible as a function (indeed, the macro just calls the function).

9 Likes

Oh, but surely there is: the fact that it mirrors the array repeat constructor syntax. There's a pleasant symmetry in

[1, 1, 1] : [1; 3] :: vec![1, 1, 1] : vec![1; 3]

although the slight semantic differences are unfortunate.

Of course it's debatable whether [e; N] should exist either or whether it should be eventually deprecated in favor of array::repeat.

11 Likes

Well, a less awkward syntax – whether a macro or something else – for creating Strings in Rust is probably one of the most often requested features. I'm sure most people much prefer writing vec![1, 2, 3] over Vec::from([1, 2, 3]) - the ([ combination especially being incredibly awkward to type on many keyboard layouts.

7 Likes

I don't see how that's an argument for a macro vs a function. If you want a shorter notation for string construction, you could just have a shorter function name: string("abc") is both shorter and more understandable than string!("abc").

I don't see how the character combination of ![ is any less awkward to type than ([.

f([...]) or f(&[...]) is standard notation for passing arrays or slices to many other functions. If we wanted a different notation for this, it should apply to all such functions.

Functions have cleaner argument and return type checking and have cleaner docs with standard function signatures.


It's macros that are far more awkward than functions. The syntax isn't clear. The types of arguments aren't clear. What happens with the arguments isn't clear. It all depends on the macro implementation internals. You have to read the docs in prose rather than look at types.

For example I've been confused before (and others too) with how println! automatically takes arguments by reference silently.

Even with vec![x; n] it's not immediately obvious that the argument x is only evaluated once and then cloned rather than evaluated N times, you have to read the docs to be sure.

3 Likes

Isn't this a moot point given:

Until we have some sort of proper guaranteed placement construction, the macro seems like the best option to me. Rust is altogether too happy to copy things at constructions, and relying on LLVM optimisations is brittle to say the least. That (and lack of any form of specialisation, but that is off topic) is one of my largest gripes with Rust currently.

3 Likes

Does the macro guarantee it? The docs don't say anything about that.

The original reason for the macro was that the const generics feature came relatively late, and without it you can't write a Vec::from that works for a fixed array of any size (you could with slices - but then you'd get references to the values), so a macro was the only way to do it.

All the other reasons for why it is good to have it as a macro are kind of irrelevant - if Rust had const generics from version 1.0.0 it wouldn't have been a macro. And the only valid reason for keeping it a macro is:

8 Likes

There's no guarantee of emplacement. But what it does do is utilize #[rustc_box] to construct the array directly on the heap, which means the (potentially unwinding) happens before constructing the elements (which may have drop glue), making it significantly easier for LLVM to optimize it into direct initialization. Additionally, while every element is created before any are moved to the heap, they remain as separate objects and the slice itself is assembled

That's all implementation details of the current stable compiler, though. We do try to avoid regressing move efficiency, and vec![] is more tested than other less ubiquitous constructions, but it's all best-effort for the medium term at least. The current vec! implementation is also measurably better for compile time than using From<[T; N]>, for essentially the same reasons it's easier to optimize.

optimized MIR
pub fn vec_macro(f: fn() -> String) -> Vec<String> {
    vec![f(), f(), f()]
}

pub fn vec_from(f: fn() -> String) -> Vec<String> {
    Vec::from([f(), f(), f()])
}
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn vec_macro(_1: fn() -> String) -> Vec<String> {
    debug f => _1;
    let mut _0: std::vec::Vec<std::string::String>;
    let mut _2: std::boxed::Box<[std::string::String]>;
    let mut _3: std::boxed::Box<[std::string::String; 3]>;
    let mut _4: *mut u8;
    let mut _5: std::boxed::Box<[std::string::String; 3]>;
    let mut _6: std::string::String;
    let mut _7: std::string::String;
    let mut _8: std::string::String;
    let mut _9: &mut std::boxed::Box<[std::string::String; 3]>;
    let mut _10: ();
    let mut _11: *const [std::string::String; 3];
    scope 1 (inlined slice::<impl [String]>::into_vec::<std::alloc::Global>) {
    }

    bb0: {
        StorageLive(_2);
        StorageLive(_3);
        _4 = alloc::alloc::exchange_malloc(const 72_usize, const 8_usize) -> [return: bb1, unwind continue];
    }

    bb1: {
        StorageLive(_5);
        _5 = ShallowInitBox(move _4, [std::string::String; 3]);
        StorageLive(_6);
        _6 = _1() -> [return: bb2, unwind: bb8];
    }

    bb2: {
        StorageLive(_7);
        _7 = _1() -> [return: bb3, unwind: bb6];
    }

    bb3: {
        StorageLive(_8);
        _8 = _1() -> [return: bb4, unwind: bb5];
    }

    bb4: {
        _11 = (((_5.0: std::ptr::Unique<[std::string::String; 3]>).0: std::ptr::NonNull<[std::string::String; 3]>).0: *const [std::string::String; 3]);
        (*_11) = [move _6, move _7, move _8];
        StorageDead(_8);
        StorageDead(_7);
        StorageDead(_6);
        _3 = move _5;
        _2 = move _3 as std::boxed::Box<[std::string::String]> (PointerCoercion(Unsize));
        StorageDead(_5);
        StorageDead(_3);
        _0 = slice::hack::into_vec::<String, std::alloc::Global>(move _2) -> [return: bb9, unwind continue];
    }

    bb5 (cleanup): {
        drop(_7) -> [return: bb6, unwind terminate(cleanup)];
    }

    bb6 (cleanup): {
        drop(_6) -> [return: bb8, unwind terminate(cleanup)];
    }

    bb7 (cleanup): {
        resume;
    }

    bb8 (cleanup): {
        _9 = &mut _5;
        _10 = <Box<[String; 3]> as Drop>::drop(move _9) -> [return: bb7, unwind terminate(cleanup)];
    }

    bb9: {
        StorageDead(_2);
        return;
    }
}

fn vec_from(_1: fn() -> String) -> Vec<String> {
    debug f => _1;
    let mut _0: std::vec::Vec<std::string::String>;
    let mut _2: [std::string::String; 3];
    let mut _3: std::string::String;
    let mut _4: std::string::String;
    let mut _5: std::string::String;
    scope 1 (inlined <Vec<String> as From<[String; 3]>>::from) {
        let mut _6: std::boxed::Box<[std::string::String]>;
        let mut _7: std::boxed::Box<[std::string::String; 3]>;
        scope 2 (inlined slice::<impl [String]>::into_vec::<std::alloc::Global>) {
        }
    }

    bb0: {
        StorageLive(_2);
        StorageLive(_3);
        _3 = _1() -> [return: bb1, unwind continue];
    }

    bb1: {
        StorageLive(_4);
        _4 = _1() -> [return: bb2, unwind: bb5];
    }

    bb2: {
        StorageLive(_5);
        _5 = _1() -> [return: bb3, unwind: bb4];
    }

    bb3: {
        _2 = [move _3, move _4, move _5];
        StorageDead(_5);
        StorageDead(_4);
        StorageDead(_3);
        StorageLive(_6);
        StorageLive(_7);
        _7 = Box::<[String; 3]>::new(move _2) -> [return: bb7, unwind continue];
    }

    bb4 (cleanup): {
        drop(_4) -> [return: bb5, unwind terminate(cleanup)];
    }

    bb5 (cleanup): {
        drop(_3) -> [return: bb6, unwind terminate(cleanup)];
    }

    bb6 (cleanup): {
        resume;
    }

    bb7: {
        _6 = move _7 as std::boxed::Box<[std::string::String]> (PointerCoercion(Unsize));
        StorageDead(_7);
        _0 = slice::hack::into_vec::<String, std::alloc::Global>(move _6) -> [return: bb8, unwind continue];
    }

    bb8: {
        StorageDead(_6);
        StorageDead(_2);
        return;
    }
}
1 Like

Did you intend to write the exact same code twice (with only the function name differing and both using vec!)?

Oops, no, I must've copied the wrong version. Fixed; the second was supposed to be using Vec::from to show the difference. Although whether Box::new not getting MIR inlined makes the difference more or less clear, I'm not sure.

Here's a version that forces more MIR inlining to make it clearer: https://rust.godbolt.org/z/Kf5TEad7Y.

Essential parts, after trimming:

fn vec_macro(_1: fn() -> String) -> Vec<String> {
    bb1: {
        _3 = alloc::alloc::__rust_alloc(const 72_usize, const 8_usize) -> [return: bb2, unwind unreachable];
    }

    bb2: {
        _4 = _3 as usize (Transmute);
        switchInt(move _4) -> [0: bb3, otherwise: bb4];
    }

    bb3: {
        _5 = handle_alloc_error(const Layout {{ size: 72_usize, align: std::ptr::Alignment(std::ptr::alignment::AlignmentEnum::_Align1Shl3) }}) -> unwind continue;
    }

    bb4: {
        _6 = *mut [u8] from (_3, const 72_usize);
        _7 = _6 as *const [u8] (PtrToPtr);
        _8 = NonNull::<[u8]> { pointer: _7 };
        _9 = Result::<NonNull<[u8]>, std::alloc::AllocError>::Ok(move _8);
        _10 = ((_9 as Ok).0: std::ptr::NonNull<[u8]>);
        _11 = (_10.0: *const [u8]);
        _12 = _11 as *mut u8 (PtrToPtr);
        _13 = ShallowInitBox(move _12, [std::string::String; 3]);
        _14 = _1() -> [return: bb5, unwind: bb10];
    }

    bb5: {
        _15 = _1() -> [return: bb6, unwind: bb9];
    }

    bb6: {
        _16 = _1() -> [return: bb7, unwind: bb8];
    }

    bb7: {
        _17 = (((_13.0: std::ptr::Unique<[std::string::String; 3]>).0: std::ptr::NonNull<[std::string::String; 3]>).0: *const [std::string::String; 3]);
        (*_17) = [move _14, move _15, move _16];

vs

fn vec_from(_1: fn() -> String) -> Vec<String> {
    bb0: {
        _2 = _1() -> [return: bb1, unwind continue];
    }

    bb1: {
        _3 = _1() -> [return: bb2, unwind: bb11];
    }

    bb2: {
        _4 = _1() -> [return: bb3, unwind: bb10];
    }

    bb3: {
        _5 = [move _2, move _3, move _4];
        _7 = alloc::alloc::__rust_alloc(const 72_usize, const 8_usize) -> [return: bb5, unwind unreachable];
    }

So you can see the allocate-before vs allocate-after.

The version with allocation first can also be done with a function rather than a macro:

Vec::new_with(|| [f(), f(), f()])

or even:

Vec::new_with([|| f(), || f(), || f()])

Of course for most uses this delayed initialization optimization doesn't matter so those can just use Vec::from.