Idiomatic generating function for arrays

rambip · July 25, 2021, 1:52pm

The core question I want to ask is well presented by this stackoverflow issue:

From what I understand, they are really 2 safe ways to create an array (or [T; N]) in rust.

Either you create it with an element of a type that implement the copy trait
Or you list explicitely what will be in the array, like a tuple.

Let's say I want to initialize a big array (let's say 50000 elements) with even values in boxes.

The standard way would be:

const N: usize = 50000
evens = {
    let mut temp = [Box::Default(); N];
    for i in 0..N {
    temp[i] = Box::new(i*2);
    }
    temp
}

This is a pretty readable code, but I think you can do better:

First, you have to initialize the array, and in this case you have to allocate N boxes on the heap to replace them just after that. The initialization step is useless !
Second, I used mutability, but you don't have to: what you really want to do is generate the array from a function FnMut(usize) -> T on N indices.

So here is I think the most elegant solution currently:

impl<T, F, const N: usize> From<F> for [T; N] where F: FnMut(usize) -> T {
    fn from(f: F) -> [T; N] {
        unsafe {
            let mut result: [T; N] = std::mem::MaybeUninit::uninit().assume_init();
            for i in 0..N {
                result[i] = f(i);
            }
            result
        }
    }
}

fn main() {
    let array_of_evens : [usize, 50000]
                       = (|x| Box::new(x*2)).into()
}

Could that be implemented in the standard libs ? I think the compiler could achieve quite good optimizations around this way of generating arrays.

hyeonu · July 25, 2021, 2:01pm

This is UB. Based on the reference, producing invalid value itself is UB and uninitialized memory outside of the union and struct paddings are invalid. And worse, this code drops the uninitialized value on assignment.

Easiest workaround currently possible is to utilize arrayvec crate which provides array-based container which can be made with .collect() and be converted into the inner array if full.

rambip · July 25, 2021, 2:03pm

Since the values are initialized directly after, without needing to read values in the array, why would that be an issue ?

My point is that this could make sense if the compiler initialized the array for me from the function I gave to it.

mbrubeck · July 25, 2021, 2:07pm

The array-init crate provides a version of this function, and some variations. It has 27 reverse dependencies and over 100,000 recent downloads, which I think is decent evidence that this feature is widely needed (especially since this is just one of a few similar crates).

I think it would be good to add this to the standard library for discoverability, especially considering the pitfalls of people trying to implement it from scratch (as demonstrated in this thread!).

Assigning to the value will cause the destructor to read the previous (uninitialized) value. The MaybeUnit docs show the correct way to do this, using [MaybeUnit<T>; N].

rambip · July 25, 2021, 2:14pm

Thanks !

The zip and map functions where already quite good abstractions around arrays, but I feel like it is the missing piece of the puzzle (for what I wanted to do anyway)

SkiFire13 · July 25, 2021, 2:26pm

Also note that [T; N]::map will be stable in rust 1.55, so you'll be able to do [(); N].map(|_| /* code here */) to initialize an array. Even though this is a weird hack, it should be enough for infallible array creation.

rambip · July 25, 2021, 2:43pm

I still have to manually manage the indices: an array filled with () does not have information about the indices, it's not like a range.

SkiFire13 · July 25, 2021, 2:48pm

That's why I said it's a weird hack. However it's much better than the two ways you linked above: it avoids both having to allocate twice and having to use unsafe (with the linked UB potential)

kornel · July 25, 2021, 3:01pm

If T has a destructor then result[i] = will run it on uninitialized data, which is very crashy.

Apart from that, arrays and slices are defined to only have initialized data. Invalid assume_init is UB by definition. It feeds self-contradictory data to the optimizer. The optimizer can then make invalid assumptions and compile bogus code (e.g. even delete the whole function, because UB is defined to never happen, so if it happens, then it's allowed to assume this code never runs).

rambip · July 25, 2021, 3:04pm

A slightly more complicated function that someone could need is map but with the index:

With an iterator you could achieve that easily with iterator.enumarate().map(|(i, v)| /*...*/).

With the array_init you could do it (result = array_init(|i| /* something with i and old_array[i])*/))

But it will not work if old_array is a non-copy array ...

I'm trying to find a small reproducible exemple. Is there a way to convert an array to an array of references ?

mbrubeck · July 25, 2021, 3:35pm

The only way to get an index when using [T; N]::map is to manage it "by hand":

let mut i = 0;
let array = [(); N].map(|_| {
    let val = do_something_with(i);
    i += 1;
    val 
});

rambip · July 25, 2021, 4:59pm

Since map and zip on arrays are still unstable, how can I implement them with MaybeUninit ?

kornel · July 25, 2021, 5:44pm

Use the arrayvec crate.

use arrayvec::ArrayVec;
use std::convert::TryInto;

const N: usize = 10;

fn main() {
    let arrvec = (0..N).map(|x| x * 2).collect::<ArrayVec<usize, N>>();
    let arr: [usize; N] = arrvec[..].try_into().unwrap();
}

usbalbin · July 29, 2021, 10:31pm

Since map and zip on arrays are still unstable, how can I implement them with MaybeUninit ?

This was once how unstable [T; N]::map was implemented(has be refactored a bit since). However that is far from as safe and simple as using something like array-init.

A slightly more complicated function that someone could need is map but with the index [...] I'm trying to find a small reproducible exemple. Is there a way to convert an array to an array of references ?

If you want to use some of the Iterator methods but on arrays, then there is iter_fixed. The limitation there is that main type IteratorFixed only implements the subset of Iterator's methods that affect the length in a const-predictable way.

struct NotCopy(i32);

let values = [NotCopy(0), NotCopy(1), NotCopy(2)];
let res: [(usize, &NotCopy); 3] = (&values)
    .into_iter_fixed()
    .enumerate()
    .map(|(i, v)| (2 * i, v))
    .collect();
assert_eq!(res, [(0, &NotCopy(0)), (2, &NotCopy(1)), (4, &NotCopy(2))]);

system · October 27, 2021, 10:32pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rust needs a safe abstraction over uninitialized memory language design	80	3843	March 21, 2025
Impl From array for array: `impl From<[T; N]> for [U; N]` language design	4	533	January 2, 2024
`<[T]>::as_uninit_mut` libs	6	1023	February 28, 2022
Add Runtime Construction Functions for [T;N] after Const Generics language design	10	1139	February 24, 2021
Collecting iterators into arrays	20	13462	September 12, 2019

Idiomatic generating function for arrays

Related topics