struct Worker {
person: Person,
job: Job
}
struct Person {
first_name: String,
last_name: String,
dob: DateTime,
address_line_1: String,
address_line_2: String,
address_line_3: String,
// lots more fields here
}
struct Job {
// similarly to person, lots of fields
}
You might think that the size of Worker was small, but in fact when you drill down its actually very big. Having this information might change design considerations, like when to Box. Could rustdoc give (or estimate) the size of any data-structures it documents?
In this case it would be fairly easy to drill down and see, but sometimes there are many levels of complex types, that may be generic, and it becomes a bit harder to tell.
The main problem I see is that size information is usually private, in the semver sense. I think the only thing we should expose, if any, is âthis struct fits in a cache line on whatever architecturesâ, since thatâs approximately the main consideration for boxing.
As for the generics consideration, I think that âminimum size of T to fit in a cache lineâ is probably what you want?
As I understand it, semver relates to how the machine sees our code and reasons about compatibility. Exposing a size hint in rustdoc seems like a very different thing with probably different tradeoffs.
I do see how differences in padding and type sizes across targets might make this harder, but overall I think it would be useful for rustdoc to maybe float this information somehow (probably more than a boolean to reflect whether it fits in a cache line, but not quite an exact number of bytes?).
It would only ever be a best estimate. It would be different on different architectures, and between different versions of the compiler. There would be no promises made about the accuracy - it's just a hint.
What it does do is act as a hint on where to profile - if a struct is big, profiling with and without boxing is probably worth doing. If it's smaller than a pointer, probably not.
You can use std::mem::size_of to find the size of a struct in bytes, but as others have noted donât rely on this staying the same across different compilers and architectures.
If the size of a type matters, then you should probably check the size of your types anyways. How will generic types be handled, in general you canât know the size of it before monomorphozation
There's already a similar lint in Clippy; I argue that if anything, this should be a Clippy lint which is maybe allow-by-default or warn-by-default. So instead of printing the size of every single data structure, there could exist a warning for extremely large structures which might need to be broken up into parts, boxed, etc.
Exactly â and in addition, if something is exposed, people will rely on it, no matter how many flashing red warnings saying "THIS IS ONLY AN ESTIMATE AND ALWAYS SUBJECT TO CHANGE" there might be.
The (Nightly) compiler can already do this, via the -Zprint-type-sizes option. Itâs very effective, Iâve used it myself on multiple occasions. See this blog post for details.
(That doesnât involve rustdoc, which means the sizes donât appear in documentation, so Iâm not sure if it meets your requirements. If not, at least the machinery is already in place within the compiler, and presumably could be hooked up to rustdoc with some effortâŚ)
Please donât âthrow ideas upâ. The language has more ideas than it can deal with, and there is very real cost of every addition, and even evaluation of ideas. If there isnât a big real need for something, forget it.
I disagree. This is not an RFC or pre-RFC; it's just a thread in the internals forum, which has plenty of capacity for ideas. It's one thing if someone, say, spams the forum with a dozen half-baked ideas over the course of a month, but this OP hasn't done that.
It's not a bad idea, either... especially since it's not proposing a core language feature or anything that would be subject to stability guarantees, just an implementation feature, and one that would be relatively easy to implement.
The docs currently allow trivially looking at the implementation of any type, which is the strongest statement that can be made in terms of stability. Stating the size of a type on a given architecture couldnât possibly suggest a higher guarantee in terms of things not changing compared to viewing the source.
I have a slightly different concern â could it be misleading? Specifically:
struct Indirect(Box<SomethingReallyHuge>)
This would show a small number, so one could go and create Vec<Indirect> with a lot of elements and be surprised how this 8-byte large structures ate all the RAM.
So maybe having a (collapsed by default) size analysis that would say 8 bytes inline, but some arbitrary amount on the heap?
If you start counting indirect memory usage, it gets hairy. For Vec usage there is extremely odd distribution â most are empty, some are huge. With a type like Vec<Vec<u8>> you donât know if it typically costs nothing or takes 90% of your RAM.
Thatâs actually what I was trying to say. If I was rustdoc, I wouldnât dare to claim this type is small. Itâs stack representation is small, but that is misleading, as thatâs only half of the message. The best/most accurate answer I could give would be something like:
Itâs worth noting that Clippy already has a warning about surprising enum size (e.g. if you have enum (u32, [u8; 1000])).
If the goal is to warn about excessive stack usage, or maybe too much copying for return types, such things can be added to Clippy. Thatâd work better than checking docs manually type by type.