Can you have must_use on an integer in the return, or would it need to be wrapped?
A bug made it to production in our Rust code at my work, caused by using read
instead of read_exact
. It was especially tricky because the method was called on a BufReader
, and it worked fine except when the read
happened to cross the end of the BufReader
’s internal buffer. So the bug was triggered very rarely, and wasn’t caught in our automated or manual testing.
(The code originally called read
directly on a File
. Since it was reading only two bytes into a [u8; 2]
buffer, it might even have been a semi-reasonable assumption that it would never result in a partial read. But when we later wrapped the File
in a BufReader
to improve performance, it violated this assumption which was buried elsewhere in the code.)
I was also thinking recently that read
and write
should have less-innocuous names. And we should use read_exact
and write_all
in docs wherever possible.
When sample code has to use read
or write
, it should always show how to use the return value. I fixed some documentation a while ago, but it looks like I missed some, like this one.
Currently there's no way to add must_use to the usize
it returns. #[must_use]
on a function applies to the return type as a whole (so here the Result
), and #[must_use]
on usize
itself would wreck havoc.
However, somehow adding a must-use-like warning in the compiler about the incomplete length read/written would be helpful here.
Minor correction: The Read
method is named read_exact
, not read_all
.
My experience is exactly the opposite; I find that sockets, pipes, stdio, and types like BufReader are much more likely to return partial reads. For example, if you are trying to read a four-byte value from a socket, but its bytes have been split across a packet boundary.
In contrast, at least on the platforms and filesystems that I target, I've never seen File::read
or File::write
fail to process the maximum number of bytes. So I'd say this problem is mostly about non-File types.
I agree that it's a footgun. It would be nice to have a clippy lint that detects cases where the count returned by read
/write
is discarded. Renaming and deprecating an std
method is not fast, and adding a clippy lint would be immediately benefitial. I'm not sure how easy it is, though: in addition to simple cases like io.write(slice)?;
, you can also write if let Err(err) = io.write(slice)
or match against Ok(_)
, etc.
I think the "correct" thing to do here would be to add more #[must_use]
(and the required plumbing to make it work), so it'd be something like
fn read(&mut self, but: &mut [u8]) -> Result<#[must_use] usize>;
Or, without adding a new location for #[must_use]
, make
#[must_use]
fn read(&mut self, but: &mut [u8]) -> Result<usize>;
work such that the "must use" is applied "through" the Result
as it's already #[must_use]
.
Of course, clippy can probably add this lint easier than rustc proper.
Right. I was a bit confused, and was thinking of read_to_end
. My point was just that there's no such thing as reading to the end of many types, but I didn't understand it was about reading to the end of the buffer.
I think clippy already has this lint: https://rust-lang.github.io/rust-clippy/master/#unused_io_amount
It's great that this lint exists, but I somehow managed to make this mistake and haven't seen this lint (I'm probably not running Clippy often enough).
Could this be helped further? e.g. can this lint be added as a warning in rustc
itself?
Is changing cargo to run clippy by default, at least for newly created projects, out of the question?
If there's a desire to run clippy by default, we might as well just uplift specific lints into the compiler proper (which I think we might want to do for some of the uncontroversial lints).
A compounding problem is that the docs don't help either. For example, when std::net::TcpStream
implements std::io::Read
, it does so without documentation. So when rustdoc renders the docs, it will use the generic documentation from the Read
trait (which obviously don't include any specific info on TcpStream).
(Also, in current rustdoc methods like read()
and write()
are tucked away under a collapsed "hidden undocumented" section, which is how rustdoc signals "you should already know the method names in this trait" and what they do)
I know that your main comment was on the method names, but do you think that mistakes could be reduced if we had some better documentation? (I think we might also need rustdoc enhancements to allow trait implements to augment the method docs, instead of simply replacing them, which is their only option today)
If it was literally just running Clippy, then I wouldn't like that, because there are Clippy lints that are purely about idiomatic style, or approximate checks that may have false positives, so it would make a worse signal/noise ratio of the compiler warnings.
The way I understand it is that rustc
warnings are about serious issues and warnings that are problems with high probability. Warnings about issues that are too subjective or too imprecise are delegated to Clippy.
In this case misused read()
/write()
call is not merely non-idiomatic code, but most certainly a bug, so it could be graduated from being a lint to a warning.
There also seems to be a small parallel with the recent #[must_bind]
thread. In that thread, it was observed that not binding a guard is often wrong in subtle ways (and #[must_use]
doesn't really help). In this thread, the #[must_use]
on the return from Read::read()
also doesn't really help, because it's often wrong to ignore the usize within the Ok
variant.
Another thing that read
and write
don't do that other methods do is handle ErrorKind::Interrupted
, which you'll probably never encounter until you do.
I'm pretty sure naming comes from libc's read/write so I don't really see problem there. Unless you're concerned about lazy users who doesn't read docs I do not see anything poor with this naming. Renaming methods just for the sake of it is pretty much pointless effort.
io traits should provide comprehensive documentation for user to understand each part of interface
To be completely fair to the users here: many uses of the Read
/Write
traits are done from concrete types, which on their documentation page just show that they implement Read::read
/Write::write
without any documentation. (Because the documentation is provided only on the trait page and not the concrete type's page.)
I agree that it would be quite beneficial for any "root" Read
/Write
type to provide specific documentation on these methods for whether and when they can provide partial reads/writes. I don't think a method rename is needed (or really all that desirable) here, and this can be handled by targeted docs on the impls.
In my case it's not laziness or not reading the docs. I'm fully aware of these methods and the pitfall, and yet from time to time I forget and make this mistake, because nice and simple read
/write
names sound like what I want to do. They sound correct. They don't stand out in the code as doing something else than "simply" read/write.
I've mentioned earlier that I suspect it's a homage to libc::read
/write
, but I've never used them. For networking I always try to use anything higher-level (exactly because there are so many sharp edges), and for files there's fread
/fwrite
(not fread_exact
/fwrite_all
).
I wanted to say that other languages don't have that footgun, but Python actually does. It even has examples in the documentation that don't check if the data was written! OTOH in Node.js write()
always writes everything.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.