TcpStream always terminates connections successfully (even on panic)

jbe · August 15, 2021, 12:02pm

That is only possible if unwinding is enabled, right? Otherwise the program gets aborted on panic, and there is no drop (if I understand it right).

That's why I suggested to set the lingering timeout to 0 right after creating the socket (which is what I'd do in C programs) and disable SO_LINGER again right before a successful close. However, if Rust's standard library does this automatically, this could cause confusion when using other libraries to work on the underlying sockets or make things more difficult when manual control on socket options is desired.

In this context, also keep in mind that there are other ways a program could be terminated (e.g. through std::process::exit or a SIGKILL under POSIX systems).

I like the behavior, and I think it's a good thing to do.

For the sake of completeness, however, I want to point out that there might be use-cases where it's better to have the operating system flush out all data properly, even on a panic, e.g. some important data reporting where it's better to risk retrieving partial data (with a wrong EOF) than risking that some data might not be sent out due to a (later) panic that causes a connection abort. That is because flushing may be "undone" by a later abort (e.g. if the RST packet arrives before the flushed data arrives).

ShadowJonathan · August 15, 2021, 12:05pm

Right! Thanks for pointing that out, I forgot that bit.

pdolezal · August 16, 2021, 8:05am

I like Variant 2 more, but it might be just because I don't see breaking the existing behavior as a big deal (considering that the current behavior does not seem completely right to me). Perhaps, to mitigate the behavior change, there might be some methods to tune the options of the connection, including how it should behave on drop (or panic), while the defaults would ensure the current behavior.

Anyway, I definitely support a kind of close method for explicit graceful and possibly fallible termination of a resource. In cases like this it would allow to confirm that the connection was closed gracefully and with no error. In a little different context, I found a use case for such a method.

jbe · August 16, 2021, 10:18am

Maybe you are right, and it's a bad idea to make a (worse) choice here just to keep up backward compatibility. I feel like Variant 2 is most clean/straightforward in a way.

However, the other issue with Variant 2 is that if a programmer forgets to properly close a connection, this could introduce subtle bugs to a program: A RST would be sent out, which many other programs might ignore (or just log an error message about a connection reset). In most cases, all data might have been processed already by the peer, but in some cases, the RST could invalidate previously sent data (depending on timing).

This problem is owed to the fact that the difference between "abort" (RST) and "close" (FIN) is not just about sending a status, but also about how the operating system handles flushing or discarding the network buffers.

I'm not sure if that's a huge issue (or if you can expect a programmer to never forget the close). Before I was aware of the details of how TCP shuts down sessions, it caused a huge headache to me, when I once had a program that terminated connections accidentally with a TCP RST. It was hard to debug because the data appeared to arrive properly in some cases (and socat or netcat don't tell you how the connection got terminated).

I think such a method is important anyway, due to what I wrote here:

However, I believe needing a graceful shutdown on panic is more an exceptional case. In most cases, you'd want a panic to cause a connection abort. Maybe it should be configurable what happens on

panic (or process termination),
or drop.

During a transition period, defaults could be to match current behavior of the standard lib (i.e. not touching socket options at all on the OS level). (That is similar to Variant 3, but with some sugar to allow automatically (re-)setting a previously stored SO_LINGER value on non-unwinding dropping, if needed.)

Later, the default could be changed to cause a connection abort on panic, but a graceful close on drop. (According to Variant 1, but with the option to enable graceful shutdown even on panic, if needed.)

Even further in future (if desired), the default behavior for drop could also be changed to abort the connection – but I'm not sure if that's wise as explained above. (This is like Variant 2 but also with the option to configure behavior, if needed.)

What do you think?

jbe · August 18, 2021, 12:34pm

After working some more with TcpStream (and UnixStream), I came across more problems in real-life usage of these interfaces:

Using threads and std::io, it is not easily possible to make threads abort after a certain timeout (which is required for network applications). In order to do this, I had to implement my own wrapper, which calculates a timeout for each read or write operation.
It is not easily possible to split up the stream into a reading and a writing half that can be owned (Tokio provides a method called into_split for that, but I don't see anything like that in the standard library for TcpStream or UnixStream) other than cloning the socket using try_clone, which imposes overhead and may fail.
The standard library doesn't provide any abstraction layer for what's common in TcpStream and UnixStream (or any other streams) in the future. Thus TcpStream and UnixStream each have their own set_read_timeout methods. I cannot even write a function that accepts some sort of "stream" where I can work with timeouts for single reading and writing operations as there is no respective trait (like "ReadWithTimeout" or "ReadWithDeadline" or similar) which would allow me to write a generic function.

Concluding, I have to say that I can work with Rust's standard library to write a network application – but it is a hassle.

I know these issues reach beyond the original topic of this post, but if we consider to improve TcpStream, then these other issues should be kept in mind as well. Maybe std::io needs to be overhauled in general?

bjorn3 · August 18, 2021, 7:36pm

Read and Write are implemented for &TcpStream so you could wrap TcpStream in an Arc to send to two threads and then call read or write on &*tcp_stream.

jbe · August 18, 2021, 11:19pm

Yes, I know. Sorry to not have been more specific here; my issue was related to the other two points I made.

The problem here is that the Arc won't implement Read or Write directly, which is why I cannot use it in a generic struct over <R: Read, W: Write> that stores types R and W.

See the following two threads:

I found a solution for all my issues, like I said:

…but the solution isn't beautiful (either loss of abstraction/generalization or needing boilerplate code).

system · November 16, 2021, 11:19pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: std::net expansion/refinement libs	13	4200	March 25, 2019
[Pre-RFC]: ExcDrop trait for different drop-glue during unwinding language design	25	4583	March 25, 2019
Weekly-meetings/2014-10-30 (error conventions; cargo; namespaced enums; trait-based error handling; macro unification; coercions; dynamic linking, byte literals, failing dtors)	6	1719	March 25, 2019
Tokio psuedo-RFC: eliminate `io::Error`	41	3075	March 25, 2019
Could we support unwinding from OOM, at least for collections? libs	37	11693	March 25, 2019

TcpStream always terminates connections successfully (even on panic)

Related topics