TcpStream always terminates connections successfully (even on panic)

This is (to me at least) a novel use of RST.

If I understand it correctly, you want the peer to see an ECONNRESET error rather than EPIPE if the socket was discarded as part of an unwind.

How would the peer distinguish this from other cases of ECONNRESET? How useful would that be in practice? I think it would be, at best, a hint of a possible malfunction but due to its unreliability application layer protocols would still need to have another mechanism to decide whether a particular request was responded to. If such a mechanism exists, however, any information conveyed in how the connection is terminated seems redundant.

I'd love to better understand the larger picture/background behind this idea, or who else is doing this.

I think the network behavior on panic should be the same as if you killed the process using the most abrupt process killing interface (i.e. SIGKILL on Linux, TerminateProcess on Windows). I'm not sure what that behavior is though.

On Linux (POSIX) it closes the file descriptor as if close(2) had been called. Read the specification:

  • All of the file descriptors, directory streams, conversion descriptors, and message catalog descriptors open in the calling process shall be closed.

It is important that the consequences of process termination as described occur regardless of whether the process called _exit () (perhaps indirectly through exit ()) or instead was terminated due to a signal or for some other reason.

I do not believe this is correct. A RST segment is generated only if closing the socket immediately would result in a TCP data loss event. Whether this is the case depends on how quickly data has been sent over network and acknowledged by the peer.

As far as I know, there is no reliable way in the BSD sockets API to generate an RST segment for an established connection.

I checked some old RFC (RFC 793), and in chapter 3 (Functional Specification), section 3.8 (Interfaces), it suggests in subsection "User/TCP Interface" two different ways to terminate a connection:

"Close: This command causes the connection specified to be closed. […] Closing connections is intended to be a graceful operation in the sense that outstanding SENDs will be transmitted (and retransmitted), as flow control permits, until all have been serviced. […]"

"Abort: This command causes all pending SENDs and RECEIVES to be aborted, the TCB to be removed, and a special RESET message to be sent to the TCP on the other side of the connection. Depending on the implementation, users may receive abort indications for each outstanding SEND or RECEIVE, or may simply receive an ABORT-acknowledgment."

Thus (at least considering this old RFC), it was intended to provide users of the TCP stack the ability to abort a connection and send a "special reset message" to the other side of the connection.

An EPIPE would only be returned when sending to a peer which reset the connection. More important to me is that a reader does not retrieve an EOF but an ECONNRESET (or, depending on the particular OS implementation any other error, as long as it is an error and not an EOF).

It cannot be distinguished.

You instantly know when a response was interrupted due to an error. The operating system won't try to flush out any fragments of an already broken message. The remote peer can distinguish successful EOFs (e.g. due to half-close or full-close) from unhandled errors (e.g. panics in Rust).

Operating systems won't report a normal EOF when receiving a TCP RST. Thus retrieving a TCP RST is a clear sign that something went wrong. However, you are right in the opposite case: Receiving a TCP FIN isn't a clear sign that everything went okay (which is why I think it's bad practice to "close" a connection on a panic rather than "abort"ing it, using RFC793's phrasing).

You are right about the redundancy.

If (some) programs do "close" instead of an "abort" on error, then we need this redundant information and cannot rely on having received a "successful" EOF on the TCP layer.

Edit to clarify: What I meant is, if there are programs out there, which "close" a connection even on error, then we need additional mechanisms to validate that a response is complete. Then, the "reset information" is redundant. Yet it can make sense to abort the connection instead of closing it (as it seems semantically more correct and can avoid unnecessary data processing, as explained in the next paragraph).

There are also many other reasons in which case this info is available redundantly (e.g. a CRC, etc). But I don't think that's a good reason to keep things as is, as in some application contexts, the message might not contain a CRC or "successful termination" string. Just to name one example, compare HTTP-connections with the Connection header set to Close and responses that don't contain length information or additional content-transfer-encoding. Also, detecting an error early may avoid unnecessary data processing.

I agree, it would be nice to see how other high-level interfaces handle this. Though I don't think that should be the only consideration when deciding what's best to do.

I just tested it on FreeBSD. It indeed is no difference whether I close the socket or kill the process. Though in both cases SO_LINGER is considered.

1 Like

I have used SO_LINGER in past, and it always resulted in a TCP RST when I enable it with a timeout set to zero.

I rechecked on Linux (5.4.0-80-generic #90-Ubuntu SMP Fri Jul 9 22:49:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux) and FreeBSD (12.2) with the following program:

#include <sys/socket.h>
#include <netdb.h>
#include <stdlib.h>
#include <netinet/in.h>
#include <stdio.h>
#include <unistd.h>
#include <signal.h>

static int set_linger(int fd, int timeout) {
  struct linger lingerval = { 0, };
  if (timeout >= 0) {
    lingerval.l_onoff = 1;
    lingerval.l_linger = timeout;
  }
  return setsockopt(fd, SOL_SOCKET, SO_LINGER, &lingerval, sizeof(lingerval));
}

int main(int argc, char **argv) {
  const char *host = "::";
  const char *port = "1234";
  struct addrinfo hints = { 0, };
  struct addrinfo *ai;
  int sock;
  FILE *f;
  hints.ai_family = AF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  hints.ai_protocol = IPPROTO_TCP;
  hints.ai_flags = AI_ADDRCONFIG;
  if (getaddrinfo(host, port, &hints, &ai)) abort();
  sock = socket(ai->ai_family, ai->ai_socktype | SOCK_CLOEXEC, ai->ai_protocol);
  if (sock < 0) abort();
  if (set_linger(sock, 0)) abort();
  if (connect(sock, ai->ai_addr, ai->ai_addrlen)) abort();
  freeaddrinfo(ai);
  f = fdopen(sock, "r+");
  if (!f) abort();
  fprintf(f, "Hello!\n");
  fflush(f);
  sleep(2);
  //kill(getpid(), SIGKILL);
  return 0;
}

lingerval.l_onoff = 1; lingerval.l_linger = 0; reliably causes a TCP RST to be sent out, both on Linux and on FreeBSD (even on a half-closed connection, I double-checked on Linux and FreeBSD using "socat -t 60 STDIO TCP6-LISTEN:1234 < /dev/null" on the other end of the connection).

The function __tcp_close in Linux first checks for the data loss event and if so, sends RST. Right after though it checks whether SO_LINGER is set with a lingertime of 0:

	} else if (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime) {
		/* Check zero linger _after_ checking for unread data. */
		sk->sk_prot->disconnect(sk, 0);
		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);

This should call tcp_disconnect, which, in turn, sends a RST if the connection is in a state where it needs that, which is checked via tcp_need_reset.

The source code credits antirez with this.

1 Like

Summarizing:

  • According to Internet Standard STD 7 (aka RFC 793), TCP connections can be closed by applications in two ways:
    • close (sending FIN)
    • abort (sending RST)
  • Peer applications can distinguish whether a connection was successfully closed (they receive an EOF) or was aborted (they receive an error).
  • Aborting a connection may cause data that has already been sent to be lost (which also avoids trying to flush out data that has not been confirmed by the peer yet).
  • Libc under Linux and FreeBSD provide a way to abort connections (using setsockopt with SO_LINGER).
  • The current implementation in Rust's standard library in combination with libc behavior on at least Linux and FreeBSD never aborts a connection (not even on panic) but always uses "close" (as defined in STD 7). Moreover, it is not possible to change this behavior without manually changing socket options using other libraries or C functions.

Thus my question is: should this behavior be changed? And if yes, how?

And: does anyone know how other high-level interfaces or applications typically handle this?

1 Like

My vote would be yes, it’d at least make errors noticeably more clear on the other end, instead of giving confusing internal state errors with “connection closed unexpectedly”, it’d be “peer reset”.

Does anyone know how node.js (on top level exception bubbling) and golang (on panic) handles this?

2 Likes

I just tested how Python 3.7 handles it in the case of

  • a forced process termination (SIGKILL) and
  • dropping the value through del and waiting for garbage collection.

In both cases, I witnessed a FIN, aka graceful close (same to what Rust does).

It also seems ugly to manually cause a connection abort, as this thread on stackoverflow suggests:

You have to be careful to set the SO_LINGER option on the right sockets API level and to use the right encoding for the option value (it's a struct).

[…]

con.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', 1, 0))

Here, the application programmer has to pack binary data to achieve the desired behavior.

However, I still believe it can (and should be) done differently in Rust.

Variant 1

How about the following idea:

  • panics cause an "abort" (according to RFC 793),
  • calling TcpStream::shutdown with Shutdown::Write or Shutdown::Close cause a "close" (according to RFC 793),
  • a new method will be provided to allow an explicit abort, such as TcpStream::abort,
  • dropping a TcpStream cause a "close" by default (ensures backward compatibility and also avoids data loss such as when error messages are sent to the peer before dropping the stream),
  • it's possible to switch the default drop behavior to "abort" by calling something like TcpStream::abort_on_drop() (useful for error handling with the "?" operator).

Implementation might be tricky. It could be done by setting SO_LINGER to a timeout of 0 after creating the socket and (unless abort_on_drop has been called previously) disabling SO_LINGER on drop or shutdown. But that would mean default behavior could interfere with other libraries (such as the libc crate) which may also perform setsockopt operations on the underlying socket.

So I'm not sure if this is the best idea.

Variant 2

  • panics cause an "abort" (according to RFC 793),
  • dropping the TcpStream causes an "abort" as well,
  • in order to achieve a graceful "close" (send an EOF to the peer and flush data out), the connection must be closed explicitly by calling a new method, something like TcpStream::close, possibly with an optional timeout parameter that indicates how long the operating system should attempt to flush data out.

This could keep interference with socket operations by other libraries to a minimum, as the only change would be to set SO_LINGER to a timeout of 0 after creating sockets. Other libraries may do whatever they want with the socket. Only calling TcpStream::close would set SO_LINGER (which would be documented, of course).

Downside of this is that it would break existing code and might be prone to cause errors in applications that forget to properly close a connection (in most cases, it would work, but with bad timing, information might be lost… a horrible scenario).

Variant 3

Alternatively, there could just be one method added that allows to modify SO_LINGER manually (at least to set it to "close" or "abort" by disabling it or setting a timeout of 0, respectively). That way, an application programmer who wants to care about properly closing or aborting the stream has the ability to do it, while other programmers aren't bothered.

However, this still bears the disadvantage of panics causing a graceful close by default, which seems semantically wrong, and could cause "confusing internal state errors" as @ShadowJonathan pointed out in the previous reply. (Edit: Maybe it isn't that bad and applications should/could always expect an EOF to be, in fact, a crashed peer. But it still feels semantically wrong when there is the possibility to properly report errors instead.)

Summarizing, I dislike all variants :weary:. (But variant 3 would at least be an improvement to the status quo.)

Some applications perform the entire lifecycle of such connections (custom protocols etc.), closing a connection would then be a “conscious”/intended effort on the other side’s part, but an abort would then be a good catch-all for all kinds of irrecoverable errors (network reset, timeout, or in this case, critical application failure). So that distinction should mean that applications treat aborts distinctly differently with exceptions, errors, and failed states, which is at least better than detecting if a failed state occurred after a close, which is what I was getting at. An abort is definitively “abnormal”, while with a close it depends on the application.

I like variant 3 the best as it doesn’t cause backwards incompatibility, but it would possibly not be seen and used by many developers. Still, is there a way for a struct to detect its being dropped as part of an unwind? Or detect inside a drop that current thread is panicking?

Changing default behaviour should be discussed with the lang team, as it’s technically a backwards-incompatible change. I’d argue for variant 3 right now, and then switch to something like variant 1 (with default behaviour abort) with the Lang team.

Personally, also, I think that dropping a connection is “bad manners”, as with a “live connection”, it’d be like dropping an unfinished ice cream in the trash, if you’re gonna throw it away, at least finish it and get all of the remains. The same applies here, if a connection still has queued data (for whatever reason), the other side might expect “this side” to have read that data, and so possibly confusion could occur as later bugs appear that the other side has not properly received that data, as it was in the process of dropping the connection. Making abort explicit on dropping a (unclosed) connection would disincentivize developers from dropping connections implicitly, though if the concern to make sure developers actually read the remaining buffer is real, then maybe close() could return the remaining buffer, though I don’t see that happening anytime soon, because the API is stable now.

(Though maybe this could be implemented as a new function; finalise(), which closes and reads the remaining buffer until the other side also has sent its FIN-ACK after the remaining buffer)

I’m not exactly sure what the library team’s opinion of “closing with unread data” is, but personally, I feel that such situations has data fall “through the cracks”, and so could be classified as a subtle uncommon footgun, similar to the abort/close behaviour this thread is addressing.

1 Like

Yes, that is surprisingly simple: std::thread​::panicking

1 Like

Nice!

Then I don’t think introducing a variant of variant 1, where it’ll switch to abort on drop, would be a big deal, as I’d argue that (beyond the backend implementation), the consensus here seems to be that aborting on critical failure is okay-ish (please disagree with me if that’s not the case, though)

That is only possible if unwinding is enabled, right? Otherwise the program gets aborted on panic, and there is no drop (if I understand it right).

That's why I suggested to set the lingering timeout to 0 right after creating the socket (which is what I'd do in C programs) and disable SO_LINGER again right before a successful close. However, if Rust's standard library does this automatically, this could cause confusion when using other libraries to work on the underlying sockets or make things more difficult when manual control on socket options is desired.

In this context, also keep in mind that there are other ways a program could be terminated (e.g. through std::process::exit or a SIGKILL under POSIX systems).

I like the behavior, and I think it's a good thing to do.

For the sake of completeness, however, I want to point out that there might be use-cases where it's better to have the operating system flush out all data properly, even on a panic, e.g. some important data reporting where it's better to risk retrieving partial data (with a wrong EOF) than risking that some data might not be sent out due to a (later) panic that causes a connection abort. That is because flushing may be "undone" by a later abort (e.g. if the RST packet arrives before the flushed data arrives).

1 Like

Right! Thanks for pointing that out, I forgot that bit.

I like Variant 2 more, but it might be just because I don't see breaking the existing behavior as a big deal (considering that the current behavior does not seem completely right to me). Perhaps, to mitigate the behavior change, there might be some methods to tune the options of the connection, including how it should behave on drop (or panic), while the defaults would ensure the current behavior.

Anyway, I definitely support a kind of close method for explicit graceful and possibly fallible termination of a resource. In cases like this it would allow to confirm that the connection was closed gracefully and with no error. In a little different context, I found a use case for such a method.

2 Likes

Maybe you are right, and it's a bad idea to make a (worse) choice here just to keep up backward compatibility. I feel like Variant 2 is most clean/straightforward in a way.

However, the other issue with Variant 2 is that if a programmer forgets to properly close a connection, this could introduce subtle bugs to a program: A RST would be sent out, which many other programs might ignore (or just log an error message about a connection reset). In most cases, all data might have been processed already by the peer, but in some cases, the RST could invalidate previously sent data (depending on timing).

This problem is owed to the fact that the difference between "abort" (RST) and "close" (FIN) is not just about sending a status, but also about how the operating system handles flushing or discarding the network buffers.

I'm not sure if that's a huge issue (or if you can expect a programmer to never forget the close). Before I was aware of the details of how TCP shuts down sessions, it caused a huge headache to me, when I once had a program that terminated connections accidentally with a TCP RST. It was hard to debug because the data appeared to arrive properly in some cases (and socat or netcat don't tell you how the connection got terminated).

I think such a method is important anyway, due to what I wrote here:

However, I believe needing a graceful shutdown on panic is more an exceptional case. In most cases, you'd want a panic to cause a connection abort. Maybe it should be configurable what happens on

  • panic (or process termination),
  • or drop.

During a transition period, defaults could be to match current behavior of the standard lib (i.e. not touching socket options at all on the OS level). (That is similar to Variant 3, but with some sugar to allow automatically (re-)setting a previously stored SO_LINGER value on non-unwinding dropping, if needed.)

Later, the default could be changed to cause a connection abort on panic, but a graceful close on drop. (According to Variant 1, but with the option to enable graceful shutdown even on panic, if needed.)

Even further in future (if desired), the default behavior for drop could also be changed to abort the connection – but I'm not sure if that's wise as explained above. (This is like Variant 2 but also with the option to configure behavior, if needed.)

What do you think?

1 Like

After working some more with TcpStream (and UnixStream), I came across more problems in real-life usage of these interfaces:

  • Using threads and std::io, it is not easily possible to make threads abort after a certain timeout (which is required for network applications). In order to do this, I had to implement my own wrapper, which calculates a timeout for each read or write operation.
  • It is not easily possible to split up the stream into a reading and a writing half that can be owned (Tokio provides a method called into_split for that, but I don't see anything like that in the standard library for TcpStream or UnixStream) other than cloning the socket using try_clone, which imposes overhead and may fail.
  • The standard library doesn't provide any abstraction layer for what's common in TcpStream and UnixStream (or any other streams) in the future. Thus TcpStream and UnixStream each have their own set_read_timeout methods. I cannot even write a function that accepts some sort of "stream" where I can work with timeouts for single reading and writing operations as there is no respective trait (like "ReadWithTimeout" or "ReadWithDeadline" or similar) which would allow me to write a generic function.

Concluding, I have to say that I can work with Rust's standard library to write a network application – but it is a hassle.

I know these issues reach beyond the original topic of this post, but if we consider to improve TcpStream, then these other issues should be kept in mind as well. Maybe std::io needs to be overhauled in general?

1 Like

Read and Write are implemented for &TcpStream so you could wrap TcpStream in an Arc to send to two threads and then call read or write on &*tcp_stream.

1 Like

Yes, I know. Sorry to not have been more specific here; my issue was related to the other two points I made.

The problem here is that the Arc won't implement Read or Write directly, which is why I cannot use it in a generic struct over <R: Read, W: Write> that stores types R and W.

See the following two threads:

I found a solution for all my issues, like I said:

…but the solution isn't beautiful (either loss of abstraction/generalization or needing boilerplate code).