Add a `SocketBuilder` in `std::net`?

[ I suspect the following might have already been discussed long time ago, I'm sorry if I'm repeating something everyone already knew. If there is already an existing conclusion about this, please let me know, thanks. ]

There have been many questions asked about the future of std::net, and there have been real efforts to replace it (or amend it?) such as net2 crate (deprecated) and socket2 crate, not to mention crates like nix to cover sockets too, only the Unix platforms.

After using std::net, socket2 and nix on macOS, Linux and Windows, it seems to me that a couple of weak spots of std::net are:

  1. Its API does not support custom operations (i.e. setsockopt) before calling bind().
  2. It does not expose a Socket struct or trait because this name is already defined as platform-dependent.

Clearly these two points are related. Inspired by an old RFC issue, I was wondering if there have been any effort to add a SocketBuilder in std::net ?

Similar to what mentioned in the linked old issue, SocketBuilder would:

  • Wrap around a platform-dependent Socket to present a platform-independent socket interface as close to a traditional socket as possible.
  • Allow custom operations before calling bind() on the inner socket.

In addition, the existing struct can be updated to support SocketBuilder as well, for example, UdpSocket can have a new method like : bind_builder(builder: SocketBuilder) -> io::Result<UdpSocket>. This allows the apps to continue using UdpSocket as the main API.

Is there a plan for a RFC / pre-RFC about enhancing std::net to allow custom operations before calling bind() or somehow exposing a socket based API?

Thanks!

3 Likes

If this is about building up things to do between making the socket and bind, maybe BindBuilder makes more sense? SocketBuilder sounds, to me, like we're building the socket itself, not performing operations on something we're going to build in the future. If it is the latter, how are error messages going to be represented? If I get EINVAL, how do I know which operation responded that way?

I preferred SocketBuilder as the name because my motivation is to have a struct representing the socket. The Builder part is to indicate it supports / uses the Builder Pattern. The bind() support is one last step to obtain a familiar UdpSocket. For TCP client, the last step would be connect() .

Regarding the error EINVAL or other C based errors, I have not thought about too much, but one way is to following the existing std::net approach to convert to std::io::Error .

The problem I'm thinking about isn't the conversion of the error, but knowing the source of the error. If I have 5 setsockopt calls pending and get back one of those variants, how am I to know which one call errored? Or is each method going to apply its effect immediately (which kind of goes against how I view most builder patterns where they only do anything at some final call).

That's a good point. I meant that each method is going to apply its effect immediately.

I guess what I have in mind is probably not a strictly Builder Pattern. It is more accurately a "fluent interface" or "method chaining". i.e. something like:

let builder = SocketBuilder::new_udp()?
                        .set_reuseaddr(true)?
                        .set_bind(&my_addr, port);
let udp_sock = UdpSocket::bind_builder(builder)?;

or 

let udp_sock = SocketBuilder::new_udp()?
                        .set_reuseaddr(true)?
                        .set_bind(&my_addr, port)
                        .bind_udp()?;

The above method signatures are placeholders. The idea is that new_??? methods create an inner socket, and then other methods can modify this socket, and call bind_() or connect methods to obtain familiar existing networking types.

3 Likes

That approach seems reasonable to me. The SocketBuilder could also implement AsFd.

1 Like

Thanks. I hoped this would be the case :slight_smile: .

Is every socket option going to have its own method? If so, how fine-grained is this going to get? This is because these things differ between platforms pretty often (even between versions of them). Is there going to be a catch-all set_option() method that just exposes the raw syscall? If not, the AsFd makes sense to be able to call it for yourself without having to wait for stabilization of a new method.

setsockopt is not safe to expose a raw interface to, except as an unsafe function. Adding safe wrappers for known socket options seems like a good idea.

1 Like

Thanks! AsFd seems to be platform dependent, i.e. Unix only. Is it a common practice to implement platform-dependent trait on a platform-independent type (like SocketBuilder here)? (I've never done that before, and don't know if that's actually doable).

On Windows you'd implement AsSocket instead. It's reasonable to implement both platform-dependent traits on a platform-independent type. (AsFd is implemented on File, too.)

1 Like

one other reason I had for using set_reuseaddr instead of a general setsockopt is that, I wanted to be consistent with the current std::net API methods in terms of overall style/level.

1 Like

Thanks again for the helpful discussions so far. Should I move ahead to try to create a RFC for this? or is there something else I should do before RFC?

This proposal seems reasonable to me, and it sounds like it addresses the primary things that make people switch to socket2 rather than std sockets.

Normally, unstable library APIs don't need an RFC, but I think some semblance of an RFC would help here in order to make sure that current users of socket2 find this sufficient to be able to use the standard library sockets. This doesn't necessarily have to be a full RFC; the main thing it needs is a detailed list of all the proposed API signatures and brief explanations of what they do.

(If libs had MCPs I'd suggest an MCP.)

As a start, can you post that list of API signatures and explanations here?

2 Likes

Thanks for the guideline! Here is the first list of API signatures and some comments.

a new file: library/std/src/net/socket_builder.rs :

/// A platform-independent wrapper for a socket that allows:
///
/// - socket configurations before binding to an address.
/// - convert into a `UdpSocket` or `TcpStream` or `TcpListener`.
pub struct SocketBuilder {
    inner: net_imp::SocketBuilder,
}

impl SocketBuilder {
    /// Here we use separate methods for UDP and TCP, instead of
    /// a single `new()` method with more parameters, mainly to be 
    /// more consistent with the overall style of existing `std::net`. 
    ///
    /// For the new `SocketAddrFamily`, please see below.
    pub fn new_udp(addr_family: SocketAddrFamily) -> io::Result<Self>;
    pub fn new_tcp(addr_family: SocketAddrFamily) -> io::Result<Self>;

    /// Enable or disable SO_REUSEADDR on the socket. This is an example
    /// of possible configurations supported before binding.
    pub fn set_reuseaddr(self, enable: bool) -> io::Result<Self>;

    /// The following methods convert into existing types in std::net.

    pub fn bind_udp(self, addr: &SocketAddr) -> io::Result<UdpSocket>;
    pub fn connect_tcp(self, addr: &SocketAddr) -> io::Result<TcpStream>;
    pub fn listen_tcp(self, addr: &SocketAddr) -> io::Result<TcpListener>;
}

In library/std/src/sys_common/net.rs :

/// This is the `net_imp::SocketBuilder` used earlier.
/// 
/// This struct provides a bridge between platform-independent `SocketBuilder` 
/// and platform-dependent implementations.
pub struct SocketBuilder {
    /// This socket is platform-dependent.
    inner: Socket,
}

impl SocketBuilder {
    /// The actual implementation of methods.

    pub fn new_udp(addr_family: SocketAddrFamily) -> io::Result<Self>;
    pub fn new_tcp(addr_family: SocketAddrFamily) -> io::Result<Self>;
    pub fn set_reuseaddr(&self, enable: bool) -> io::Result<()>;
    pub fn bind_udp(self, addr: &SocketAddr) -> io::Result<UdpSocket>;
    pub fn connect_tcp(self, addr: &SocketAddr) -> io::Result<TcpStream>;
    pub fn listen_tcp(self, addr: &SocketAddr) -> io::Result<TcpListener>;
}

in library/std/src/net/addr.rs :

/// Address family values for sockets. This is useful to create a socket without a concrete address.
///
pub enum SocketAddrFamily {
    /// Address family of IPv4.
    InetV4,
    /// Address family of IPv6.
    InetV6,
}

impl SocketAddrFamily {
    pub fn from_addr(addr: &SocketAddr) -> SocketAddrFamily;
}

in library/std/src/sys/unix/net.rs , we change the signature of internal Socket::new to use the new SocketAddrFamily so that we can create a socket without a concrete address.

impl Socket {
    pub fn new(addr_family: SocketAddrFamily, ty: c_int) -> io::Result<Socket>;
}

All the existing callers to use SocketAddrFamily::from_addr() to convert from an address.

=== With the above change and misc bookkeepings, I can run ./x.py check without errors. And it provides the minimal functions I was looking for.

Thanks!

This should be #[non_exhaustive] I would think.

It'd be nice to have new_udp not be able to lead to connect_tcp, but I guess that would just lead to an explosion of methods as new protocols get support (is such extension ever planned?).

Maybe there should be a way to query what protocol a builder is working on as well? This would be another #[non_exhaustive] enum I suspect. Maybe then there could just be a single new() method that takes a protocol and address variant?

Will learn to use #[non_exhaustive]. Thanks for pointing out.

Yes, we should be able to implement a method to query the socket_type (i.e. udp or tcp) using getsockopt with SO_TYPE. That method can be also used internally, for example, make sure connect_tcp is not called on UDP socket.

Regarding a single new() method, I am not against it. The main reason I didn't do it currently is to keep consistent with the std::net style, which one can argue is limited. But we could change to a single new() and pass in socket type (and even protocols) if we think in future we might support many different socket types.

This should also document the availability of AsFd/AsSocket.

Also, I'm not sure if it's entirely safe to have methods for different kinds of sockets on the same builder. Not just because you can call new_udp and then listen_tcp, but also because sockopt may not be a safe interface to call if you don't know what kind of socket you have.

Should the SocketBuilder record all actions and then apply them all at once? This is what OpenOptions and Command do too. The advantage of this is that it isn't observable if the OS supports piecewise initialization or requires it to be done all at once. In addition it allows errors to be reported by the .listen_tcp() and other finalizer methods rather than every single builder method.

The advantage of applying each method incrementally would be the ability to do some setup and then get the file descriptor to do further setup, or set unsafe sockopts, and interleave all of those.

That does suggest that the name "Builder" may be a misnomer though.

2 Likes

Maybe use UnboundSocket in that case? Or UnboundTcpSocket and UnboundUdpSocket?

1 Like