Routing and extraction in Tide: a first sketch

This post continues the series on Tide, sketching a possible design for routing and extraction that combines some of the best ideas from frameworks like Rocket, Actix, and Gotham.

  • Routing is how the framework maps from an HTTP request to an endpoint, i.e. a piece of code intended to handle the request.

  • Extraction is how an endpoint accesses data from the HTTP request.

The two concerns are usually somewhat coupled, because the extraction strategy shapes the signature of the endpoints that are being routed to. As we’ll see in this post, however, the coupling can be extremely loose.

Nothing in this post is set in stone! Rather, this is a sketch of one possible API direction, to kick off discussion and collaboration. Please leave your thoughts here!

8 Likes

Echoing some comments I posted on Twitter:

I loved Warp’s type-safe parameter extraction and the way it matches that up with the types of functions at compile time, and I don’t see any way to do that in the proposed Tide model. I don’t think you can get compile-time type safety from strings like “/hello/{}”. And I’d like to do better than that.

“Drive endpoint selection solely by URL and HTTP method” seems problematic; for instance, I’d like to be able to drive endpoint selection by logged-in status: "does this method take a User, or an Option<User>". Note that that doesn’t mean I want fallback matching from route to route, just that I’d like to match on criteria like “logged-in user” or “has this HTTP header” or other criteria extractable from the request.

10 Likes

I loved Warp’s type-safe parameter extraction and the way it matches that up with the types of functions at compile time , and I don’t see any way to do that in the proposed Tide model. I don’t think you can get compile-time type safety from strings like “/hello/{}”. And I’d like to do better than that.

Strong agreement here -- I think this is where Rust can really push forward compared to other options, especially if we can get compile-time guarantees that you've also not provided "/hello" twice, i.e., you have an "extra" unreachable match arm. Maybe we could do something like format_args!/std::fmt::Arguments but provide a constructor? Something like a const fn constructor is what I'm envisioning here...

3 Likes

To clarify: the intended setup here is that if you create an ambiguous route (e.g. the extra "/hello") or if you mismatch the number of {} and Path<_> entries, you get an immediate panic at the point of server construction. That is, both of these classes of bugs are trivially caught while the server is being constructed. I don't believe it's worth the (substantial) added complexity to track this information at compile-time instead.

Yep, this is definitely something that needs ergonomic support. I was trying to emphasize the role of the URL+method as determining the intended endpoint, and I think it's a mistake to treat precondition failures through the same mechanism that determines which endpoint is intended. This is what happens when these preconditions are treated as "guards" for the routing match and result in the router "retrying" with the next patterns.

I'd prefer to handle this use-case by applying middleware at some point in the URL hierarchy for checking log-in, and configure within that middleware what to do when the preconditions are not meant.

Also, I realize now that the post was not sufficiently clear: the router design here has no notion of fallback and the order in which routes are added has no effect on the matching process. Those choices are tied to the use of a separate mechanism for preconditions.

EDIT: I've now updated the post to make some of these points more clearly.

1 Like

I hope I didn’t miss anything in the post but I wanted to share an idea about middleware functions. As it stands, I am not sure how this could fit into something that implements the Endpoint trait. It may be good to explore offering an API similar to Go’s http.Handler interface. This way you could chain them together to share common features like authentication or rate limiting.

I’m not sure how this could fit into the current API or how one could seemingly terminate a request early in the service.

I've got a middleware design in mind, but it was too much to put into this one post. Actix-web in particular has already explored this space, and something similar works fine with this API sketch. If I get some time I can sketch it out here; ultimately though it wants a post of its own!

1 Like

That's what I'd like to avoid; I love Rust's static typing, and Warp's ability to catch routing issues at compile time is extraordinary.

As far as I can tell, the only downside of that approach is that if you don't want to use macros, you'll have to use either a builder pattern or a cleverly overloaded / operator to write paths. But I'm completely OK with that.

Are there any other downsides, apart from the possibility of verbosity in route construction (which I think we can easily address)?

9 Likes

Yes. I think having compile-time checked routes (like we have compile-time checked format!) would be a great thing that would really set the Rust web experience apart from what else is out there. I'd love to not have to wait for the server to start-up to find out my routes have a typo.

2 Likes

I think having the route verification at runtime is fine, as long as it can be done fairly quickly. I usually want things done at compile time, but I really dislike Rust macros and the way they hide really complex things behind what seems so trivial. It effectively means learning a new language and while I think Rust should have a compile time version of this, warp does it and the goals of this project focus on generality and teaching. Having complex compile time computation hampers that when the alternative is simply running the code and seeing if it panics.

I like what I see enough to want to help out, so point me at a repo with issues or something. :slight_smile:

There's not a big difference between a server that doesn't compile and a server that can't kick up. Either way you're looking at a failed deploy. This mindset seems like a misapplication of typing to me.

3 Likes

There’s a big difference for me between errors caught at compile time and errors caught at runtime, no matter how early or how consistently.

15 Likes

Very much so.

However, as a counterpoint - is Tide targeting a use case whereby routing configuration ca be entirely statically declared at compile time, or can there be some runtime configuration (from a config file, based on other logic / events, etc).

I’d love to see static config done really strongly, like the embedded hal uses type traits for pin and resource allocation - but that precludes any kind of dynamic config. It’s worth questioning whether that’s a requirement / objective.

1 Like

In general, I think the proposal looks very interesting, and would definetly be something that I would consider using for a service. I have two comments that I would like to make

  1. Alternative routes based on the success of extractors as in Rocket is something I really like and feel is a good way to clearly separate different code paths. I understand that it is a matter of taste, and there are pros and cons of both approaches. I would urge that the nice use-cases of guards be considered, and APIs for similar cases be investigated. In particular, the use of guards as in https://rocket.rs/guide/requests/#forwarding-guards makes it very clear in the code exactly what case is handled, and no explicit if that is easy to forget is needed in every request handler. I’m not saying that request guards is the only way to handle such cases or the way Tide should handle it, what I would like to see is a discussion how a similar case should be written idiomatically in Tide.

  2. A small, but I believe still significant, issue is the use of .0 to get ownership of the contents of the extractor. While short and understandable for a seasoned Rustacean, I would not want to show code to a collegue that is not that interested in Rust that is peppered with .0 everywhere. A method call such as .to_owned() would be much clearer IMHO.

The proposal generally looks good to me.

One thing that was important to me in earlier web frameworks I worked on (mostly in Python) was modularity of the routing, where a top-level table of contents delegates some part of the URL space to a lower-level table of contents. This makes it possible to write reusable modules that can be “plugged” into applications in a simple way.

I think this would be supported by the current proposal, but just wanted to throw this out there/check that that this can work.

1 Like

The syntax for routes is very simple: URLs with zero or more {} segments, possibly ending in a * segment (for matching an arbitrary “rest” of the URL).

Only supporting multi-segment matching at the end can be limiting for some usecases. http://git.nemo157.com/grarr/blob/master/README.md and http://git.nemo157.com/forks/maud/blob/master/README.md are both currently handled via a /*repo/blob/:ref/*path route matcher (* for multi-segment match and : for single segment match), the route recognizer (a fork of conduit's) still guarantees a unique handler matched based on prioritization of the matchers (literal segment > single segment > multi-segment).

If there’s a mismatch between the number of {} or * segments and the corresponding Path and Glob extractors in an endpoint, the resource builder API will panic on endpoint registration .

How does this work? If I were to re-implement grarr on Tide I think I would want to implement a custom Repository extractor that takes the first path segment then finds and initializes the libgit2 repository based on the path. How can the resource builder know that this custom extractor will consume a dynamic path segment? (or are custom extractors not a thing?)

Can you explain the impactful difference in this specific scenario?

This is the kind of dogmatism around typing that Rust has applied: our library APIs regularly can panic because they dynamically check something that we have decided isn't worth the trade off to encode in the type system. The difference here is that the location of the panic (before the server can accept connections) make it even more similar to a compile fail in its impact.

4 Likes

I wonder, does having a run-time panicking API preclude the ability to layer a compile-time checking API on top of that? If not, I’d say that would be the best of both worlds. No? @withoutboats @josh

Slightly crazier idea: once const fn support gets to the point that compile-time string manipulation is a thing, is there any reason the router logic that leads to this runtime panic wouldn’t become 100% const and therefore “fail at compile time” automagically?

9 Likes

This is really exciting - I really like the design so far.

In particular I love the focus on “plain Rust”. I think this could be really valuable for those of us who want to build something more complicated than “hello world” examples and want to be able to customise/offload behaviour into the framework, but don’t have PhDs in type theory or AST manipulation.

I’m a fan of the simple path+method routing, I think it makes it super easy to see what routes exist in the application, and which handler each maps too. The nested routers seem like a nice touch too.

I may be an outlier here, but I also like the lack of fallbacks - in my opinion it results in far fewer surprises, especially if many people are working on an application. I don’t mind having to use Option<Auth> and an extra if statement if it reduces the amount of magic in the application.

I’m not against the idea of type safety on path params in principle, but I would be nervous that trying to make it 100% compile-time safe could end up overcomplicating the design. There’s also a risk of losing the route in the noise, which I think is one of the flaws in the Warp approach.

That said, perhaps being able to name the path params might help describe a route’s intent, even if the name isn’t used for anything else. eg. /users/{id}/pofile

I think the combination of Extractor and IntoResponse could end up being really powerful. I’m imagining you could, for example, create a RequiresAuth<User> extractor which could check if the request has a valid auth token and then either populate the User and continue, or automatically redirect to the login page. Or you could have an IntoResponse which does content-type negotiation to decide whether to return, say, json or xml.

All in all, I’m really looking forward to having a play with this and finding out more about the middleware/extensibility story.

2 Likes

The difference is hitting a “compile” button or typing :make or similar and getting a compile error pointing you right to the problem (or even just having a background compile highlight the issue), versus deploying to a staging environment, getting a panic that hopefully has enough error location information, and then going back to the editor.

I’m not asking to cover every possible scenario with static typing no matter how obscure. I’m saying there’s already a design out there that covers this issue, and I think we can easily make that just as convenient to use.

6 Likes