Feedback requested: book on API type patterns

willcrichton · February 3, 2021, 6:41pm

Over the last year, I've written many individual essays about typestate, type maps, type-level programming, heterogeneous lists, and session types. I'm excited by the possibilities that Rust provides to design better APIs that enforce invariants at compile-time. I've also noticed that many new frameworks like warp and bevy are making heavy use of type-driven design patterns.

I wanted to collect my thoughts into a single semi-cohesive book that can serve as a reference for advanced API design patterns. Specifically, my goals for the book are:

Provide concrete examples for individual design patterns like guards and typestate, along with a discussion of the pros/cons of each approach.
Teach a broader mindset of how to use Rust's type system to improve API design.

Here's the link to my working draft:

https://willcrichton.net/rust-api-type-patterns/

Note: all content/structure is preliminary and not intended for wide public distribution. I'm sharing it here for early-stage feedback. Please don't put this on reddit/HN/etc

I'd like to get feedback on a few points:

Topics: do you have any more topics or design patterns that you think would fit for this book?
Structure: do you think the book's approach to teaching design patterns is a reasonable balance between usefulness and brevity?
Content: any specific feedback on the individual chapters? Do you think the more abstract discussion about representational principles ("The Main Idea") is useful?

H2CO3 · February 3, 2021, 8:52pm

I've presented my favorite approach to wrapping HTTP APIs in a type-safe and boilerplate-free manner a couple of times over at URLO. I haven't read all of the posts above, but I think it would have been relevant in one of the first three, so I believe you haven't mentioned this pattern yet. I think it's a piece of valuable practical advice that you might like to write up in more detail than I did.

willcrichton · February 3, 2021, 9:32pm

That's a neat architecture, I'l have to think about distilling out the individual mechanisms.

Also I completely forgot URLO existed... I should probably cross-post this there.

EdmundsEcho · February 4, 2021, 1:06am

@willcrichton I like the unifying principal of the book - use the type system to express the design. To maximize how the contribution might complement "prior art", if you have not already done so, definitely take a look at Rust Design Patterns. There is lots of room for unique contributions and "angles of attack", this may be fodder to further contrast/compliment what you are doing.

Having read through the draft here's what came to mind:

There seems to be a reasonable flow, build of ideas. Here's where there it "got rough":
- The Hlist example seemed overly complex. warp uses Hlists but...
  - I'm unclear as to why the Rust tuple doesn't solve the problem as presented (much more flexible than what Haskell has to deal with for instance),
  - similarly, nor why the zip function doesn't do most of what you need (include an assertion that fails at compile time if the lists somehow end-up with different lengths)
  - finally, when I learned about Hlists it was to demonstrate how type families (available in Rust) can be an elegant way to treat different types in some related manner
- The sections 4, 5 and 6 seem a little disjointed; I lost the building thread.
If the focus is going to be on the use of the type system, I wonder if there is a way to go from how to limit/constrain input using the type system (as you did in the Color example), to how to unify your types. After all, this is a unique challenge that belongs only to strongly typed languages; how do I leverage a computation/process that is relevant for several related, but now distinct types. It's difficult "to go deep" with types if it subsequently prevents code re-use. The easiest "out" is the use of enum + match, then generic types... traits, trait objects. Perhaps it include a widely used-pattern in Rust: the use of a "go-between" struct with methods or implements a trait, but more importantly in this vein, a "go-between" that disparate types can unify upon.
I'm not sure I saw an explicit demonstration of how to use traits for "type-level programming"; the use of PhantomMarker was good.
Dependency injection: I was "raised to believe" that the simplest form of the pattern is first-class treatment of functions as values. With Rust we can pass around both functions and closures. At minimum, if it aligns with how you think about it, I might ground the discussion with this idea.

carols10cents · February 4, 2021, 2:39am

I know some of what you have here touches on these topics, but exploring more uses of generics, traits, associated types, trait bounds, blanket impls, extension traits. Another good place to mine for ideas would be object-oriented design books, especially in dynamically-typed languages, to see how those patterns might translate into statically-typed guarantees. I think it would be super compelling for, say, experienced Python or JavaScript or Ruby developers to learn Rust in relation to design patterns and APIs they understand.

I think so. I think it'd be a useful exercise for you, and useful information for readers, to define who the ideal reader of this is that you have in mind: the level of experience in Rust and other programming languages they have, the kind of tasks they're trying to accomplish, and the knowledge they're looking to gain. In other words, I'm not sure if I'm your ideal reader or not, so I'm not sure if you've hit the right balance between usefulness and brevity

I liked "The Main Idea" chapter: I didn't find it super abstract like I find some writing on type theory topics to be. The concrete code examples in that chapter, with clear, real-world use-cases, really helped keep it from getting too abstract. I think it'll be fine as long as you stay away from Greek letters, LaTeX formulas, and terms I often see tossed around in Rust language design like "existential" and "soundness" that might be appropriate but only after thorough and concrete definitions.

It also sounds like in that chapter you're discussing connascence but I don't see that word-- not sure if that was deliberate or not! This article by Greg Brown, which links to talks by Sandi Metz and Jim Weirich, might also be relevant to the kinds of ideas you're exploring.

willcrichton · February 4, 2021, 3:27am

Great point, I'll take a look at some other design books for inspiration.

So nominally this is intended for intermediate-to-advanced Rust users who are designing libraries for other Rust users. But I do wonder if this might also be useful for people using type-driven APIs. I've talked to people who find frameworks like bevy and warp confusing because it has (to them) unusual design choices.

I had not heard of this! Thanks for the reference, I'll read into it. It sounds vaguely like the concept of viscosity in the cognitive dimensions of notation, but more specific and nuanced to API design.

josh · February 4, 2021, 7:28am

You might consider discussing the actor pattern: data owned by a task or thread, accessed by sending and receiving messages.

djc · February 4, 2021, 8:22am

I quickly looked over the book's contents and I think there's a bunch of good stuff there. Some half-baked thoughts:

I am unlikely to use the HList stuff because the ergonomics look unattractive to me what with the all the macros and highly abstract resulting types. Also "Parallel lists" don't seem like a use case that warrants a solution this complex.
I really liked the admin panel example, hadn't thought about it that way before.

Some examples from my own projects that might be interesting:

The Approval type (and ApprovalIter) from bb8 allow me to do access control in one place and consume it in another
The imap-proto CommandBuilder is more or less a classic typestate variant of the builder pattern
The instant-distance PointId is an example of just what you can do with an integer-wrapping newtype.

2e71828 · February 4, 2021, 12:06pm

I'm touching on some of these same points in my MS thesis. The framing that I came up with was of moving preconditions out of the documentation and into something that the compiler can verify. That puts these ideas in terms that most programmers will be familiar with. It also helps explain why safe functions can contain unsafe blocks: Instead of programmer discipline, the conditions necessary to execute the code safely are being enforced by the compiler.

I don't know if there are any generalizable design patterns in my project that you aren't already covering, but I'd be happy to talk about anything in there if you think it'd be useful. The main part of my project has been modeling relational algebra within Rust's type system. To do this, each column domain is represented by a newtype. Record and table headers are an HList of these, and the query planning is done via type-level programming. To keep the proliferation of type bounds in check, I implemented a lisp interpreter inside the type system (similar in power to tyrade).

The net effect of all this is a semantic-level query interface for collections, instead of structural: The program logic doesn't need to change when the storage layout changes. Because all of the query planning is inside the type system, the emitted machine code for each query should be comparable to what you'd write by hand.

carols10cents · February 4, 2021, 1:51pm

I think that point about understanding how to use type-driven APIs is a good one. I'd encourage a definition of "libraries for other Rust users" that's a bit more broad: there are a TON of Rust programmers who are, ultimately, working on applications and won't ever publish a general-purpose library to crates.io. However, within those applications are functions, structs, modules, workspace crates, etc that are abstractions used within the application where these type system ideas can absolutely be relevant. Working on a large application with a team might be what you meant by "designing libraries for other Rust users", but I'd make it explicit that this doesn't necessarily mean a crate on crates.io, and your internal application's API is an API that is worthy of design and can benefit from type constraints too. I just don't want potential readers to disqualify themselves because they aren't OSS library developers.

atagunov · February 4, 2021, 11:42pm

my first introduction into type-based programming and HList - very useful/readable; dependency-injection exactly what a Java guy would expect; nice food for thought

EdmundsEcho · February 5, 2021, 1:18am

I have this horrible habit of perseverating on ideas when they overlap with concepts I'm working to internalize/intuit more myself. I also have a strong bias that iterating over the core/simple concepts, is an efficient path to tackling the "bigger stuff". It has truth, because the bigger is never more than some combination or projection/inference of the simpler.

I've gone ahead and articulated several ideas for how to expand on your introduction. I've expanded on expressing the value of the approach and a concrete method that might be a useful routine for how to apply what you present, elsewhere.

While I think what I'm saying is useful, I may have over-simplified to a point where it needs to be "called-out"; so a clear take is my trying to be precise, but is done so in the spirit of a churning of ideas.

Target audience

Per your target audience, I imagine the reader knows what they are doing. The person codes, a lot. How much they know Rust versus say C and a strongly typed language other than Rust, is less critical. The proficient Rust user, is fluent with "just doing it" and might benefit from something more than the often dominant, implied translation C++ -> Rust.

The insights are likely less from "I've never seen this before" but rather, "I never thought of it that way". Insights from simple examples that both broaden, and facilitate a complete thought process, that in doing so, articulate a useful "so what" that can be missed (or forgotten) when "just doing it" (even at peak effectiveness). Quick aside: there is this place, where I can know something really well, so well I know how to implement whatever I'm thinking in the moment. That code can often be difficult to maintain. The target audience is always open to a replay of how the simple, often occurring patterns, can apply more broadly than what we remembered.

Use the simplicity of the `fn color_to_rgb` to present a broader view

The introduction using A prototypical example: enums over strings is a good example because it quickly gets the reader into the right mindset for what to expect. The expectation being that you are now going to take what we know to be a good thing (per the example used), to a new level of understanding and ability to apply.

Here are some additional thoughts on how to convey the value of the approach.

One way to articulate what the type system allows for in a way that frames the subsequent demonstrations might be:

The type system enables a precise alignment between what the compiler will admit as valid input to a function, with values the function is equipped to process.

Safety and expressiveness

The Rust safety features and "fearless concurrency" are clearly touted and realized benefits of this capacity. However I suspect, "expressiveness" and "fearless refactoring" have not received their due, if only because the "top-line" buffer is already full (not to mention, the bandwidth required to help navigate the borrow checker).

Given this, the introduction might talk more about the expressiveness enabled by the type system. Concretely, the ability to align what a function commits to doing, with what it actually does, depends on the finesse, the precision of the type expression.

every function inherently specifies the set of valid "something it can process" inputs by definition of the implementation (implementation => valid inputs); not the other way around.

This fixed set of inputs is often referred to as a function's domain (related but not the same as "domain" in "domain-driven design"). The "so what" here is that in many ways this is all you need to describe and ensure "safe code": align the function's domain with what the compiler admits as input. Done.

`Option` might be a flag for needing to do more with the type system

What the use of Option and the like (Result) allows us to gloss over, is that the reason for using it has more to do with a failure to align what the compiler admits as input with what the function is inherently equipped to process. For instance, using i32 to describe valid input for the denominator of a function that computes the value of a fraction. The "immutable truth" is that the domain for the function 1/denominator is not the set of integers in i32 but of course, a type NonZero that can only be instantiated accordingly.

The gap between a function's domain and the set of inputs permitted by the design, is an opportunity for the type system to "do the work for you" of aligning inputs with the function's domain.

The key idea, and likely a useful way to exploit the power of the type system, is to use it to precisely align the inherent truth of what the function is meant to process with the inputs it admits. From this perspective, as useful as Option is, it must be seen for what it is: the equivalent of unsafe code in that it is infinitely useful, but sometimes "a shortcut". No matter, it should be isolated, and tagged as precisely as possible. Admittedly, and to the point, even when using NonZero the application might still require the use of Option, now just further "up-stream" when trying to instantiate NonZero from raw input.

Where to find a missing type

In both versions of the color_to_rgb function, the Rust type system enforces alignment of the function's domain with the set of permitted inputs. The key is in the how. In the first (_bad), the trick was to expand the domain of the function compared to the other (_good). This was accomplished with an implementation that could process

junk part of the domain -> None

The second (_good) approach instead used the type system to more narrowly specify the set of admissible inputs; a specification that permitted a more precise expression of the intent. In other words, the type system closed the gap between admitted inputs and the domain of the more precise function.

What follows may be too granular, but it works for the example. How do you decide which version of a function is "_bad" vs "_good" in a way that reveals opportunities more generally to empower the type system "just that much more"?

The process can become slippery (circular) because "how can we not" think about the inputs the function needs to process whilst designing the body of our functions? Isn't building to an interface a well established best practice; something the function type signature ultimately express?

The source of immutable truth is not related to an "interface"

To better understand the trade-offs in how to close the gap, without getting too circular anyway, I have found the following to be useful. First, where can I find "a hard truth", "a single truth" about the task. As mentioned, the body of the function dictates the function's domain, while they are inseparable, one defines the other; the implementation implies the function's domain.

How does this square with "designing to an interface". I might argue that the "_bad" is good if I had to design to an interface that accepts as input &str or worse, some untyped "goodness only knows" input coming from the wild-wild "untyped" world out there.

Information is like matter; it doesn't come from "thin air"

More peeling of the onion can give us a means to better understand if options exist that both meet the spec of the interface, whilst benefiting from a design inspired by inherent qualities of the "core computation". Where the latter involves creating types that align with the "natural" offshoots of the computation (an immutable truth of sorts).

On the other hand, to build to an interface, think composition of functions. To figure out which functions, start with the ones that generate the most "new information" by what is hosted in the body of the function, the most computationally-driven functions; only those functions hold some inherent truth about the task. Framed as such, likely more so than the interface being built to.

`_bad` is relatively so because it built to an interface (too early anyway)

The design trap the "_bad" function fell into, was to build to an interface that accepts &str as input. The implementation of the function was "interface driven". The details of the implementation clearly show that there are two things going on: input -> (valid | invalid); valid -> (u8,u8,u8).

The "aha" moment for how the type system can help, is to better understand how "_good" encapsulates valid -> (u8,u8,u8) using the type system to align admitted input with the "inherent" domain of the function. It resonates as a design, because it describes where the computation generates new information: encoding the name of a primary color into its corresponding rgb format.

_good is relatively so because it is more precise despite being more verbose

With this "source of truth" now articulated, the "build" to an interface design is likely that much better for it.

// we learned the value of PrimaryColor by understanding the inherent truth 
// that comes from this computation
fn color_to_rgb (color: PrimaryColor) -> (u8, u8, u8) { .. }

// "fill-in" the gaps to meet the meet the need of the interface
// ...must accept &str as input
fn try_from(raw_input: &str) -> Option<PrimaryColor> { .. }

// compose to fullfill the commitment to the interface
fn try_color_to_rgb(raw_input: &str) -> Option<(u8,u8,u8)> {
    let try = try_from(raw_input)?;
    Some(color_to_rgb(try))
}

It's worth noting that it is by no means less succinct. Likely the opposite. However, where our computations add the most value likely overlap with where being more precise with the type definitions make sense. This is especially true if we consider the power of type inference that by definition of "proof by construction" (see witness later) gives us a capacity to "refactor with confidence".

Hotel California

Finally, I think this more complete version of the example tees-up how a strictly typed universe provides a richer set of tools for interacting with the wild-wild untyped world. However, in exchange for safe, runtime code, we have what is likely a more burdensome and inescapable task of needing to deal with fn (wild: &[_]) -> Result<T, E> in our design. Like unsafe code, and the example above, the best we can do is isolate/encapsulate precisely where that logic needs to be. A big net win, and one "reading this book" will make that much more so. If "you can never leave", I'd rather face my truth with types :))

H2CO3 · February 5, 2021, 7:45am

I love these ideas.

djc · February 6, 2021, 10:12pm

I was also reminded of Parse, don't validate.

scottmcm · February 6, 2021, 10:25pm

Or something similar, but not in Haskell: Aiming for correctness with types

(I've tried sharing "Parse don't validate" with people but often they seem to get the impression it wouldn't work in whatever language they're using.)

EdmundsEcho · February 7, 2021, 12:26am

Perhaps key to the responses provided was "whatever language they're using". The only experience I have with strongly typed languages, is with statically typed languages (Rust and Haskell in my case). Having "drunk the Kool-Aid" I have tried to apply the concepts in Python and JS/React. Without type inference and compile-time checks, I concur with the quoted sentiment. This all said, type annotations to inform the Python and JS (TypeScript or FlowType) linting/transpilling, are surprisingly not bad. Notwithstanding, meaningfully handicapped without it being a built-in feature.

I would be curious to get your impression on whether you think the most limiting factor to using a strongly-typed language is creating the different types, or code-reuse once those types are in-place? (or perhaps "other")

PS: @djc The link was a good read; attributing "parsing" to "validating" is a useful way to think about it. It also reminded me of how often I find myself having to "assert" far too much throughout my Python codebase. I probably could assert :)) to do more of what the article suggests.

H2CO3 · February 7, 2021, 7:26am

This! Incidentally, it's exactly what Serde does.

matklad · February 7, 2021, 11:38am

I have two bits of advice:

First, it's good to mention cognitive and compile time costs. For example, the witness type is essentially free, typestate requires some headscrathing & compiling, and HLists invoke fond memories of C++ template instantiation errors. wrap might be an interesting example here, as, I expect, at least some people will agree that routing is not the bit where you get a lot of leverage out of the type system.

Second, it would be nice to have a couple of examples of pedestrian, enum-based state machine. First, because it just seems a low-cost high impact mechanism. Second, because I actually have an issue with coding them up in rust.

For many SMs, the natural signature is to accept &mut Self. The typical pattern is to accumulate events in some kind of main working state, until you transition into a final state. It also is common to hold SM as a field in some bigger context struct. And here, you hit the problem. If transition function uses &mut, it becomes harder to move data from one state to another. If it uses mut, the common case of no transition becomes awkward, and the owning context has to go into transiently invalid state to actually call the transition function:

struct Context {
  state: Option<State>,
  // many more fields
}
impl Context {
  fn on_event(&mut self, event: Event) {
    let old_state = self.state.take().unwrap();
    let new_state = old_state.transition(event);
    self.state = Some(new_state);
  }
}

ckaran · February 8, 2021, 2:13pm

Note to self: @matklad has a very different definition of the word 'fond' than I'm familar with... ^{PTSD from Boost template metaprogramming errors...}

scottmcm · February 8, 2021, 5:11pm

There is one way to spin this that it can be considered an advantage: the match mem::replace(&mut self.state, State::Faulted) { pattern means that an accidental panic! in the implementation leaves it in an error state, which one could argue is better than it being only half-updated.

Topic		Replies	Views
Pre-RFC: Traits for crates (or: canonical API portability) language design	11	702	January 18, 2025
Thoughts (critique) about the Rust book documentation	12	2491	March 25, 2019
Welcome cramertj to the lang team! announcements	8	3552	March 25, 2019
Revisit Orphan Rules #2 language design	4	955	March 22, 2021
Formal Semantics / Type Rules documentation	3	1995	March 25, 2019