[Pre-RFC] Cargo Templates

It's getting late and I'm a bit too fuzzy to respond to everything, but I'll respond to what I can now and the rest later.

That doesn't seem accurate to me. A template's Cargo.toml is probably going to include dependencies as literal values... maybe set up some features... maybe tune a few other things too.

Here's the actual Cargo.toml from the template where the justfile's use of {{...}} led me to discover the problem with the original templating implementation:

[package]
name = {{toml-escape name}}
version = "0.1.0"
authors = [{{toml-escape author}}]

[dependencies]
clap = "2"

[dependencies.error-chain]
version = "0.9"
default-features = false  # disable pulling in backtrace

[profile.release]
lto = true

# Uncomment to sacrifice Drop-on-panic cleanup for 20K space saving
#panic = 'abort'

# We need to specify this explicitly so sed can swap the 3 for "z" in
# release.sh when running with --nightly
opt-level = 3

[features]
nightly = []

I doubt that's what you meant by "nothing except for some {{ interpolation }} keys".

I think this is one of the details where we need more voices.

Given my use cases, I see single-template repos as a common use case because it allows me to comfortably present templates such as the following examples as independent, top-level offerings with each having its own README.md and GitHub project listing:

  • My current "CLI utility boilerplate" template
  • A "rust-cpython library boilerplate" template

I also see single-template repos as potentially being a friendlier, more readily understandable on-ramp for users who have never created a template before.

That said, none of that is directly against a design that enables multi-template repos... just in favour of being careful about a design's effects on the single-template case.

My concerns are that:

  1. I can still see that causing a non-zero amount of unnoticed mangling if, for example, someone includes the text of their Cargo.toml in a code block inside something not intended to be templated, such as an INSTRUCTIONS.md that is meant to be read after generating a new project or any kind of documentation containing code-snippets in a meta-template repo. (As I remember, Jekyll (and, thus, GitHub Pages) only templates files that begin with a front matter block for just this reason.)
  2. Rust generally aims for an "explicit, but not onerous" design, but interpolate_patterns = ["*"] feels a bit too much like the un-Rusty aspects of the Ruby on Rails "convention over configuration" philosophy to me.
  3. We're aiming for comfortable defaults. Do you really want to force anyone who's writing a web application or using just for build automation to have to override the default? It doesn't feel like the right balance to strike.

OK, Here's the stuff I didn't feel like thinking about last night:

I agree. Let's get something simple and useful first, and then the design can be extended later.

The reason I'd prefer to see it all in one URL is that we already have a lot of experience recognizing, storing, and processing URLs, so:

  1. From a "passing a value around" standpoint, a single string, usable as an opaque token, provides a more robust way to reference a specific template.
  2. I worry that --template-name makes it more difficult to teach. I'd be in favor of http://url/to/repo#template as the easiest to teach and perfectly in line with what the relevant specs say about fragment identifiers. (Treating them like query strings is just an artifact of "AJAX in the days before history.replaceState and history.pushState")
  3. There's no concern about miscommunicating how to reference a specific template within a repo. It's just a URL like any other.
  4. I can imagine situations where calling cargo inside another script/tool could be fragile in the face of requiring both a URL and an option like --template-name while having a single URL for the entire reference is hard to mess up.
  5. We know how to typeset URLs. Typesetting "URL + random option value" pairs is an unknown.
  6. When working with humans, it's easy to imagine someone noting down or copying only the URL, but not the template name.
  7. The whole point of a URI (ie. URL or URN) is to act as an identifier to reference a specific resource. It runs against the spirit of the thing if you make it only a part of the identifier. (ie. As a successor to "two-part" identifiers like "FTP to to 1.2.3.4 and cd to /srv/algol/project_template")

I haven't really thought about this much, but my current stance is to prefer Template.toml for the following reasons:

  1. Requiring a separate Template.toml could be a good way to make "you must use a subfolder, even for a single-template repo" an intuitive requirement, which, in itself, would make for a more regular, intuitive behaviour for template repos:
  2. Stuff inside the template folder passes through, whether or not it's interpolated, and stuff outside the template folder (like Template.toml) does not. No need for special stuff to handle giving the template and generated projects different README.md, LICENSE, and .travis.yml files.
  3. If you ask for something to be interpolated, it will. Otherwise, it'll be passed through literally. No exceptions.
  4. Requiring a separate Template.toml would make for more intuitive interpolate_patterns behaviour, since there wouldn't be any doubt about whether stripping template-defining keys in Cargo.toml when generating a new project is a special-cased behaviour.

That said, I'm not sold on interpolate_patterns. It's not a very intuitive key name and I think it needs more brainstorming.

1 Like

That's commonly used (well, npm uses it, can't think of any other tool off the top of my head) as a way to specify which revision of the repository to use. Being able to specify a revision/reference seems important to me for versioned templates, e.g. https://github.com/serde-rs/serde#v0.9.15, but then you will also want to encode both a revision and a template name in the same url.

https://github.com/serde-rs/serde#v0.9.15/library could work since / is invalid for git references, but then if there's only #foo how do you determine if that means git reference foo + default template or git reference HEAD + template foo.

This looks nice and should be easy to teach, but since the URI specification says that

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource.

as far as I understand it, unless http://url/to/repo dereferences to something else than HTML, using http://url/to/repo#template to mean something else than the relevant HTML specification wants it to mean is technically outside the URI spec.

But that's the thing... http://url/to/repo does dereference to something other than HTML.

Displaying a useless autogenerated directory listing (self-hosted repos with apache+mod_autoindex) or redirecting to the HTML URL if the requesting user agent isn't a git client (GitHub) are both optional extras which have no bearing on what the primary resource for that URL is.

Furthermore, given that the HTTPS and SSH URLs (git@github.com:ssokolow/rust-cli-boilerplate.git is an alternative syntax for ssh://...) point to the same resource via different protocols, it makes little sense to argue "but web browsers!" when deciding what to do with the SSH clone URLs, which should remain consistent with the HTTPS URLs for intuitiveness, teachability, and maintainability.

Hmm. Good point. My first impulse would be to make the / mandatory in the presence of # so #foo is disallowed. Then, you'd have these four possibilities to represent the same template:

  • http://url/to/repo (Useful in the "HEAD of a single-template repo" case)
  • http://url/to/repo#HEAD/
  • http://url/to/repo#/default
  • http://url/to/repo#HEAD/default

I don't know why I thought / was invalid for git references when I wrote this, git definitely allows / inside references including tags. It's not commonly used so we might be ok using it here, but maybe there's some other way both reference + template name could be encoded nicely.

This sounds like a good time for research. For example,

  • What characters are invalid in git references? Maybe one of those might make a reasonable separator.
  • Has anyone solved the ā€œinclude a git reference in the URLā€ problem in a way which doesnā€™t use the fragment identifier, but is still easy to implement without out-of-band information? (ie. Not something like https:/path/to/repo/reference)

I slept poorly, so Iā€™m not really feeling motivated today, but, if nobody else beats me to it, Iā€™ll take a look once Iā€™m better rested.

EDIT: Another possibility to consider would be reversing the order. We have control over the list of disallowed characters in template names, so, if Template.toml requires a character like @ to be excluded from template names, then #template@tag_or_revision would be valid and have a reasonably intuitive meaning based on its literal reading: "basic_microservice at v1.2"

1 Like

It doesnā€™t allow colon (:)

Not something that lends itself especially well to intuitively obvious syntax, but a viable possibility.

I may be really arguing technicalities here, but I just reacted to the "perfectly in line" phrasing. Of course in practice bending or breaking specifications is a long honored tradition and yet somehow the world fails to end because of it. And using the fragment identifier for template naming certainly looks nice and clean.

I believe that using a HTTP client performing retrievals according to the HTTP protocol specification is the natural way of understanding what representations "might result from a retrieval action on the primary resource" identified by an URI using the http or https scheme.

And eg. GitHub HTTP servers certainly do answer with a 200 status code and a HTML document (or do they do some content negotiation?) when asked for eg. GitHub - rust-lang/rust: Empowering everyone to build reliable and efficient software. - GitHub - rust-lang/rust: Empowering everyone to build reliable and efficient software. does result in a 301 redirect, but I would argue (but not for very long :slight_smile:) that semantically this changes nothing (a 303 redirect would).

It's rather the git client that works outside (or rather beside?) the RFCs - as far as I understand it it doesn't really access the resource identified by the URI it is given, but uses it as a sort of "template", if you will, for constructing the actual requests for communicating with the server.

I was just shouting "but standards!" - not the same as "but web browsers!" at all.

And the ssh URI scheme is not really specified so git is doing almost 100% its own thing here anyway (personally I like that the alternative syntax doesn't pretend to be an URI).

[quote="mpol, post:50, topic:5056"] I may be really arguing technicalities here, but I just reacted to the "perfectly in line" phrasing. Of course in practice bending or breaking specifications is a long honored tradition and yet somehow the world fails to end because of it. And using the fragment identifier for template naming certainly looks nice and clean.[/quote]

Fair enough. When I sleep poorly, I do tend to get incautious with my choice of phrasing. I suspect we probably have fairly similar views on how the trade-offs play out in practice.

I think of Git more in terms of WebDAV than bare HTTP (in fact, as I remember, it supports WebDAV as a less efficient alternative to it's own "smart" HTTPS server support).

WebDAV and its Delta-V extension for revision tracking are also standardized via the RFC process and, in standardizing a machine-listable folder/directory and its history (ie. the abstract definition of the root of a Git repository) as a compound "resource", they provide prior art for this conception.

I don't remember where I heard this, but my understanding was that they use a 301 redirect for technical reasons. (ie. Some software would consider a 303 as meaning "the git repo has moved" rather than "you are being redirected due to content negotiation" and then, for whatever reason, git requests would wind up going to the page meant for web browsers.)

Again, looking to WebDAV makes the gap between git and "standard behaviour" much narrower. Many of the differences between the Git and HTTP definitions of a resource fade away with nothing more than allowing a "resource" to be a machine-readable directory with children and there are various WebDAV clients which perform similar "expect a certain layout" tricks.

Beyond that, while it's discouraged, there is prior art for HTTP user agents formulating requests for resource based on assumptions about the structure of some higher-level scope. Most notably, /robots.txt and /favicon.ico.

Furthermore, there is actually a provision for doing it in a non-discouraged fashion via the /.well-known/ namespace defined in RFC 5785 and subject to this registry (Used by things like WebFinger, Let's Encrypt and DNT policies, and Apple's iOS universal links, the last of which are not listed in the registry).


P.S. Does anyone know how to turn off the "helpful tips" in the posting UI?

They don't seem to align well with the discourse here and it's getting irritating to play whac-a-mole with popups that cover up the preview, just to tell me things like "That link was posted already" (I'm following the Wiki-style approach of periodically re-hyperlinking terms like "just" which not everyone may be familiar with in order to limit the need to scroll) or "You've made 30%+ of the posts. Give someone else a turn."

Iā€™ve been thinking about this lately, and I have a proposal that Iā€™d like some feedback on. Essentially, we can think of creating a project from a template as a special case of code generation, right? The rust community already has a way to do code generation at build time in the form of build.rs files, so why donā€™t we adapt that style to the template issue?

Essentially, I would like (and at least for the first thing, am working on a prototype) two things:

  1. a rust library with an API that lets me programmatically specify a template. A ā€œtemplateā€ would essentially just be a bin crate that is executed. I have an example at the bottom of this post, but essentially it will let you build up a series of operations using rust code, and when those operations have been specified, the library would know how to translate those operations to the actual file system contents that make up a template. String interpolation could be built in, but it could also be up to the template creator to add some kind of text templating library (handlebars, askama, etc) if they need it. Or the template library could provide a default text templating feature, but allow it to be relpaced.
  2. The other thing would be a runner for these ā€œtemplate crates.ā€ Much like cargo runs the executable that is produced when a build.rs file is compiled, the executables produced by compiling the ā€œtemplate cratesā€ could have a single ā€œrunnerā€ that gives the user of cargo a consistent way to run the ā€œtemplate crateā€ executables. This is especially important for any input that a template crate might need in order to produce a functional project. The runner would know how to get the author name, crate name, etc, that a template crate could access to generate, for example, a Cargo.toml. (this runner could very simply be cargo, but at least for this post Iā€™m leaving it open to discussion)

Here is an example. This code block is an example of the main.rs of a ā€œtemplate crate.ā€ This would be compiled into an executable that the runner would run in order to produce a project.

extern crate template;
extern crate serde;
#[macro_use]
extern crate serde_derive;

use template::prelude::*;

#[derive(Deserialize)]
struct TemplateArgs {
    some_param: String,
}

fn main() {
    // The library could have standard ways to get input. The runner might take 
    // some inputs and translate them to environment variables, much like how
    // build.rs files do input & output
    let args = Template::args_from_env::<TemplateArgs>()
                        .expect("need to include TEMPLATE_some_param");

    // A vector of "file operations," that describe the transformation from
    // template->project. This shows essentially 3 different "operations."
    let files = vec![
        // first, the library knows how to generate a Cargo.toml, so the template
        // author doesn't have to include a templated Cargo.toml in their project
        // unless they really need more flexibility than the API provides
        CargoToml::builder()
                  .name_from_env() // might expect an environment variable 
                                   // "TEMPLATE_CRATE_NAME" or something, that
                                   // would be set by the runner
                  .author_from_env() // might expect an environment variable
                                     // "TEMPLATE_AUTHOR_NAME" or something, that
                                     // would be set by the runner
                  .license(License::Mit_Apache2) // API would also have a
                                                 // `license_from_env()` method
                  // API would also have similar `.dev_dependencies()` and
                  // `.build_dependencies()` methods
                  .dependencies(vec![("tokio", "0.1")])
                  .build(),

        // next, this would take a string, which happens to come from a file in
        // the template crate but could come from anywhere, fills in the templated
        // parameters using some sort of built-in text templating mechanism, and
        // copies it to a location in the output directory
        File::from_str(include_str!("src/templated/main.rs"))
             .to("src/main.rs")
             .args(&args),

        // This shows a plain file copy, taking some bytes from one file in the
        // template and copying them to a location in the generated project. No
        // interpolation is done
        Blob::from_bytes(include_bytes!("src/some-binary-blob")).to("src/assets/some-binary-blob"),
    ];

    // lastly, we call `create` to take the list of operations and translate them to
    // an actual, on-disk project. it would probably create the project in the $PWD
    // by default, though here I show it specifying an output directory
    if let Err(e) = Template::create(&files).at("./result") {
        eprintln!("{:?}", e);
        ::std::process::exit(1);
    }
}

For the runner, you would need to be able to: specify some way to identify the template crate, pass required input parameters to the template crate, and specify an output location. Since templates would just be crates, retrieving them from crates.io is simple, though I agree with @ssokolow that there should be a simple and unambiguous way to specify that the input is coming from a location on the filesystem. Input to the runner could be through environment variables, or maybe key-value command-line flags that come after a -- like is done for the test runner. I would expect specifying the output directory to work just like cargo new.

Iā€™d be glad to hear any feedback, though Iā€™m more interested about what people think of the concept in general than any specifics about the API, as Iā€™m still very early in the process of developing a prototype for this and am still working out exactly what kinds of APIs are useful.

Thanks for your feedback!

1 Like

@pwoolcoc it would be nice to see progress here!

I was thinking about how I would expect it to work. I was thinking that ideally Cargo should define some trait which defines methods to (a) manipulate the default manifest contents and (b) write files into paths relative to the crate root. Then, maybe there could be a convention of library crates that have implementations of that crate, such that cargo new --template=stdweb goes to download and compile the stdweb-template crate and finds the exported CargoTemplate impl of the trait?

I guess there might be similarities with the cargo meta build system RFC.

I made a very very rough example of what would be needed for a template system for me: https://github.com/Keats/kickstart with 2 examples of templates in https://github.com/Keats/kickstart/tree/master/examples (see README for running). Itā€™s essentially https://github.com/audreyr/cookiecutter but in Rust and using TOML for the template config.
I didnā€™t add generating a template from a remote URL yet but that should work the same as for local templates. I donā€™t think it will load a template from crates.io though. It is a good idea in theory but you need to start adding the versions to it and it is more annoying to develop (just creating the folders/files vs include_bytes/include_str).

In short:

  • language independent because we are just dealing with files
  • ask questions to the user before generating the project and use that in the templates.
  • we can use those variables for filenames/directories as well.
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.