Support URI/Java-form crate names

I'd like to hear what you think if we could have organization-scoped package names, using URI or Java form.

// URI-like
use "www.qux.org/foo"::*;
use "www.john.me/services/whatever"::*;

(In fact in NPM both '@qux/foo' and 'org.qux.foo' work as package names.)


Note, this was suggested before, but proposing a different form of namespacing.

If the string literal is an issue, then I've another suggestion, in the form of Java package identifiers but using :: as delimiter:

use org::qux::foo::*;

You might want to check out the most recent Namespacing RFC:

There is a prototype which uses the URL-like / separator for namespaces here:

4 Likes

I see, thanks. It seems they still use underscore in user code, though.

What's the purpose of having a url-like crate name? Does this provide any advantage over org_qux_foo_lib or hosting your own registry to allow foo-lib = { registry = "qux.org" }?

The URI is more self-explanatory that what precedes .org is an organization. It indicates an org domain (think, it could also be .me!) and the dot character even keeps it well separated from the crate name. But it'd be pratically the same thing as Java's case...

With the URI idea there are the domain separator and one or more of the slash separator. With the Java idea there are only one or more of the dot separator.

I also just thought of adding a www. prefix for more legibility. (This is up to the user, optional convention.)

Also, this URI idea has been due to XML namespaces from which I took inspiration. The URI doesn't correspond to a HTTP transaction.

I updated the OP with few more examples of the URL idea.

The Java scheme can be quite good but it is certainly not perfect. The downside to this global URL (and yes, the proposed scheme is not a URI) is that it can easily break when projects change ownership. Consider a project such as QT which has changed hands many times since its creation. having the original company name as part of all the identifiers would be a major issue for the latest owner.

That's perhaps the subtle distinction between URL and URI - A URL describes a location (the current company that owns the code, a transient property) whereas a URI describes an immutable identification. For example we have such schemes to identify books.

The above mentioned RFC addresses the underscores issue only within the project which is correct. The additional information (company name) you want to add to crate names would be better done as associated metadata.

IMO, a company should have a private registry for its internal code as suggested above. Alternatively, for publishing OSS crates.io ought to be slightly extended IMO to allow cargo to optionally reference projects by users/accounts. This shouldn't leak into the code though as the association belongs in the cargo.toml manifests only. Something like:

[dependencies]
    SomeOpenSourceProject = { version = "1.0", account = "MyCompany" } 
    SomePrivateProject = { version = "1.0", registry = "MyCompany.com" }

I reckon this would help grow the ecosystem and encourage the big enterprises to publish Rust open source libraries & projects. The current policies of crates.io are way too dogmatic for my taste at the moment - e.g. the current restriction that published crates must not depend on external registries could be relaxed within such accounts.

5 Likes

Makes sense, but NPM uses the kind of namespacing proposed anyway... and it'd be an advantage to know who owns the crate in the user code.

If these URIs are supposed to be the source the crate is fetched from (like Golang or Deno do), then it's a security risk when the domain expires (you can't expect someone to hold on to a domain forever, especially if it's a non-commercial open-source project). The new domain owner could inject arbitrary code into existing projects. crates.io offers longevity and immutability of the sources.

And if these are supposed to be names of crates on crates.io, then it's just confusing to use a different URLs for them. Also crates.io would have to verify domain ownership, because otherwise anyone could squat any domain and have even worse false legitimacy.

Adding dependencies with use would make it harder to analyze project's dependencies. Cargo.toml has an advantage of being an easy-to-parse central place.

5 Likes

I meant URIs as identification strings only (not HTTP transaction). You're right, maybe Cargo would have to verify the URI domain parts.

About the dependency, it'd still have to be added to Cargo.toml, that is, something like:

[dependencies]
"www.feathersui.com/aeon" = "1.0.0"
1 Like

In my opinion, one of the things Rust does better than Java w.r.t. ergonomics is the omission of URLs in e.g. import statements. I am actively happy with the status quo.

The main reason is that import URLs are unreasonably unergonomic to type, a akin to a git dependency. The difference between the 2 is that use statements are used much more than dependency entries in Cargo.toml. In effect, I consider URLs in import statements a usability hazard. It might not.bother those using IDEs that automatically do imports as much, but that is definitely not close to 100% of the community.

In addition, the rest of the URL other than the crate name is just noise. That is to say, it conveys no information that is useful at the level of the program to either me as the author /reader of the code, or to rustc. For example, at the level of code I don't care where it comes from, just that it is unambiguously resolved when I import it. And that is one thing the status quo is exceedingly good at.

And if it did convey useful information, conceptually speaking I think its place would be where all metadata for a crate lives: in its Cargo.toml.

The third reason is the use of strings. Stringly typed language constructs always make me uneasy, it feels way too unsafe. It's the same issue as in any random program in a statically types language, except magnified because it's part of the core language. Note that it is also a usability hazard in this way: strings don't get e.g. syntax coloring or internal error checking other than basic "does this match a reasonable regex for a URL?"

9 Likes

Almost, but not quite. A URL (uniform resource locator) is indeed a location. A URN (uniform resource name) is an immutable identification. A URI (uniform resource identifier) is either a URL or a URN.

5 Likes

I go by the WHATWG spec:

Standardize on the term URL. URI and IRI are just confusing.

5 Likes

Also wrong. URNs and URLs are URIs, but there are URIs that are neither. (Apart from that, I agree with @kornel and the WHATWG spec: the battle for distinguishing these terms is long lost, too many people already use them interchangably)

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.