Relative paths in Rust 2018


#101

I’d be in favor of a variant like leading :: as well. I mean, personally I’d still prefer crate:: and extern:: prefixes, even as someone who imports things with individual uses per imported namespace.

In my mind, function should always win against form. Some might consider :: ugly, but others (like me) could argue the beauty is in its clarity. And in this case it even has a big picture elegance to it.

Aesthetics are also quite subjective, and might not be the same forever. The clarity provided by knowing if any path is absolute or relative however is evident and unchanging. No matter what that added value would remain.


#102

/me waves a hand

I feel almost exactly the same way about absolute paths as I do about required-leading-::. If I’m in a situation where I need to use it, it feels like something has gone wrong and I’m working around it. (The same goes for ./ and self::.)

The path analogy works much better for me when you include the $PATH environment variable. I never write absolute paths just to use “dependency” commands in a terminal, so why should I do so in source code?

(Notably this is also how Python’s absolute paths are resolved, and IIRC several other scripting languages.)


#103

I personally want 3 things from the module system:

  1. Being able to refer to things in external crates unambiguously
  2. Being able to refer to things in the current crate unambiguously
  3. Never being forced to write self:: (which I have really not enjoyed so far).

Not being forced to write a leading :: is also a nice bonus but not a deal breaker for me. I find @rpjohnst’s reasoning to be spot on. Writing use proptest::strategy::Strategy; and having it do what I mean is very appealing. For those who prefer absolute paths for readability, you can have an IDE automatically transform extern paths to use ::. All in all, I like @aturon’s proposal a lot.

I think you overstate the ease. To type :: I have to hold down Shift and then : is on the opposite side of the keyboard. This interrupts my writing flow to some degree. But mostly, I would forget to write :: all the time until I reach the end of the import, which would interrupt my flow even more.


#104

I would like to cite a paragraph from the Google C++ Style Guide [emphasis mine]:

Optimize for the reader, not the writer
Our codebase (and most individual components submitted to it) is expected to continue for quite some time. As a result, more time will be spent reading most of our code than writing it. We explicitly choose to optimize for the experience of our average software engineer reading, maintaining, and debugging code in our codebase rather than ease when writing said code. “Leave a trace for the reader” is a particularly common sub-point of this principle: When something surprising or unusual is happening in a snippet of code (for example, transfer of pointer ownership), leaving textual hints for the reader at the point of use is valuable (std::unique_ptr demonstrates the ownership transfer unambiguously at the call site).

How use foo::bar is going to be resolved might be obvious to the person writing said line, but it might not be obvious for the next person reading it.

Hence, I am very much in favor of an unambiguous path syntax.


#105

Summary of ideas at the bottom.

Leading ::

I’ll confess that I come from a C++ background, so my biases are that way. In C++, namespaces follow lexical scoping, with leading :: to make an absolute path starting at the global namespace, so this means that, say std always means ::std unless it is shadowed. This means that you can just write std::string and it will mean exactly what you think it means. To me, :: is a disambiguator that you apply only when you really need to, and it’s not that I think it is inherently ugly (it’s not), but rather that it goes against my sensibilities of how paths “ought” to work. Why force people to disambiguate with leading :: when there is not possibly a second interpretation? The C++ model wouldn’t work in Rust, because of the way modules are scoped, but this is where I come from.

I do think that requiring crate:: everywhere would be ugly, but to me it feels a lot like std:: except that crate is a magic word like self. In fact, I think that the current crate name should be declared as an alias for it, rather than requiring the use of crate, to resolve the asymmetry.

Searching names

I also don’t think that there’s any difficulty understanding where a name comes from with the originally-proposed solution. I do understand the value of this, as I also code in Go, where source files can share scopes, and it’s extremely useful there. But in Rust, because of the lack of shared scopes, we don’t have to look far beyond the file. I do not expect that users will often run into trouble because of not knowing which packages are imported, or the contents of the prelude.

The only things that can muck this up are macros and glob imports, but your proposal doesn’t address them, and if we address them, then we have the following easy algorithm:

  1. Ctrl+F the file. If you see an in-scope definition of the name, then you have found it.
  2. Check the list of crates and the prelude. If the name is found anywhere there, then you have found it (in case of ambiguity, the crate name takes precedence over the prelude).
  3. If I still haven’t found it yet, then it must be imported by a glob or declared by a macro.

Step 2 is not hard as the list of possible identifiers is not that big, most developers will have internalized the contents of the prelude, and I don’t expect developers to regularly lose track of which projects a crate uses (it will very slightly increase the overhead of reading unfamiliar code, but as the developer seeing an unfamiliar crate name would probably go check out Cargo.toml to look at the package anyway, I don’t think it’s a big cost).

Ambiguity

After thinking about my ambiguity-resolution proposal more, I realized there is an issue with making an ambiguous import an error: it makes any change that introduces a new ambiguity a breaking one. This could arise the following ways:

  • Declaring a new name locally. This is fine, since it’s a change in the code in question.

  • A macro declares a new name locally. This is not fine, since it could be a change upstream breaking a downstream client. This is already a breaking change, however, because this new name can already shadow something imported by glob or in a higher scope, or conflict with another locally-declared name.

    The fact that this is a breaking change for a macro author may need to be clarified, however.

  • Another crate adds a new name, which is glob-imported. This is not fine, since it can cause this name to shadow another name, or to conflict with a name in another glob import. The only new case here is that a glob import could shadow a crate name (it’s already possible for it to interfere or shadow a local/prelude name).

    This does cause me concern. As it is, glob imports can be used without worry about backwards-compatibility if a) crates are careful not to introduce anything with a name used in the prelude b) a module is careful to use a glob import only once, and only at module scope. This would mean that either glob-imports impose more burden on crate authors (they cannot introduce new names and be backwards-compatible) or on glob-importers (they cannot assume any sort of backwards compatibility if they use relative paths any more).

    However, there’s an easy fix: we can simply ban names imported by cross-crate glob as leading path segments, and require users to import them explicitly. This could be applied only in use, but I think the ergonomics are better if it’s applied uniformly. This rule only needs to apply when the import and declaration are in different crates; if they are in the same crate, then there are no issues because it’s all under one author’s control.

  • Adding a new crate. This is fine, since it’s a change in the code in question.

  • Adding a name to the prelude. This is a Big Problem if this introduces an ambiguity error: the change is not backwards-compatible. I believe that, today, adding new names to the prelude is non-breaking since they can always be safely shadowed, and we should probably preserve that.

Bearing the above in mind, there are at least two kinds of circularity problems that we obviously wish to fix. I thought that things would not be so bad initially, but now that I have considered them in more detail, the implications are very thorny.

Macros

Macros can ambiguity in the invocations of other macros and other use declarations. @nikomatsakis gave this example, which I don’t see addressed in any proposal allowing crates to be referred to with non-absolute paths. In this example, foo is a crate.

use foo::bar;
baz!(); // declares "mod foo"

How do we resolve this import properly? We would have to wait until the expansion of baz to understand it. I propose two variants two point out worse nastiness:

use foo::bar;
bar!(); // declares "mod foo"

This one is worse, because we can’t lookup bar if we hold off on resolving the import until after the macro is expanded. In this next one, foo is not a crate:

use quux::foo;
fn local() {
    use foo::bar;
    baz!(); // declares "mod foo"
}

This one demonstrates that “locally-declared” as a separate category in handling ambiguity is not enough to actually manage things: any case where we can shadow is enough (glob imports is another one) to cause it. And of course we can also mess with macro names themselves:

use foo::{bar, baz};
fn local() {
    bar!(); // expands to "use quux::baz;"
    baz!(); // expands to "use quux::bar;"
}

This ambiguity is extremely broad in scope, and hard to solve. Moreover, it gets even worse if we ever make it possible for macros to perform name lookup, which is required if we want to make the intert attributes used by proc macro derives to scope properly. @aturon implied in the first post that the proposal there would resolve issues like this, but it’s not clear to me how broad of problem he was trying to fix there was, so I’m not sure if these were in scope.

Circular use

mod foo {
    use bar::*;
}
mod bar {
    mod bar {}
}

As before, I don’t actually think that @aturon’s proposal is enough to resolve this, because the local shadowing behaviour of glob imports causes issues that go deeper than simply shadowing crates. Globs have something in common with macros, in that they can declare things which aren’t immediately apparent from surface syntax. But in fact, we don’t need that to create ambiguity:

mod bar {
    pub(crate) mod baz {}
}
mod baz {
    pub(crate) mod bar {}
}
mod foo {
    use bar::baz;
    use baz::bar;
}

Now which bar and baz does foo get?

I think we would have to enforce the following rule, or something stricter that the compiler can reasonably calculate: the use declarations and macro invocations in a crate form a DAG, when ordered by whether one can change the interpretation of the other. I’d have to think more about whether or not we can do this in a way that lets us still add things to the prelude (or crate authors add things to crates without breaking glob-importers), but honestly, I feel like we’re getting too far into the weeds for a change this late in the edition cycle.

(Personally, I’m starting to feel like the edition should be postponed.)

Side Thought: Tooling

From a tooling point of view, I am not sure that letting an IDE have instant ideas when you write a name is going to be a super useful idea. There is one thing that I would love to have, however, and that is goimports. For those unfamiliar, because imported names are always qualified in Go, there is a tool which looks through your source for undeclared package names and adds them as imports. It’s not perfect but it definitely beats having to go back to the top of your source file manually all the time. Something similar for Rust would be wonderful: if I write some_crate::name and some_crate is undeclared, then I can run a tool to automatically check if some_crate exists in Cargo’s index and, if so, add a dependency at its current version. This would only work if we can reliably determine that some_crate is indeed intended to be a crate. We could do it heuristically, or only apply it to absolute paths with ::some_crate (I, for one, am the sort of lazy programmer who would absolutely write the absolute path, run the tool, and then delete the leading ::). I think, having written this down, that it is mostly orthogonal to the other questions in this thread, though, so I think future discussion of this should branch off rather than continue here.


#106

I think part of the problem is that every explanation of how things will work without explicit absolute paths is paragraphs of:

Type of caveats and comments. Not to pick on anything you’ve said in particular (or anyone else), but, I just cannot understand why I would not want absolute paths to always be explicit. I just don’t understand why it is a benefit to ever have any sort of “heuristic” with respect to this. I know the decisions has effectively already been made, and I’m not arguing to change the decisions, I just think things would be better if absolute paths were absotively, posilutely, 100% corn-fed, spot-on, explicit. If I have to look beyond the use statement or the line where it is being used to know whether or not it is an absolute path, then, to use a certain ethnic phrase, “something ain’t kosher my friend”.


#107

I agree that absolutely paths should definitely be explicit, and it should be completely obvious when you are using one.

I do not agree that absolute paths should be the only way to name crates.


#108

I’ve just now realized that the large part of my concerns above are invalid because they forget that modules don’t follow lexical scope.

I will be in the shame cube until the edition ships.


#109

I’m a Unix user, as such, I like leading path separators for absolute paths.

Windows does it in a very wrong way IMO. (but if you really like it, you should have a way to disambiguate self and cratename, just like you need a colon in C:\. (module::name vs crate:::name, anyone?))