Expand targets acceptable to rustc

Currently, rustc accepts only a subset of target tuples. While some of this is related to support, there are some targets that are semantically equivalent, but only one of them is accepted. For example, rustc accepts x86_64-unknown-linux-gnu, but does not accept x86_64-pc-linux-gnu (despite the latter being the canonicallization of a typical distro host tuple, x86_64-linux-gnu and, in particular, the output of config.guess on my system). One issue this presents is it makes using rustc in an autotools project (either indirectly, by invoking cargo, or directly when cargo may be unavailable) difficult (and, in particular, requires using host_alias when set, rather than the canonical value of host).

My proposal is thus:

  • rustc should accept for all i?86 or x86_64 targets, the vendor pc and the vendor unknown interchangebly, and possibly also accept an omitted vendor (such as in x86_64-linux-gnu)
  • rustc should also likewise accept i?86 for targets where previously only one is accepted (linux targets only accepts i686-unknown-linux-gnu, windows is i586). This is slightly less annoying, but typically I use i386 when targeting 32-bit x86.

One possibly, this could be supported via the target-tuples crate, written by myself (shameless plug), which can parse and provide a correct semantic representation of a subset of targets accepted by config.sub (and, in particular, it accepts a superset of canonical (output by config.sub) targets which llvm itself accepts).

That's not true, there are i686 and i586 targets for both Linux and Windows, which use a base CPU of "pentium4" and "pentium" respectively. The difference is there mostly to enable SSE and SSE2, even though there were earlier i686 CPUs that didn't have that.

Ah, yeah. I just checked. I had originally only seen one apiece (i686-linux and i586-windows). The rest of my point stands, though. i386-unknown-linux-gnu is not accepted (just confirmed: error: Error loading target specification: Could not find specification for target "i386-unknown-linux-gnu". Run rustc --print target-list for a list of built-in targets).

It's also possible these targets could be treated as equivalent, since most people running 32-bit still will have a 686 or better. That would make library maintenance easier for sure.

1 Like

I think we shouldn't accept i386-unknown-linux-gnu unless we're actually targeting a 386.

I do completely agree that we should handle pc as equivalent to unknown, though. Both should match target_vendor="unknown".

4 Likes

BTW, as a workaround for anyone else who (foolishly) wishes to work with rust via autotools, I wrote a macro that handles the first point (pc and unknown being treated different), via some horrible shell scripting: lc-login/lcrust_prog_rustc.m4 at main · LightningCreations/lc-login · GitHub (and I cannot wait to implement LCRUST_PROG_RUSTC_FOR_BUILD as well because proc-macros).

I'd like to proceed with an implementation of this. It would need at least a compiler-team MCP, possibly an RFC, correct (since it is a user-facing change)?

Not all user facing changes need RFCs. I’m not on the compiler team but I think for the part about making pc synonymous with unknown, it seems like a very small, uncontroversial change with not a lot of design space. I would file an MCP with the motivation, and then following up with a PR after it is accepted would be enough.

I think this kind of needs an RFC at least, as it would break code that uses cfg(target_vendor). I also don't think its particularly uncontroversial to allow target tuples to no longer uniquely identify targets in Rust.

I’m not sure I understand your concern of why it would require an RFC. AFAICT, this would just be adding pc as an alias for unknown on Linux, how would this be breaking? Not only is this already a potential value for this field, but targets aren’t covered by the same stability guarantees as the language, and I don’t see how this would be more breaking than adding new targets which don’t go through RFCs.

We’ve made bigger breaking changes to targets without RFCs and MCPs because it was more accurate, just as an example wasm32-unknown-unknown’s entire ABI was changed to match clang, and that only went through a design meeting. New ABI: "wasm" · Issue #90 · rust-lang/lang-team · GitHub

2 Likes

If pc is just an alias for unknown, that's fine. My concern is if cfg(target_vendor = "unknown") would now fail to apply for foo-pc-bar-baz targets when it previously applied to foo-unknown-bar-baz.

Also, there are interactions with existing RFCs such as https://github.com/rust-lang/rfcs/pull/2991, which hasn't merged yet (but has disposition-merge, and is likely to merge after some updates). Would a build that used --target=foo-pc-bar-baz apply for cfg(target = "foo-unknown-bar-baz")?

I feel like there are enough open questions here and there are arguments for either side to them, that an RFC is probably warranted.

What do other compilers do here, in particular gcc and clang?

What specifically do you mean? If you are referring to the feature in general, then both accept both targets (and, relevent to the proposal, I would have to assume that it will continue to be the case for the gcc-rs frontend: ie. x86_64-pc-linux-gnu-gccrs will configure, build, and function properly, for some reasonable definition of "properly"). The only difference between the two is toolchain delegation: gcc (and clang, at least when linking through gcc), invoke tools using the exact target name (the exact string passed to --target, at toolchain configure time for gcc, and at build time for clang). However, this is also the case when passing non-canonical targets (in particular, x86_64-linux-gnu-gcc invokes x86_64-linux-gnu-as, not x86_64-pc-linux-gnu-as).

If you are referring to how it reports targets to source code, currently there is no way in gcc/clang c/c++ to inspect the vendor of the target. For other components (in particular, architecture, which can be inspected), it effectively chooses the defines based on the canonical target, but defines the same macro for adjacent architectures (so, reguardless of whether the target is i386, i686, or i786, it will define __i386__. Note that it also defines that on x86_64). lccc will operate the same for the C and C++ frontends. For the rust frontend, without resolution on some of these, it will likely use the canonical name of each component (architecture, vendor, system, env+binary-format) to derive the values set in target_arch, target_vendor, target_os, and target_env. target_os_family and cfg(unix)/cfg(windows) will both be set from queries of the target properties.

My expectation would be that it matches the exact target name, IE. non-canonicallized target, since a reason to actually match the target name would include accessing toolchain files, which should almost always be based on the exact target. Ths even --target=x86_64-bar-baz would not match cfg(target="x86_64-pc-bar-baz"). However, I would agree that could be argued at an RFC level.

A vendor of pc should be treated exactly the same as if it were a vendor of unknown; target_vendor = "unknown" should match, and target_vendor = "pc" shouldn't.

Perhaps, though that would conflict with my current intended implementation, which is just extract Target::canonical_vendor_name (though that is fixable). I wonder if it may also be a good idea to get the gcc-rs developer's opinions. This should definately go to an RFC at this point, though.

This is a "definition of the Rust language" question, not an implementation question.

Wouldn't that be all the more reason for it to go to RFC?

If we were looking to establish a fully general target alias mechanism or target guessing mechanism, that seems like a good topic for lang and compiler to discuss. (Not completely clear which team it would fall under, but in any case those two teams should coordinate on it. Also not clear whether we'd want such a mechanism.)

However, adding a couple of targeted aliases that are widely used and that are just a simple translation (map the alias to the canonical name and act in every way as if the canonical name were passed in the first place) seems like something that just needs a PR and a clear explanation, and a compiler/language ack.

1 Like

If you're asking what they accept for --target, they accept an enormous pile of different things. The parser is violently backwards-compatible in a way that I'm not sure rustc should be? Both accept the incredibly vague elf32 to mean i386-unknown-none-elf, and the way you do detection is, as another poster mentioned, hilariously ad-hoc.

It's probably worth studying the concrete need here over a "GCC does it this way, clearly we should too". Target triples are a bit of an inexact science and mostly not super meaningful beyond an atomic "this is the ABI". The OS and machine architecture aren't quite as separable as people think, and frankly the "vendor" component was probably a mistake (there will never be a non-Apple Darwin).

(Incidentally, I think the lack of x86 bare-metal targets is a bit of a bug, but that's neither here nor there.)

1 Like

Yes. Since autotools seem to be part of the argument, and autotools was presumably written to match gcc/clang here, I figured it would make sense for Rust to follow what those compilers do. Certainly if they would differentiate between "pc" and "unknown", I would consider it a mistake for rustc to treat them as equivalent.

Now you say that full compatibility with gcc/clang is likely not desirable. That's fair, but the entire thread here (as I understand it) is about better compatibility with "other tools using target triples", so -- how compatible with gcc/clang do we need to be to make things work well?