Precise semantics of `no_mangle`?


#1

tl;dr: #[no_mangle] does things other than controlling mangling, they don’t appear to be documented, and they’re spreading, cargo-cult style, through the Rust embedded community. It would be nice to get this defined or fixed before too many people are relying on it.

Documentation of #[no_mangle]

The book says,

The no_mangle attribute turns off Rust’s name mangling, so that it is easier to link to.

The reference says,

on any item, do not apply the standard name mangling. Set the symbol for this item to its identifier.

This seems totally reasonable.

Behavior of #[no_mangle] in practice

And yet #[no_mangle] seems to additionally imply external linkage. Here’s a pub static:

pub static ISR_VECTORS : VectorTable = ...;

(In embedded applications, the interrupt vector table is typically the true program entry point, and the proper root for the linker reachability pass. So it is not uncommon for such a static to exist without being referenced from Rust code.)

When linked into a staticlib without #[no_mangle]:

$ nm target/thumbv7em-none-eabi/debug/libemb1.a | grep VECTORS
00000000 r _ZN4emb111ISR_VECTORS17he9e8248937e58715E

Lowercase “r” means a local read-only symbol, despite being pub.

Attempting to name this symbol as the reachability root in the linker script (using ENTRY in the case of gnu ld) fails, because it’s local.

Adding #[no_mangle]:

$ nm target/thumbv7em-none-eabi/debug/libemb1.a | grep VECTORS
00000000 R ISR_VECTORS

Two things have changed:

  1. The name is not mangled (expected).
  2. The symbol is now global (not expected).

#[no_mangle] is becoming #[export] #[used]

At the moment, #[no_mangle] seems to be the most reliable way to get a symbol exported from a staticlib, or to get a symbol to live long enough in a binary that the linker script can act upon it. I’ve tried various tricks using #[export_name] and #[linkage(_)] without success.

It also seems to be the only way to preserve a symbol that is not otherwise referenced from Rust code in a binary, long enough that the linker can act on it. (Correct me if I’ve overlooked it, but it seems that Rust does not have a “used” attribute at this time.)

People are noticing this, copying it into their code, and suggesting it as a solution when symbols are linker-GC’d too early. (Here’s one from March; Zinc, too, uses #[no_mangle] pervasively for this purpose.)

I think this abuses the attribute a bit, at least given the way it’s currently defined in the reference.

If there’s been discussion on this and #[no_mangle] is intended to have permanent, reliable effects on linking, we should document that. I’d personally advise against this, because symbol naming and linkage are orthogonal, and I occasionally need unmangled local symbols for interaction with things like kernel-aware debuggers. However, there’s evidence in the compiler that the current behavior is deliberate; this lint specifically triggers on the case I just mentioned, and #[no_mangle] is explictly punned as export here.

On the other hand, if people are relying on a curious behavior of the current compiler implementation, we may want to separate the effect of #[no_mangle] into multiple attributes and introduce a “bridge” lint to encourage people to apply them all if needed.

Finally: is there a different, reliable way of getting a symbol exported, even if rustc can’t see it being used?

Thanks!


#2

And yet #[no_mangle] seems to additionally imply external linkage.

Ugh, linkage attributes are set on best effort basis. They can depend on 1) item being pub and being reachable from other crates 2) item being used in inline functions 3) #[no_mangle] 4) extern "ABI" 5) item being related to lang items 6) crate type - lib or exe. If the combination of all factors make an item likely to be used from the outside and linked to - let’s put an external linkage on it! I don’t think anything of this is specified in any way. Amusingly, this “nothing is specified” was used as a motivation for rejecting the RFC about #[used]. Someone needs to write one more RFC specifying what is guaranteed about link-time symbols and what is not, and proposing #[used] again.

Behavior of #[no_mangle] in practice

#[no_mangle] items are also required to have unique names in crate regardless of modules.


#3

It kind of felt that way. This makes things somewhat hard to reason about as a programmer, of course.

I suspect that most folks who care about symbol visibility and linkage are doing FFI, and FFI with C-style languages tends to involve #[no_mangle], so this probably felt reasonable. (I bet this has also held true for dynamic libraries.)

In the Rust-embedded case we’re having to override the compiler/linker’s notion of reachability and entry points without any FFI, in which case #[no_mangle] technically works but feels like an abuse.


#4

My mental model is that symbols are owned by rustc by default (e.g., if the symbol is private, rustc can emit a differently-typed “arg-promoted” symbol instead of the expected one, as long as it handles it correctly), and #[no_mangle] transfers ownership of the symbol to the programmer.

Now, because ownership is transferred to the programmer, rustc’s unspecified symbol mangling scheme can’t be used, so the symbol is left unmangled.

Having an unmangled rustc-owned symbol makes pretty much no sense (you can’t actually use it because the compiler owns it) - so #[no_mangle] implies #[linker_owned]. There is no loose #[linker_owned] because nobody implemented it.