💡 Proposal: Add `include_c_str!` Macro to Rust

The Rust language provides excellent build-time embedding tools with the existing include_str! and include_bytes! macros, allowing us to embed static file content directly into the final binary. These are incredibly useful for configuration, static assets, or templates.

However, when working with Foreign Function Interfaces (FFI), especially those leveraging C libraries, we often encounter a common friction point: safely and efficiently embedding null-terminated C-style strings.

The Current Situation

  1. include_str!: Returns &'static str (a UTF-8 slice). Converting this to a safe C string (e.g., CStr) at runtime requires an allocation (via CString::new) and checks for internal null bytes, which can panic. If the included file is known to be ASCII/valid C-string data, this conversion overhead is unnecessary.

  2. include_bytes!: Returns &'static [u8]. While this gives us the raw bytes, we still need to manually verify it is null-terminated and safely create a &'static core::ffi::CStr from it, often requiring unsafe code and careful handling of the final null byte.

The Case for include_c_str!

I propose adding a new macro, include_c_str!, that would work as follows:

  • Signature: It would take a file path literal, similar to the existing macros.
    // Example usage:
    let c_string: &'static core::ffi::CStr = include_c_str!("path/to/my_c_string.txt");
    
  • Return Type: It would return a &'static core::ffi::CStr.
  • Compile-Time Guarantee: The macro would perform compile-time validation on the file's contents to ensure:
    1. The file contains no internal null bytes (\0) other than the optional, final terminator.
    2. The file is properly null-terminated. If the file itself is not null-terminated, the macro should potentially append a null byte during embedding.
    3. If any of these conditions are violated, the compilation should fail with a descriptive error.

Benefits

  • Safety & Ergonomics: Eliminates the need for manual unsafe code to convert raw bytes or runtime CString allocation/error handling when dealing with embedded C-style strings.
  • Performance: Provides zero-cost abstraction for passing static C strings to FFI functions.
  • Clarity: Clearly expresses the intent of embedding a file specifically as a static C string.

This addition would provide a complete set of static file inclusion tools for the most common embedded data types in Rust:

Macro Return Type Use Case
include_bytes! &'static [u8] Arbitrary binary data.
include_str! &'static str UTF-8 text.
include_c_str! &'static core::ffi::CStr Null-terminated C-style strings.

What are your thoughts on this proposal? Would this improve your FFI workflow?

1 Like

I think the nightly concat_bytes! macro is sufficient for this: something like

const { CStr::from_bytes_with_nul(
            concat_bytes!(include_bytes!("file.txt"), b"\0")) }

This would error out if there are any NUL bytes within the file itself (including at the end), but it would be easy to create (in an external crate or just for internal use) a const fn that converted a bytestring to a C string if it ended with either one or two NUL bytes (thus supporting the "NUL at the end of the file") case.

As such, I don't think there needs to be special-case support for this – the tools Rust already provides (or in the case of concat_bytes! will provide in the future) seem to be enough for this.

4 Likes

CStr::from_bytes_with_nul returns a Result instead of a CStr.

Although CStr::from_bytes_with_nul_unchecked can be used, it is unsafe.

Sorry, I meant to add a match to extract the answer, and a panic if the match fails, but forgot – a panic in a const block give you a compile error, so it's an easy way to get the compile-time validation you're looking for.

6 Likes

The from_bytes_until_nul method accepts a suffix of arbitrarily many NULs which is probably good enough.

To be even more precise:

#![feature(concat_bytes)] // <- eventually not needed

use ::core::ffi::CStr;

macro_rules! include_cstr {(
    $path:expr $(,)?
) => (const {
    match ::core::ffi::CStr::from_bytes_with_nul(::core::concat_bytes!(
        ::core::include_bytes!($path),
        b"\0",
    )) {
        ::core::result::Result::Ok(it) => it,
        ::core::result::Result::Err(_) => ::core::panic!("{}", ::core::concat!(
            "Encountered null byte in `", $path, "`",
        )),
    }
})}

const C_STRING: &'static CStr = include_cstr!(file!());

pub fn main() {
    dbg!(C_STRING);
}

I see two small issues with this approach, though:

  • upon error, it does not say where / at which offset the problematic null byte was found (note that the aforementioned third-party lib does give a nicer error message).

  • This approach yields a const _: &'static rather than a static. The latter would be more likely to guarantee lack of byte duplication and allow to rely on address stability.

    We'd just need a helper type to represent a by-value CStr. This is generally not possible, except in this kind of situations, where the compile-time info is giving us a compile-time strlen!

    use ::core::num::NonZero;
    
    #[repr(u8)]
    enum NullByte { _0 = b'\0' }
    
    #[repr(C)]
    pub
    struct ByValueCStr<const STRLEN: usize> {
        pub buf: [NonZero<u8>; STRLEN],
        pub _terminator: NullByte,
    }
    
    // TODO: impl Deref<Target = CStr> etc.
    
    // TODO: macro to populate this.
    

    Then you'd have a macro produce a static C_STR: ByValueCStr<_> = …;, and &C_STR would still Deref-coërce to &'static CStr.

3 Likes