Setting our vision for the 2017 cycle

Rust has great potential for those who want their code to be directly callable from the widest set of languages.

I focus on high-performance/on-demand image asset processing servers and libraries; the next generation of which is Imageflow, an (originally) C library. I recently ported the extern API to a Rust cdylib, some of the Ruby demo server to Rust, and wrote the a demo command-line app in Rust as well. I’m not happy with the current state of the Imageflow Rust codebase though; I’m still focused on regaining behavioral parity with the original C.

As a library author, I find Rust compelling because

  • Control flow is obvious and explicit, and it encourages correctness. It is less magical than C++, easier to learn, and has fewer footguns. Lure: fewer bugs.
  • It enforces many kinds of safety, preventing the most common categories of vulnerabilities in libraries. Lure: lower likelihood of catastrophic security vulnerabilities
  • It has no GC, and is theoretically* capable of exposing a C-compatible API that can be used from nearly every popular programming language on earth. Lure: re-use parity with C

Achieving parity with C

Robustness in low-memory conditions

Graceful handling of malloc failure is expected of C libraries and most C servers. See common justifications for abort() on malloc failure - debunked. In my previous-gen software (the server version), malloc failure simply caused that HTTP request to return a 500. There are better ways to handle this, of course, but aborting the process is not one of them.

There is no clear path to achieving this in Rust.

  • There is no reliable way to know whether an API allocates. If you have advice/ideas on creating a compiler plugin to help with this determination, let me know. Knowing which APIs and trait implementations to avoid would permit carefully coding around them, and writing replacements for stdlib collections.

  • Allocation failure nearly always causes an abort instead of a panic.

  • set_oom_handler was created to provide better abort messages, not to allow swapping out abort for panic!. stdlib may not be robust against malloc failures that result in recoverable panics (although this should be easy to verify through automated tests). Others have requested that panic-on-oom at least be possible for a collections.

Remember that malloc failure != OOM. Although you should avoid allocations during unwinding (does this happen IRL?), a malloc failure for 192MB (an uncompressed smartphone pic) is unlikely to be followed for a malloc failure for 1KB .

Coming from C, Rust’s current *alloc failure behavior seems a particularly glaring regression in correctness, particularly given its otherwise wonderful focus on robust and correct behavior. The status quo feels very contrary to Rust’s philosophy.

There are many use cases where graceful handling of malloc failure makes sense:

  • You have a work queue. Failed tasks can go to the back of the queue with a decremented retry value.
  • A web server. If a request fails, you toss that state and return a 500.
  • Any kind of scheduler or server with heuristics or resource consumption planning. If you can’t tolerate a malloc failure, you can’t have a feedback loop in your algorithm.

I began porting my C library over to Rust when catch_panic reached stable. I did this under the assumption that out-of-memory caused a panic, which turned out to be a common misconception on Reddit.

Also +1 for Vulkano’s TROUBLES.md; Imageflow will hopefully use Vulkano sometime next year.

Error reporting in production

There’s a common pattern in C libraries.

  1. Create a context. The context allocates space for an error message and stacktrace.
  2. Pass the context to every function call thereafter.
  3. Every function call returns NULL or false upon failure, but first populates the error ID, error message (which can include key variable state), and the first row of the backtrace with line number, function name, and file name. As the failure cascades, each function explicitly adds itself to the backtrace, usually via a small macro.
  4. The end user can call an API to copy the error message and backtrace to any buffer they provide.
  5. This means that rare and troublesome production bugs are not nearly so painful to debug. You always have a useful backtrace and clear error message, if, at least, you’ve been consistent in following the simple conventions required.
  6. Users of your library get very clear error messages, with backtraces, and error codes for dealing with recoverable situations.
  7. The app can log those errors and backtraces and continue to run.

Can one accomplish the same result in Rust? What about in a Rust sandwich (Rust code which both exposes an FFI and consumes another, and both expose this kind of error reporting)?

If possible, but simply undocumented, I’d be happy to help.

Vectorization & Performance

It’s unlikely that Imageflow will become 100% Rust within the next. Not necessarily for lack of effort or desire, but rather due to performance constraints (assuming other blockers disappear).

Imageflow currently depends on libjpeg-turbo and libpng, both of which use large quantities of assembler to achieve the required performance.

For Imageflow itself, GCC offers enough hints that we can get very excellent vectorization with very carefully tuned use of vectorizable C and a few SIMD intrinsics. We stop short of actually using assembler.

CLang/LLVM actually doesn’t fair as well as GCC, and produces significantly slower runtime code for Imageflow. Should I expect more from Rust?

Come to think of it, image codecs are a great way to evaluate runtime performance.

Image codecs are the perfect storm of complexity

  • Extreme focus on performance (or they’re not usable - they’re a common bottleneck)
  • Highly complex state machines
  • Malicious input
  • Massive matrices.
  • Graceful malloc failure handling.
  • Often used on embedded and low-ram devices.
  • Fleixble I/O interfaces
  • Async (resumable I/O) support. For real, in C, with setjmp.
  • Defective specifications which enable attacks.
  • Defective files you have to support … because Photoshop
  • High numbers of data dependencies
  • Complex metadata formats; a single image may contain IPTC, EXIF, and XMP (xml) metadata.
  • Aforementioned metadata includes rotation data AND color profile information, both required for processing.
  • Color profile conversions may require runtime code generation.
  • Algorithms tend not to be well optimized by compilers; assembler proliferates.
  • Branching too common for porting more than certain passes to the GPU.
  • APIs are used from all languages, and must be flexible. (There aren’t many codecs written in languages other than C/C++).
  • Often use or permit custom allocators per context.

Are image codecs something Rust intends to be good at in 2017?

Would a Rust image codec be sufficiently interesting as a real-world benchmark that its performance issues would inspire compiler improvements?

It’s 1am, so I’ll conclude with two hopes:

  • I would love to see an official document tracking Rust’s parity status for various patterns in C. Maybe this can tie into the C/C++ to Rust guide mentioned earlier.
  • I would love to see Rust improve its parity with C for the things above, particularly around malloc failures.
13 Likes