Proposing to move from internal libserialize to serde, bincode and json crates
libserialize is used by the Rust compiler for serializing/deserializing internal data structures. Those encoding/decoding are used in various libraries mostly for incremental compilation. Also, internal features of the compiler such as RustcEncodable and RustcDecodable uses libserialize when implementing the extension. In few words, Rust compiler is deeply tied to libserialize (553 results in 96 files for RustcEncodable).
libserialize is based on the libserialize crate which has been deprecated now for serde crate. This means that all update are mostly done internally by the compiler team for libserialize and might not take advantages of the last improvements and bugs correction from serde crate. There is this PR that gives some highlight of the reasons why it has not been done before.
In this post, I would like to explore the idea, pros and cons of replacing libserialize with crates such as serde, bincode and serde_json so I'm very interesting to any comments, feedback or ideas.
I believe this was explored in the past but couldn't be made to work since rustc heavily uses specialization to speed up (de)serialization, and also the metadata format has some properties that made serde a bad fit at the time (not sure if this has been resolved since).
I believe this was explored in the past but couldn't be made to work
I was almost sure that this subject has been investigated before opening this discussion, so I checked closed and opened PR and issues regarding rustc-serialize, RustcEncodable, RustcDecodable. All that I found were discussions which introduce the idea but was postponed because Rust needed to be stabilized.
since rustc heavily uses specialization to speed up (de)serialization, and also the metadata format has some properties that made serde a bad fit at the time (not sure if this has been resolved since).
I couldn't find any information regarding this try but if you have any link or pointer to share I'm willing to have a look at it to see if it's has been resolved or if I can specialized serde to fit Rust compiler needs.
Dumb question here, simply because I don't know anything about the compiler internals...
Are there any shared references within the compiler? Serde's documentation on Arc and Rc explicitly state that when you deserialize shared pointers they each point to a separate object. If you rely on shared objects actually being shared, then this could be problematic.
Yes, the compiler interns all types for example. This is one of the reasons why serde wasn't a good fit – it didn't provide a good way to pass the interning context along.
I believe that rustc would need stateful serialization as well but I couldn't find any SerializeSeed function. I think that would need to write custom serialization in order to do it for those specific types that needs to save their context, and that specialization could be added in the compiler. But I'm not sure if rustc would only need that and if serde doesn't miss another features.
The libserialize implementation for Arc creates a new object each time it deserialize an Arc without making it shared between other shared objects. The rc feature of serde seems to do the same thing but I'm not quite sure if there isn't any other aspects of libserialize that I've missed here?