Anybody interested in a Languages-hosted-in-Rust WG?


#1

Hello all!

I’d like to propose starting a working group that would bring together people who have an interest in implementing programming languages in Rust!

While many language runtimes are implemented in C, from Python to GHC, there is plenty of room for Rust to host it’s own share of runtimes with all the benefits that Rust brings while also widening Rust’s reach.

A possible vision

  • a language with a runtime that is to Rust as CPython is to C
    • familiarity, broad appeal
  • that doesn’t compromise the safety of Rust the host language
  • widens the reach of Rust, creates a new gateway to Rust
  • and is focused on close compatibility or integration with the outcomes of other WGs and the ecosystem
    • with most definitely an eye on tokio/futures
  • libraries and documentation to make Rust a great choice for implementing other languages

Initial steps

  • create a space to talk and coordinate
  • bring together any existing documentation on runtimes in Rust
    • e.g. garbage collection blog posts; the work done by pnkfelix and manishearth on integrating GCs
    • survey implementations of already-existing runtimes written in Rust, identifying patterns of abstraction, what is ergonomic, what can be improved, memory management choices
  • identify missing Rust features needed to ensure safety
    • handling drop-unsafety in garbage collection, for example

Obviously the core teams have established priorities for this year so we wouldn’t expect significant involvement from them for now. :slight_smile:

What do you all think?

Would anybody consider participating in such a WG?

EDIT: until a WG with gh repo and chat channel etc is established, feel free to collaborate on this Dropbox Paper doc


#2

Would this WG be more concerned with EDSLs or writing runtimes for other languages such as Haskell in Rust?

It’s a pipe-dream (due to a lack of time) of mine to RIIR GHC’s runtime :slight_smile:


#3

Runtimes for other languages. Agreed, GHC-in-Rust would be awesome! Have you seen https://github.com/gluon-lang/gluon ?


#4

Only at a glance, but it seemed very interesting :+1:


#5

Just dropping here the awesome work that @nikomatsakis &co are doing on lalrpop, the parser generator


#6

Sounds interesting for sure.


#7

I think bridging with existing runtimes(JVM, CLR, v8, cpython, …) are important too. Not sure if they’re related or not.


#8

In this context Holyjit might be an interesting common tool for these runtimes.

@nbp recently adapted Holyjit to avoid the use of a compiler plugin, which makes it easier to integrate in other projects.


#9

I’d be interested to join. I don’t have a “serious” language project, but i like to implement small runtimes and compiler components for fun. Would be great to exchange ideas.


#10

Count me in… definitely interested


#11

I’m also currently working in this space and would be interested in chatting and collaborating!

Some of my projects:

Pikelet is a dependently typed language with a goal of application programming in mind. As noted from the Github description it’s still not ready for wider circulation however. Hopefully will compile to LLVM and/or Cretonne one day. Will have an efficient non-uniform representation for data structures, linear types, strict evaluation, and will draw inspiration from 1ML and modular implicits. Uses LALRPOP for parsing. Not sure about GC yet.

Codespan is intended to make it easier to track source code locations, and eventually to make it easier to implement pretty diagnostic formatting for languages implemented in Rust, and to implement language servers. As of yet only the initial data structures are laid out, but I think this is super important for folks to build high-quality EDSLs and languages.

Nameless is a library I built to help out with the ickiness of name binding. Still work to be done on it though, and I’m still unsure if it’s the best approach. Could definitely be handy for prototyping though.

I’m also currently working on a type-directed external DSL for binary data parsing, with Rust as an initial target. There are some interesting issues there wrt. code generation and interactions with the crates.io ecosystem.


#12

Definitely interested in this. Have used Rust a bit in this space, but it is not very ergonomic for the exploratory programming I’m currently doing.

Nevertheless, I’d love to be chatting about this with like-minded people.


#13

I was wondering why your name seemed familiar… hython of course! :grinning:


#14

I wrote up a better summary of what I think a WG would do. If you have a Dropbox account, feel free to add comments to the document, fill in obvious gaps, or disagree strongly and continue conversation in this thread!


#15

I would be interested in helping out with this. I don’t have a Dropbox account so I’ll share some comments here:

gitter? irc?

IRC probably makes the most sense since it’s (I think) a bit more common than Gitter. We could probably have a dedicated channel on the Mozilla IRC server.

github repo? new organization or rust-lang-nursery?

As much as I would prefer GitLab over GitHub (after all I work for GitLab) I think a repository in the rust-lang-nursery group makes the most sense. The WASM group also uses this and it’s better to keep things in one place. Since we’ll mostly be dealing with text documents and issues we won’t need any fancy CI features anyway.


#16

The problem i see with irc is that most people don’t have a relay set up and only see conversation when they are online. In a small group that can make the difference between a slow, but ongoing conversation and dying out because “nothing is happening when i’m looking”. It works a lot better for immediate questions… (all imo/afaik, everyone seems to have a completely different setup in their communication channels)

Another alternative i’d like to mention is Discord. Imo it’s a lot nicer then gitter. But i understand that the not being open source part could be a blocker for some.


#17

A couple points in favor of gitter:


#18

I think the idea of creating such a workgroup is very welcome, especially since sooner or later people will start creating languages on top of Rust.

For example I am currently working (since last November) on a Scheme interpreter in Rust, with the purpose of replacing Bash in my “operational” scripts:

However before I started implementing it I made a survey of other Rust-based Scheme / Lisp implementations and found quite a lot of them (18 to be more precise):

Moreover due to the nature of Rust type system and ecosystem I think it makes a perfect candidate for implementing such languages.


However, there are quite some a few issues to overcome when using Rust to implement dynamically typed languages:

  • garbage collection is a huge issue; currently I rely on Rc<T> exclusively in my interpreter, which means cycles lead to memory leaks; therefore an important issue to tackle by such a workgroup is how to enable the implementation of garbage-collection system that is as easy to use as a simple Rc<T>;

  • dynamic dispatch; which pattern should one use to dispatch functions based on the “object” and “signature” of the arguments? for example a large portion of my code is dedicated to dispatching “primitive” procedures, and especially the ones that accept multiple number of arguments;

  • efficiently passing the barrier from dynamic typing of the intepreter language and the static typing required by Rust; for example all my primitive functions must accept the generic Value type, from which I must then “cast” the actual Rust type one expects (like an array, or string, etc.); currently I have a set of macros that try to match the arms of the enum Value, but I have a feeling this is quite inefficient…

  • efficient implementation of “expression” structures and how to “interpret” them – i.e. writing VM’s in Rust-compliant manner; many C-based Scheme implementations out there rely on using a stack / byte-code based VMs, which makes a lot of sense in C, however one can’t reuse that pattern in Rust without resorting to unsafe mem::transmute all over the place;

  • handling mutable types (used in the interpreted language) in the Rust-based code; and this was one of the toughest issues I had to tackle, because I now had to move around Ref<T> values; moreover when one is faced with two variants of the same type, say immutable and mutable strings, now one has to juggle an enum that holds either a reference to the internals of the immutable string, or Ref to the mutable one; (similar to cow but not quite;)

  • interacting with “opaque” data types when crossing the bridge from the interpreted language to Rust; for example for the “core” data types in my language I’ve used enum arms, but I also introduced an Box<Any> variant for the “other” less frequent datatypes;

  • “exception” handling; Rust uses Result<T, E> as an explicit return to handle errors; many programming languages like to “throw exceptions”; therefore one’s interpreter code is full of Result<T, E> and try! (call_some_function (...));

  • related to error handling there is the issue of “backtraces”; how should one implement an efficient and useful backtrace mechanism that would allow one to mix Rust-layers and interpreted layers; currently I have two mechanisms for backtraces: one based on the backtrace library to handle Rust backtraces, and a trivial “dumb” one to handle tracing in the interpreted code;

  • parsers, especially of the PEG-style; I currently use peg as the language parser, but it lacks the ability to parse “expression-by-expression” from a stream, which means I can’t use it to implement a function that “parses the next expression” from a file; and so far I have not found a parser (in Rust) that is easy to use and able to do this;

  • having the tests of the interpreted language “integrated” within cargo test framework; here thanks to the macro system and the include_str! I was able to “somehow” integrate them, but the result isn’t the nicest one…

  • green threads, continuations, and other “concurrency” constructs; (I haven’t even started thinking about these, but after a first look they seem far from trivial to implement…)

And these are only a few items that came in my mind while writing this. I’m sure there are other issues out there similar to these.


Now about the collaboration in such a workgroup, I think the best solution would be a mailing list (or equivalent).

I’ve seen mentioned IRC, and that would be a deal breaker for me (and perhaps many other).

Github issues (as is done for Rust RFC) might be a solution, but if one looks closer it actually mimics a mailing list. It might be best suited for discussing code, but for general discussions I think it’s a poor replacement for a good mailing list.

Therefore my vote (if it comes to it) is – in order of preference – a mailing list, GitHub issues, any other solution that allows notifications to be alerted via email.


I hope such a workgroup becomes reality, and make the life easier for language implementers out there.


#19

You may want to ask @Manishearth and @mystor about that, I believe they worked on a similar topic in the past: https://github.com/Manishearth/rust-gc .


#20

@pliniker

I’d say let’s just go with Gitter + GitHub issues in that case, otherwise we may get stuck in a bikeshed :slight_smile:

@ciprian.craciun

garbage collection is a huge issue; currently I rely on Rc exclusively in my interpreter, which means cycles lead to memory leaks; therefore an important issue to tackle by such a workgroup is how to enable the implementation of garbage-collection system that is as easy to use as a simple Rc;

Not sure if it helps (and I don’t want to get too offtopic), but this is how Inko does it: the allocator just returns a custom pointer (which is just a light wrapper around a *mut T). It doesn’t use any reference counting, nor does it provide a way of keeping objects marked as “live” when there are references outside of the managed stack/heap (e.g. some random Rust structure). This works for Inko because I simply decided not to support storing pointers to heap memory in arbitrary places, instead when the GC runs everything is in a place it is aware of. I realise this may be a bit vague so I’m happy to discuss more (probably in a separate discussion) if wanted.