RFC: macro functions

That's not the reason people abuse macros in C. Macros, even simple ones, are emphatically not easier to write than corresponding functions. But C doesn't (or historically didn't, and plenty of projects are stuck on C89/C99) have the notion of first-class constants, constant functions or compile-time programming. Old compilers also couldn't properly do function inlining, and many higher-level constructs, like iterators or objects, can't be expressed in C. For these reasons people resort to macros.

Rust doesn't have any of those problems, and already has 2 (!) built-in macro systems. Whatever macro abuse people can do, they would already be doing it. The point of macro functions is exactly to reign in the macros, and to give a simpler to both read and write alternative in the common cases.

8 Likes

I think I like this idea. The case I see this as the most useful in is making code generated by regular macros when implementing things for lots of specific types much more maintainable. For example, there are plenty of private macros that look something like this:

macro_rules! foo_impl{
    ($($t:ty)*) => {$(
        impl Foo for $t{
            fn foo(&self, other:&Self) -> Self{
               (self + other).some_inherent_method() 
             }
        }
    )*}
}
foo_impl!{ Bar Baz Qux }

It's a bunch of repetitive trait impls for types, with some non-trivial logic inside the function. Rust's regular tooling is rightfully almost entirely useless when macro_rules is involved, so debugging and maintaining the implementations are extremely difficult, and even keeping things formatted is basically impossible. In that example, since inherent methods are being called, you can't delegate the impl to a generic function without making another trait with a whole bunch more repetitive impls. Macro functions fit that niche perfectly, so we can delegate the implementation to a function like this:

macro fn foo_impl_inner<T>(lhs: &T, rhs: &T) -> T 
where 
    &T: Add<&T, Output = T>
{
    (lhs + rhs).some_inherent_method()
}
macro_rules! foo_impl{
    ($($t:ty)*) => {$(
        impl Foo for $t{
            fn foo(&self, other:&Self) -> Self{
                 impl_foo_inner(self, other)
             }
        }
    )*}
}
foo_impl!{ Bar Baz Qux }

Now the implementation is something that can be automatically formatted, can be fairly easily reasoned about by tools, and is a lot easier to debug and maintain.

5 Likes

Exactly this is the problem. But yeah, maybe C is a bad example but I've seen this stuff in not-too-old C++ code too. What could've been a simple template function was a macro (min and max are typical examples).

Yes, because the alternative is even worse. And if they were even easier I believe there would be even more of them.

1 Like

That's not how human brain works. Human brain naturally tries to find the easiest short-term solution without much thinking about long-term. Yes, you can override it with willpower but that is limited anyway. Removing the need for willpower is the practical solution.

2 Likes

I really like the macro type T (bikeshedding aside) syntax instead of writing only T or _. To me it points out which types are inferred from the calling context better than the alternatives. This option also allows a generic parameter with type T (instead of macro type T) to behave like in a normal function, so as to have a better control over the function signature.

8 Likes

Needless to say, I think this is an absolutely horrible idea.

We just managed to get rid of the endless confusion and impossible-to-test template libraries. Why re-introduce all of this headache?

I find the motivation extremely weak. Whatever you can't express as either parametric polymorphism or a reasonably simple macro is probably something you shouldn't be writing at all, anyway. "Macros are hard to write" is… not exactly true, either. At best, it's subjective. But if you spent a few hours writing them, you should be pretty proficient. Maybe the dollar signs look weird at first to some people, but there's really nothing fundamentally wrong or nonsensical about the way Rust's declarative macros currently work.

Most of the time when I need to write macros are small utilities that map some compile-time constructs to runtime. E.g. I wrote a macro to generate a bunch of distinct unit structs representing keywords, and an array of string literals that collected all keywords as strings (to be recognized by a parser). This macro was easy to write, it's easy to read, it's not possible to express purely as a parametrically-polymorphic function, and it wouldn't work with macro fn either.

Furthermore, I find the "add" example absolutely unconvincing. I wouldn't ever want to write, let alone have to read code like that. It really should just be a regular generic function. If you are dissatisfied with the current assortment of numeric traits in std or num-traits, then that's another question, and it does not warrant adding a language feature that has been proven to be actively harmful based on 40 years of industry experience.

Please, don't make us write C++ in Rust again.

11 Likes

I think this is a viable idea, I certainly can relate to the problems @CAD97 mentions. I remember once writing a function that wanted to process both integers and floats, where I needed to add, sub, mul, div them. Figuring out the traits took me IIRC more than 2 hours. For a private utility function to avoid code duplication. So saying:

I find the motivation extremely weak. Whatever you can't express as either parametric polymorphism or a reasonably simple macro is probably something you shouldn't be writing at all, anyway.

Is a little condescending, sure there are people who could have written those Ops trait bounds in 5 minutes, but that's not everyone. It felt needlessly hard to do. There were several points where I seriously considered just not doing it and duplicating the code.

So when should you use macro functions, then? The primary use case is private helper functions,

I would recommend that macro functions cannot be pub, maybe pub(crate) but not part of a public API. Because that's the point where they cause the most issues and in general degrade library code quality. But for private helper functions I see them as a nice way to simplify and speed up writing code.

2 Likes

For what it's worth, I have a project that could benefit from proper view types but not from this feature, because the function that would take a view is an associated method of a trait.

While the design per se is interesting, I am kind of sceptical about this proposal.

I personally feel like a lot of thought went into the design of Rust's polymorphic structures when compared to C++: In C++ you have classes, templates and C style macros and Rust adjusts this:

  1. Rust's trait objects are an equivilant to base classes, using composition rather them inheritance. This removed most of the implementation complexity and made them more flexible.
  2. Rust's generics are a more restrictive version of templates, that use the trait concept for early nominative type checking. This imporves compilation time and error messanging, as well as harmonice it with trait object polymorphism and avoids the problems commonly associated with structural duck typing.
  3. Rust's (declarative) macros conver the more exotic cases, in a more strait forward manner: They are fully explicit and their function can be understood on a syntactical level.

The problem I see with this proposal is that it doesn't fit in well with the language as designed:

  1. Generic functions allready cover most of the potential applications of macro functions, in a more type safe manner.
  2. C++ templates and their structural typing rely strongly on other features like function overloading and SFINAE magic. These do not exist in Rust.
  3. Macro functions would use yet another syntax. From a didactical standpoint, I do not see much benefit over teaching people how to write simple declarative macros compared to this.
  4. C++ templates are explained in header files. Users can inspect the header to reason about potential instanciations. Reintroducing this into Rust seems as a step backwards.

The only real benefit I could see here are that a) writing a macro fn could be faster for crate internal use cases over defining and implementing an add hoc trait and b) macro fns could make porting or interfacing C++ code from Rust easier.

8 Likes

I haven't seen any view type proposal which would allow to use it in traits. Without being able to define fields in traits, that doesn't make much sense, since a trait can't talk about the fields of implementing struct, while view types are all about specifying touched fields. I guess something like abstract "views" which specify borrow disjointness without talking about specific implementation could work, but I don't recall proposals which would discuss it in detail.

Note that, if we allow macro functions in traits (which imho should be allowed), then you could recover at least part of what you want. However, macro methods obviously can't be available on trait objects (though view types likely can't as well), and there are likely issues with using them in generic code (it may require the calling function to also be a macro fn in general).

Those are not the only reasons to want this feature. I spoke about some of that above. I short, macros are too powerful in some respects, annoyingly lacking in others (lack of interaction with visibility, lack of unsafe/async hygiene, can't be used in postfix position or use type-based dispatch), and way too hard to analyze, both for tools and users. There are also always some things which can't be expressed in a strongly typed system, or which must be expressed in a Turing-complete sublanguage in types. This means that we would either have to expand type system with even more complex additions, or use loosely-typed Turing-complete systems anyway (proc macros, typelevel programming). I'd say those alternatives are worse.

2 Likes

Basically, I want to write an iterator that's analogous to slice::IterMut, except that the iterator owns the underlying buffer and deallocates it when dropped.

This requires some sort of LendingIterator, as the elements returned from the iterator borrow from the iterator itself. But LendingIterator on its own imposes an annoying limitation: it prevents the lifetimes of the iterator elements from overlapping with one another, when the only restriction that's needed to avoid UB is that the elements not outlive the iterator.

With Iterator:

Lifetime of iterator:  ----------
Lifetime of element 1: -----
Lifetime of element 2:     -----
Lifetime of element 3:        ---------

With LendingIterator:

Lifetime of iterator:  ----------
Lifetime of element 1: ---
Lifetime of element 2:    ---
Lifetime of element 3:       ---

What I want:

Lifetime of iterator:  ----------
Lifetime of element 1: -----
Lifetime of element 2:    -----
Lifetime of element 3:      -----

To express this as a trait, I would need to be able to write a method signature fn next(&'a mut ViewOfIter) -> Item<'a>, where ViewOfIter is a view type of the iterator, and Item borrows immutably from a different and disjoint view type of the iterator.

The pushback is expected, but imho it's driven more by the scars of dealing with C++ templates than by the real problems of this proposal.

Rust already has multiple ways to introduce compile-time dynamic typing. Macros (in 2 forms) are one. Type-level programming is another (technically you get a specific type, practically you don't know what it normalizes to unless you fully evaluate the corresponding term, which is a Turing-complete operation). It would be nice if we could reduce the power and pervasiveness of those already existing techniques.

There is very little difference in practice between a 10-page error from template instantiation, and a 10-page type which couldn't be evaluated to the required form. It's quite easy to get such errors once you start using more complex types and traits. Not that a 10-line trait bound with complex crate-local traits much better.

The real issue with C++ templates isn't their duck typing, it's that there is no better solution for most common cases, nor there is support for proper error generation (concepts may cover that, but it took 40 years to get them in, and they are still less powerful than Rust's traits). There is also zero language-level support for many core programming primitives, like tuples and variants, coupled with overly-generic implementations. Stuff like SFINAE is just cherry on top.

So actively harmful that most of current std or popular libraries is template-based. Overuse of TMP is certainly a big issue, coupled with all other sins of C++ it's a volatile mix, but "actively harmful" is still a huge overstatement. If anything, C++ sans templates would have been dead long ago, because it would be just a worse Java.

Unlike C++, we can rely on strong type system, pervasive powerful static analysis, feature-gates (which allow not to roll half-baked features) and a much more cooperative and flexible language evolution process to resolve issues with the added language flexibility. We're not stuck forever carrying old bugs, bad interfaces and lack of proper compiler errors.

So you're saying that pin-project-lite, serde_derive, async-trait, tokio::select!, pest_generator should have never been written?

Even for something as reasonably simple as "handle all integer types", both declarative macros and generics are too complicated. The former can result in huge functions which are hard to write and to debug. The latter results in huge unwieldy trait bounds, even if you depend on num-traits (which is itself quite huge and complex, and still doesn't support all operations on integers). Macro functions would fill this use case perfectly: the functions are just as easy to write as normal ones, the mistakes in usage can be covered in a comment "intended to be used only with built-in uN and iN types", and you can also throw in some trait bounds if you depend on e.g. your private traits.

10 Likes

That would be better served by an entirely different feature: &move T owning references. You don't need to borrow from the iterator if the iterator itself borrows the slice, and the iterator items can't overlap (which is the case for something like windows_mut, but not simple by-ref mut slice iteration).

I have a proposal in the works, but I don't know when I fill find the time to fully flesh it out.

1 Like

In this example I personally do not see why you cannot just use a macro instead. The idea of a function is (at least for me), that and user should not be forced to inspect the full implementation to be able to use the function correctly. Your example relies on "inside knowledge", to optimize for lifetimes, which violates this principle. You could argue, that the macro function gives you an signature, which a macro would not, but given that you need to inspect the implementation anyway, and the signature not telling you the full truth, I am not sure if that is really a benefit.

I short, macros are too powerful in some respects

I personally don't see this as an issue. You do not have to use macros to their full extend. "too powerfull" only becomes a problem, when that means compromising on some other desirable feature.

, annoyingly lacking in others (lack of interaction with visibility, lack of unsafe/async hygiene, can't be used in postfix position or use type-based dispatch), and way too hard to analyze, both for tools and users. There are also always some things which can't be expressed in a strongly typed system, or which must be expressed in a Turing-complete sublanguage in types. This means that we would either have to expand type system with even more complex additions, or use loosely-typed Turing-complete systems anyway (proc macros, typelevel programming). I'd say those alternatives are worse.

Declarative macros are IMO not that hard to analyse, at least not nessarily harder them macro functions. The hygiene/visibility issues should be fixable with would also say that they COULD interact fairly well with visibility, but yes, there is quite a lot of legacy burden in the actual implementation.

I agree, with the fact, that macros cannot be used in postfix position (although I don't know how usefull this really is) and that they do not offer any reflection. (However I also don't know how this would be implemented in macro functions, and in principle, this could be added to macros.

I agree with the Turing compleatness part. But here the question is, again, how macro functions would introduce Turing compleatness (SFINAE?) and whether this is even desirable. In addition, we also have const functions for this.

Declarative macros are Turing-complete. There is no way to know what a macro expands to without running it. At least, unlike proc macros, we're guaranteed not to have any side effects. A macro function directly gives you the final code, which is much easier to understand.

Even disregarding analysis hitting the halting problem in principle, a macro can be quite hard to understand. While such examples would often do Rust parsing, even with simple macro-based function generation the results can be unreadable. Just imagine several macros-used-as-functions calling each other.

There is also the significant compile time cost of parsing and expansion at every call site.

I don't expect macro functions to introduce Turing completeness beyond what is already available for generic functions. I would certainly not want to see SFINAE added to Rust, we can do better. Even C++ is slowly moving away from it.

Something like Zig's comptime could be a nice addition to Rust, but I have no idea how to fit it in the language.

3 Likes

Whether or not this ends up being considered a good fit for Rust, I thoroughly enjoy the idea, and the identification of an appealing point in design-space "hidden in plain sight" as it were. Bravo.

Also, not being able to ergonomically factor out nested &mut accesses (plain fns upset the borrow checker, plain macros more bother than they're worth both at the definition and use sites) is in fact very annoying and, whether or not this is the "right" solution to that problem, I would appreciate having some solution.

8 Likes

I'm going to go through and respond to feedback soon :tm: but I wanted to real quick respond to

This is a good idea, and I think a reasonable amount of overhead to bring attention to doing these usage-typed tricks. The OP syntax does support this by fn f<T: _>, but I agree that this is likely less visible than desirable. Using macro type T in the generic list is very clear about what is occurring. (I'd use macro type T to match the logical extension of macro const N.)

3 Likes

Let me offer yet a different perspective on this RFC, with some prior art.

I prefer to talk about this feature as "transparent functions" rather than "macro functions", even though my preferred syntax is macro fn (mostly because macro is an already reserved keyword, the semantics are vaguely related, and I don't know of any better word). Calling them "macro functions" is a bit confusing, because they are much closer in their semantics to functions than to macros. For example, if the signature isn't generic, then a transparent function is basically the same as an ordinary function.

I call them "transparent functions" because the primary difference from ordinary functions is their signature transparency. Normally in Rust a function's signature is a hard boundary for any kind of compiler analysis (unfortunate exception: impt Trait in return position and its interaction with auto traits). This gives powerful stability guarantees and is particularly invaluable for public APIs. However, this also brings complications, because the local type inference is significantly more powerful than global one, and some things are impossible to imitate at the function level (e.g. disjoint borrows).

Transparent functions opt out of opaque signatures, allowing type inference, borrow checker and possibly other analyses to look into the body of the function. This leads to a result which is reminiscent of duck typing or C++ templates, but it is neither. Instead the closest analogue is global type inference in languages such as OCaml and Haskell. In those languages there is no difference between a global function and a local closure, unlike Rust, where those have very different syntax and capabilities. You still need to declare the (parametric) signatures of all functions exported from a module, but you don't need to declare all signatures within a module.

I'll stress it again: it's the same type inference, the same syntax and the same functions. We just allow the compiler to deduce more details about our code, but if there is ambiguity, compilation will fail, as usual. This is in stark contrast with C++ templates, where the compiler will happily pick wrong instantiations out of a very hard to control list of functions, and where template resolution is (mostly by design) a Turing-complete process. Hindler-Milner type inference is guaranteed to result in a single valid type in finite (but possibly exponential) time, or fail to typecheck.

Now, it's true that global type inference can result in a brittle code at large scale, where a change in the body of some function causes a cascade of inference changes, breaking something unexpected in a different place. For this reason explicit documentation of type signatures is still a good practice in those languages. However, it works great at small scale (e.g. within a single, even large, module), and one can always use more specific type signature if a globally inferred one causes problems.

While Rust's type system is way more complicated than pure HM, and it's quite different from both OCaml and Haskell, I believe the same principles apply. The defaults of Rust are correct: for ecosystem stability, all types must be explicit and type inference should be only local. But I believe that relaxing that requirement and enabling global inference in specific explicitly annotated cases would be beneficial. It would facilitate simple code reuse, particularly in the cases which run afoul of the borrow checker, and reduce boilerplate in cases where explicit trait bounds do more harm than good.

12 Likes

That's not really the mental model of this I had, since I was expecting the duck typing approach -- a huge part of the win for this, in my mind, was that it could be called twice with different types but still call inherent methods on those types despite there not being a trait abstracting them.

If there is a trait, then the signature might be messy, but it can be written. So my instinct there would just be to polish the "here's the bound you meant to write" suggestions so that you do get a perfectly normal function.

But maybe it's that duck typing that makes the macro-ness important, and there's also a useful "no adhoc extension points, and no weird control flow, but checked at instantiation time not definition time" non-macro feature?

4 Likes

Imho it doesn't neatly fit in either duck typing or global inference approach, in part because a feature may be evolved in a different direction, which is incompatible with simpler older mental models.

For example, the inference of borrow regions (for e.g. disjoint mutable getters macro fn foo(&mut self) -> &mut Foo) cannot be expressed either as duck typing or as complex trait bounds. Something like generalization over integers could possibly be based on a trait, but I don't think you would expect the compiler to write a num-traits crate from scratch for you, like some GPT AI model. Features like explicit trait bounds on macro type parameters also don't really square with duck-typing.

Ad-hoc polymorphism w.r.t. presence of some methods with a given signature could also be expressed via nominative interfaces, like in Go. If nominative interfaces are added to Rust, then suddenly all functions relying on duck-typing of methods would become statically typed.

It also doesn't fit precisely into duck typing because the code is still statically typed. You don't ask an object for a method with a given name and roll with whatever it gives you. You instantiate the function at call site, and then run the standard type inference algorithm, with normal trait & method resolution.

So it may not be the case that the type system is capable of giving the most generic type to a given function, but each instantiation is statically typed in the usual Rust sense. This makes the feature distinct from every other similar feature in other languages. I personally prefer the "compiler can look into the called function's body" as the mental model, which seems closest to the "global inference" analogy.

Edit: given that many Rust devs have C++ experience and were burnt by template metaprogramming, the comparison of this proposal with C++ templates and the associated reflexive aversion are quite natural. However, it's just an analogy, and not a precise one. I am offering a different analogy, with other languages and language features which have a much better track record. I hope that looking at this feature from a more positive angle may help to evaluate it on its own merits (which are imho substantial), rather than being automatically dismissed as "C++ templates".

2 Likes