Testing and mocking based on name conversion

cathaysia · December 17, 2024, 8:23am

Currently, when writing mocks/tests we need to write some boilerplate code:

#[cfg(test)]
mod color;

#[cfg(not(test))]
mod color_mock;

#[cfg(test)]
mod color {
    pub use color_mock::*;
}

Or in this simpler form: Testing and mocking based on name conversion - #3 by zackw

Mock and test name conversion

Here, I propose to add support for mock/test based on name conversion:

mod color;
mod color_mock;
mod color_test;

fn op() {}
fn mock_op {}
fn test_op {}

Comparison with existing solutions:

#[cfg_attr(test, path = "color_mock.rs")]
mod color;

#[cfg(not(test))]
fn op() {}

#[cfg(test)]
fn op() {}

#[cfg(test)]
mod test {
    #[test]
    fn test_op{}
}

As you can see, the first form looks simpler and easier to remember.

Existing Implementation

There are already some language support this. This means that name conversion is feasible. For example:

Advantages and Disadvantages

Advantages:

In this way, we can reduce the amount of boilerplate code we write. Then this will reduce the occurrence of some bugs. For example Testing and mocking based on name conversion - #9 by cathaysia
Reduced mental burden. Newbies can learn to write tests and mock modules in less than a minute.

Disadvantage:

People who don't understand the naming convention will be confused. This requires programmers to at least read "How to Write Unit Tests". Testing and mocking based on name conversion - #2 by jjpe

Others' suggestions

Testing and mocking based on name conversion - #13 by CAD97

jjpe · December 17, 2024, 10:37am

This is an example of magic naming, something that has bitten me personally in the behind not too long ago when using graphviz with the dot language.

In dot, you basically define a graph, with nodes and edges. Each node can have its own name, except there is a magic node node that can be used to define properties for all modes at once. Similar story for edges (i.e. edge) and graphs (i.e. graph).

Dot also has the notion of subgraphs. But the behavior is different depending on whether or not the name of the subgraph is prefixed with cluster_.

Quite useful, except when it isn't, and then it becomes a real chore to deal with.

zackw · December 17, 2024, 1:35pm

OP's specific example can be compacted using cfg_attr:

#[cfg_attr(test, path = "color_mock.rs")]
mod color;

So maybe some form of else block for cfg attributes would help the general case? For cfg_attr that's an easy extension to the existing syntax:

#[cfg_attr_ifelse(
  test;
    path = "color_mock.rs";
    path = "color_real.rs"
)]
mod color;

(semicolons separate the arms so you can still write multiple comma-separated attributes in each arm)

It's harder for plain cfg applied to an item, the least bad idea I have is to allow if cfg!(...) { .. } else { ... } at file scope, but people would immediately want to generalize that to arbitrary (const) controlling expressions and beyond, maybe we don't want to let that genie out of its bottle...

cathaysia · December 17, 2024, 2:14pm

I don't think this is a magic name. There are many similar examples in reality. For example, nextjs assumes that the layout file is named layout.ts and the page file is page.tsx¹. And vitest treats files ending with _test as test files².

Names like xxx_mock or xxx_test should be simple and clear. This is the so-called "convention over configuration"³.

cathaysia · December 17, 2024, 2:15pm

Yes, you are right. But the purpose of this post is to reduce the writing of such boilerplate code. Isn't it

jjpe · December 17, 2024, 5:31pm

This is the term that came to mind as well for me. The thing is, specifically in a systems language (as opposed to a language more tied to certain domains) I do not think that it is desirable to introduce such ideas.

jdahlstrom · December 17, 2024, 7:02pm

And Rust does not subscribe to that design philosophy. On the contrary it prefers explicitness. There’s nothing else in the language that uses such magic concatenated names. For example, test modules and functions are not found by name but by attributes. The only hardcoded filenames are main.rs, lib.rs and mod.rs.

jrose · December 17, 2024, 11:05pm

I don’t think that’s quite true. Module foo can be found in foo.rs or foo/mod.rs. Files found in a benches directory are assumed to be benchmark executables by default. Yes, directories and extensions are a special kind of concatenation, but in cases where convention provides enough value we do in fact support it alongside configuration.

cathaysia · December 18, 2024, 2:18am

As an example, I want to share with you a real example that happened to me. I started working with Rust last April. I learned that the boilerplate code for unit testing was like this:

#[cfg(test)]
mod test {
    #[test]
    fn test_xxx() {
    }
}

I was never sure if mod test was a magic name. Finally I found that to write a test I just needed to do:

#[test]
fn test_xxx() {
}

Yes, I omitted the #[cfg(test)] mod {} here. I found it works fine without it.

This soon became a problem, and when I built the code, the symbols in the test conflicted with the symbols in the production code. It was only then that I realized the role of #[cfg(test]. Later, I wrote more platform-specific code and learned more about cfg.

Situations like this may happen to other people. We can certainly write some boilerplate code to implement this functionality, which can be simple with careful design. But it can never be more intuitive and simple than name conversion. Name conversion seems to be simpler and easier to understand, and there is no need to memorize so many boilerplate codes. Because some are very intuitive.

Compared to the implicit control flow caused by inheritance, name conversion actually does not add much mental burden. You can learn and understand it in less than a minute. Everything is very natural.

cathaysia · December 18, 2024, 2:56am

And Rust does not subscribe to that design philosophy.

Can you explain Rust's design philosophy in detail? When I first came into contact with Rust, I heard that it seemed to be "performance and safety". Rust never seemed to stick to a certain principle like Go. Rust seems to be pragmatic in my opinion. It borrows a lot from other languages.

On the contrary it prefers explicitness.

Name conversion and explicitness don't seem to conflict. In terms of explicitness, Drop and Deref actually break this rule. If Rust favored explicitness, would it prefer defer over Drop?

zirconium-n · December 18, 2024, 3:07am

Side rant: TBF this proposal is obviously (to me) something that won't happen. It's a language change for saving a couple lines, and can be easily emulated with a macro. Whether you think it's a good change or not is not really that relevant. At most you can debate whether Rust should have done this in the beginning, but in reality it just won't happen, and using a macro is not end of the world anyway.

cathaysia · December 18, 2024, 3:22am

The best time to change something is ten years ago, and the second best time is the current situation. It is easiest to propose a proposal like this when the language is being designed. It is undoubtedly very difficult to implement it now. But the difficulty of implementation does not mean that we cannot come up with ideas, right? Maybe one day the community will reach a consensus, or a strong leader will promote this idea?

CAD97 · December 18, 2024, 6:02am

A version that might actually fit Rust's design could be

mod color;

#[mock]
#[path = "color_mock.rs"]
mod color;

fn op() { … }

#[mock]
fn op() { … }

#[test]
fn test_op() { … }

with the semantics that an item tagged #[mock] is #[cfg(test)] and is allowed to shadow a non-#[mock] item.

However, Rust doesn't really want to encourage this kind of unconditional mocking for #[cfg(test)]. This just means that you're testing your mocks instead of the code you're actually going to be using at runtime. Instead, Rust much prefers you to write your code in a "sans IO" style, or make it generic over the service provider.

cathaysia · December 18, 2024, 12:19pm

#[mock]
#[path = "color_mock.rs"]
mod color;

Maybe this is better? It looks weird to combine two separate statements into one function.

#[mock, path="color_mock.rs"]
mod color;

CAD97 · December 18, 2024, 6:31pm

It's separate because the attributes are separate, and #[path] is just doing what #[path] does. Using multiple attributes on a single item is quite common.

Additionally, you could write something like

mod color; // ./color/mod.rs

#[mock]
mod color {
    mod mock; // ./color/mock.rs
    pub use self::mock::*;
}

instead, if you wanted to avoid using #[path]. Or mod { include!() } if you want to make it more explicit.

cathaysia · December 19, 2024, 1:54am

This property macro doesn't seem to be a departure from the existing #[cfg(test)] approach. It just adds some aliases. If we want to add name conversion, wouldn't it be better to switch to name conversion completely (rather than just adding some aliases to the current macro)?

CAD97 · December 19, 2024, 3:52am

The difference from #[cfg] is that you only need to mark the item mock, and it will shadow the non-mock item with the same name.

Magic symbol name semantics where the code nominally says it does one thing (defines multiple items with different names) but actually does a different thing (makes one item do different things in tests than in the normal build) aren't going to be added to Rust at this point, so I was exploring a different way of addressing not the feature you're proposing but the underlying desire which the feature is servicing.

jjpe · December 19, 2024, 10:50am

It'll compile fine yes. But if you add a lot of tests it'll blow up the compile time of your project. That's one reason all tests generally live in a module marked with #[cfg(test)] - that will only compile the tests when cargo test is run.

No offense, but that is pretty subjective. It isn't natural to me at all, for example. Quite the contrary, it's just one more highly undesirable (from my POV) special case to remember.

Aside from that, an appeal to naturalness, as a form of argument, is a bit of a fallacy. There are plenty of natural but harmful things (cancer and other disease, the manchineel tree, hurricanes, cobra's, I could go on). Conversely, any kind of technology is by definition anything but natural. Yet it clearly is a good thing that e.g. Rust exists.

There is more to it, but the part of Rust's philosophy that's relevant here is "explicitness over implicitness". In other words, what's happening should always be explicitly written with as few a priori assumptions as possible. This directly conflicts with the idea of magic naming / conventions.

So you see, initially they might seem not to conflict, but they actually do.

Rust has another point in its philosophy: backwards compatibility, so that all rust code ever written should (at most with minor modifications / cargo fix) keep compiling on arbitrary future Rust versions.

That does mean though that arguing defer vs Drop a bit of a moot point. Drop is stable, so it won't be going anywhere, ever.

cathaysia · December 19, 2024, 11:36am

explicitness over implicitness Can you point out where this sentence appears in the official Rust documentation?

I checked the rust official website again and found that it said: "Performance, Reliability " , Productivity"

I checked the zig website and I saw it says: "No hidden control flow. No hidden memory allocations. No preprocessor, no macros."

I guess this statement probably applies more to zig than to rust.

You think that Rust's choice of drop instead of defer is a historical issue. In fact, Rust uses RAII for resource management, which is obviously contrary to what you said.

I always think that Rust is a pragmatic language. I guess the reason why many people think "explicit is better than implicit" is because of the influence of composition over inheritance.

We can see that whether it is drop, deref, MutexGuard, macro, async, the compiler actually does a lot of work behind the scenes, secretly inserting and converting a lot of code. This is not "explicit" at all.

If a file is called xxx_test.rs what are you going to use it for? Will you use it for something other than testing?

To put it more extreme, have you ever been surprised by the magic names src, doc, bench, and test that already exist in Rust?

There is an idiom in China called "Gu Ming Shi Yi", which means that when I see its name, I know its meaning. I am curious whether xxx_test.rs meets this standard? If it does, is it still a magic name?

jjpe · December 20, 2024, 7:27am

As might be gathered from the terseness of that statement, that's a slogan, not a philosophy.

Same for the zig website.

Writing down a philosophy, especially in a coherent way, is something that both will and should take more than 3 words and some punctuation.

Please don't put words in my mouth. Even though it happens to be true that Rust uses RAII, I claimed exactly none of what you quoted.

I said Drop is stable and won't be going anywhere, ever, as long as Rust is around.

It's not (primarily) that. The explicit over implicitness tenet has been reinforced over the years by blog posts, reddit, as well as on URLO and here on IRLO.

So you see, it's not just some folks thinking that. It's factually true, and has been an active tenet for almost a full decade now.

There are a few exceptions, yes. But note that 1. Those are exactly that (exceptions to the rule) which were made because the alternatives were less palatable or just straight up inferior solutions at the technical level, 2. Exactly none of that requires a user to remember magic incantations off the top of their head for fear of invoking unwanted side effects if they don't, and 3 drop and async (at least in how they're used) are merely syntactic sugar for things you also could do otherwise eg manually inserting drop calls (which is sometimes actually useful) or just writing fn foo() -> impl Future<... > rather than async fn foo() -> ....

Its true that's there's more going on for async (ie converting it to a state machine) but there's more going on with async anywhere and everywhere it exists. It's a consequence of the execution model, not of the language making some choice about explicitness.

I might use it for other purposes, yes, depending on the situation.

I have also used functions and macros in unusual-but-perfectly-valid ways, which was made possible by people not thinking they know what's best for other developers, and instead just leaving such options completely open for each individual user to decide.

That's irrelevant, as those things are stable and thus will not change regardless of what anyone thinks, says or does.

The relevant part is that your proposal actively requires work from the user to remember magic names, which I consider undesirable as a paradigm in something as fundamental as a systems programming language. It fits better in languages like Python, or like I said in DSLs.

As I indicated a couple of paragraphs above, clearly it doesn't.

Topic		Replies	Views
#[cfg] if-then-else language design	6	174	December 21, 2024
Contributing to Rustup - help with code structure needed tools and infrastructure	4	1050	March 25, 2019
Pre-RFC `#[test]` on `mod` language design	10	1226	September 3, 2022
Idea: make assert!() a keyword - REJECTED language design	103	4567	January 4, 2021
Unbaked Idea: DRY using "parameterized inline modules"	21	1818	March 25, 2019

Testing and mocking based on name conversion

Mock and test name conversion

Existing Implementation

Advantages and Disadvantages

Others' suggestions

Related topics