Case insensitive match pattern


#1

I’m wondering whether it’s feasible to add a special pattern that can be used with match in a case insensitive way without allocating. One horrible way to do it outside the compiler would be to generate all possible cases of a string at compile time but this generates so many possibilities making this infeasible.

I got this idea while working on optimising the psl crate’s performance. Assuming input is in lowercase already, List::domain can process 10 million domains in less than 3 seconds. However, if I lowercase the input first to do a case insensitive match, it takes about 13 seconds.


#2

Servo uses unicase for ASCII-case-insensitive comparison, and caseless for comparison with Unicode case folding and normalization. Both work without allocating new strings.

Update: I see that unicase now supports Unicode case folding too. (It used to support only ASCII case.)


#3

Thanks. Can caseless be compiled with no_std? Currently psl is no_std.


#4

@rushmorem I just browsed the code: No it does not support no_std. Its dependency unicode-normalization uses a std::collections::VecDeque somewhere.


#5

@MajorBreakfast Thanks. I was planning to check the source code later. You have saved me a bit of time :slight_smile:


#6

I submitted a PR to allow caseless to work with no_std, minus the functions that require unicode_normalization:


#7

Cool, thanks!