I’m with the nay-sayers. Every real string processing algorithm I’ve seen can easily be amended to use UTF-8 with byte offsets and/or appropriate iterators. Usually it also becomes more efficient, more general (e.g. comes to support code points outside the BMP with no extra effort), or both. See also: UTF-8 Everywhere. I’ll pick on your examples to demonstrate:
I’m not sure I understand where those requirements cine from, is the code public? What do you need the char offset for, or in other words, why can’t you use an iterator? Alternatively, if you have the byte offset, you can get the char at that offset, and the byte offset of the next char, with char_range_at. If you somehow need a char offset, you can keep track of that in a separate counter, or with enumerate() if you use the chars() iterator.
Coming to think of it, I also wonder why a base85 converter works with strings in the first place. Base85 its binary <-> ASCII. To encode a &str, I’d just encode its bytes (&[u8]). To decode a &str, decode to UTF-8 bytes and convert them to a &str with one of the from_* constructors.
fn strip<'a>(s: &'a str, front: &str, back: &str) -> &'a str {
assert!(s.starts_with(front));
assert!(s.ends_with(back));
&s[front.len() .. s.len() - back.len()]
}
I haven’t thought too hard about lines and words, but (1) those seem to be even rarer, in my experience, and (2) a Vec<&str> constructed from the right iterator (lines/words) allows an implementation very similar to the above.
I know that situation, and I have sympathy. But that alone should not dictate API design, especially not for something as fundamental as string manipulation. Doubly so when it makes the Right Way™ more annoying to follow. In addition, while Rust does fairly well for small programs, I don’t think it is, or should be, actively optimized for quick and dirty scripts.
Finally, you (or anyone else who supports this proposal) is free to implement it in a library, put it on crates.io, and see how well it turns out in practice. You won’t even need to define a different string type: If your crate defines Byte, Word, etc. then impl Index<Range<Word>> for str passes the orphan check just fine.