Read my message again.
First of all, ‘typical’ is doing a lot of work here, especially since half-width versus full-width rendering depends on the current locale. Second, if you’re weighing grapheme clusters by their expected column width, you’re not merely counting them or indexing by them. And third, counting grapheme clusters, even weighted, is not enough. Take this program:
def highlight(t0, t1, t2=''):
print(t0 + t1 + t2)
print(' ' * wcwidth.wcswidth(t0) + '^' * wcwidth.wcswidth(t1))
# noooo, you can’t just count grapheme clusters to measure the width of text on a particular display device, it will fail to handle bidirectional text correctly, nooooo
highlight("haha, ", "Latin-based assumptions", " go brrrr")
highlight("something \"😀😃😄😁\" ", "something")
highlight("I have read ‘吾輩は猫", "である", "’ recently.")
# keyword arguments added to alleviate direction confusion in the syntax-highlighted source
highlight(t0="The inscription read ‘מנא מנא תקל ", t1="ופרסין", t2="’; I didn’t understand what it meant.")
When I run the above example under a VTE-based terminal, the first three samples display reasonably, but in the last one the wrong fragment is highlighted:
Under xterm (and uxterm) the Hebrew text is rendered backwards:
It’s not enough to count grapheme clusters, or even weigh them by expected column width; to underline text correctly you need to know whether your output device can render bidirectional text in logical order, and basically implement Unicode bidirectional algorithms on your own (to convert text into left-to-right order and/or to compute proper column spans). At this point you’re enough removed from grapheme cluster indexes that using those doesn’t really afford you any advantages.
rustc seems to use basically the same wrong algorithm I posted here. Also note that I haven’t even mentioned tabulation in this post, which is also relevant to this problem.)
Well, the caveats we keep pointing out do cast some doubt on the usefulness of certain string APIs, and given that such APIs’ presence would encourage devising half-, ahem, -hearted solutions to common problems, they provide some argument for those APIs to be considered harmful™ and therefore excluded.