Tidy versus URLs in doc comments


#1

One of my library-tinkering PRs failed on CI with tidy errors:

/checkout/src/libstd/ascii.rs:468: line longer than 100 chars
/checkout/src/libstd/ascii.rs:469: line longer than 100 chars

The offending lines are URLs in a doc comment (the last two lines of this quote):

    /// Rust uses the WhatWG Infra Standard's [definition of ASCII
    /// whitespace][infra-aw].  There are several other definitions in
    /// wide use.  For instance, [the POSIX locale][posix-ctype]
    /// includes U+000B VERTICAL TAB as well as all the above
    /// characters, but—from the very same specification—[the default
    /// rule for "field splitting" in the Bourne shell][field-splitting]
    /// considers *only* SPACE, HORIZONTAL TAB, and LINE FEED as whitespace.
    ///
    /// If you are writing a program that will process an existing
    /// file format, check what that format's definition of whitespace is
    /// before using this function.
    ///
    /// [infra-aw]: https://infra.spec.whatwg.org/#ascii-whitespace
    /// [posix-ctype]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03_01
    /// [field-splitting]: http://pubs.opengroup.org/onlinepubs/9699919799/utilitiesV3_chap02.html#tag_18_06_05

As far as I know, Markdown does not allow lines like these to be split in any way. It appears to be possible to disable the line-length check for the entire file with a magic comment, but that seems like much too large a hammer. Has this come up before? What would people think of adding a heuristic to tidy that would allow this kind of overlength line? Perhaps

if line ~ m!^\s*/// \[[^]+\]:\shttps?://.+$! { dont_complain(); }

(please forgive the macaronic pseudocode :wink:)


#2

Yeah, this sounds like something that could be improved. Linkchecker is very basic…


#3