Rustdoc - use Stork search index?

If I recall correctly, the Rust docs currently have a custom JavaScript-based search implementation.

Might be interesting to evaluate it against stork, which is a Webassembly/Rust based search indexer for static sites https://github.com/jameslittle230/stork/

cc @GuillaumeGomez, whom I remember talking to about this at RustFest Barcelona last year.

1 Like

It requires js to build the index. Therefore, it's not useable for rustdoc since it'll require extra dependencies. But thanks for the info! :slight_smile:

:wave: Stork creator (who admittedly knows nothing about the Rustdoc build system) here. Totally fine if you don't want to keep pulling this thread, but I wanted to clarify that building the index is a completely native, non-JS process; it's all contained within the executable.

Searching from the browser, as you might expect, does involve a Javascript frontend.

1 Like

The frontend search being in JS isn't an issue.

I was misled by the README when reading " brew install". By re-reading it, it's not required. Strange to require brew to download rust. :wink:

I should go ahead and note one other issue with using wasm:

At least on Windows, you cannot run wasm from a webpage loaded by opening the HTML directly from the filesystem into your browser. (Which currently works and is what cargo doc --open does.) You need to run an actual server and connect to that instead. The reason is because wasm must be loaded with a correct MIME type of application/wasm, and the filesystem does not do that.

This may be a browser thing, but I'm not sure who's at fault here or if this is by design, tbh. Browsers could skip the MIME type check for file:// URIs if they wanted to.

3 Likes

It's a very important detail, thanks a lot @CAD97! Yes, it needs to be able to run without server.

That's a really dumb problem to have. =/

It's been, what, two years since wasm MVP shipped?

It's not just a Windows thing. Trying to load a wasm-bindgened, webpacked site from disk in Safari gives

[Error] Cross origin requests are only supported for HTTP.
[Error] Fetch API cannot load file:///Users/nemo157/sources/cbor.nemo157.com/dist/048579358ee0369cf25b.module.wasm due to access control checks.

and in Firefox

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at file:///Users/nemo157/sources/cbor.nemo157.com/dist/048579358ee0369cf25b.module.wasm. (Reason: CORS request not http).
3 Likes

It looks like eventually WebAssembly/ES Module Integration and HTML: WebAssembly Javascript Module Integration will allow loading WASM modules similar to JS modules. Unfortunately JS modules seem to have similar limitations that mean they won't work on file:// urls either:

To get modules to work correctly in a browser, you need to make sure that your server is serving them with a Content-Type header that contains a JavaScript MIME type such as text/javascript . If you don't, you'll get a strict MIME type checking error along the lines of "The server responded with a non-JavaScript MIME type" and the browser won't run your JavaScript.

You need to pay attention to local testing — if you try to load the HTML file locally (i.e. with a file:// URL), you'll run into CORS errors due to JavaScript module security requirements. You need to do your testing through a server.

JavaScript modules - JavaScript | MDN

So, unless something changes about the file:// security context it doesn't look like serverless (no, not that serverless) WASM is going to be a thing

1 Like

Since the limitations are about network/filesystem accesses, if you really needed to, you could embed wasm binary code in your page's javascript and use that ArrayBuffer instead of fetching these bytes over the network/filesystem.

1 Like

I guess it would work, but that's a really inefficient way of storing and delivering wasm. It'd be nice to have a more idiomatic alternative.

@GuillaumeGomez,

It does run without a server: the Javascript library and the WASM bundle are both loaded from a CDN (and that CDN takes care of the MIME type, etc). To clarify on usage, you'd just have to include the script tag[0], point the library at your index file (which will successfully load from the filesystem), and everything else would "just work," even on Windows (iirc, haven't tested it in a bit).

Strange to require brew to download rust.

Ha! When I have some time, I'll put it in Cargo. My intended audience (at least at the start of the project) was web developers over Rustaceans, though it might be time for that to change :slight_smile:

[0]:

<script src="https://files.stork-search.net/stork.js"></script>
1 Like

@jameslittle230 the more important thing is that rustdoc locally generated documentation can be used offline (I personally would be ok with having cargo doc --open spawn a little web server to get around file:// issues, but my opinion doesn't matter much).

(EDIT: gah, why can't I retarget which post this is a reply to)

I see, if that's the case then Stork probably isn't a good fit for now.

Thanks for considering and chatting!

1 Like

Where should I go to find out why these restrictions apply to file:// URIs?

I suppose it'd be a big security issue if you could, from an https:// URI, read an arbitrary file:// URI, especially if you're getting raw bytes (like via WASM loading APIs) rather than "just" attempting to interpret the file as HTML, CSS, or JS. But I would think that file:// to file:// requests could (should?) be considered same-origin for CORS.

Similarly, it might make sense to turn off strict MIME checking for file:// URIs, since the fs doesn't (can't?) set a MIME type. Or maybe have the browser set the MIME type for files via extension, the same way a file serving web server would.

I understand that "load this through py -m http.server or similar isn't that big of a requirement for developers... except when it is, like for cargo doc --open/rustup doc.

1 Like

file:// has just been on the way out for a long time; it's been hampered in various browsers for various reasons for years. It's also not consistent between browsers what restrictions exist. It's not a heavily used feature and so it's often neglected, or fixes come in that (IMHO) are a bit too heavy handed, because it doesn't really matter (or so it seems to me, anyway).

The issue then is, if I can get you to download and open an HTML file, I can make it send me any file on your computer. (Or if it's restricted to files within the same directory, I can still read any file in your Downloads folder, or your Desktop, or wherever you save it.)

"Don't open HTML files from untrusted sources" is good advice, of course, but browser makers still consider this too big a risk to leave up to caveat user. (I believe there are already risks of this sort, but the goal is to reduce these over time rather than increase them.)

5 Likes

CVE-2019-11730 makes mention of this being exploited in a "popular Android messaging app".

1 Like

That doesn't seem unreasonable to me as a workaround. (Though I also think it's worth having a conversation with browser vendors about MIME types and WebAssembly.)

1 Like

Well, except for the fact that stork.wasm is 222kB currently, so that'd be 300kB of base64 in your JS.

At the very least, you'd want a "file:// hack" mode and a "real server" mode.

So I think the path forward would be to get a WASM API that works for file:// URIs. And probably, that's letting <script type="module"> work on file:// URIs, and the planned import * as wasm from 'file.wasm'; for modules to also work for file:// URIs. And that "probably" is "just" having browsers assign MIME types to file:// URIs based on filetype.

And hopefully CORS doesn't get in the way of module imports for file:// URIs, though I will admit that allowing reading arbitrary files is almost definitely a no-no.

1 Like