Better support translations for The Book?


#21

Yes, that decision is kind of blocked, but my proposal gives ability to solve problems like:

… Some sections (pages) are tens of screens long, and to provide smooth transition from one version to another we should track smaller units than entire files (web pages)…

A more complex way to do this is to add some kind of identifier to each file (something like UUID). If the identifiers of the files are identical, we can cross-link them…

The approach with MITI meta-data solves problem of synchronization between translations independently from problem of “how multi-lang book will be organized in terms of repo structure”. But that approach is definitely solves synching the UI/HTML views on switching between languages because the MITI data can be rendered into HTML anchors (which is not available in markdown)

I’ve read issue and I see it has other related but different aspects:

  • suitable folder/layout structure for multi-lang support
  • possiblity to switch (or not) if there is no translation provided on other lang
  • necessary updates in mdbook

#22

As we already know @steveklabnik leaves Mozilla, but we hope someone will replace you. So I will leave proposition here for later or I can move it into github issue if it looks ok and reasonable.

2. How we can update writers flow and how that can be possible to archive

I wouldn’t like to disturb writers with all that technical stuff, so it is good to have that MITI meta-info to be included into markdown automatically. I thought about ‘git-hook’ (better on server-side) or by some similar stuff (suggestion is welcome).

So the logic could be following:

  • Any completely new MITI added into text will not have meta-info and will be added by git-hook ( oe another mechanics). If it MITI new or changed is up to writer.

  • If there is a MITI exist with the computed HASH/CRC it will be recalculated (CRC) and updated by git-hook (and will be the same as before or will get new CRC value)

  • If writer moves MITI to another page/file AND MITI content has not changed, writer leaves meta-info unchanged (git-hook will recompute CRC to the same value #2)

  • If writer moves MITI to another page/file and changes NITI/paragraph for more then on XX% (say 50%) writer can REMOVE meta-info, so new UUID+CRC will be generated by git-hook

  • when other language translation is branched (or merged with original) all that MITI meta-info stays ‘as is’ inside translated text/repo/branch. MITI can be translated OR not yet. MITI meta-info will help later to more quickly find and identified MITI between original and translated repos. In case translation is present and CRC is equal in both origin and transaction, then it’s OK, otherwise MITI needs update in translation

  • removed MITI content is removed altogether with MITI meta-info

  • when documentation is moved to new book repo/branch (say for making new version of document), all previous content is preserved and later updated by logic above.

In such approach any MITI can be found/identified by UUID at any place of original documentation (even moved to another MD file) AND it can be tracked/checked inside all other translation(s).


#23

3. Translators approach

Translation contribution team work by they own internal flow/process. The team makes internal agreements/consensus on which MITI(s) are completely translated and what can be merged into parent repo by PRs. All MITI are identified and traceable.

Probably task could be configured in CI/CD build script for tracking ‘known/new/updated/deleted’ MITI IDs+CRCs in original repo/branch/folder?? I’m not completely sure here. Periodically running such task (e.g. manually) we could store info about parsed UUID/CRC values and put difference in translation reports (or translation tasks later) when something new/updated/removed in original MITI/text and differ with their translation.


#24

Zola (formerly Gutenberg) by @Keats has finished implementing multi language and will have it in its next release.

I don’t think Zola is far away from feature-parity with mdBook. Someone invested in multilang could easily set up a mock migration to put that to the test.p


#25

We could have a github bot that, for every commit made to the english version of the book, creates a PR for each of the translations, translating english to the appropriate language using, e.g., the Google translate API.

The maintainers of the translation could then adapt the PR appropriately.

One way to start a new “translation” could be to start by using Google translate on the master version of the english book, and then add the translation to the bot so that it is kept up to date (at least via PRs).


#26

I don’t feel like automatic translation would add much there. Following all changes would suffice.

Wouldn’t it help to have the language book track the version it translates in some form? Then, changes can be worked off in order from there.