I wrote the original index generator. The primary purpose at the time was to reduce the number of times the S3 API is called to keep the costs of index generation down. It appears that the cost savings ended up not being justified, especially given that in the end to generate the index at all, every single file in the bucket must be listed â and they are very numerous, not to mention that the pace they are added at only increases!
In the end the problematic and serial part of the index generation is making an API call per 1000 files in the bucket. Eventually that adds up to a lot of time spent doing mostly nothing and the problem will only get worse over time.
Now, for dated directories generating a listing is not too much of a problem, because S3 API supports filtering the listing by a certain prefix (so e.g. filtering by dist/2019-01-15 will only return files in said directory). This is why listing for date directories works and is possible. Alas, the same is not possible for the dist/ directory itself because it has not an unique prefix that could be used to filter the files there!
I believe that changing the directory structure somewhat to make sure that âcurrentâ dist/ does not end up containing the dated subdirectories would resolve essentially all of the blockers here, but it would also end up breaking all the tooling that assumes the current structureâŚ