I just made fresh clones of the llvm-git-prototype and rust with the current llvm submodules.
The disk usage of llvm-git-prototype is 1383M total – 601M in .git and 782M for the checkout.
$ du -BM -d1 | sort -h
1M ./debuginfo-tests
1M ./libunwind
1M ./parallel-libs
4M ./libclc
7M ./libcxxabi
7M ./openmp
11M ./clang-tools-extra
15M ./lld
29M ./llgo
32M ./polly
42M ./compiler-rt
43M ./libcxx
99M ./lldb
136M ./clang
363M ./llvm
601M ./.git
1383M .
For rust submodules, remember that the metadata is separate under .git/modules.
$ du -BM -c -s src/{llvm{,-emscripten},tools/{clang,lld,lldb}} | sort -h
14M src/tools/lld
99M src/tools/lldb
135M src/tools/clang
227M src/llvm-emscripten
355M src/llvm
828M total
$ du -BM -c -s .git/modules/src/{llvm{,-emscripten},tools/{clang,lld,lldb}} | sort -h
31M .git/modules/src/tools/lld
162M .git/modules/src/tools/lldb
432M .git/modules/src/tools/clang
881M .git/modules/src/llvm-emscripten
882M .git/modules/src/llvm
2386M total
Again that those in src/tools/ are optional for general rust developers, as is llvm-emscripten. You really only need src/llvm, and not even that if you use external LLVM (but then you shouldn’t need the mono repo either).
But note that the monorepo’s .git is already much better packed, despite having more content! So the monorepo’s full 1383M can be directly compared to:
$ du -BM -c -s src/llvm .git/modules/src/llvm
355M src/llvm
882M .git/modules/src/llvm
1237M total
So the monorepo actually reduces network use for the git data, and is only a little bit bigger in total disk usage with the checked out data.
EDIT: I usually have an alias du='du -h', but I think some of the rounding from M to G is confusing here. I’ve updated the numbers using du -BM for a more consistent comparison.