Indent/pretty print compilation steps as per dependency graph?

I've been thinking it might be "nice" if there was a way for "cargo build" to show some context about the underlying dependency graph, by way of showing it in its build order.

For example, I've been packaging a lot of crates for Gentoo, and one of the nice things this lets me do is pretty-print the instal-order as a dependency tree.

For example, when installing the dependencies required to test syn, output looks like(edited to remove our toolchain specific stuff) this:

 syn-1.0.5 
  insta-0.9.0 
   ci_info-0.3.1 
   console-0.7.7 
    atty-0.2.13 
    clicolors-control-1.0.1 
     lazy_static-1.4.0 
    parking_lot-0.9.0 
     lock_api-0.3.1 
      scopeguard-1.0.0 
     parking_lot_core-0.6.2 
      cfg-if-0.1.10 
      smallvec-0.6.10 
      rustc_version-0.2.3 
       semver-0.9.0 
        semver-parser-0.7.0 
    regex-1.3.1 
     regex-syntax-0.6.12 
     aho-corasick-0.7.6 
      memchr-2.2.1 
     thread_local-0.3.6 
    unicode-width-0.1.6 
    termios-0.3.1 
   difference-2.0.0 
   failure-0.1.6 
    backtrace-0.3.38 
     rustc-demangle-0.1.16 
     backtrace-sys-0.1.31 
      cc-1.0.46 
    failure_derive-0.1.6 
     syn-1.0.5 
      rayon-1.2.0 
       crossbeam-deque-0.7.1 
        crossbeam-epoch-0.7.2 
         arrayvec-0.4.12 
          nodrop-0.1.14 
         crossbeam-utils-0.6.6 
         memoffset-0.5.1 
       either-1.5.3 
       rayon-core-1.6.0 
        crossbeam-queue-0.1.2 
        num_cpus-1.10.1 
      ref-cast-0.2.7 
       ref-cast-impl-0.2.7 
      termcolor-1.0.5 
      walkdir-2.2.9 
       same-file-1.0.5 
     synstructure-0.12.1 
   pest-2.1.2 
    ucd-trie-0.1.2 
   pest_derive-2.1.0 
    pest_generator-2.1.1 
     pest_meta-2.1.2 
      maplit-1.0.2 
      sha_1-0.8.1 
       block-buffer-0.7.3 
        block-padding-0.1.4 
         byte-tools-0.3.1 
        byteorder-1.3.2 
        generic-array-0.12.3 
         typenum-1.11.2 
       digest-0.8.1 
       fake-simd-0.1.2 
       opaque-debug-0.2.3 
   ron-0.4.1 
    base64-0.10.1 
    bitflags-1.2.1 
   serde_json-1.0.41 
    itoa-0.4.4 
    ryu-1.0.1 
   serde_yaml-0.8.11 
    dtoa-0.4.4 
    linked-hash-map-0.5.2 
    yaml-rust-0.4.3

There's no need to indicate circular dependencies, or for that matter, show when a given item occurs as children of multiple dependencies, simply showing its place in the graph the first time its needed is enough to add some level of "nice"

Whereas currently you get a somewhat depth-first flattened traversal of some kind, which is substantially less "nice" when it comes to giving context as to the why of each step.

   Compiling semver-parser v0.7.0
   Compiling proc-macro2 v1.0.6
   Compiling unicode-xid v0.2.0
   Compiling autocfg v0.1.7
   Compiling memchr v2.2.1
   Compiling cc v1.0.46
   Compiling arrayvec v0.4.12
   Compiling libc v0.2.65
   Compiling syn v1.0.5
   Compiling ucd-trie v0.1.2
   Compiling byteorder v1.3.2
   Compiling cfg-if v0.1.10
   Compiling rand_core v0.4.2
   Compiling nodrop v0.1.14
   Compiling lazy_static v1.4.0
   Compiling scopeguard v1.0.0
   Compiling serde v1.0.101
   Compiling bitflags v1.2.1
   Compiling failure_derive v0.1.6
   Compiling ryu v1.0.2
   Compiling smallvec v0.6.10
   Compiling maplit v1.0.2
   Compiling rayon-core v1.6.0
   Compiling linked-hash-map v0.5.2
   Compiling rustc-demangle v0.1.16
   Compiling regex-syntax v0.6.12
   Compiling syn v1.0.5 (/home/kent/.cpanm/work/1571591331.16473/syn-1.0.5)
   Compiling itoa v0.4.4
   Compiling unicode-width v0.1.6
   Compiling dtoa v0.4.4
   Compiling ci_info v0.3.1
   Compiling either v1.5.3
   Compiling difference v2.0.0
   Compiling same-file v1.0.5
   Compiling termcolor v1.0.5
   Compiling semver v0.9.0
   Compiling rand_chacha v0.1.1
   Compiling rand_pcg v0.1.2
   Compiling num-traits v0.2.8
   Compiling num-integer v0.1.41
   Compiling rand v0.6.5
   Compiling pest v2.1.2
   Compiling rand_core v0.3.1
   Compiling rand_jitter v0.1.4
   Compiling crossbeam-utils v0.6.6
   Compiling thread_local v0.3.6
   Compiling lock_api v0.3.1
   Compiling yaml-rust v0.4.3
   Compiling walkdir v2.2.9
   Compiling rustc_version v0.2.3
   Compiling backtrace-sys v0.1.32
   Compiling rand_isaac v0.1.1
   Compiling rand_xorshift v0.1.1
   Compiling rand_hc v0.1.0
   Compiling crossbeam-queue v0.1.2
   Compiling pest_meta v2.1.2
   Compiling memoffset v0.5.1
   Compiling parking_lot_core v0.6.2
   Compiling parking_lot v0.9.0
   Compiling quote v1.0.2
   Compiling aho-corasick v0.7.6
   Compiling rand_os v0.1.3
   Compiling termios v0.3.1
   Compiling atty v0.2.13
   Compiling clicolors-control v1.0.1
   Compiling num_cpus v1.10.1
   Compiling time v0.1.42
   Compiling base64 v0.10.1
   Compiling regex v1.3.1
   Compiling synstructure v0.12.1
   Compiling pest_generator v2.1.1
   Compiling backtrace v0.3.40
   Compiling crossbeam-epoch v0.7.2
   Compiling serde_derive v1.0.101
   Compiling ref-cast-impl v0.2.7
   Compiling pest_derive v2.1.0
   Compiling crossbeam-deque v0.7.1
   Compiling console v0.7.7
   Compiling ref-cast v0.2.7
   Compiling failure v0.1.6
   Compiling rayon v1.2.0
   Compiling serde_yaml v0.8.11
   Compiling ron v0.4.2
   Compiling chrono v0.4.9
   Compiling uuid v0.7.4
   Compiling serde_json v1.0.41
   Compiling insta v0.9.0
    Finished release [optimized] target(s) in 11m 49s

Surely this can be improved upon? :slight_smile:

5 Likes

How should this interact with parallel compilation? Currently, cargo could compile insta and serde_yaml simultaneously, which would complicate the pretty-printing. I think we would need to use some terminal escape magic to extend the pretty-printed list in-place if we wanted to get a sane result with parallel compilation.

3 Likes

A second complication is that any dependency can have multiple "parents" (e.g. many crates use cc).

Best option would be to support precompiled dependencies from crates.io, so that you wouldn't even have time to think about this list :slight_smile:

Like I said, for the scope of this proposal, that doesn't matter. The only interest is that the first parent that pulls a dependency into being compiled show some kind of contextual information about the why.

Cargo and friends already solve this problem, because they're already doing a depth-first traversal.

It doesn't need to show "all parents", just "a parent".

Cargo still needs to do some depth-first traversal when resolving dependencies, and parallel compilation shouldn't be an issue.

All one needs to do is inspect the stack of nodes at the given point in compilation, look upwards towards the tree root, and then use that data to indicate heirachy.

The only useful change I would imagine is Cargo should print the heirachial level its at before descending into its dependencies.

Currently it only shows "compiling foo" when the dependencies of foo are already satisfied for compilation.

It might require some clever thinking about how to represent this nicely in conjunction with parallel processing, for instance, prefixing each current target with the name of the first parent that caused it to be considered would be better than nothing.

serde_yaml-0.8.11 > yaml-rust-0.4.3
serde_yaml-0.8.11 > linked-hash-map-0.5.2
insta-0.9.0 > ci_info-0.3.1
serde_yaml-0.8.11 > dtoa-0.4.4
syn-1.0.5 > serde_yaml-0.8.11
syn-1.0.5 > insta-0.9.0
syn-1.0.5

For example, would be better than nothing.

If however cargo can determine that all dependencies of a given target are an isolated set, its opportunities for pre-printing it nicely might be better.

Do you really need it while building? If not, you can get the dependency tree before building from GitHub - sfackler/cargo-tree

Or after building, how about a filterable SVG of what was built when?

1 Like

That's nice, and I do appreciate that.

However, the goal is to improve the standard output to help make developers more aware of their dependency costs.

A nice textual, minimal representation can be acheived which improves the status quo slightly.

As someone who just wrote a cargo-tree-alike as well as a fun console screensaver-sort-of-thing, I have some ideas :wink:

The basic idea would be dynamically printing and updating/redrawing the tree based on what's compiling, complete with "throbbers" showing which packages are presently compiling. I've been using crossterm for this, for what it's worth.

A second complication is that any dependency can have multiple "parents" (e.g. many crates use cc ).

cargo-tree (and by extension cargo-audit) solves this by only printing the full dependency subtree for each visited dependency once. This naturally maps to the fact that dependencies only need to be compiled once. So, when you encounter an "already compiled" dependency, perhaps print it for context (or debatably don't), but at that point you're done and don't need to re-display the previously displayed dependency subtree.

4 Likes

I like this idea. You could also imagine progress bars for each crate.

This is why I think it's not right to choose any/first parent for a dependency. Some heavy deps will be blamed on the first crate that started compiling, and not on all the others. This is problematic, because:

  • the developer may waste time working to remove the parent dependency that got the blame, only to discover that the build is just as slow, and the problematic dependency is still there, now attributed to something else.

  • slow leaf dependencies may be costly, but if they're used by multiple crates, in a way they're "amortized" between the crates (you could think of it as 1/nth of slowness for each of the n crates that use it). For example, syn/quote may be expensive, but I want them for serde, so I may just as well use them everywhere else.

  • compilation is parallel, so you don't see actual compilation speed of crates by observing how fast they scroll by. To show a proper tree that starts at the root, Cargo will have to buffer/reorder output lines, because it compiles starting from leaf crates. That's actually a lot of buffering, so you'll probably see almost no output during compilation, and then a whole tree dump at the end.

So this output is really really rough and can be misleading in multiple ways if you don't take care to read between the lines.

There are better ways!

  • The -Z timings gives very precise cost of building. That's much better than "eyeballing" deps that go by the progress bar.

  • There's cargo tree which can show the whole tree precisely. And can show an inverted tree to investigate where a dependency comes from. And can show duplicate deps.

12 Likes

Time is not my concern here with deps, its complexity. Particularly, when working as a vendor, you might be constrained in that you can't fetch any dependency "automatically", and you may have to ensure any dependency that enters your graph is vendorized before you can use it. And this naturally ends up pushing every dependency to occur at the deepest point in the graph it can occur.

cargo-tree is not presently fit-for-purpose here, at least for me. I'd likely have to request a few features get added to it before I could use it as I described above. ( Because what I'm after is inherently a traversal-of-build-order )

After some thinking I've found an alternative suggestion that somewhat improves graph awareness in the build, but without needing an actual graph representation, nor requiring any curses magic.

Take this output from quickcheck 0.4.1

   Compiling libc v0.2.65
   Compiling winapi-build v0.1.1
   Compiling winapi v0.2.8
   Compiling log v0.4.8
   Compiling cfg-if v0.1.10
   Compiling utf8-ranges v0.1.3
   Compiling regex-syntax v0.3.9
   Compiling kernel32-sys v0.2.2
   Compiling log v0.3.9
   Compiling memchr v0.1.11
   Compiling thread-id v2.0.0
   Compiling rand v0.4.6
   Compiling thread_local v0.2.7
   Compiling aho-corasick v0.5.3
   Compiling regex v0.1.80
   Compiling rand v0.3.23
   Compiling env_logger v0.3.5
   Compiling quickcheck v0.4.1 (/home/kent/.cpanm/work/1573572353.504/quickcheck-0.4.1)

With a little tweaking based on the data I got from cargo tree, here's a hand-crafted example of how this could look:

   thread-id v2.0.0 > Compiling libc v0.2.65
   thread-id v2.0.0 > Compiling winapi-build v0.1.1
   thread-id v2.0.0 > Compiling winapi v0.2.8
         log v0.3.9 > Compiling log v0.4.8
         log v0.4.8 > Compiling cfg-if v0.1.10
      regex v0.1.80 > Compiling utf8-ranges v0.1.3
      regex v0.1.80 > Compiling regex-syntax v0.3.9
   thread-id v2.0.0 > Compiling kernel32-sys v0.2.2
  env_logger v0.3.5 > Compiling log v0.3.9
aho-corasick v0.5.3 > Compiling memchr v0.1.11
thread_local v0.2.7 > Compiling thread-id v2.0.0
       rand v0.3.23 > Compiling rand v0.4.6
      regex v0.1.80 > Compiling thread_local v0.2.7
      regex v0.1.80 > Compiling aho-corasick v0.5.3
  env_logger v0.3.5 > Compiling regex v0.1.80
  quickcheck v0.4.1 > Compiling rand v0.3.23
  quickcheck v0.4.1 > Compiling env_logger v0.3.5
   Compiling quickcheck v0.4.1 (/home/kent/.cpanm/work/1573572353.504/quickcheck-0.4.1)

What if multiple crates depend on the same crate?

1 Like

This was already asked, and I already provided a solution in the second half of this post (i.e. do what cargo-tree does)

Well, given cargo has to solve that problem in order to get rust building things anyway, it makes sense that if you're trying to produce something that adds more in-band context, you'd only have to prefix the "current target" with the name of the immediately preceeding parent.

In the mocked up example I gave, there were several instances of duplicates in the graph:

quickcheck v0.4.1 (/home/kent/.cpanm/work/1573642952.5167/quickcheck-0.4.1)
├── env_logger v0.3.5
│   ├── log v0.3.9
│   │   └── log v0.4.8
│   │       └── cfg-if v0.1.10
│   └── regex v0.1.80
│       ├── aho-corasick v0.5.3
│       │   └── memchr v0.1.11
│       │       └── libc v0.2.65
│       ├── memchr v0.1.11 (*)
│       ├── regex-syntax v0.3.9
│       ├── thread_local v0.2.7
│       │   └── thread-id v2.0.0
│       │       ├── kernel32-sys v0.2.2
│       │       │   └── winapi v0.2.8
│       │       │   [build-dependencies]
│       │       │   └── winapi-build v0.1.1
│       │       └── libc v0.2.65 (*)
│       └── utf8-ranges v0.1.3
├── log v0.3.9 (*)
└── rand v0.3.23
    ├── libc v0.2.65 (*)
    └── rand v0.4.6
        └── libc v0.2.65 (*)

As you'll note, cargo already has to provide some kind of linearised output.

So I opted to use the parent that was nearest to being compiled itself.

So I parented libc under thread-id, but I think I should have parented it under memchr ( Given that target starts compiling before thread-id does ).

And I parented memchr under aho-corasick as its consumers were: aho-corasick, regex, and as you'll see, due to regex requiring aho-corasick, memchr is naturally required to complete before aho-corasick ( as you'll see in the vanilla cargo build output )..

At this point I think the fully fledged cargo tree-alike model wouldn't ever get adopted, mostly because it grossly complicates output processing.

At this point, all I'm aiming for is to improve the standard output a little, as the current standard output is nearly useless for human consumption.

If comprehensive detailing of the "why" for each target is required, I could also be happy with this format:

   Compiling libc v0.2.65 [ rand v0.4.6, rand v0.3.23, memchr v0.1.11, thread-id v2.0.0 ]
   Compiling winapi-build v0.1.1 [ kernel32-sys v0.2.2 (build) ]
   Compiling winapi v0.2.8 [ kernel32-sys v0.2.2 ]
   Compiling log v0.4.8 [ log v0.3.9 ]
   Compiling cfg-if v0.1.10 [ log v0.4.8 ]
   Compiling utf8-ranges v0.1.3 [ regex v0.1.80 ]
   Compiling regex-syntax v0.3.9 [ regex v0.1.80 ]
   Compiling kernel32-sys v0.2.2 [ thread-id v2.0.0 ]
   Compiling log v0.3.9 [ env_logger v0.3.5, quickcheck v0.4.1 ]
   Compiling memchr v0.1.11 [ aho-corasick v0.5.3, regex v0.1.80 ]
   Compiling thread-id v2.0.0 [ thread_local v0.2.7 ]
   Compiling rand v0.4.6 [ rand v0.3.23 ]
   Compiling thread_local v0.2.7 [ regex v0.1.80 ]
   Compiling aho-corasick v0.5.3 [ regex v0.1.80 ]
   Compiling regex v0.1.80 [ env_logger v0.3.5 ]
   Compiling rand v0.3.23 [ quickcheck v0.4.1 ]
   Compiling env_logger v0.3.5 [ quickcheck v0.4.1 ]
   Compiling quickcheck v0.4.1  (/home/kent/.cpanm/work/1573572353.504/quickcheck-0.4.1)

But I've intentionally shied away from reporting the full matrix of "all dependents requiring X" in output, and intentially shied away from wanting duplicating entries when dependenices appear in the graph twice, because they over-complicate the output, and probably overcomplicates the amount of pre-processing required by cargo to glue it together.