-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outboard toolchains #641
Comments
@cormacrelf Thanks for this extensive issue ❤️
In the LRE setup this is a desirable behavior. We're quite interested in effective scaling to zero which means that we want to reduce cold-start times for worker deployments as much as possible. Embedding the toolchain into the container rather than fetching it on-demand makes a big difference in wall-clock time as it's easier to control container pulls/pushes than fetches from external repos (e.g. rust nightly takes wayyy too long to fetch). We're planning an important twist to the "standard" RBE container setup though: The nativelink Scheduler can dispatch execution jobs based on platform properties. This means that with a single remote worker endpoint we can make an arbitrary number of different toolchains and targets available and we can create configurations where those different workers all still operate against the same cache. Personally, I think that slow updates/modifications to toolchain containers are by far the biggest hindrance with existing RBE setups. Also, the fact that 99% of RBE toolchains use generic paths like We plan to expand the LRE setup into language-specific containers. For instance, one container for Clang/C++, one container for Cuda (which would effectively be a superset of a clang container), a container for python, one for java etc etc.
This sounds very similar to what we intend to do. As of now, I think the missing piece is a more flexible (or I guess a more "specialized") variant of the
IMO instead of fetching toolchains with bazel, users could use nix as the default environment and have Bazel "inherit" from that environment. This approach also has the benefit that it's much easier to create containers ( I'm not familiar enough with Buck2 toolchains yet to tell whether the same is true for Buck2. We do intend to support Buck2 as first-class citizen at some point though, so this is certainly an area where we're super interested in any information that might be relevant.
I haven't played around with that idea. Initially I'm slightly sceptical whether this would interop nicely with e.g. container runtimes that mount certain paths, like the nvidia-container-toolkit CDI which mounts certain locations that could be unresolvable if it mounts symlinks. I haven't investigated this approach enough yet to be able to make a good assessment here.
As additional datapoint, mirroring the nix-shell and the remote toolchains allows using the LRE toolchains without an actual remote. For instance a nix flake combined with, with a trusted remote cache setup allows sharing each other's build artifacts between local builds on different (but architecturally similar) machines. No remote execution involved at all. This was super useful for a smaller team without the bandwidth to manage a larger scale cloud infra. For instance, perfect cache sharing between Gentoo, WSL2, NixOS, Ubuntu and nix-based containers is doable with this setup. |
Thanks for your detailed response!
Too right. The equivalent of this task literally takes a couple of hours for our current setup, which is just a CI pipeline. For that entire time, my poor computer uses 100% CPU and/or thrashes the disk so hard I can't do anything else. I automated large swaths of it and I've still been putting another run off for two weeks. Worker cold-start times
I figured this might be addressed by the outboard toolchains being cached by Native Link itself, not every worker independently. Both of my ideas involved exploiting the cache-busting properties of the Nix store, where the name of the store path incorporates the content. On a worker cold start, the symlink one would look like (in this file): // in `fn download_to_directory`
// ambient
let outboard_cache = HashMap::new();
#[cfg(target_family = "unix")]
for symlink_node in directory.symlinks {
// eg "/nix/store/hcil3fgcjav0y458ff4m98zgcqky00gk-coreutils-9.3"
if symlink_node.target.starts_with("/nix/store") {
get_outboard(symlink_node.target);
}
}
fn get_outboard(target: &str) {
let fake_outboard_action = Action::new().command(
// semicolons because that's an invalid path in any league
["::nativelink::outboard", matched_symlink_dst]
);
let digest = fake_outboard_action.digest();
// Happy path = no action required, worker already has the action result in the hashmap
let action_result = outboard_cache.entry(digest).or_insert_with(|x| {
if let Some(result) = action_cache.GetActionResult(digest) {
// most worker cold starts = just fetch from the cache
result
} else {
// this branch only taken when you add a dependency, basically
let resolved = run_configured_resolver(target, fake_outboard_action);
action_cache.UpdateActionResult(resolved.clone());
resolved
}
});
// use action_result.output_files & output_directories, but since they will be relative paths,
// treat them relative to some root that we will bind-mount to /nix/store (or wherever else
// the user wants to mount it).
for dir in action_result.output_directories { // and paths, i guess
download_to_directory(..., dir, "/tmp/some-dir-to-bind-mount-to-nix-store");
}
} As you can see, it would piggy-back on almost everything, including that individual components of toolchains that are rarely used (or have stopped being used) get evicted from caches like anything else. And since workers can download only the parts of the toolchains they actually need, worker cold starts have a much, much faster lower bound than downloading a big blob of Docker container with the kitchen sink. Since toolchains often comprise radically different sized files to the rest of your action outputs, like a few hundred MB of LLVM stuff or "we use a 46GB VDSO file and launch it as a test runner", I would think a separate action cache / store configuration might be necessary. Other than that it would fit right in.
|
This strengthens my belief that we really need to make sure that "ease of bumping dependencies" should be a major focus for whatever we end up implementing. Ultimately, it might even make sense to empirically test different approaches (docker vs lazy loading etc).
FYI We briefly considered a However, something that is essentially a nix API wrapper could also be considered a potential feature drift. Personally, I'm quite open to the idea - I literally tried to merge the codebases for attic and NativeLink at some point lol. @MarcusSorealheis @adam-singer @allada What are your opinions on this? I think a
I believe @allada might have some valuable insights on this. WDYT to what degree should toolchains be in container-land vs Starlark-land? My current line of thought is:
@cormacrelf Regarding the code example, I feel like I'm not grasping things in their entirety. Could you elaborate what the benefit of this is as opposed to using a
Yes. A framework that easily allows creating custom codegen pipelines is what I had in mind. PoC-wise just a translation from nix derivations to RBE toolchain generators. FYI I've also played with the idea to draw inspiration from LLVM/MLIR similar to how Mitosis does it to create some sort of "NativeLink IR" for toolchains. @allada and I talked about this concept a bunch of times but the scope is daunting. Imagine a framework that allows creating RBE-compatible toolchains from some external source via some sort of pluggable to-proto and from-proto generators:
CDI mounts specific executables. Not the entire |
@cormacrelf can you email me please? I do not intend to dox you or sell you. |
✔️ although "one container per build" => maybe you mean "one container image per repo", which also describes the default for most projects, to my mind.
It is meant to support a rules_nixpkgs approach in starlark land, which does not work on its own. It's not an alternative to rules_nixpkgs, rather it is an alternative to pre-loading nix dependencies in a container image. I think I know what you mean -- my pseudocode looks like it could be done in starlark! Just have a build rule fetch a nix package... right? But it cannot work on a remote, because of all of the above about absolute paths, but again for clarity:
You can make rules_nixpkgs work on a remote by pre-building a container image that has every conceivable nix path already there, so that when you upload a symlink (made by a local-only What does the rules_nixpkgs approach give you in the output directory + on the remote?For example, zlib. When you run the local-only rules_nixpkgs rules, in your output folder it dumps a bunch of symlinks, roughly like this, which is just the symlink output of
Obviously this goes beyond toolchain rules, there is no such thing as a "zlib toolchain". But rules_nixpkgs does exactly the same thing when you ask it to get clang, and then depend on that from your toolchain rules. In which case, you will have:
And your toolchain will make executables that look like
Currently, that execute step fails when your container image does not have Nix clang and the symlink deref fails. Or if you just bumped clang to 16.0.0, and the container image you had built is still on 15.0.1. My proposal is that Native Link just runs a script that fetches the symlink target from a nix remote and caches its results, so your container images do not need to contain any Nix stuff in them, not even an installation of Nix. (You would need to install Nix on the workers themselves.) If it helps, this is an outdated and fairly dodgy version of my Nix Clang toolchain rule for Buck2, which basically does what Nix's clang wrapper package does, but using clang-unwrapped, and relying entirely on buck's cxx rules instead of NIX_ environment variables. https://gist.github.com/cormacrelf/c28f400b87eb5a285435f94459fae48a How specific to nix does the feature have to be?
To some extent, the resolver + cache approach would have to be designed to support nix, as each of those symlinks will usually depend on a bunch of other nix paths that aren't named in the one symlink. Most nix packages depend in some way on glibc, for instance, and you don't want 400 copies of glibc in the cache, or indeed 12 copies of the entirety of LLVM under other downstream deps' cache keys. Maybe you can always execute the resolver, which prints a list of paths (the transitive deps) to possibly fetch from cache. You can cache that step as well. But I don't think you have to take it much further than that. It is plausible that building it in a generic way will allow people to come up with other creative ideas, like having the resolver directly mount some readonly network drive if the build tries to reference it, and emit no paths for NL to fetch. |
The problem
rules_nixpkgs
exploit this to great effect.(4) makes things very impractical for Nix-based toolchains, and the only viable solution right now is not to build toolchains on-demand at all. Instead, you must describe every nix dependency you need in advance, package it all up into a Docker image (example), and re-upload the image every time you want to add a tool. When you have a few GB of dependencies, my experience with a similar pattern is that while you can do it, this slows down delivering tooling improvements to a snail's pace. (It is always feasible if you have a "developer platforms team" or similar at a big company. But the Nix project is your "developer platforms team" at a small company.)
Some links:
rules_ll
, in a long, long thread of other people with this exact problem Support remote execution with rules_nixpkgs tweag/rules_nixpkgs#180 (comment)Why should Native Link help?
I believe this is a pretty signifcant barrier to adopting remote builds. Anyone can use bazel and buck without RBE, it's child's play. But getting a remote-compatible toolchain together is pretty hard, and takes a lot of effort. At my employer, this has almost singlehandedly blocked the adoption of Buck -- the benefits of RBE would take too much time investment to obtain, so the whole thing is not worth it. Anything you can offer to make this work would be a big differentiator.
How can Native Link help?
You're enterprising folks, and you seem to be using Nix yourselves. Do you have any ideas to make on-demand Nix toolchains work? I have two:
1. Absolute output_paths
Is there perhaps a way to bring
/nix/store
back into the cache? I can imagine a possible extension to the RBE protocol, to support absolute paths in action cache output paths, moderated by a whitelist of/nix/store
for example. Tricky because all Nix actions would write into the same directory. If every Nix "store path" were a separate action in the bazel/buck build graph, and you had to provide the actual path in e.g.nix_package(name = ..., expr = "pkgs.coreutils", path = "/nix/store/hcil3fgcjav0y458ff4m98zgcqky00gk-coreutils-9.3")
, then this would be doable, especially if you could autogenerate rules like this.2. Resolver for symlinks to things in /nix/store
Alternatively, there could be special treatment of symlinks to things in /nix/store or any other configurable whitelisted path. That would not require modifying RBE clients for support, because they can all send symlinks already. It wouldn't even require changing
rules_nixpkgs
, because it already uses symlinks to /nix/store, placed in the bazel output. (I have still not released my version of rules_nixpkgs for buck, but it too works this way.)The idea involves a symlink resolver, which would attempt to pre-resolve symlinks of some given pattern by just hitting a Nix remote cache. The resolver would run before actions that depend on that symlink execute. This could be configured like so:
{ "/nix/store": ["./nix_resolver.sh"] }
nix-store --realize /nix/store/hcil3fgcjav0y458ff4m98zgcqky00gk-coreutils-9.3
for each command line argument. Obviously you can configure your own nix remote cache to hit, etc, in the shell script. The resolver can output a list of paths to be hardlinked alongside the rest of the action inputs. (Hardlink? Not sure. Although if Nix is going to be running on the worker nodes in order to resolve these things, might want to let Nix manage the /nix directory itself, copy resolved paths to the NL cache, and then expose those cached paths through some other means, like bind mounts. I'm thinking this would be handled by an environment variable provided to the execution script, and so people can choose to use docker volume mounts to make the paths available.)sh
in it, whensh
could be stored in the regular cache hierarchy and be treated like any other file.This would play really nicely with local builds, because your local machine's
rules_nixpkgs
or equivalent will just be realizing nix paths from a nix cache, symlinking those paths into the bazel/buck output directory, and adding those symlinks as Nix GC roots so thatnix store gc
doesn't delete them. This will end up having the exact same effect when it comes to remote execution, just that the symlinks will be made to work a different way.Footnotes
Say you've got a repo with a frontend and a backend. A backend developer might never trigger the toolchain rules that download Node.js in order to build a JS bundle for the frontend. ↩
Installed nix packages live in
/nix/store
, and contain many absolute path references to other things in/nix/store
. This is just a slightly more upfront incarnation of a problem that exists elsewhere; Nix just fails very very fast when it's in the wrong place. All dynamically linked libraries are referenced via absolute paths in ELF files, for example. ↩The text was updated successfully, but these errors were encountered: