Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching behavior with --ci for large libraries #10403

Open
MattTheCuber opened this issue Jan 8, 2025 · 4 comments
Open

Caching behavior with --ci for large libraries #10403

MattTheCuber opened this issue Jan 8, 2025 · 4 comments
Labels
cache Caching of packages and metadata performance Potential performance improvement

Comments

@MattTheCuber
Copy link

According to the docs, uv cache prune --ci "removes all pre-built wheels and unzipped source distributions from the cache".

Our project uses a few large libraries which take a long time to download from PyPi. For example, torch and it's dependencies (several nvidia packages) make up 2 GB of files that need to be downloaded (takes ~5 minutes to download and install with uv). Would it be possible to make the --ci flag not clear very large downloaded packages? Otherwise, an argument could be added to uv cache prune that doesn't clear libraries over a given size?

@MattTheCuber
Copy link
Author

Example:

$ uv venv
Using CPython 3.9.21 interpreter at: /usr/bin/python
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
$ source .venv/bin/activate
$ uv pip install torch --cache-dir .uv_cache
Resolved 22 packages in 528ms
Prepared 22 packages in 4m 58s
Installed 22 packages in 196ms
...
$ uv cache prune --ci --cache-dir .uv_cache
Pruning cache at: .uv_cache
Removed 13871 files (4.9GiB)
$ deactivate
$ rm -rf .venv
$ uv venv
Using CPython 3.9.21 interpreter at: /usr/bin/python
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
$ source .venv/bin/activate
$ uv pip install torch --cache-dir .uv_cache
Resolved 22 packages in 487ms
Prepared 22 packages in 5m 02s
Installed 22 packages in 202ms
...

@Gankra Gankra added performance Potential performance improvement cache Caching of packages and metadata labels Jan 8, 2025
@Gankra
Copy link
Contributor

Gankra commented Jan 8, 2025

If we do anything here, the CLI flag for a size limit is probably a good idea either way, as an override for any behaviour we pick.

Out of curiosity, did you measure how long it takes to upload+download everything from your cache (e.g. if you don't use prune --ci)? That kind of information is the crux of whether this will actually improve performance to do this.

@MattTheCuber
Copy link
Author

Yes, I am currently using that method. It cuts off several minutes although it didn't measure the exact time difference.

@MattTheCuber
Copy link
Author

My current fastest solution is to build a custom image that simply pip installs torch that way uv doesn't have to download or install it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cache Caching of packages and metadata performance Potential performance improvement
Projects
None yet
Development

No branches or pull requests

2 participants