Enable chunked uploads #150

isinyaaa · 2024-08-19T21:42:34Z

TL;DR: Rebases #137. Also fixes out-of-date _state_ parameter on session_url, which caused a 404 when resuming/completing uploads.

We wanted to use oras for larger uploads (say, ML model files) at containers/omlmd, but I wasn't able to make them work with standard uploads. I noticed #137, rebased it, addressed some of your comments (not sure how to address all of them, though). But I still couldn't make it work with https://hub.docker.com/_/registry. So I started debbuging to find out there's a _state_ parameter being passed around, and I assume it must be updated on the https://distribution.github.io/distribution/spec/api/#completed-upload PUT request to reflect the last state reported by the server. I tested this with 20GB-ish files.

I wonder if this could help with bringing back chunked file support by default, although I'm not really sure in which aspect that kind of support is "flaky" or (as I experienced) poorly documented.

vsoch

This looks good for testing - could you please add / re-enable a test for chunked? You'll also need to figure out how to sign the DCO, which is a requirement for CNCF projects.

oras/provider.py

Signed-off-by: Brian Cook <[email protected]> Signed-off-by: Isabella do Amaral <[email protected]>

vsoch · 2024-08-20T17:20:56Z

@isinyaaa please see my previous review comment - we need explicit tests for the chunked upload.

Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa · 2024-08-21T13:41:44Z

@isinyaaa please see my previous review comment - we need explicit tests for the chunked upload.

@vsoch sorry I missed that. Updated now, wdyt?

vsoch · 2024-08-21T21:57:38Z

Nice! Let's run these tests now.

isinyaaa · 2024-08-22T12:11:47Z

Oops, I apologize for the dumb mistake, updated now @vsoch .

isinyaaa · 2024-08-26T15:59:52Z

@vsoch can you reapprove tests? I ran the linter locally to make sure it's working, sorry for the trouble

isinyaaa · 2024-08-27T19:30:54Z

@vsoch not sure what went wrong with those tests... maybe some issue with the generated file size? from the raw logs I can't spot any problems, neither locally.

vsoch · 2024-08-28T00:10:22Z

Here is what I see:


/bin/bash scripts/test.sh
ORAS_PORT: 5000
ORAS_HOST: localhost
ORAS_REGISTRY: localhost:5000
ORAS_AUTH: 
============================= test session starts ==============================
platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/runner/work/oras-py/oras-py
configfile: pyproject.toml
collected 22 items
oras/tests/test_oci.py .
oras/tests/test_oras.py .sSuccessfully pushed localhost:5000/dinosaur/artifact:v1
Successfully pushed localhost:5000/dinosaur/artifact:v1
.Successfully pushed localhost:5000/dinosaur/artifact:v1
.Successfully pushed localhost:5000/dinosaur/artifact:v1
..Successfully pushed localhost:5000/dinosaur/directory:v1
.s
oras/tests/test_provider.py Successfully pushed localhost:5000/dinosaur/artifact:v1
Successfully pushed localhost:5000/dinosaur/artifact:v1
0+0 records in
0+0 records out
0 bytes copied, 3.8763e-05 s, 0.0 kB/s

I would try to reproduce locally.

Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa · 2024-08-30T13:02:28Z

@vsoch as expected, while testing on my fork I found the problem lies when working with those very large files on GHA (workflow run on my modified main). It worked when I reduced the test file size to be a couple times the default chunk size, wdyt?

vsoch · 2024-08-30T15:59:18Z

Should the chunk size perhaps be smaller then?

isinyaaa · 2024-08-30T18:06:51Z

I don't really think that's a problem, up to you. The issue was in creating a 15GB test file in github actions. I think the worker didn't have this much space to spare or something. I reduced the test file to be 4x the default chunk size (4x 16MB), that's all.

vsoch · 2024-08-30T18:53:59Z

Ah ok. Please keep the tests in GitHub the same as what you are doing (and what is working) locally and let's try making more space on the builder - these first three lines before the tests to cleanup and add space should be sufficient:

https://github.com/converged-computing/fluxnetes/blob/0d577aa3155e68aff457f390f8c926e9f57a13d3/.github/workflows/e2e-test.yaml#L129-L133

But you can add more as needed.

Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa · 2024-09-02T15:03:56Z

@vsoch updated! thanks for the pointer, I had no idea the default images could be so huge! Though the first three lines didn't suffice as we actually need 15GB for the file + 15GB for the upload.

isinyaaa · 2024-09-04T15:12:45Z

Hey, @vsoch, can we merge this yet?

vsoch · 2024-09-04T17:44:55Z

Yes - we are close! Can you please bump the version in oras/version.py and make a note about the change in CHANGELOG.md? That should be the final bit we need.

Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa · 2024-09-04T18:20:56Z

updated, wdyt?

isinyaaa requested review from vsoch and SteveLasker as code owners August 19, 2024 21:42

vsoch reviewed Aug 19, 2024

View reviewed changes

oras/provider.py Show resolved Hide resolved

isinyaaa force-pushed the tuneable-chunk-sizing branch 2 times, most recently from adb4c55 to 123e39d Compare August 20, 2024 15:17

add arg to enable blob chunking and allow custom chunk sizes

2eb94e6

Signed-off-by: Brian Cook <[email protected]> Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa force-pushed the tuneable-chunk-sizing branch from 123e39d to 9ab5537 Compare August 20, 2024 16:22

isinyaaa requested a review from vsoch August 20, 2024 16:24

test.sh: expose ORAS_PORT

dfdab0f

Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa force-pushed the tuneable-chunk-sizing branch from 9ab5537 to 804ccb1 Compare August 21, 2024 13:40

isinyaaa force-pushed the tuneable-chunk-sizing branch from 804ccb1 to bb59445 Compare August 22, 2024 12:11

isinyaaa force-pushed the tuneable-chunk-sizing branch from bb59445 to 038a509 Compare August 26, 2024 14:08

isinyaaa force-pushed the tuneable-chunk-sizing branch from 038a509 to ce6e216 Compare August 30, 2024 12:53

provider: update location on every chunk PATCH

0956242

Signed-off-by: Isabella do Amaral <[email protected]>

isinyaaa force-pushed the tuneable-chunk-sizing branch from ce6e216 to 0956242 Compare August 30, 2024 13:01

GHA: make space for large files

371072c

Signed-off-by: Isabella do Amaral <[email protected]>

bump version

599b4ac

Signed-off-by: Isabella do Amaral <[email protected]>

vsoch approved these changes Sep 4, 2024

View reviewed changes

vsoch merged commit dfc2415 into oras-project:main Sep 4, 2024
5 checks passed

isinyaaa deleted the tuneable-chunk-sizing branch September 4, 2024 20:11

isinyaaa mentioned this pull request Sep 4, 2024

enable large file upload containers/omlmd#13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable chunked uploads #150

Enable chunked uploads #150

isinyaaa commented Aug 19, 2024

vsoch left a comment •

edited

Loading

vsoch commented Aug 20, 2024

isinyaaa commented Aug 21, 2024

vsoch commented Aug 21, 2024

isinyaaa commented Aug 22, 2024

isinyaaa commented Aug 26, 2024 •

edited

Loading

isinyaaa commented Aug 27, 2024

vsoch commented Aug 28, 2024

isinyaaa commented Aug 30, 2024 •

edited

Loading

vsoch commented Aug 30, 2024

isinyaaa commented Aug 30, 2024 •

edited

Loading

vsoch commented Aug 30, 2024

isinyaaa commented Sep 2, 2024

isinyaaa commented Sep 4, 2024

vsoch commented Sep 4, 2024

isinyaaa commented Sep 4, 2024

Enable chunked uploads #150

Enable chunked uploads #150

Conversation

isinyaaa commented Aug 19, 2024

vsoch left a comment • edited Loading

Choose a reason for hiding this comment

vsoch commented Aug 20, 2024

isinyaaa commented Aug 21, 2024

vsoch commented Aug 21, 2024

isinyaaa commented Aug 22, 2024

isinyaaa commented Aug 26, 2024 • edited Loading

isinyaaa commented Aug 27, 2024

vsoch commented Aug 28, 2024

isinyaaa commented Aug 30, 2024 • edited Loading

vsoch commented Aug 30, 2024

isinyaaa commented Aug 30, 2024 • edited Loading

vsoch commented Aug 30, 2024

isinyaaa commented Sep 2, 2024

isinyaaa commented Sep 4, 2024

vsoch commented Sep 4, 2024

isinyaaa commented Sep 4, 2024

vsoch left a comment •

edited

Loading

isinyaaa commented Aug 26, 2024 •

edited

Loading

isinyaaa commented Aug 30, 2024 •

edited

Loading

isinyaaa commented Aug 30, 2024 •

edited

Loading