Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uio copy to s3 during network issues #2

Open
cflaming opened this issue Sep 6, 2017 · 0 comments
Open

uio copy to s3 during network issues #2

cflaming opened this issue Sep 6, 2017 · 0 comments

Comments

@cflaming
Copy link

cflaming commented Sep 6, 2017

When uploading very large files to s3 and a temporary network issue occurs, the s3 retry code throws an exception as seen below. I doubt we want to address retry problems like this via uio but as the underlying library already does it in a network efficient way, maybe we could take advantage of it?

We could wrap (from from-url) in a BufferedInputStream of a specified size that is then communicated to request.getRequestClientOptions().setReadLimit(int) and then the TransferManager would be able to reset back the stream far enough to retry. This avoids falling back to the user to re-upload the file from the beginning.

There is an aws ticket that seems to recap the problem here aws/aws-sdk-java#427 but they pretty much say "hey don't give TransferManager an InputStream but give it a File". This isn't helpful for uio copy but I think it does mention the solution I propose or else the stacktrace suggests this.

I really only see an issue with validating the fix.

com.amazonaws.ResetException: Failed to reset the request input stream;  If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1305)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1129)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4185)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4132)
	at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3172)
	at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3157)
	at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:257)
	at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:191)
	at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Resetting to invalid mark
	at java.io.BufferedInputStream.reset(BufferedInputStream.java:448)
	at com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106)
	at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102)
	at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168)
	at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:102)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1303)
	... 21 more```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant