Comments (7)
We are looking into this.
After initial investigations it appears there may be an issue in BufferedInputStream which is attempting to create a negative sized array while reading. The reason the app appers to "hang" is that the exception is thrown on the main thread in your app which will subsequently die an no longer be writing data to the blob input stream. I have a consistent repro using the code you supplied. If you remove BufferedInputStream and read directly from the FileInputStream this work correctly. Also note, this stream is only read on the main thread.
Exception in thread "main" java.lang.NegativeArraySizeException
at java.io.BufferedInputStream.fill(BufferedInputStream.java:205)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at com.microsoft.windowsazure.services.core.storage.utils.Utility.writeToOutputStream(Utility.java:971
at com.microsoft.windowsazure.services.blob.client.BlobOutputStream.write(BlobOutputStream.java:546)
at com.microsoft.windowsazure.services.blob.client.CloudBlockBlob.upload(CloudBlockBlob.java:447)
at Blobs.main(Blobs.java:78)
I will continue investigating this issue, but for now I would recommend not using the BufferedInputStream.
from azure-sdk-for-java.
Actually the main issue for me was not the buffering. It was the undocumented (ok, it's documented, but only in the C# sdk) 90 second timeout. This combined with a auto-retry of 3 was problematic with a large file.
from azure-sdk-for-java.
We were able to repro this without using the library at all by simply opening a file and copying it to another local file. It seems in some versions of the jre that when mark is used on a BufferedInputStream then it can cacluate the arraySize incorrectly.
The Timeout is applied in two places, the HTTPUrlConnecitons readTimeout and as a url parameter to the service. The service will use the timeout value ( converted to seconds) on the server side. To be clear this does not mean that the entire blob has to uploaded / downloaded in the given timeout, only that either the server couldnt process the request in the given amount of time OR the client could not read data from the server in a given amount of time.
A retry executes the same operation again depending on the results of the previous attempt. This means that any subsequent retry will get its own timeout and is not impacted by prior attempts.
from azure-sdk-for-java.
Let me try to be clear. The bufferedInputStream while interesting, and hopefully helpful, was NOT the way I was always calling. More often I was calling either with a ByteInputStream (for items < 50 MB) or a MarkableFileInputStream (this is my own implementation of FileInputStream using RandomAccessFile to allow marking). So the example I was showing was just ONE INSTANCE of the initial failure.
I was running my tests on a wireless connection and a slow ADSL one (256k up stream). I started with a big file (150 MB) and found it never seemed to finish. I then narrowed the issue to
a file exceeding about 3500 MB causing the issue. Finally I resolved the issue kludgily, by realizing it was the 90 second timeout, and setting this higher.
This issue was even worse with the 150 MB file, because I had concurrent requests on to 10. So 10 different threads were doing 4 MB chunks, each was failing and retrying (which is why it seemed to go on forever (eg 1.5 hours, when 1 hour should have been sufficient for a "normal" upload.
So for me, it was pure timeout. Once I raised this to 30 minutes (yes I know, but I had to force it all the up to 64 MB before the 4 MB partitioning occurred), it worked with files of 4, 10, 60, 80, 100, and 150 MB.
So the BufferedInputStream issue - not denying it may exist and certainly appreciating you looking into it, but for me, I had done tests with and without it and was having the same whacky issues. In fact my initial issue was with my MarkableFileInputStream. I then made it a bytearrayStream, and simply for CLARITY of showing the repro, provided you with code that showed the FileInputStream and then the BufferedINputStream.
From: joeg [email protected]
To: mikebell90 [email protected]
Sent: Tuesday, January 10, 2012 10:01 AM
Subject: Re: [azure-sdk-for-java] Blob storage hangs for files > about 3500 kb (#1)
We were able to repro this without using the library at all by simply opening a file and copying it to another local file. It seems in some versions of the jre that when mark is used on a BufferedInputStream then it can cacluate the arraySize incorrectly.
The Timeout is applied in two places, the HTTPUrlConnecitons readTimeout and as a url parameter to the service. The service will use the timeout value ( converted to seconds) on the server side. To be clear this does not mean that the entire blob has to uploaded / downloaded in the given timeout, only that either the server couldnt process the request in the given amount of time OR the client could not read data from the server in a given amount of time.
A retry executes the same operation again depending on the results of the previous attempt. This means that any subsequent retry will get its own timeout and is not impacted by prior attempts.
Reply to this email directly or view it on GitHub:
#1 (comment)
from azure-sdk-for-java.
Yes, on a slow connection you are correct that you may need a higher timeout value.
However as part of this investigation we have identified and reported a bug in BufferedInputStream regarding using mark values over 1 GB 1073741824 which will cause a NegativeArraySize Exception to be thrown.
We will be updating the library to use a smaller mark size to avoid this bug altogether.
from azure-sdk-for-java.
FYI, another group of folks ran into that. Might get a useful workaround from them
https://bitbucket.org/jmurty/jets3t/issue/99/repeatablerequestentity-blindly-marks
From: joeg [email protected]
To: mikebell90 [email protected]
Sent: Thursday, January 12, 2012 10:51 AM
Subject: Re: [azure-sdk-for-java] Blob storage hangs for files > about 3500 kb (#1)
Yes, on a slow connection you are correct that you may need a higher timeout value.
However as part of this investigation we have identified and reported a bug in BufferedInputStream regarding using mark values over 1 GB 1073741824 which will cause a NegativeArraySize Exception to be thrown.
Reply to this email directly or view it on GitHub:
#1 (comment)
from azure-sdk-for-java.
This issue has been resolved in the most recent pull request.
from azure-sdk-for-java.
Related Issues (20)
- [BUG] Revisit bearer policy response body consumption and retries
- [QUERY] What happens when "partitionProcessor.processEvent" throws an exception? How can I stop processEvent retries when the client gets closed? HOT 2
- [FEATURE REQ] Allow users to customize SSLContext in azure-core-amqp and eventhubs HOT 1
- Enable structured JSON logging for SLF4J `2.0.0`+
- [BUG] Service Bus session idle timeout should fall back to retry-options::try-timeout
- [BUG]Issue with com.azure.core.amqp.exception.AmqpException: FAIL_NON_SERIALIZED occurred. HOT 5
- Unexpected character ('<' (code 60)) HOT 1
- Allow the use of a client assertion in OnBehalfOfCredential
- [BUG]Always Populate optional dimensions
- [FEATURE REQ] Add support for `azure-response-timeout` context param in all HttpClients that can support this
- SDK inconsistency across langs: support AZURE_CLIENT_SEND_CERTIFICATE_CHAIN from DAC HOT 2
- [BUG] When making a request with an invalid credential a null pointer is thrown HOT 1
- [BUG] Simple query with Sort but without any where clause in @Query fails while using "ReactiveCosmosRepository" HOT 2
- [BUG] ReactiveCosmosTemplate existsById does not work as intended HOT 4
- [BUG] Timeout on consuming message from azure service bus through Application Gateway HOT 6
- [BUG] Java service bus: ServiceBusAdministrationClient.createSubscription does not create a rule when rule is included as a parameter HOT 2
- Entra access token authentication policies such as `BearerTokenAuthenticationPolicy` should respect `refresh_on` information HOT 1
- Handle redirects in Socket Client
- Discourage use of UsernamePasswordCredential
- [FEATURE REQ] [Key Vault] Add configuration flag to delegate all cryptographic operations to service
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from azure-sdk-for-java.