Azure blob Storage: Missing CONTENT-MD5 when blob uploaded using cloudBlockBlob.uploadBlock Java API - azure-storage

I'm uploading files in AZURE blob storage using azure-storage java sdk version 8.6.5. If I upload a file from Web Console, I see Content-MD5 value.
But I do not see CONTENT-MD5 value when I upload using the following sample code :-
BlobRequestOptions blobRequestOptions = new BlobRequestOptions();
blobRequestOptions.setStoreBlobContentMD5(true);
cloudBlockBlob.uploadBlock(blockId, inputstream , length, null, blobRequestOptions, null);
File is split into multiple chunks and uploaded in multiple parallel threads and finally committing the block list as follows. File upload is working fine.
cloudBlockBlob.commitBlockList(blockIds, null, blobRequestOptions, null);
Any pointers would be greatly appreciated, thanks!
Also any ideas what is the best way to check the file integrity programmatically and to ensure file is uploaded correctly if content-MD5 is not available. Does Azure blob Storage support any thing for content verification?

If you want to get CONTENT-MD5 value after you have uploaded a file successfully,just try the code below :
cloudBlockBlob.getProperties().getContentMD5()
If you are still missing the content-MD5 value, this link could be helpful.

Related

Streaming mPDF Output for Download

This is more of a conceptual question, but I'm wondering if it's possible to stream the output of mPDF directly to the user in a download (e.g. without saving in a temp folder on the server or loading into the user's browser).
I'm using a similar method successfully for downloading a zip file of S3 photos using ZipStream and AWS PHP S3 Stream Wrapper which works very well, so I would like to employ a similar method for my PDF generation.
I use the mPDF library to generate reports that have S3 images on Heroku. The mPDF documentation shows four output options including inline and download; inline loads it right into the user's browser and download forces the download prompt (desired behavior).
I've enabled S3 Stream Wrapper and embedded images in the PDF per mPDF Image() documentation like this:
$mpdf->imageVars['myvariable'] = '';
while (!feof($streamRead)) {
// Read 1,024 bytes from the stream
$mpdf->imageVars['myvariable'] .= fread($streamRead, 1024);
}
fclose($streamRead);
$imageHTML = '<img src="var:myvariable" class="report-img" />';
$mpdf->WriteHTML($imageHTML);
I've also added the header('X-Accel-Buffering: no'); which was required to get the ZipStream working within the Heroku environment, but the script always times out if there are more than a couple of images.
Is it possible to immediately prompt the download and just have the data stream directly to the user? I'm hoping this method can be used for more than just zip downloads but haven't had luck with this particular application yet.

How to set object in a Play framework session or how to retrieve the current size transferred in aws?

There is an upload object which will be returned by AWS on file upload. Where the upload object contains the byte transferred so far.
How to inject an object in play framework session ? So that it can be retrieved in the next ajax call to get the status of the file upload
Is there a way to get the byte transferred by AWS API by giving the file access key or file unique key in the next ajax call after file upload.
Thanks.
1) Play's session doesn't work this way : It's based ok cookies, and there is no storage out of the box (everything you set in a user's session end up in the a cookie), so you need to handle that yourself.
I would set a random UUID as session ID, and use a backend storage that storage a data blob based on a combination.
2) Sure, but you need to hande that yourself. AWS's API is async, so you get an ID on upload, and use that later on to see the status,

Is there any other way to insert data in BigQuery via API apart from via streaming data

Is there any other way to insert data in BigQuery via API apart from via streaming data i.e. Table.insetAll
InsertAllResponse response = bigquery.insertAll(InsertAllRequest.newBuilder(tableId)
.addRow("rowId", rowContent)
.build())
As you can see in the docs, you also have 2 other possibilites:
Loading from Google Cloud Storage, BigTable, DataStore
Just run a job.insert method from the job resource and set as metadata the field configuration.load.sourceUri.
In the Python Client, this is done in the method LoadTableFromStorageJob.
You can therefore just send your files to GCS for instance and then have an API call to bring the files to BigQuery.
Media Upload
This is also a job.load operation but this time the HTTP request also carries binaries from a file in your machine. So you can pretty much send any file that you have in your disk with this request (given the format is accepted by BQ).
In Python, this is done in the resource table Table.upload_from_file.

How to check whether Azure Blob Storage upload was successful?

I'm using an Azure SAS URL to upload a file to a blob storage:
var blockBlob = new Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob(new System.Uri(sasUrl));
blockBlob.UploadFromFile(filePath);
The file exists on my disk, and the URL should be correct since it is automatically retrieved from the Windows Store Ingestion API (and, if I slightly change one character in the URL's signature part, the upload fails with HTTP 403).
However, when checking
var blobs = blockBlob.Container.ListBlobs();
the result is Count = 0, so I'm wondering if the upload was successful? Unfortunately, the UploadFromFile method (similarly to the UploadFromStream method) has no return type, so I'm not sure how to retrieve the upload's result).
If I try to connect to the SAS URL using Azure Storage Explorer, listing
blob containers fails with the error "Authentication Error. Signature fields not well formed". I tried URL escaping the URL's signature part since that seems to be the reason for that error in some similar cases, but that doesn't solve the problem.
Is there any way to check the status of a blob upload? Has anybody an idea why an auto-generated URL (delivered by one of Microsoft's official APIs) can not be connected to using Azure Explorer?
Please examine the sp field of your SAS. It shows the rights you are authorized to do with the blob. For example, sp=rw means you could read the blob and write content to the blob using this SAS. sp=w means you can only write content to the blob using this SAS.
If you have the read right, you can copy the SAS URL to the browser address bar. The browser will download or show the blob content for you.
Is there any way to check the status of a blob upload?
If no exception throws from your code, it means the blob has been uploaded successfully. Otherwise, a exception will be thrown.
try
{
blockBlob.UploadFromFile(filePath);
}
catch(Exception ex)
{
//uploaded failed
}
You can also confirm it using any web debugging proxy tool(ex. Fiddler) to capture the response message from storage server. 201 Created status code will be returned if the blob has been uploaded successfully.
Has anybody an idea why an auto-generated URL (delivered by one of Microsoft's official APIs) can not be connected to using Azure Explorer?
Azure Storage Explorer only allow us to connect a Storage Account using SAS or attach a storage service (blob container, queue, or table) using an SAS. It doesn't allow us to connect a blob item using SAS.
In case on synchronous UPLOAD, we can try Exception based approach and also we can cross check "blockBlob.Properties.Length". Before uploading file, its "-1" and after upload completes, it become size of file got uploaded.
So we can add check, to verify block length, which will give details on state of upload.
try
{
blockBlob.UploadFromFile(filePath);
if(blockBlob.Properties.Length >= 0)
{
// File uploaded successfull
// You can take any action.
}
}
catch(Exception ex)
{
//uploaded failed
}

Access EXIF data from file uploaded with Paperclip in tests

Can I safely use self.image.staged_path to access file which is uploaded to Amazon S3 using Paperclip ? I noticed that I can use self.image.url (which returns https...s3....file) to read EXIF from file on S3 in Production or Development environments. I can't use the same approach in test though.
I found staged_path method which allows me to read EXIF from file in all environments (it returns something like: /var/folders/dv/zpc...-6331-fq3gju )
I couldn't find more information about this method, so the question is: does anyone have experience with this and could advise on reliability of this approach? I'm reading EXIF data in before_post_process callback
before_post_process :load_date_from_exif
def load_date_from_exif
...
EXIFR::JPEG.new(self.image.staged_path).date_time
...
end