Azure blob storage: Azure Source blob size is set to Zero when I download the blob using DownloadRange API - azure-storage

I am trying to download Azure blob using azure data movement library.
I am facing the issue where the Azure source blob size is set to "Zero" when I try to download this source blob using the APIs downloadRange
The destination file is downloaded correctly and its size is correct. Am I missing anything?
I am using azure-storage java sdk version 8.6.5. Here is the sample code
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("myConnectionString");
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("myContainer");
CloudBlockBlob cloudBlockBlob = container.GetBlockBlobReference("abc.txt");
And this is loop where I am reading in chunks. Destination file size is correct. I just cannot figure why the source blob size is set to zero ? This is not reproducible when entire contents are downloaded using API download
try (OutputStream out = new ByteBufferBackedOutputStream(buffer)) {
cloudBlockBlob.downloadRange(position, (long) buffer.capacity(), out);
}...
Thanks in Advance!!

If you just want to read a file in chunks by cloudBlockBlob.downloadRange, try code below:
import java.io.ByteArrayOutputStream;
import com.microsoft.azure.storage.CloudStorageAccount;
import com.microsoft.azure.storage.blob.CloudBlobClient;
import com.microsoft.azure.storage.blob.CloudBlobContainer;
import com.microsoft.azure.storage.blob.CloudBlockBlob;
public class lagencyStorgeSdk {
public static void main(String[] args) throws Exception {
CloudStorageAccount storageAccount = CloudStorageAccount.parse(
"<storage account connection string>");
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();
CloudBlobContainer container = blobClient.getContainerReference("<container name>");
CloudBlockBlob cloudBlockBlob = container.getBlockBlobReference(".txt file name");
cloudBlockBlob.downloadAttributes();
long totalSize = cloudBlockBlob.getProperties().getLength();
long readSize = 2;
for (long i = 0; i < totalSize; i += readSize) {
ByteArrayOutputStream downloadStream = new ByteArrayOutputStream();
cloudBlockBlob.downloadRange(i, readSize, downloadStream);
System.out.println("===>" + downloadStream);
}
}
}
content of my .txt file:
Result:
Let me know if you have any questions.

Related

How to update an existing Blob in Azure Storage in .NET 6 or in ASP.NET Core

I have prepared some C# code to create a container in the Azure Storage and then I am uploading a file into that azure container. The code is below:
var connectionString = _settings.appConfig.StorageConnectionString;
BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
BlobContainerClient blobContainer = blobServiceClient.GetBlobContainerClient("nasir-container");
await blobContainer.CreateIfNotExistsAsync(); // Create the container.
BlobClient blobClient = blobContainer.GetBlobClient(fileName); // Creating the blob
string fileName = "D:/Workspace/Adappt/MyWordFile.docx";
FileStream uploadFileStream = System.IO.File.OpenRead(fileName);
blobClient.Upload(uploadFileStream);
uploadFileStream.Close();
Now I have updated my MyWordFile.docx with more content. Now I would like to upload this updated file to the same blob storage. How can I do this? I also want to create versioning too so that I can get the file content based on the version.
Now I have updated my MyWordFile.docx with more content. Now I would
like to upload this updated file to the same blob storage. How can I
do this?
To update a blob, you simply upload the same file (basically use the same code you wrote to upload the file in the first place). Upload operation will overwrite an existing blob.
I also want to create versioning too so that I can get the file
content based on the version.
There are two ways you can implement versioning for blobs:
Automatic versioning: If you want Azure Blob Storage service to maintain versions of your blobs, all you need to do is enable versioning on the storage account. Once you enable that, anytime a blob is modified a new version of the blob will be created automatically for you by service. Please see this link to learn more about blob versioning: https://learn.microsoft.com/en-us/azure/storage/blobs/versioning-overview.
Manual versioning: While automatic versioning is great but there could be many reasons why you would want to opt for manual versioning (e.g. you only want to version a few blobs and not all blobs, you are not using V2 Storage account etc.). If that's the case, then you can create a version of the blob by taking a snapshot of the blob before you update the blob. Snapshot creates a read-only copy of the blob at the point of time snapshot was taken. Please see this link to learn more about blob snapshot: https://learn.microsoft.com/en-us/azure/storage/blobs/snapshots-overview.
First you need to enable versioning in the blob storage through the portal in the storage account.
Just click on Disable it will take you to a different page and select enable version and click save
Here after uploading the blob when you update the blob it will automatically trigger the creating of versions.
public static async Task UpdateVersionedBlobMetadata(BlobContainerClient blobContainerClient,
string blobName)
{
try
{
// Create the container.
await blobContainerClient.CreateIfNotExistsAsync();
// Upload a block blob.
BlockBlobClient blockBlobClient = blobContainerClient.GetBlockBlobClient(blobName);
string blobContents = string.Format("Block blob created at {0}.", DateTime.Now);
byte[] byteArray = Encoding.ASCII.GetBytes(blobContents);
string initalVersionId;
using (MemoryStream stream = new MemoryStream(byteArray))
{
Response<BlobContentInfo> uploadResponse =
await blockBlobClient.UploadAsync(stream, null, default);
// Get the version ID for the current version.
initalVersionId = uploadResponse.Value.VersionId;
}
// Update the blob's metadata to trigger the creation of a new version.
Dictionary<string, string> metadata = new Dictionary<string, string>
{
{ "key", "value" },
{ "key1", "value1" }
};
Response<BlobInfo> metadataResponse =
await blockBlobClient.SetMetadataAsync(metadata);
// Get the version ID for the new current version.
string newVersionId = metadataResponse.Value.VersionId;
// Request metadata on the previous version.
BlockBlobClient initalVersionBlob = blockBlobClient.WithVersion(initalVersionId);
Response<BlobProperties> propertiesResponse = await initalVersionBlob.GetPropertiesAsync();
PrintMetadata(propertiesResponse);
// Request metadata on the current version.
BlockBlobClient newVersionBlob = blockBlobClient.WithVersion(newVersionId);
Response<BlobProperties> newPropertiesResponse = await newVersionBlob.GetPropertiesAsync();
PrintMetadata(newPropertiesResponse);
}
catch (RequestFailedException e)
{
Console.WriteLine(e.Message);
Console.ReadLine();
throw;
}
}
static void PrintMetadata(Response<BlobProperties> propertiesResponse)
{
if (propertiesResponse.Value.Metadata.Count > 0)
{
Console.WriteLine("Metadata values for version {0}:", propertiesResponse.Value.VersionId);
foreach (var item in propertiesResponse.Value.Metadata)
{
Console.WriteLine("Key:{0} Value:{1}", item.Key, item.Value);
}
}
else
{
Console.WriteLine("Version {0} has no metadata.", propertiesResponse.Value.VersionId);
}
}
The above code is from the following documentation.

How do I set Set ContentType before upload to azure blob container [duplicate]

This question already has answers here:
How upload blob in Azure Blob Storage with specified ContentType with .NET v12 SDK?
(2 answers)
Closed 2 years ago.
I'm using the asp.net-core webapi to upload images to azure storage.
I was able to successfully upload a image blob to azure storage (using the quickstart). However, the content-type property (with azure) is set to application/octet-stream. The problem with this is the public url will not load in a browser due to this content type. I plan to eventually consume this url/image in my website so I'm thinking this might work. Is there any way to specify the CONTENT-TYPE to image/jpeg? I've also tried the following code but received error message: 404 (The specified blob does not exist.) during the SetHttpHeaders call (the UploadBlob method call that is currently commented out does work, but has the octet-stream content type).
BlobClient blobClient = containerClient.GetBlobClient(guids[index]);
using (var content = file.OpenReadStream())
{
blobClient.Upload(content);
blobClient.SetHttpHeaders(new BlobHttpHeaders() { ContentType = "image/jpeg" });
//containerClient.UploadBlob(guids[index], content);
}
Not specific contentType, with the filename.[ext] download ok. When create the reference to blobclient set the extension. Example download:
var fileName = $"{guids[index]}.jpg";
var pathStorage = Path.Combine(path, fileName);
BlobClient blobClient = containerClient.GetBlobClient(pathStorage);
BlobDownloadInfo download = await blobClient.DownloadAsync();
byte[] bytesContent;
using (var ms = new MemoryStream())
{
await download.Content.CopyToAsync(ms);
bytesContent = ms.ToArray();
}
return bytesContent;
Example upload:
var fileName = $"{guids[index]}.jpg";
var pathStorage = Path.Combine(path, fileName);
BlobClient blobClient = containerClient.GetBlobClient(pathStorage);
var stream = new MemoryStream(bytesContent);
var uploadInfo = await blobClient.UploadAsync(stream);

PDF header signature not found error?

I am working on Asp.Net MVC application with Azure. When I upload the PDF document to Azure blob storage it will uploaded perfectly by using below code.
var filename = Document.FileName;
var contenttype = Document.ContentType;
int pdfocument = Request.ContentLength;
//uploading document in to azure blob
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
var storageAccount = CloudStorageAccount.DevelopmentStorageAccount(FromConfigurationSetting("Connection"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("containername");
container.CreateIfNotExists();
var permissions = container.GetPermissions();
permissions.PublicAccess = BlobContainerPublicAccessType.Blob;
container.SetPermissions(permissions);
string uniqueBlobName = string.Format(filename );
CloudBlockBlob blob = container.GetBlockBlobReference(uniqueBlobName);
blob.Properties.ContentType = ;
blob.UploadFromStream(Request.InputStream);
after uploading the document to blob trying to to read the pdf document getting an error "PDF header signature not found." that erorr code is
byte[] pdf = new byte[pdfocument];
HttpContext.Request.InputStream.Read(pdf, 0, pdfocument);
PdfReader pdfReader = new PdfReader(pdf); //error getting here
and one more thing I forgot i.e if we comment the above code(uploading document in to Azure blob) then am not getting that error.
In your combined use case you try to read Request.InputStream twice, once during upload and once later when trying to read it into your byte[] pdf --- when you read it first, you read it until its end, so the second read most likely did not get any data at all.
As you anyways intend to read the PDF into memory (the afore mentioned byte[] pdf), you could in your combined use case
first read the data into that array
int pdfocument = Request.ContentLength;
byte[] pdf = new byte[pdfocument];
HttpContext.Request.InputStream.Read(pdf, 0, pdfocument);
then upload that array using CloudBlob.UploadByteArray
var storageAccount = CloudStorageAccount.DevelopmentStorageAccount(FromConfigurationSetting("Connection"));
[...]
CloudBlockBlob blob = container.GetBlockBlobReference(uniqueBlobName);
blob.Properties.ContentType = ; // <--- something missing in your code...
blob.UploadByteArray(pdf); // <--- upload byte[] instead of stream
and then feed your PDF reader
PdfReader pdfReader = new PdfReader(pdf);
This way you read the stream only once, and a byte[] should be re-usable...

Windows Azure UploadFromStream No Longer Works After Porting to MVC4 - Pointers?

Updated my MVC3/.Net 4.5/Azure solution to MVC4.
My code for uploading an image to blob storage appears to fail each time in the upgraded MVC4 solution. However, when I run my MVC3 solution works fine. Code that does the uploading, in a DLL, has not changed.
I’ve uploaded the same image file in the MVC3 and MVC4 solution. I’ve inspected in the stream and it appears to be fine. In both instance I am running the code locally on my machine and my connections point to blob storage in cloud.
Any pointers for debugging? Any known issues that I may not be aware of when upgrading to MVC4?
Here is my upload code:
public string AddImage(string pathName, string fileName, Stream image)
{
var client = _storageAccount.CreateCloudBlobClient();
client.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(5));
var container = client.GetContainerReference(AzureStorageNames.ImagesBlobContainerName);
image.Seek(0, SeekOrigin.Begin);
var blob = container.GetBlobReference(Path.Combine(pathName, fileName));
blob.Properties.ContentType = "image/jpeg";
blob.UploadFromStream(image);
return blob.Uri.ToString();
}
I managed to fix it. For some reason reading the stream directly from the HttpPostFileBase wasn't working. Simply copy it into a new memorystream solved it.
My code
public string StoreImage(string album, HttpPostedFileBase image)
{
var blobStorage = storageAccount.CreateCloudBlobClient();
var container = blobStorage.GetContainerReference("containerName");
if (container.CreateIfNotExist())
{
// configure container for public access
var permissions = container.GetPermissions();
permissions.PublicAccess = BlobContainerPublicAccessType.Container;
container.SetPermissions(permissions);
}
string uniqueBlobName = string.Format("{0}{1}", Guid.NewGuid().ToString(), Path.GetExtension(image.FileName)).ToLowerInvariant();
CloudBlockBlob blob = container.GetBlockBlobReference(uniqueBlobName);
blob.Properties.ContentType = image.ContentType;
image.InputStream.Position = 0;
using (var imageStream = new MemoryStream())
{
image.InputStream.CopyTo(imageStream);
imageStream.Position = 0;
blob.UploadFromStream(imageStream);
}
return blob.Uri.ToString();
}

How to write a string to Amazon S3 bucket?

How can I add a string as a file on amazon s3? From whaterver I searched, I got to know that we can upload a file to s3. What is the best way to upload data without creating file?
There is an overload for the AmazonS3.putObject method that accepts the bucket string, a key string, and a string of text content. I hadn't seen mention of it on stack overflow so I'm putting this here. It's going to be similar #Jonik's answer, but without the additional dependency.
AmazonS3 s3client = AmazonS3ClientBuilder.standard().withRegion(Regions.US_EAST_1).build();
s3client.putObject(bucket, key, contents);
Doesn't look as nice, but here is how you can do it using Amazons Java client, probably what JetS3t does behind the scenes anyway.
private boolean putArtistPage(AmazonS3 s3,String bucketName, String key, String webpage)
{
try
{
byte[] contentAsBytes = webpage.getBytes("UTF-8");
ByteArrayInputStream contentsAsStream = new ByteArrayInputStream(contentAsBytes);
ObjectMetadata md = new ObjectMetadata();
md.setContentLength(contentAsBytes.length);
s3.putObject(new PutObjectRequest(bucketname, key, contentsAsStream, md));
return true;
}
catch(AmazonServiceException e)
{
log.log(Level.SEVERE, e.getMessage(), e);
return false;
}
catch(Exception ex)
{
log.log(Level.SEVERE, ex.getMessage(), ex);
return false;
}
}
What is the best way to upload data
without creating file?
If you meant without creating a file on S3, well, you can't really do that. On Amazon S3, the only way to store data is as files, or using more accurate terminology, objects. An object can contain from 1 byte zero bytes to 5 terabytes of data, and is stored in a bucket. Amazon's S3 homepage lays out the basic facts quite clearly. (For other data storing options on AWS, you might want to read e.g. about SimpleDB.)
If you meant without creating a local temporary file, then the answer depends on what library/tool you are using. (As RickMeasham suggested, please add more details!) With the s3cmd tool, for example, you can't skip creating temp file, while with the JetS3t Java library uploading a String directly would be easy:
// (First init s3Service and testBucket)
S3Object stringObject = new S3Object("HelloWorld.txt", "Hello World!");
s3Service.putObject(testBucket, stringObject);
There is a simple way to do it with PHP, simply send the string as the body of the object, specifying the name of the new file in the key -
$s3->putObject(array(
'Bucket' => [Bucket name],
'Key' => [path/to/file.ext],
'Body' => [Your string goes here],
'ContentType' => [specify mimetype if you want],
));
This will create a new file according to the specified key, which has a content as specified in the string.
If you're using java, check out https://ivan-site.com/2015/11/interact-with-s3-without-temp-files/
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.S3Object;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.*;
import java.nio.charset.StandardCharsets;
class S3StreamJacksonTest {
private static final String S3_BUCKET_NAME = "bucket";
private static final String S3_KEY_NAME = "key";
private static final String CONTENT_TYPE = "application/json";
private static final AmazonS3 AMAZON_S3 = new AmazonS3Client();
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
private static final TestObject TEST_OBJECT = new TestObject("test", 123, 456L);
public void testUploadWithStream() throws JsonProcessingException {
String fileContentString = OBJECT_MAPPER.writeValueAsString(TEST_OBJECT);
byte[] fileContentBytes = fileContentString.getBytes(StandardCharsets.UTF_8);
InputStream fileInputStream = new ByteArrayInputStream(fileContentBytes);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(CONTENT_TYPE);
metadata.setContentLength(fileContentBytes.length);
PutObjectRequest putObjectRequest = new PutObjectRequest(
S3_BUCKET_NAME, S3_KEY_NAME, fileInputStream, metadata);
AMAZON_S3.putObject(putObjectRequest);
}
}
This works for me:
public static PutObjectResult WriteString(String bucket, String key, String stringToWrite, AmazonS3Client s3Client) {
ObjectMetadata meta = new ObjectMetadata();
meta.setContentMD5(new String(com.amazonaws.util.Base64.encode(DigestUtils.md5(stringToWrite))));
meta.setContentLength(stringToWrite.length());
InputStream stream = new ByteArrayInputStream(stringToWrite.getBytes(StandardCharsets.UTF_8));
return s3Client.putObject(bucket, key, stream, meta);
}
The sample code at https://docs.aws.amazon.com/AmazonS3/latest/dev/UploadObjSingleOpJava.html works for me.
s3Client.putObject(bucketName, stringObjKeyName, "Uploaded String Object");
Looks like this was added around 1.11.20, so make sure you are using that or new version of SDK.
https://javadoc.io/doc/com.amazonaws/aws-java-sdk-s3/1.11.20/com/amazonaws/services/s3/AmazonS3.html#putObject-java.lang.String-java.lang.String-java.lang.String-