What is default storage account in HDInsight - azure-storage

For a given HDInsight cluster I have seen that there is a 'Default Storage Account' and 'Linked Storage Account'. What does it mean? What's special for some account to be a default storage account for a given HDInsight cluster? How is this different than any arbitrary storage account with respect to that cluster. Probably that whenever we try to access that storage account from that cluster it wont ask for keys?
And how is that different from 'Linked Storage Account' for a given HDInsight cluster? I have seen that there is generally one default storage account for a HDInsight cluster but several Linked storage accounts.

Default storage account is like system drive. Log files are stored in the default storage account. Each cluster must have a default storage account. It is not supported to share a default storage account between two clusters. They are also some issues with reusing a default storage account for many times.
You can have many linked storage accounts. People usually store business data in linked storage accounts. In the past, you can only link a storage account during the cluster creation time. Now, you can use Ambari to add linked storage accounts to a live Linux-based cluster.
see https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-use-blob-storage/

Related

Replication between two storage accounts in different regions, must be read/writeable and zone redundant

We are setting up an active/active configuration using either front door or traffic manager as our front end. Our services are located in both Central and East US 2 paired regions. There is an AKS cluster in each region. The AKS clusters will write data to a storage account located in their region. However, the files in the storage accounts must be the same in each region. The storage accounts must be zone redundant and read/writeable in each region at all times, thus none of the Microsoft replication strategies work. This replication must be automatic, we can't have any manual process to do the copy. I looked at Data Factory but it seems to be regional, so I don't think that would work, but it's a possibility....maybe. Does anyone have any suggestions on the best way to accomplish this task?
I have tested in my environment.
Replication between two storage accounts can be implemented using the Logic App.
In the logic app, we can create two workflows. One for replicating data from storage account 1 to storage account 2. Other for replicating data from storage account 2 to storage account 1.
I have tried to replicate blob data between storage accounts in different regions.
The workflow is :
When a blob is added or modified in the storage account 1, the blob will be copied to the storage account 2
Trigger : When a blob is added or modified (properties only) (V2) (Use connection setting of storage account1)
Action : Copy blob (V2) ) (Use connection setting of storage account2)
Similar way, we can create another workflow for replication of data from Storage Account 2 to Storage Account 1.
Now, the data will be replicated between the two storage accounts.

S3 redundancy, 3+ availability zone guarantee

According to the S3 FAQ:
"Amazon S3 Standard, S3 Standard-Infrequent Access, and S3 Glacier
storage classes replicate data across a minimum of three AZs to
protect against the loss of one entire AZ. This remains true in
Regions where fewer than three AZs are publicly available."
I'm not clear on what this means. Suppose you store your data in a region with fewer than three AZs that are "publicly available." Does that mean that Amazon will store your data in an AZ within that region that is not publicly available, if necessary? Or that it will store your data in an AZ in another region to make up the difference?
S3 will store your data in an AZ that is not publicly available. The same is also true for DynamoDB, and possibly other services as well.
Source:
I want to say I heard it at a re:Invent session. I’ll try to find a link to some documentation.
This says even if you have mentioned AZ where publicly available AZs are < 3, Amazon S3 makes sure to replicate your data in a total of at least 3 AZs(including public & non-public).

Many 4 character storage containers being created in my storage account

I have an Azure storage account.
For a while now, something has been creating 4 character empty containers as shown here, there are hundreds of them:
This storage account is used by:
Function Apps
Document Db (Cosmos)
Terraform State
Container Registry for Docker images
It's not a big deal but I don't want millions of empty containers being created by an unknown process.
Note1: I have looked for any way to find more statistics / history of these folders but I cant find any
Note2: We don't have any custom code that creates storage containers in our release pipelines (ie... PowerShell or CLI)
thanks
Russ
It seems the containers are used to store logs of Azure Function. I have a storage account just for azure function and web app. We could see it has the containers like yours via Storage Explorer.

Windows Azure File storage perfermance

Is there a solution to get the below informations of Window File Azure storage Account using Windows Azure Storage Client Library:
Azure Storage Account Capacity
Azure Storage Free and used Space
Azure Storage Account State (Active, Disable, Enable ….)
Client Transfer files (Mo, GO … ) per month, days …
Azure Storage Account Performance
...
Thanks
As far as I know, a azure standard account contains multiple services. Blob, table, queue, file.
If you want to know the information about he file service, you could use Windows Azure Storage Client Library. If you want to know the information about your storage account, I suggest you could use azure management library.
Azure Storage Account Capacity
As far as I know, the azure storage account capacity is 500TB.
Max size of a file share is 5TB.
Max size of a file is 1TB.
We could create multiple file share in one storage account. The only limit is the 500 TB storage account capacity.
More details, you could refer to this article.
Azure Storage Free and used Space
As far as I know, we could only get the quota and usage of a fileshare by using the Windows Azure Storage Client Library.
We could use CloudFileShare.Properties.Quota property to get the quota of the fileshare and use CloudFileShare.GetStats method to get the usage of the fileshare.
More details, you could refer to below codes:
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
"connectionstring");
CloudFileClient fileClient = storageAccount.CreateCloudFileClient();
CloudFileShare share = fileClient.GetShareReference("fileshare");
share.FetchAttributes();
//get the quota
int? i = share.Properties.Quota;
//get usage
var re = share.GetStats();
Console.WriteLine(i);
Console.WriteLine(re.Usage);
Azure Storage Account State (Active, Disable, Enable ….)
As far as I know, we couldn't get storage account state by using storage SDK. If you want to get this value, I suggest you could use azure management library. You could install it from Nuget package. You could get the StorageAccount.Properties.Status from the StorageAccounts class.
More details about how to use azure management library to access the storage account you could refer to this article.
Client Transfer files (Mo, GO … ) per month, days …
As far as I know, the Windows Azure Storage Client Library doesn't contain the method to get the client transfer files (Mo, GO … ) per month, days.
Here is a workaround, you could write codes to calculate the transfer files number in your application and store this number to azure table storage per day.(When uploading the file to the azure file storage, firstly get the number from the table and add one, then upload the number to the table storage)
If you want to get the number of the transfer files, you could use the azure table storage SDK to get the result.
Azure Storage Account Performance
As far as I know, if we want to check our azure storage account performance, we should firstly enable the diagnostics to log how the storage works. Then we could check the storage performance by using its service's metrics.
More details about how to access metrics data by using Windows Azure Storage Client Library. I suggest you could refer to this article.

Azure - Storage Account/ARM Issue

Pretty new to Azure and struggling with creating a VM from an existing vhd. I get the following error when executing New-AzureQuickVM -ImageName MyVirtualHD.vhd -Windows -ServiceName test:
CurrentStorageAccountName is not accessible. Ensure that current storage account is accessible and the same location or affinity group as your cloud service.
Select-AzureRMSubscription does not return anything for the CurrentStorageAccount property. Get-AzureRMStorageAccount does list my storage account.
Azure has two deployment models: "Classic" and "Resource Manager" (ARM). You're not seeing your ARM-created storage accounts because you're using classic-mode powershell commands to list storage accounts, and your storage accounts were created with the (newer) Resource Management API (and the classic API will only list storage accounts created with the "classic" management API).
Your example shows you mixing the two types. (also - don't worry about resource groups in this context - that's not your issue - resource groups are unrelated for this).
Once you select your subscription (via Select-AzureRmSubscription), and then Get-AzureRmStorageAccount, you should see all of your newly created storage accounts.
Also: Set-AzureSubscription does something different - it's for altering subscription properties. You want Select-... for selecting the default subscription to work with.