Are there any good tools to take a snapshot of my Azure tables and blob containers and copy it into local development storage?
Developers sometimes need to work in a isolated environment but would like a copy of some "real" application data. Right now we have data creation scripts that we can run to populate local storage but it would be helpful to be able to grab a snapshot and move into development storage.
I generally use Cloud Storage Studio for all handling of Azure Storage. Using that you can easily download from your live blob storage and then upload to your local storage.
You can also use the Azure Storage Synctool to upload the local storage to a live storage blob on Azure, or download (vice versa).
Related
What is the best method to sync medical images between my client PCs and my Azure Blob storage through a cloud-based web application? I tried to use MS Azure Blob SDK v18, but it is not that fast. I'm looking for something like dropbox, fast, resumable and efficient parallel uploading.
Solution 1:
AzCopy is a command-line tool for copying data to or from Azure Blob storage, Azure Files, and Azure Table storage, by using simple commands. The commands are designed for optimal performance. Using AzCopy, you can either copy data between a file system and a storage account, or between storage accounts. AzCopy may be used to copy data from local (on-premises) data to a storage account.
And also You can create a scheduled task or cron job that runs an AzCopy command script. The script identifies and uploads new on-premises data to cloud storage at a specific time interval.
Fore more details refer this document
Solution 2:
Azure Data Factory is a fully managed, cloud-based, data-integration ETL service that automates the movement and transformation of data.
By using Azure Data Factory, you can create data-driven workflows to move data between on-premises and cloud data stores. And you can process and transform data with Data Flows. ADF also supports external compute engines for hand-coded transformations by using compute services such as Azure HDInsight, Azure Databricks, and the SQL Server Integration Services (SSIS) integration runtime.
Create an Azure Data Factory pipeline to transfer files between an on-premises machine and Azure Blob Storage.
For more details refer this thread
I have a bunch of VHDs mounted locally.
They contain a huge number of relatively small files (tens of millions of files per VHD)
Is there a way to transfer the VHDs to Azure and "mount" them inside a blob storage container so I can be able to access those files as blobs?
I can convert VHDs to ISO files if that would help.
Trying to save time and money with this method.
LE:
Azure File Shares is fine, too
I don't think there is a way to do so. Instead, you can either:
Upload a VHD to Azure as a page blob, and mount it as data disk of an Azure VM. However, it's just one page blob when you access the VHD with Azure Storage API, and you're only able to browse the internal files in the VHD within the VM.
Mount the VHD locally, and upload the files in the VHD one by one to Azure File Share or Azure Blob Container (you may consider using AzCopy for best uploading performance). After that, you'd be able to access those files directly with Azure Storage REST API or Azure Storage Client Libraries, but the VHD itself it not on Azure.
In a word, there is no way to both mount the VHD in Azure Storage and access the files within the VHD with Azure Storage API, you need to choose either of them.
I'm investigating whether the feature to copy multiple folders
(Exports from Collections) from Azure File Share to onPremise Accelerate file share (windows share) exists or not.
Azure file share is indeed supported in the Import/Export process:
"Azure Import/Export service is used to securely import large amounts of data to Azure Blob storage and Azure Files by shipping disk drives to an Azure datacenter"
You can read more about the feature and when it's best used here
Current story:
Moving overall BI solution fully to Azure cloud services. Building a new Azure DW and loading data from an Azure DB. Currently, Azure DW doesn't support linked servers and/or the elastic query (this is only supported in Azure DB). Due to price, we can not use data factory or an instance of SSIS. We can't use bcp as we don't have a local directory to hold the file in between loads.
Is it possible to use Azure PowerShell with sqlcmd to write results of a query directly to Azure Storage, without having to write to a file on a local directory in between?
Are there other options that aren't mentioned above?
Thank you for any input.
The current Azure Storage PowerShell (Set-AzureStorageBlobContent) only support upload blob from local file.
Azure Storage Client Library (https://github.com/Azure/azure-storage-net) support to upload blob from stream, can you try to develop your own application with the Azure Storage Client Library?
If your data is big, you can also try https://github.com/Azure/azure-storage-net-data-movement/, it has better performance in upload big blob.
Hi I'm playing around with HDInsight. I'm putting log files into Azure storage and then using Hive external tables to map onto them. I believe Microsoft recommend Azure storage to HDFS so you can delete and recreate the clusters without losing data. What is the scalability vs HDFS. My understanding of HDFS is that it is spread over multiple nodes to allow parallel processesing how does this compare to Azure storage.
On HDInsight, HDFS storage is based on disks that run in the physical hosts of the VMs (PaaS VMs called worker roles in Windows Azure).
Windows Azure storage has its own scalability mechanisms. The scalability targets are documented here: http://msdn.microsoft.com/en-us/library/windowsazure/dn249410.aspx
To give you an idea, Windows Azure storage is where an OS disk lives for Windows Azure IaaS VMs.