How to read the File present in Azure Data Lake Store through azure power shell commands? - azure-powershell

Is there any way to read the file data present in Azure data lake store with azure powershell cmdlets?

Here is the step by step on just the powershell itself: https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-get-started-powershell
You can use Get-AzureRmDataLakeStoreItemContent thereafter

Try Get-AzureRmDataLakeStoreItemContent which gets the contents of a file in Azure Data Lake Store. An example:
Login-AzureRmAccount
Get-AzureRmDataLakeStoreItemContent -AccountName "yourADLSAccountName" -Path "/input/someFile.txt"

Related

Hi, Can we store Big XML file into azure sql db using power automate?

I need to store xml file from blob to azure sql db. Can I put it in single column in my db?
To insert xml file data from blob storage to SQL database, follow below steps
Step1: Upload XML file to blob storage
Step2: Create table in SQL database.
Step3: Create copy activity in Azure Data Factory
Step4: Add Blob storage as a Source
Step5: Add SQL database as a Sink
Step6: Do the mappings
Step7: Now you can run copy activity and get required output as shown in below image.

Is it possible to export a single table from Azure sql database, then directly save into azure blob storage without downloading it to local disk

I've tried to use SSMS, but it requires a temporary location for the BacPac file which is local. But i don't want to download it to local, would like to export a single table directly to Azure Blob storage.
Per my experience, we could import table data from the csv file stored in blob storage. But didn't find a way to export the table data to Blob Storage as csv file directly.
You could think about using Data Factory.
It can achieve that , please reference bellow tutorials:
Copy and transform data in Azure SQL Database by using Azure Data
Factory
Copy and transform data in Azure Blob storage by using Azure Data
Factory
Using Azure SQL database as the Source, and choose the table as dataset:
Source dataset:
Source:
Using Blob storage as Sink, and choose the DelimitedText as the sink format file:
Sink dataset:
Sink:
Run the pipeline and you will get the csv file in Blob Storage.
Also thanks for the tutorial #Hemant Halwai provided for us.

Loading 50GB CSV File Azure Blob to Azure SQL DB in Less time- Performance

I am loading 50GB CSV file From Azure Blob to Azure SQL DB using OPENROWSET.
It takes 7 hours to load this file.
Can you please help me with possible ways to reduce this time?
The easiest option IMHO is just use BULK INSERT. Move the csv file into a Blob Store and the import it directly using BULK INSERT from Azure SQL. Make sure Azure Blob storage and Azure SQL are in the same Azure region.
To make it as fast as possible:
split the CSV in more than one file (for example using something like a CSV splitter. This looks nice https://www.erdconcepts.com/dbtoolbox.html. Never tried and just came up with a simple search, but looks good)
run more BULK INSERT in parallel using TABLOCK option. (https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-2017#arguments). This, if the target table is empty, will allow multiple concurrent bulk operations in parallel.
make sure you are using an higher SKU for the duration of the operation. Depending on the SLO (Service Level Objective) you're using (S4? P1, vCore?) you will get a different amount of log throughput, up to close 100 MB/Sec. That's the maximum speed you can actually achieve. (https://learn.microsoft.com/en-us/azure/sql-database/sql-database-resource-limits-database-server)
Please try using Azure Data Factory.
First create the destination table on Azure SQL Database, let's call it USDJPY. After that upload the CSV to an Azure Storage Account. Now create your Azure Data Factory instance and choose Copy Data.
Next, choose "Run once now" to copy your CSV files.
Choose "Azure Blob Storage" as your "source data store", specify your Azure Storage which you stored CSV files.
Provide information about Azure Storage account.
Choose your CSV files from your Azure Storage.
Choose "Comma" as your CSV files delimiter and input "Skip line count" number if your CSV file has headers
Choose "Azure SQL Database" as your "destination data store".
Type your Azure SQL Database information.
Select your table from your SQL Database instance.
Verify the data mapping.
Execute data copy from CSV files to SQL Database just confirming next wizards.

Getting files and folders in the datalake while reading from datafactory

While reading azure sql table data (which actually consists of path of the directories) from azure data factory by using the paths how to dynamically get the files from the datalake.
Can any one tell me what should I give in the dataset
Screenshot
You could use lookup activity to read data from azure sql, and then following it by an foreach activity. And then, pass #item(). to your dataset parameter k1.

Azure Databricks storage or data lake

I'm creating a structured streaming job that stores its data in a databricks delta database. I'm confronted with the option of storing the checkpoint location and data from the delta database in either ...
1. a normal dbfs location like "/delta/mycheckpointlocation" and "delta/mydatabase"
2. a mounted directory from a data lake like "/mnt/mydatalake/delta/mycheckpointlocation" and "/mnt/mydatalake/delta/mydatabase"
If I understand correctly the data in nr1 will be persisted in a blob storage while the data in nr2 would be stored in the data lake (assuming it's mounted on /mnt/mydatalake)
What considerations are there to decide to store stuff like the checkpoint location and the delta database in either 1 or 2?
The DBFS location is a part of your workspace. So if you drop the workspace you lose it.
The lake is shared so many things can connect to it, including other Databricks workspaces, or other services (like ADF).
There is no right or wrong to this - pure preference.