Data Factory Childitem modified or created date

Data Factory Childitem modified or created date - azure-data-factory-2

I have a Data Factory V2 pipeline consisting of 'get metadata' and 'forEach' activities that reads a list of files on a file share (on-prem) and logs it in a database table. Currently, I'm only able to read file name, but would like to also retrieve the date modified and/or date created property of each file. Any help, please?
Thank you

According to the MS documentation.
We can see File system and SFTP both support the lastModified property. But we only can get the lastModified of one file or folder at a time.
I'm using File system to do the test. The process is basically the same as the previous post, we need to add a GetMetaData activity to the ForEach activity.
This is my local files.
First, I created a table for logging.
create table Copy_Logs (
Copy_File_Name varchar(max),
Last_modified datetime
)
In ADF, I'm using Child Items at Get Metadata1 activity to get the file list of the folder.
Then add dynamic content #activity('Get Metadata1').output.childItems at ForEach1 activity.
Inside the ForEach1 activity, using Last modified at Get Metadata2 activity.
In the dataset of Get Metadata2 activity, I key in #item().name as follows.
Using CopyFiles_To_Azure activity to copy local files to the Azure Data Lake Storage V2.
I key in #item().name at the source dataset of CopyFiles_To_Azure activity.
At Create_Logs activity, I'm using the following sql to get the info we need.
select '#{item().name}' as Copy_File_Name, '#{activity('Get Metadata2').output.lastModified}' as Last_modified
In the end, sink to the sql table we created previously. The result is as follows.

One way , I can think of is please add a new Getmetdata inside the FE loop and use paramterized dataset and pass a filename as the paramter . The below animation should helped , I did tested the same .
HTH .

Related

Bulk copy multiple csv files from Blob Container to Azure SQL Database

Environment:
MS Azure:
Blob Container, multiple csv files saved in a folder. This is my source.
Azure Sql Database. This is my target
Goal:
Use Azure Data Factory and build a pipeline to "copy" all files from the container and store them in their respective tables in the Azure Sql database by automatically creating those tables.
How do I do that? I tried following this but I just end up having tables incorrectly created in the database, where table is created with a single column having same name as the table name.
I believe I followed the instructions from that link pretty must as they are.

My CSV file is as follows, one column contains the table name.
The previous steps will not be repeated,it is the same as the link.
At Step3 inside the Foreach activity, we should add a Lookup activity to query the table name from the source dataset.
We can declare a String type variable tableName pervious, then set the value via expression #activity('Lookup1').output.firstRow.tableName.
At sink setting of the Copy activity, we can key in #variables('tableName').
ADF will auto create the table for us.
The debug result is as follows:

Dynamic filename in Data Factory dataflow source

I’m working with a pipeline that loads table data from onpremise SQL to a datalake csv file dynamically, sinking a .csv file for each table that I already set to load in a versionControl table in a AzureSQL using Foreach.
So, after load the data, i want to update the versionControl table with the lastUpdate date, based on the MAX(lastUpdate) field of each .csv file loaded. To accomplish that, i know that i need to add a dataflow after the copy activity, so i can use the aggregate transformation, but don’t know how to pass the filename to the source of the dataflow dynamically in a parameter.
Thanks!

2 options:
Parameterized dataset. Use a source dataset in the dataflow that has a parameter for the file name. You can then pass in that filename as a pipeline parameter.
Parameterized Source wildcard. You can also use a source dataset in the dataflow that points just to a folder in your container. You can then parameterize the wildcard property in the Source and send in the filename there as a pipeline parameter.

Azure Data Factory 2 : How to split a file into multiple output files

I'm using Azure Data Factory and am looking for the complement to the "Lookup" activity. Basically I want to be able to write a single line to a file.
Here's the setup:
Read from a CSV file in blob store using a Lookup activity
Connect the output of that to a For Each
within the For Each, take each record (a line from the file read by the Lookup activity) and write it to a distinct file, named dynamically.
Any clues on how to accomplish that?

Use Data flow, use the derived column activity to create a filename column. Use the filename column in sink. Details on how to implement dynamic filenames in ADF is describe here: https://kromerbigdata.com/2019/04/05/dynamic-file-names-in-adf-with-mapping-data-flows/

Data Flow would probably be better for this, but as a quick hack, you can do the following to read the text file line by line in a pipeline:
Define your source dataset to output a line as a single column. Normally I would use "NoDelimiter" for this, but that isn't supported by Lookup. As a workaround, define it with an incorrect Column Delimiter (like | or \t for a CSV file). You should also go to the Schema tab, and CLEAR the schema. This will generate a column in the output named "Prop_0".
In the foreach activity, set the Items to the Lookup's "output.value" and check "Sequential".
Inside the foreach, you can use item().Prop_0 to grab the text of the line:
To the best of my understanding, creating a blob isn't directly supported by pipelines [hence my suggestion above to look into Data Flow]. It is, however, very simple to do in Logic Apps. If I was tackling this problem, I would create a logic app with an HTTP Request Received trigger, then call it from ADF with a Web activity and send the text line and dynamic file name in the payload.

Archive the file in adf with timestamp using copy activity in adf

My requirement is to Copy specific file based on wild card from a container/folder in datalake to azure database using copy activity and then copy the file into a different folder with timestamp at the end of the file.
I used getmetadata and filter activities to get the specific file name from the datalake/blob folder to be loaded.But copy activities to database and the file movement with timestamp are failing.
Please find the attachment for the steps that was followed.
Can you please help.
Thanks

Found a solution for this. After filter activity used foreach activity and inside that used setvariable. Using this setvariable able to archive the source file with timestamp.

Getting files and folders in the datalake while reading from datafactory

While reading azure sql table data (which actually consists of path of the directories) from azure data factory by using the paths how to dynamically get the files from the datalake.
Can any one tell me what should I give in the dataset
Screenshot

You could use lookup activity to read data from azure sql, and then following it by an foreach activity. And then, pass #item(). to your dataset parameter k1.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Data Factory Childitem modified or created date - azure-data-factory-2

One way , I can think of is please add a new Getmetdata inside the FE loop and use paramterized dataset and pass a filename as the paramter . The below animation should helped , I did tested the same . HTH .

Related

Bulk copy multiple csv files from Blob Container to Azure SQL Database

Dynamic filename in Data Factory dataflow source

Azure Data Factory 2 : How to split a file into multiple output files

Archive the file in adf with timestamp using copy activity in adf

Getting files and folders in the datalake while reading from datafactory

Categories

Resources