Adding a dynamic column in copy activity of azure data factory - azure-data-factory-2

I am using Data flow in my Azure Data factory pipeline in order to copy data from one cosmos db collection to another cosmos db collection. I am using cosmos SQL Api as the source and sink datasets.
Problem is when copying the documents from one collection to other,I would like to add an additional column whose value will be same as one of the existing key in json. I am trying with Additional column thing in Source settings but i am not able to figure out how can I add an existing column value in there. anyone with any help on this_

In case of copy activity, you can assign the existing column value to new column under Additional column by specifying value as $$COLUMN and add the column name to be assigned.
If you are adding new column in data flow, you can achieve this using derived column

Related

SSIS Mapping and Transformation

I'm new to building SSIS packages, in fact this is my first package. I need to pull data from one DB view on Azure managed instance to an SQL on prem. I have built out the data flow and all. I'm moving data from a database view into a another database table but the destination table has a column that the source doesn't have hence my destination mapping view looks like (See attached image) How do I fix this or what are my options?
If this columns needs to stay empty in the source and you don't have it in source your best and only option is leave it like this. It basically needs to ignore it so no information will be fed. That will work.
In case you need information as current date you can add derivied column box in between your source and destination in your Data Flow where you can add current date or more columns that come from variable for example.
Its self explanatory that ignore(optional) means mapping for those columns can be ignored and if you want columns to be mapped with any calculated column you can do it by using derived column SSIS component Reference
As per your use case,try to use OLD DB component instead of ADO.NET component
to optimize performance for a relatively large data set

How to split column value using azure data factory

I have Source csv file, In which there is one column which have multiple values (data sep.by comma (,)) so I want extract that particular one column using data factory and store that multiple records into table (in database) with different column name
Could you please suggest how should I design that azure data factory pipeline ?
You can use the split function in the Data flow Derived Column transformation to split the column into multiple columns and load it to sink database as below.
Source transformation:
Derived Column transformation:
Using the split() function, splitting the column based on delimiter which returns an array.
Derived Column data preview:
Here 2 new columns are added in the derived column which stores the split data from the source column (name).
Select transformation (optional):
In Select transformation, we can remove columns which are not used in sink and only select required columns.
Sink:
Connect sink to the database and map the column to load the data.

How to validate Column Names and Column Order in Azure Data Factory

I want read the column names from a file stored in Azure Files And then validate the column names and sequence e.g. "First_Column"="First_Column", "Second_Column"= "Second_Column", ... etc and also the order should match. Please suggest a way to do this in Azure Data Factory.
Update:
Alternatively, we can use Lookup activity to view the headers, but the judgment condition will be a little complex.
At the If Condition1 we can use the expression: #and(and(equals(activity('Lookup1').output.firstRow.Prop_0,'First_Column'),equals(activity('Lookup1').output.firstRow.Prop_1,'Second_Column')),equals(activity('Lookup1').output.firstRow.Prop_2,'Third_Column'))
We can validate the column names and sequence in dataflow via column patterns in derived column.
For example:
The source data csv file is like this:
The dataflow is like this:
I don't select First row as header , so we can read the headers of the csv file into the dataflow.
Then I use SurrogateKey1 to add a row_no to the data.
The data preview is like this:
At ConditionalSplit1 activity, I use row_no == 1 to filter the headers.
At DerivedColumn1 activity, I use several column pattern to validate the column names and sequence.
The result is as follows:

Create table schema and load data in bigquery table using source google drive

I am creating table using google drive as a source and google sheet as a format.
I have selected "Drive" as a value for create table from. For file Format, I selected Google Sheet.
Also I selected the Auto Detect Schema and input parameters.
Its creating the table but the first row of the sheet is also loaded as a data instead of table fields.
Kindly tell me what I need to do to get the first row of the sheet as a table column name not as a data.
It would have been helpful if you could include a screenshot of the top few rows of the file you're trying to upload at least to see the data types you have in there. BigQuery, at least as of when this response was composed, cannot differentiate between column names and data rows if both have similar datatypes while schema auto detection is used. For instance, if your data looks like this:
headerA, headerB
row1a, row1b
row2a, row2b
row3a, row3b
BigQuery would not be able to detect the column names (at least automatically using the UI options alone) since all the headers and row data are Strings. The "Header rows to skip" option would not help with this.
Schema auto detection should be able to detect and differentiate column names from data rows when you have different data types for different columns though.
You have an option to skip header row in Advanced options. Simply put 1 as the number of rows to skip (your first row is where your header is). It will skip the first row and use it as the values for your header.

copy a field from one data source to other in tableau

I wanted to get rid off one data source in Tableau, that's why instead of using 2 different data source for one dashboard, I wanted to copy all relevant fields from one data source to other. Is there any way in Tableau, by which I can copy-paste tos field from one to other data source?
In the attached screenshot, I wanted to copy the advisor sales field in data source biadvisorSalesmonth24 to bitransactionPartnerDay365:
You cannot make schema or structure changes to a table / datasource from within Tableau. If advisor sales is not in the bitransactionPartnerDay365 data source, then you will have to keep both data sources in the workbook and join them together.
Now, if you are familiar with the datasets and know the necessary table layout, you could write a custom SQL command and use that SQL command to retrieve the desired data as a single data source.