Update after a Copy Data Activity in Azure Data Factory

Update after a Copy Data Activity in Azure Data Factory - sql

I've got this doubt in Azure Data Factory. My pipeline has a copy data activity, and after loading the information in the table I need to update a field in that destination based on a parameter. It is a simple update, but given that we do not have a SQL task (present in SSIS) I do not what to use. Create a SP for this does not seem to be the most appropriate solution, besides, modify the database is complicated. I thought the option "Use Query" in the Lookup activity could be a solution, but this does not allow me to create a SQL query with a parameter, just like in a Source.
What could be a possible workaround?

You are on the right track with the Lookup. That is definitely the way to go. The query field there will allow you to create dynamic SQL just like you did within the copy activity. You just need to reference the variable/parameter properly.
Also, with the Lookup, it will always expect something returned. You don't have to do anything with that returned value. Just ignore it, but the Lookup will not work without returning something. So, that query field would contain something like:
UPDATE dbo.MyTable SET IsComplete = 1 WHERE RunId = #{pipeline().parameters.runId};
SELECT 0 AS DummyValue; -- Necessary for Lookup to work

Related

ADO SQL query create column if it doesn't exist

I have a query for a report based on an MS Access database (as the program project file). The tables in this database get updated with new fields periodically as new features are added.
We need to be able to support old and new versions of the file for our report, so need to know if there is a way to insert a field into the SQL SELECT query if it does not already exist. (Note: Do not want to create ALTER TABLE type statements, as the field only needs to be added into the result set, not into the table permanently.)
I know you can do something like "" AS [FieldName], but that only applies when you know the field doesn't exist and need to create a blank spot for it (such as when a unioned table does have that field). In this case, the table might have the field so I want to use it if it does, but if it doesn't I want to have it still exist in the query results with a default value.
Any help would be appreciated. (I also know you can force the user to update the file, but that option was stated as "only last resort".)
Thanks,
Chris

Use SQL Field in SSIS Variable

Is it possible to reference a SQL field in your SSIS variable?
For instance, I would like use the field from the "table" below
Select '999999' AS Physician_Profile_ID
as a dynamic variable (named "CMSPhysProID" in our example) here
I plan on concatenating multiple IDs into a In statement.

Possible by using execute sql taskIn left side pan of Execute SQL task, general tab 1.Select result set as single row2. Connection type ole db 3. Set connection and form SQL statement, As you mentioned Select '999999' AS Physician_Profile_ID 4.Go to result set in your left side pan 5. Add your variable where you want to store '999999' 6. Click ok

If you are looking to store the value within the variable to be used later, you can simply use an Execute SQL Task with a single row result set. More details in the following article:
SSIS Basics: Using the Execute SQL Task to Generate Result Sets
If you are looking to add a computed column while importing data, you must use a Derived Column Transformation within the data flow task to add a column based on another one, you can refer to the following article for more details about this component:
SSIS Derived Columns with Multiple Expressions vs Multiple Transformations

What are you trying to accomplish by concatenating the IDs into an "IN" statement? If the idea is to use the values of the IDs to limit the results, as a dynamic WHERE clause, you may have better luck just using a lookup against either a table you maintain with the desired IDs or even a static list generated in the package with a script task. (If you can use the lookup table method it will be much easier to maintain as you only have to update a table, not your source code.)
Alternatively, you may even be able to accomplish the goal with a join. Create a temp table from the profile IDs you want to keep and join to it, or, again, use it as a lookup component. Dynamically creating a where clause using IN will come in a lot slower and will be cumbersome to maintain.

How can I schedule a script in BigQuery?

At last BigQuery supports using ; in the queries, so I can write more than one query in one "block", if I seperate them with semicolon.
If I run the code manually, it works. But I cannot schedule that.
When I want to schedule, I have two choices:
(New) Web UI: I must give a destination table. If I don't do it, I could not save the scheduled query. But all my queries are updates and inserts with different "destination tables". Like these:
UPDATE project.exampledataset.a
SET date = current_date()
WHEN TRUE
;
INSERT INTO project.otherdataset.b
SELECT c,d
FROM project.otherdataset.c
So I cannot even make a scheduling in the Web UI.
Classic UI: I tried this, because the official documentary states, that I should leave the "destination table" blank, and Classic UI allows it. I can setup the scheduling, but it doesn't run, when it should. I get the error message in email "Error status: Dataset specified in the query ('') is not consistent with Destination dataset 'exampledataset'."
AIK scripting (and using semicolon) is a very new feature in BigQuery, but I hope someone can help me.
Yes, I know that I could schedule every query one by one, but I would like to resolve it with one big script.

Looks like the scheduled query was defined earlier with destination dataset defined with APPEND/TRUNCATE type transaction. While updating the same scheduled query to a DML query, GUI doesn't show the dataset field / table name to update to NULL. Hence this error is coming considering the previously set dataset and table name in the scheduled query.
Hence the fix is to delete the scheduled query and create it from scratch with DML query option. It worked for me.

Scripting is supported in scheduled query now. However, scripting query, when being scheduled, doesn't support setting a destination table for now. You still need to use DDL/DML to make change to existing table.
E.g.:
CREATE OR REPLACE TABLE destinationTable AS
SELECT *
FROM sourceTable
WHERE date >= maxDate

As of 2022, the BQ Console UI will let you create a new scheduled query without a destination dataset, but it won't let you update a prior SELECT to use DDL/DML block syntax. However, you can use the BigQuery Data Transfer API to update the destinationDatasetId field, via transferconfigs/patch. Use transferconfigs/list to get the configId for a given scheduled query.
Note that you can either use the in-browser API Explorer, if you have the appropriate credentials, or write a programmatic solution. Also seems useful for setting/updating any other fields, including renaming scheduled queries.

MS Access SQL (Quickbooks) Update Query

I'm trying to create an Update Query in MS Access (2013) to a QuickBooks Database using QODBC.
I need to update the table PriceLevelPerItem. I am trying to update the field in said table called PriceLevelPerItemCustomprice with a value from another table, QueryThreeTable, and a column titled UpdatedPrice.
I need to update the table PriceLevelPerItem where the PriceLevelPerItemItemRefListID matches the value of ItemID from QueryThreeTable and ListID matches the QueryThreeTable.ItemListID (yes I know these are the wrong way around...)
So far this process has been a very annoying trial of many queries and any help would be greatly appreciated
This is what I've been working with:
UPDATE
PriceLevelPerItem
SET
(PriceLevelPerItemCustomPrice = QueryThreeTable.UpdatedPrice)
FROM
QueryThreeTable, PriceLevelPerItem
WHERE
QueryThreeTable.ItemID = PriceLevelPerItem.PriceLevelPerItemItemRefListID
AND
QueryThreeTable.ItemListID = PriceLevelPerItem.ListID;

I think the problem is that you're trying to use a DAO query inside a QODBC query. I think the two use different Data Access engines.
You're going to need to lookup your UpdatedPrice in your QueryThreeTable using DLookup. Or maybe you need to create a DAO loop using QueryThreeTable that then updates values in your QODBC table from there.
Make your QODBC query work without the use of QueryThreeTable and without any joins. Then come up with a way to dynamically create your query. You're resulting SQL should look something like this:
UPDATE
PriceLevelPerItem
SET
PriceLevelPerItemCustomPrice = 150.16
WHERE
PriceLevelPerItem.ListID = '310000-1146238368';

SSIS - fill unmapped columns in table in OLE DB Destination

As you can see in the image below, I have a table in SQL Server that I am filling via a flat file source. There are two columns in the destination table that I want to update based on the logic listed below:
SessionID - all rows from the first CSV import will have a value of 1; the second import will have a value of 2, and so on.
TimeCreated - datetime value of when the CSV imports happened.
I don't need help with how to write the TSQL code to get this done. Instead, I would like someone to suggest a method to implement this as a Data Flow task within SSIS.
Thank you in advance for your thoughts.
Edit 11/29/2012
Since all answers so far suggested taking care of this on the SQL Server side, I wanted to show you what I had initially tried doing (see image below), but it did not work. The trigger did not fire in SQL Server after SSIS inserted the data into the destination table.
If any of you can explain why the trigger did not fire, that would be great.

If you are able to modify the destination table, you could make the default values for SessionID and TimeCreated do all the work for you. SessionID would be an auto-incremental integer while the default value for TimeCreated would be getdate() or gettime() depending on the data type.
Now, if you truly need it the values to be created as part of your workflow, you can use variables for each.
SessionID would be a package variable which is set by an Execute SQL Task. Just reference the variable in your result set and have your SQL determine the next number to use. There are potential concurrency issues with this, though.
TimeCreated is easily done by creating a Derived Column in your data flow based on the system variable StartTime.

You can use a Derived Column to fill the TimeCreated column, if you want the time of the data flow to happen, you just use the date and time function to get the current datetime. If you want a common timestamp for the whole package (all files) you can use the system variable #[System::StartTime] (or whatitwascalled).
For the CSV looping (i guess), you use a foreach loop container, and map an iterative value to a user variable that you map in the derived column for SessionID as mentioned above.

First, I'd better do it on SQL Server side :)
But if you don't want or cannot to do it on server side you can use this approach:
It is obvious that you need to store SessionID somewhere you can create a txt file for that or better some settings table in SQL Server or there can be other approaches.
To add columns SessionID and TimeCreated to OLE Destination you can use Derived columns

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Update after a Copy Data Activity in Azure Data Factory - sql

Related

ADO SQL query create column if it doesn't exist

Use SQL Field in SSIS Variable

How can I schedule a script in BigQuery?

MS Access SQL (Quickbooks) Update Query

SSIS - fill unmapped columns in table in OLE DB Destination

Categories

Resources