I'm looking to implement a sort of Type 2 Slow Changing Dimension like behavior into my dimension table using PL/SQL's MERGE statement. It workd just fine for updating existing values and inserting new ones. I'm looking to extend this functionality by not only updating existing values but creating a different row with the updated values while preserving the row with the "outdated" values.
In short, is it possible to do this?
MERGE INTO A
USING B
ON (A.ID = B.ID)
WHEN MATCHED THEN
UPDATE END_DATE ON THE EXISTING ROW;
INSERT UPDATED VALUES IN A NEW ROW;
WHEN NOT MATCHED THEN
INSERT A NEW ROW WITH NEW VALUES;
Thanks you guys in advance.
If you are truly doing a Type 2 dimension and want to use a merge, yes it can be done but it's not terribly straightforward. Effectively, if you will need to compare your data using an inline view in the USING clause, that features a column that indicates whether its an insert or an update. This field gets joined to an slowly changing dimension table that drives whether an insert or an update occurs.
This blog post describes the technique in great detail and has worked for us, albeit we used a hash to determine unqiueness as opposed to a column by column compare.
Load Slowly Changing Dimension Type 2 using Oracle Merge Statement
You need to run two statements instead of 1 here.
(in your pseudo-code)
UPDATE END_DATE IN A WHERE A.ID = something_from_B
INSERT new VALUES IN A
The insert needs to happen no matter what. So no need of checking if the row exists or not (and only updating if the row exists). Just end_date all records that match B. This would end_date 0 to n number of rows. And then Insert everything together without worrying if that record has been end-dated already or not.
Put this inside a BEGIN...END; if you want transactional atomicity.
In short - no. A MERGE statement provides the option to INSERT if matching data is not found, and to UPDATE if matching data is found. It does NOT give the option to "update-and-insert" if matching data is found.
Best of luck.
As I understand, you wrote a pseudocode. So I can suggest just an idea also in pseudocode:
MERGE INTO A
USING (select * from B1 union all
select * from B2) B
ON (A.ID = B.ID)
WHEN MATCHED THEN
UPDATE END_DATE ON THE EXISTING ROW FROM B1;
WHEN NOT MATCHED THEN
INSERT A NEW ROW WITH NEW VALUES FROM B2;
If you want to insert updated rows as new, you can generate subquery based on B, which will contain "updated rows", but these rows have to be considered as new by SQL engine (it is impossible to say how to do it without details about your tables). In my query B1 is "rows to update", and B2 is "updated rows which have to look as new". If you do this, updated rows will be inserted.
Also, no guarantee that in your case it could be implemented.
Related
So I need to figure out how to insert into a table, from another table, with a where clause that requires me to access the table that I am inserting into. I tried an alias from the table I am inserting into, but I quickly found out that you cannot do that. Basically, what I want to check is that the values that I am inserting into the table match a particular field within the table that I am inserting into. Here is what I've tried:
INSERT INTO "USER"."TABLE1" AS A1
SELECT *
FROM "USER"."TABLE2" AS A2
WHERE A2."HIERARCHYLEVEL" = 2
AND A2."PARENT" = A1."INSTANCE"
Obviously, this was to no avail. I've tried a couple other queries, but they didn't me anywhere, either. Any help would be much appreciated.
EDIT:
I would like to add rows to this table, not add columns to the table. The two tables are of the exact same structure -- in fact, I extracted the data already in table1 from table2. What I have in table1 currently is a bunch of records who have NO PARENT, but an instance. What I want to add is all the records who have a parent in table2 that are equal to the instance in table 1.
Currently there is no way to join on a table when inserting. The solution with the subselect where you select from the table, is the correct.
Aliasing the table you want to change is only possible with UPDATE, UPSERT and MERGE. For these operations it makes sense, as you need to match a column and then decide if you need to update it or insert something instead. In your example the line from table1 that you match is not relevant, as you don't want to change it, so from the statement point of view it is not really relevant that the table you use in your subselect is the same that the one you insert into.
As alternative, I can suggest you following solution, which is equivalent with yours:
INSERT INTO "user"."table1"
SELECT
A1."ROOT",
A1."INSTANCE",
A1."PARENT",
A1."HIERARCHYLEVEL"
FROM "user"."table2" AS A1
WHERE A1."INSTANCE" in (select "PARENT" from "user"."table1")
AND A2."HIERARCHYLEVEL" = 2
This gave me the answer I was looking for, although I am sure there is an easier -- or more efficient -- way to do it.
INSERT INTO "user"."table1"
SELECT
A1."ROOT",
A1."INSTANCE",
A1."PARENT",
A1."HIERARCHYLEVEL"
FROM "user"."table2" AS A1,
"user"."table1" AS A2
WHERE A1."INSTANCE" = A2."PARENT"
AND A2."HIERARCHYLEVEL" = 2
I have a stored procedure which is doing the following.
The populated target table data is checked against several similar source tables for a match (based on name and address data). If a match is found in the first table then it updates the target with a flag identifying which source table the match was from. However if it doesn't find a match I need it to look in the next source table and the next until either a match is found or not as the case may be.
Is there an easy way for the UPDATE statement to provide some kind of return value I can query to say whether it updated the target table? I would like to use this return value so that I can skip checking subsequent source tables unnecessarily.
Otherwise will I have to perform the conditional UPDATE then do a separate query to determine if the UPDATE actually updated the flag?
Probably the safest approach is to use the OUTPUT clause. This will return the modified rows into a new table.
You can check the table to see if any rows have been updated.
One advantage of the OUTPUT clause is that you can update multiple rows at the same time.
I like the soulution of Gordon, but I do not think you actualy need it.
Simply run the updates in order:
UPDATE BASE_TABLE
SET FLAG='first_table'
where FLAG IS null AND
EXIST (SELECT 1 FROM first_table f1 where f1.ID = ID)
UPDATE BASE_TABLE
SET FLAG='second_table'
where FLAG IS null AND
EXIST (SELECT 1 FROM second_table f2 where f2.ID = ID)
...
And so on.
You dont need to check every row conditionaly, that would be very slow.
you can put your update in try/catch and insert your result to another table
I need to INSERT a row in table_A depending on the information in a row in table_B.
Is it possible to do this in an isolated way where the SELECT retrieval of the row from table B is locked until either the new row is INSERTed into table_A or the INSERT is skipped due to the information in table_B's row?
It's really not clear what you are trying to say , i think your problem is solved by using a trigger .
check this site for know more about trigger
http://www.codeproject.com/Articles/25600/Triggers-SQL-Server
You can do this:
INSERT INTO A (columns) select columns from table B where condition;
Columns retrieved from the query must match the queries defined in the table A.
PostgreSQL supports MVCC, custom locking can be done but it is not recomended.
I am trying to execute a query within a SQL trigger.
I have 4 tables A, B, C, D. Table A is a lookup list and contains roughly 1400 rows of data. Table B are values being input through an HMI with a timestamp. Table C is the table where my values are intended to go. Table D is a list of multipliers to use to multiply values from table A to table B (I am only using one multiplier from table D at the moment).
When a user inputs data into table B, that should trigger the procedure to get the values that were inserted (including the itemnumber) and relate the itemnumber to table A and use table D to multiply a few things together to send values to Table C. If I only input 3 rows of data in table B for example, I should only get three rows of data in table C. I am merely using table A to match the item number and get some data. But for some reason I am inserting way more records than intended, over 1600 rows.
Table D multipliers have a timestamp that does not match or have any correlation with any other table. So I am using a timestamp and selecting the multipliers that are closest to the timestamp from table B (some multipliers will change throughout time and I need a historical multiplier to correctly multiply the right things together)
Your help is most appreciated. Thank you.
Insert into TableC( ItemNumber, Cases, [Description], [Type], Wic, Elc, TotalElc, LbsPerCase, TotalLbs, PeopleRequired, ScheduleHours, Rated, Capacity, [TimeStamp])
Select
b.ItemNumber, b.CaseCount, a.ItemDescription, a.DivisionCode, a.workcenter,
a.LaborPercase as ELC, b.CaseCount * a.LaborPerCase * d.IpCg,
a.LbsPerCase, a.LaborPerCase * b.CaseCount as TotalLbs,
a.PersonReqd, b.Schedulehours, a.PoundRating,
b.ScheduleHours * a.PoundRating as Capactity, b.shift, GETDATE()
from
TableA a, TableB b, TableD
Where
a.itemnumber = b.itemnumber
and d.IpCG < b.TimeStamp
and b.CasesCount > 0
You do not reference the inserted or deleted tables that are available only in the trigger, so of course you are returning more records tha you need in your query.
When first writing a trigger, what I do is create a temp table called #inserted (and/or #deleted) and populate it with several records. It should match the design of the table that the trigger will be on. It is important to make your temp table have several input records that might meet the various criteria that affect your query (so in your caseyou want some where the case count would be 0 and some where it would not for instance) and that would be typical of data inserted into the table or updated init. SQL server triggers operate on sets of data, so this also ensures that your trigger can properly handle multiple record uiinserts or updates. A properly written trigger would have test cases you need to test to make sure everything happens correctly, your #inserted table should include records that meet all those test cases.
Then write the query in a transaction (and roll it back while you are testing) joining to #inserted. If you are doing an insert with a select, only write the select part until you get that right, then add the insert. For testing, write a select from the table you are inserting to in order to see the data you inserted before you rollback.
Once you get everything working, change the #inserted references to inserted, remove any testing code and of course the rollback (possibly the whole transaction depednig on what you are doing.) and add the drop and create trigger part of the code. Now you can test you trigger as a trigger, but you are in good shape becasue you know that it is likely to work from your earlier testing.
Working with DB2 but guess this applies to SQL in general.
I have two tables; Table1 contains data where the Fix column could be same in multiple rows. Table2 has unique rows and has data I want to add to columns in the first table if one or more matches between the Fix column in Table1 and the Title column in Table2 are found.
I'm getting an issue in that the SQL is returning an error saying: "The same row of target table "xxxxxxx" was identified more than once for an update, delete or insert operation of the MERGE statement.."
Now that is expected ie I know there are multiple rows in the target table that match the criteria and need to have the data from the source table applied to them.
I'm using MERGE but is that just not going to be possible? Been looking at GROUP BY too but can't get it to work.
If I was doing this in another language I'd go through the source table, build a collection of matching records from the source then iterate through the collection updating that with the source data that needed adding. Thinking there is a more efficient way in SQL though?
This code is completely wrong (and returns the error above) but adding it here to help any good person who wants to lend a hand :-)
CREATE PROCEDURE Update_RawData_With_xKey_Data ()
P1: BEGIN
MERGE INTO table1 AS T
USING table2 AS S
ON (T.FIX = S.TITLE)
WHEN MATCHED THEN
UPDATE SET
T.Rating = S.Rating,
T.GSDS = S.GSDS_Date,
ELSE IGNORE ;
END P1