MERGE with WHERE clause - sql

Consider data existing in a table:
Customers
| CustomerID | Name | Status |
|------------|-----------------|--------------------|
| 1 | Ian Boyd | Killed |
| 2 | Shelby Hawthorn | Booked |
And rows i would like to MERGEd into the Customers table:
| CustomerID | Name | Status |
|------------|-----------------|--------------------|
| 1 | Ian Boyde | Waiting | name has 'e' on the end
| 2 | Shelby Blanken | Waiting | different last name
| 3 | Jessica Bogden | Waiting | totally new row
So i can come up with approximate psuedocode MERGE statement:
MERGE Customers USING (
SELECT CustomerID, Name, 'Waiting' FROM Staging) foo
ON Customers.CustomerID = foo.CustomerID
WHEN MATCHED THEN
UPDATE SET Name = foo.Name, Status = foo.Status
WHEN NOT MATCHED BY TARGET THEN
INSERT (Name, Status)
VALUES (Name, Status);
And that would MERGE them:
| CustomerID | Name | Status |
|------------|-----------------|--------------------|
| 1 | Ian Boyde | Waiting | Last name spelling updated
| 2 | Shelby Blanken | Waiting | Last name changed
| 3 | Jessica Bogden | Waiting | New row added
But only UPDATE some rows
Except a caveat is that i don't want update any existing rows for customers who are Booked. In other words i want the final results to be:
| CustomerID | Name | Status |
|------------|-----------------|--------------------|
| 1 | Ian Boyde | Waiting | updated existing row spelling
| 2 | Shelby Hawthorn | Booked | not updated because they're booked
| 3 | Jessica Bogden | Waiting | inserted new row
My first guess would for the UPDATE to have a where clause:
MERGE Customers USING (
SELECT CustomerID, Name, 'Waiting' FROM Staging) foo
ON Customers.CustomerID = foo.CustomerID
WHEN MATCHED THEN
UPDATE SET Name = foo.Name, Status = foo.Status
WHERE Status <> 'Booked' -- <--------- it's the matching row; but don't update it
WHEN NOT MATCHED BY TARGET THEN
INSERT (Name, Status)
VALUES (Name, Status);
But that's not a valid syntax.
My second guess would be to add the criteria to the ON clause:
MERGE Customers USING (
SELECT CustomerID, Name, 'Waiting' FROM Staging) foo
ON Customers.CustomerID = foo.CustomerID
AND Customers.Status <> 'Booked'
WHEN MATCHED THEN
UPDATE SET Name = foo.Name, Status = foo.Status
WHERE Status <> 'Booked' --it's the matching row; but don't update it
WHEN NOT MATCHED BY TARGET THEN
INSERT (Name, Status)
VALUES (Name, Status);
But now the row would not match, and they would get inserted under the not matched by target rule:
| CustomerID | Name | Status |
|------------|-----------------|--------------------|
| 1 | Ian Boyde | Waiting | updated existing row
| 2 | Shelby Hawthorn | Booked | not matched bcause booked
| 3 | Jessica Bogden | Waiting | inserted new row
| 4 | Shelby Blanden | Waiting | Mistakenly inserted because not matched by target
What's the way out of the conundrum?

The key is that you want to make sure that the record falls into the MATCHED logic, otherwise it will generate a new row via the NOT MATCHED logic.
To do this, using your code, we add your criteria to the MATCHED logic:
MERGE Customers USING (
SELECT CustomerID, Name, 'Waiting' FROM Staging) foo
ON Customers.CustomerID = foo.CustomerID
WHEN MATCHED AND Customers.Status <> 'Booked' THEN
UPDATE SET Name = foo.Name, Status = foo.Status
WHEN NOT MATCHED BY TARGET THEN
INSERT (Name, Status)
VALUES (Name, Status);
This tells the merge to match everything on CustomerID. When it finds a match, you then tell it to only run the update if the Status <> 'Booked'

Merge statements are driven by the USING clause.
Rows in the USING that do match existing rows cause existing rows to update.
Rows in the USING that do not match existing rows cause new rows to be created
Rows that are not in the USING clause cannot affect rows in the db
If you do not want an existing row to be updated, ensure its matching row never makes it into the result set presented by the statement in the USING. This may mean doing a join in the USING. This is fine
Example:
MERGE Customers USING (
SELECT
s.CustomerID,
s.Name,
'Waiting' as Stat
FROM
Staging s
INNER JOIN Existing e on s.CustomerId = e.CustomerId
WHERE
e.Status <> 'Booked' --ignore all existing booked rows
) foo
...
This join inside the using statement ensures that the staging row that relates to the existing "Booked" row, never makes it into the result set produced by the USING. It hence cannot cause either an update or an insert

Related

SQL question, query is not updating account_id's fields: income, customerid, customergroup?

I am executing this query through a databricks notebook, to join data from a stage table to a target table based on the shared join keys: account_id and stmt_end_dt. The stage table has 2 billion rows of data and the target table has 3 billion rows of data.
Here is the main query:
"UPDATE TARGET_TBL SET INCOME = S.INCOME, CUSTOMERGROUPID = S.CUSTOMERGROUPID, CUSTOMERID = S.CUSTOMERID
FROM STAGE_TBL AS S
WHERE CAST(S.ACCT_ID AS NUMBER(18,0)) = TARGET_TBL.ACCT_ID
AND CAST(S.STMT_END_DT AS DATE) = TARGET_TBL.STMT_END_DT"
What I want to do is add "income", "customerid", and "customergroup" data to the matching rows of "account_id" and "stmt_end_dt" in the target table, from the stage table. When I go into the target table I see that there are now fields for "income", "customerid", and "customergrop" (this is fine because it was created through an earlier query). After my query has run and I click into the target table I see that account_id is blank and that "income", "customerid" and "customergroup" all have data filled. And when I run this query: SELECT * FROM TARGET_TBL WHERE INCOME IS NOT NULL; I get back 80000 rows (seems kinda low considering the stage table is 2 billion). Also after that query runs I see again that "income", "customerid" and "customergroup" are all populated with data, but account_id is full of NULLS. It is as this data is just being appended or tacked on, and not updating each account_id's fields with the matching data, this is how I imagine it should look like:
account_id | income | customerid | customergroupid
4321 | 60000 | 6345 | 3
5432 | 55000 | 4345 | 5
But instead it looks like this:
account_id | income | customerid | customergroupid
| 60000 | 6345 | 3
| 55000 | 4345 | 5
Or when I run: SELECT * WHERE INCOME IS NOT NULL:
account_id | income | customerid | customergroupid
NULL | 60000 | 6345 | 3
NULL | 55000 | 4345 | 5
And if I simply open the target table it looks like this:
account_id | income | customerid | customergroupid
4321 | | |
5432 | | |
After that query runs, it is also NULL for all other fields outside of the last 3 shown.
Perhaps the data types coming from the stage table aren't compatible with the target table?
What could be causing this strange behavior?
you can't compare "values" with "null"... if a field is "null" there is nothing to compare. I believe this is your problem.
if you have null fields and you want to compare, usually you can try "is null" or "nvl" lookup for the syntax of these.. it is very helpfull.

SQL: How do I find a descript value from one table that's linked to another?

I have two tables (TableA; TableB) and I need to pull find info from TableB as it relates to an entry in TableA.
TableA contains the following:
Unique_ID (Unique ID applied to every record in this database)
STATE_CODE (Abbreviation of state that is used as information in the record; a record can only have one state selected but the same state could be applied to multiple records)
TableB contains the following:
STATE_CODE
STATE_NAME
So let's assume it looks like this:
TableA:
| Unique_ID | STATE_CODE |
+-----------+------------+
| 1 | TX |
| 2 | GA |
| 3 | TX |
| 4 | MS |
TableB would look like:
| STATE_CODE | STATE_NAME |
+------------+-------------+
| TX | Texas |
| GA | Georgia |
| MS | Mississippi |
I need my select statement to pull the STATE_NAME (e.g. "Texas") from TableB based on the STATE_CODE that's applied to the Unique_ID in TableA.
I know I need to relate the STATE_CODE to the Unique_ID and then the STATE_NAME to that specific STATE_CODE but I'm having trouble with my statement. It pulls the State Name but it repeats it over and over in the results, aka: Texas;Texas;Texas. I think it's repeating it once for every time that individual record (Unique_ID) was edited/updated. How do I make it only appear once?
My SQL statement has tp be fairly uncomplicated because I'm writing and running this through a database GUI, not something like SSMS or MySQL. So a bunch of JOIN and WHERE don't work through the GUI.
SELECT STATE_NAME
from dbo.TableB b
, dbo.TableA a
where b.STATE_CODE = a.STATE_CODE
and a.Unique_ID = {Unique_ID}
I expect the output to just say 'Texas' and instead it says
'Texas;Texas;Texas'
I'm from Texas and even I don't want to say it over and over again.
This will solve the problem:
SELECT STATE_NAME
from TableB b
, TableA a
where b.STATE_CODE = a.STATE_CODE
and a.Unique_ID = 1
GROUP BY STATE_NAME
Here is the DEMO
But please do note:
In your question you have given us two tables and some data. When we
test your query with that data it returns "Texas" only one time
because in TableA you have given us two different values for column
Unique_ID for TX. In my first DEMO that you can see above I have
changed that for test.
If it is not ok(because of group by and your GUI) then try this:
SELECT MAX(STATE_NAME)
from TableB b
, TableA a
where b.STATE_CODE = a.STATE_CODE
and a.Unique_ID = 1
Or maybe this:
SELECT STATE_NAME
from TableB b
, TableA a
where b.STATE_CODE = a.STATE_CODE
and a.Unique_ID = 1
limit 1

update column value based on inserted value in another table

I have two tables
Travel Table
tid | tname | countryid | status
1 | a | 1 |
PassengerTravel
tid | passengerid
1 | 1
1 | 2
i want update status column in table 1 if insert in passenger Travel table
and update status column if print the travel application for them
Can you do with your code ?
the first one it can use trigger to manage it.
the second you print with your application right ?
why not let the code to update this.

MS Access 2013 - How to perform a SQL Update with id based on MAX date from another table?

First time posting, please forgive my newbieness!
In MS Access 2013, How can I update the Actions table's 'LatestRequest' column with the 'id' from the IncomingRequests table by using only for the latest RequestDate for each Scan Result ID ?
Note that there can be duplicate Scan Result IDs in IncomingRequests, but the date will always be different.
Actions Table:
id (primary key) | Scan Result ID | LatestRequest | other-misc-columns...
1 | 123456 | (blank)
2 | 666666 | (blank)
3 | 789789 | (blank)
4 | 888888 | (blank) (this record won't change)
5 | 999222 | 987 (this record won't change)
IncomingRequests Table:
id (primary key) | RequestDate | Scan Result ID | other-misc-columns...
201 | 5/9/2016 | 123456
202 | 4/12/2016 | 123456
203 | 5/7/2016 | 666666
204 | 5/8/2016 | 666666
205 | 5/9/2016 | 789789
What I want to see:
Action Table:
id (primary key) | Scan Result ID | LatestRequest | other-misc-columns...
1 | 123456 | 201
2 | 666666 | 204
3 | 789789 | 205
4 | 888888 | (blank)
5 | 999222 | 987
I've tried creating a subquery for the max date, and updating the Actions table, but run into "Operation must use an updateable query".
UPDATE Actions INNER JOIN (SELECT t1.*
FROM
IncomingRequests t1
INNER JOIN
(
SELECT [Scan Result ID], MAX([DateFromIT]) AS MaxDate
FROM IncomingRequests
GROUP BY [Scan Result ID]
) t2
ON t1.[Scan Result ID]=t2.[Scan Result ID]
AND t1.[RequestDate]=t2.MaxDate
) AS ij ON Actions.[Scan Result ID] = ij.[Scan Result ID]
SET LatestRequest = ij.id
The alternate version I have (below) checks using the Request table primary key id, and this works, except that I really need it by latest date, not highest id.
UPDATE Actions
INNER JOIN IncomingRequests ON Actions.[Scan Result ID] = IncomingRequests.[Scan Result ID]
SET Actions.latestrequest = IncomingRequests.id
WHERE IncomingRequests.id=
(SELECT MAX(IncomingRequests.id)
FROM IncomingRequests
WHERE Actions.[Scan Result ID] = IncomingRequests.[Scan Result ID]
GROUP BY IncomingRequests.[Scan Result ID] );
I've ran into many dead ends trying to follow other answers from this site, or errors in MS Access that others didn't seem to get. Any assistance appreciated.
Thanks so much! =)
Try this and let me know if this solved your problem
update Actions set Actions.LatestRequest = t2.id
from Actions
inner join (select id,ScanResultID from IncomingRequests where RequestDate
in (select MAX(RequestDate) from IncomingRequests group by ScanResultID))
t2 on Actions.ScanResultID = t2.ScanResultID
I eventually resolved by creating a Sub module in VBA to do the following:
DROP a temp table.
Perform a SELECT with the MAX date into a temp table.
Perform an UPDATE with an INNER JOIN to the temp table.
DROP the temp table.
Ugly, but it's the only way I could get it to work.

How to iterate a collection of records and "re-associate" dates whenever a record is added or changed

I have two Date columns, EffDate and TermDate. When a record is inserted for the very first time, TermDate is NULL. When additional records are added, the first record's TermDate should now be assigned the value of the second record's EffDate, assuming #2's EffDate is later than the first. If the second record's EffDate is before #1's, then #2 should have its TermDate set to #1's EffDate. Basically, any time a record is inserted or updated, the database needs to re-evaluate all of the records and link them so that there is only the smallest possible span of time between all of the dates.
I found this SO post: Creating new date field dynamically from next row
But the selected answer doesn't quite work for me. This is the query I'm using:
UPDATE T1 SET
T1.[TermDate] = #aEffDate
FROM [Information] T1
WHERE T1.[InformationID] IN (SELECT ISNULL(T1.InformationID, 0)
FROM [Information] T1 INNER JOIN [Information] T2 on T1.InformationID = T2.InformationID - 1
WHERE T1.InformationID <> #aInformationID
AND T1.[DeletedBy] IS NULL
AND T1.[DeletedOn] IS NULL
AND T1.Code = #aCode
AND T1.TermDate IS NULL
AND T1.EffDate < #aEffDate)
Then when I load up the software and insert a new record, it does not seem to go back and "update" all the previous records as I was hoping. Here is the output after inserting a few records with random EffDate's into the database:
Output & Analysis
| InformationID | EffDate | TermDate | IsGood | ShouldBe |
--------------------------------------------------------------------
1 | 5756 | 07/19/15 | 09/19/15 | N | 07/25/15 |
2 | 5757 | 06/30/15 | 07/10/15 | Y | N/A |
3 | 5758 | 08/01/15 | 09/19/15 | Y | N/A |
4 | 5759 | 07/25/15 | 09/19/15 | N | 08/01/15 |
5 | 5760 | 09/19/15 | NULL | Y | N/A |
6 | 5761 | 07/10/15 | NULL | N | 07/19/15 |
I need it so that whenever a new record is added, it'll re-evaluate all the records associated with a given Code and make sure each date points to the next "closest" date, so that there are no overlapping EffDate and TermDate ranges.
I realize that, at the very least, I need to remove AND T1.TermDate IS NULL because that's preventing it from re-evaluating two records when a third record with an EffDate between the other two's range is inserted. But I'm not sure how I should evaluate TermDate so that records with existing dates will be appropriately re-evaluated and assigned a date if necessary.
update tb
set tb.termdate=tb2.effdate
from InfoTable tb
left outer join InfoTable tb2 on tb2.effdate>tb.effdate
left outer join InfoTable tb3 on tb3.effdate<tb2.effdate and tb3.effdate>tb.effdate
where tb3.effdate is null
This will: (A) select all rows in your table tb; (B) join all rows from the same table tb2 with EffDate greater than tb.EffDate; and (C) again join all rows from the table tb3 with EffDate > tb.EffDate AND EffDate<tb2.EffDate
The final condition tb3.EffDate is null will make sure that there is no other record in the table with EffDate between tb.EffDate and tb2.EffDate