Oracle sql - identify and update primary and secondary rows in same table - sql

I have a scenario where i need to identify a combination of records which are duplicates and use them to mark and identify which one is primary and which is secondary and then use an additional column to update the keys. This will help me update another child table which has referential integrity. Here's an example
Table Member
ID Name Birthdt MID
1 SK 09/1988 312
2 SK 09/1988 999
3 SL 06/1990 315
4 GK 08/1990 316
5 GK 08/1990 999
So from the above table my logic to identify duplicate is -- I do a group by on Name and Birthdate and when MID is 999 i consider that as a duplicate
so I created another temp table to capture dups.
Table member_dups
ID NAME BIRTHDT MID M_flg M_Key
1 SK 09/1988 P 1
2 SK 09/1988 S 1
4 GK 08/1990 P 4
5 GK 08/1990 S 4
For my table member_dups im able to load the duplicate records and update the flag . however I'm finding it difficult to get the right M_KEY for records marked as secondary. If i can achieve that then I can take that record and update the other table accordingly.
Thanks for any help offered.

If I understand your logic right the records that had MID = 999 is the secondary in member_dups.
If so you should be able to use an simple update with a join:
update member_dups
set m_key = m.id
from member_dups md
inner join Member m on m.name = mb.name and m.birthdt = mb.birthdt
where mb.m_flg = 's' and m.mid = 999
This example uses MSSQL syntax and thus isn't valid Oracle syntax, but you should get the idea, and hopefully you know Oracle better than I do. I'll try to work out the correct Oracle syntax and update the answer soon.
EDIT: I think this is the working Oracle syntax, but I haven't been able to test it:
MERGE INTO member_dups
USING
(
SELECT id,
name,
birthdt
FROM Member
where m.mid = 999
) m ON (m.name = mb.name and m.birthdt = mb.birthdt and mb.m_flg = 's')
WHEN MATCHED THEN UPDATE
SET member_dups.m_key = m.id

Related

Is it optimal to use multiple joins in update query?

My update query checks whether the column “Houses” is null in any of the rows in my source table by joining the id between the target & source table (check query one). The column Houses being null in this case indicates that the row has expired; thus, I need to expire the row id in my target table and set the expired date. The query works fine, but I was wondering if it can be improved; I'm new to SQL, so I don't know if using two joins is the best way to accomplish the result I want. My update query will later be used against millions of rows. No columns has been indexed yet.
Query:
(Query one)
Update t
set valid_date = GETDATE()
From Target T
JOIN SOURCE S ON S.ID = T.ID
LEFT JOIN SOURCE S2 ON S2.Houses = t.Houses
WHERE S2.Houses is null
Target:
ID
namn
middlename
Houses
date
1
demo
hello
2
null
2
demo2
test
4
null
3
demo3
test1
5
null
Source:
ID
namn
middlename
Houses
1
demo
hello
null
3
demo
world
null
Expected output after running update query :
ID
namn
middlename
Houses
date
1
demo
hello
2
2022-12-06
2
demo2
test
4
null
3
demo3
test1
5
2022-12-06
I would recommend exists:
update t
set valid_date = getdate()
from target t
where exists (select 1 from source s where s.id = t.id and s.houses is null)
Note that your original query does not exactly do what you want. It cannot distinguish source rows that do not exist from source rows that exist and whose houses column is null. In your example, it would update row 2, which is not what you seem to want. You would need an INNER JOIN instead of the LEFT JOIN.
With EXISTS, you want an index on source(id, houses) so the subquery can execute efficiently against target rows. This index is probably worthwhile for the the JOIN as well.
I don't see why you'd need to join on the column houses at all.
Find all rows in source that have value NULL in the column houses.
Then update all rows in target that have the IDs of the source rows.
I prefer to write these kind of complex updates using CTEs. It looks more readable to me.
WITH
CTE
AS
(
SELECT
Target.ID
,Target.Date
FROM
Source
INNER JOIN Target ON Target.ID = Source.ID
WHERE Source.Houses IS NULL
)
UPDATE CTE
SET Date = GETDATE();
To efficiently find rows in source that have value NULL in the column houses you should create an index, something like this:
CREATE INDEX IX_Houses ON Source
(
Houses
);
I assume that ID is a primary key with a clustered unique index, so ID would be included in the IX_Houses index implicitly.

How to improve an Update query in Oracle

I'm trying to update two columns in an archaic Oracle database, but the query simply doesn't finish and nothing is updated. Any ideas to improve the query or something else that can be done? I don't have DBA skills/knowledge and unsure if indexing would help, so would appreciate comments in that area, too.
PERSON table: This table has 200 million distinct person_id's. There are no duplicates. The person_id is numeric and am trying to update the favorite_color and color_confidence columns, which are varchar2 and values currently NULLed out.
person table
person_id favorite_color color_confidence many_other_columns
222
333
444
TEMP_COLOR_CONFIDENCE table: I'm trying to get the favorite_color and color_confidence from this table and update to the PERSON table. This table has 150 million distinct person's, again nothing duplicated.
temp_color_confidence
person_id favorite_color color_confidence
222 R H
333 Y L
444 G M
This is my update query, which I realize only updates those found in both tables. Eventually I'll need to update the remaining 50 million with "U" -- unknown. Solving that in one shot would be ideal too, but currently just concerned that I'm not able to get this query to complete.
UPDATE person p
SET (favorite_color, color_confidence) =
(SELECT t.favorite_color, t.color_confidence
FROM temp_color_confidence t
WHERE p.person_id = t.person_id)
WHERE EXISTS (
SELECT 1
FROM temp_color_confidence t
WHERE p.person_id = t.person_id );
Here's where my ignorance shines... would indexing on person_id help, considering they are all distinct anyway? Would indexing on favorite_color help? There are less than 10 colors and only 3 confidence values.
For every person, it has to find the corresponding row in temp_color_confidence. The way to do that with the least I/O is to scan each table once and crunch them together in a single hash join, ideally all in memory. Indexes are unlikely to help with that, unless maybe temp_color_confidence is very wide and verbose and has an index on (person_id, favorite_color, color_confidence) which the optimiser can treat as a skinny table.
Using merge might be more efficient as it can avoid the second scan of temp_color_confidence:
merge into person p
using temp_color_confidence t
on (p.person_id = t.person_id)
when matched then update
set p.favorite_color = t.favorite_color, p.color_confidence = t.color_confidence;
If you are going to update every row in the table, though, you might consider instead creating a new table containing all the values you need:
create table person2
( person_id, favorite_color, color_confidence )
pctfree 0 compress
as
select p.person_id, nvl(t.favorite_color,'U'), nvl(t.color_confidence,0)
from person p
left join temp_color_confidence t
on t.person_id = p.person_id;

update a column in a sql table with a value from another table based on a relationship

I am working on a SQL query to update a column values with an Id from a different table
Example
Organization Table
Id Name
1 AA
2 BB
Events Table
Id Name OrgId
1 AA NULL
2 AA NULL
3 BB NULL
Now, I would like to update OrgId of Events table with its respective Id from Organization table
I did try the below query but I had explicitly do it for each organization
UPDATE Event SET OrId=
(SELECT DISTINCT O.ID FROM Organization O WHERE O.Name='AA') WHERE Name='AA'
May I know a better way to do it automatically?
Use a join:
update e
set orid = o.id
from event e join
organization o
on o.name = e.tenant;
You can perform a Merge using
MERGE Event AS e
USING Organization AS o
ON (e.Name= o.name)
WHEN MATCHED THEN
UPDATE SET e.OrgId = o.id
OUTPUT $action, inserted.*;
The output clause is optional and will print out the Ids that are inserted into the Event table.
The Merge is quite powerful as it has other clauses that can be used for cases when data is only in one table and not the other. Here's a nice post which explains things clearly. https://www.simple-talk.com/sql/learn-sql-server/the-merge-statement-in-sql-server-2008/

SQL code for updating a column where the WHERE condition is from another table

In SQL Server 2012, I have two tables: WINE_TYPE and WINE.
WINE_TYPE is the parent table, and wtCode is the primary key (foreign key in WINE).
I'm trying to execute code that will update the price of the wine (winePrice) by 50 cents, or 0.50. The catch is... my WHERE condition is from the parent table (I only need the wines that are made by the lead Amanda McPhee (wtLeadFirst='Amanda', wtLeadLast='McPhee') to be increased by 50 cents).
Inner join doesn't seem to work here. Help?
You can use UPDATE ... FROM syntax to update table with JOIN condition:
UPDATE WINE
SET WINE.WinePrice = WINE.WinePrice + 0.5
FROM
WINE
INNER JOIN
WINE_TYPE ON WINE.wtCode = WINE_TYPE.wtCode
WHERE
WINE_TYPE.wtLeadFirst='Amanda' AND WINE_TYPE.wtLeadLast='McPhee'
Update + Exists
Use Exists operator to check the existence of wtLeadFirst='Amanda' and wtLeadLast='McPhee' in parent table
update W
Set Wineprice = x + Wineprice
from Wine W
where exists (select 1
from wine_type WT
where W.wtCode =WT.code
and Wt.wtLeadFirst='Amanda'
and Wt.wtLeadLast='McPhee' )

Updating field from 3 different tables

I have a question using below 3 tables:
1. Company (table)
CONAME
COPOINTS
2. Group_Member (table)
CONAME
NAME
3. Member (table)
NAME
MPOINTS
I would like to have a correct query with the following condition:
Update Member
Set MPOINTS=MPOINTS+5
Where Company.CONAME=Group_Member.CONAME
And
Group_Member.NAME=Member.NAME
Can you please correct above query?
In T-SQL (MS-SQL Server dialect) this would be
Update Member
set MPOINTS = MPOINTS + 5
from Group_Member
join Company on Company.CONAME = Group_Member.CONAME
where Group_Member.NAME = Member.NAME
I would like to remind you, that names are not a good choice for neither primary nor foreign key relations. You will never be able to change a name without violating primary key constraints. Use a (numeric, autoincrement) ID column instead.
try this:
Update M
Set MPOINTS=MPOINTS+5
from Member M
join Group_Member GM
on GM.NAME=M.NAME
join Company C
on C.CONAME=GM.CONAME
SQL Server 2005+
UPDATE x
SET x.MPOINTS += 5
FROM (SELECT m.MPOINTS FROM Company c JOIN Group_Member g ON c.CONAME = g.CONAME
JOIN Member m ON g.NAME = m.NAME) x