Find and update specific duplicates in MS SQL - sql

given below table:
+----+---------+-----------+-------------+-------+
| ID | NAME | LAST NAME | PHONE | STATE |
+----+---------+-----------+-------------+-------+
| 1 | James | Vangohg | 04333989878 | NULL |
| 2 | Ashly | Baboon | 09898788909 | NULL |
| 3 | James | Vangohg | 04333989878 | NULL |
| 4 | Ashly | Baboon | 09898788909 | NULL |
| 5 | Michael | Foo | 02933889990 | NULL |
| 6 | James | Vangohg | 04333989878 | NULL |
+----+---------+-----------+-------------+-------+
I want to use MS SQL to find and update duplicate (based on name, last name and number) but only the earlier one(s). So desired result for above table is:
+----+---------+-----------+-------------+-------+
| ID | NAME | LAST NAME | PHONE | STATE |
+----+---------+-----------+-------------+-------+
| 1 | James | Vangohg | 04333989878 | DUPE |
| 2 | Ashly | Baboon | 09898788909 | DUPE |
| 3 | James | Vangohg | 04333989878 | DUPE |
| 4 | Ashly | Baboon | 09898788909 | NULL |
| 5 | Michael | Foo | 02933889990 | NULL |
| 6 | James | Vangohg | 04333989878 | NULL |
+----+---------+-----------+-------------+-------+

This query uses a CTE to apply a row number, where any number > 1 is a dupe of the row with the highest ID.
;WITH x AS
(
SELECT ID,NAME,[LAST NAME],PHONE,STATE,
ROW_NUMBER() OVER (PARTITION BY NAME,[LAST NAME],PHONE ORDER BY ID DESC)
FROM dbo.YourTable
)
UPDATE x SET STATE = CASE rn WHEN 1 THEN NULL ELSE 'DUPE' END;
Of course, I see no reason to actually update the table with this information; every time the table is touched, this data is stale and the query must be re-applied. Since you can derive this information at run-time, this should be part of a query, not constantly updated in the table. IMHO.

Try this statement.
LAST UPDATE:
update t1
set
t1.STATE = 'DUPE'
from
TableName t1
join
(
select name, last_name, phone, max(id) as id, count(id) as cnt
from
TableName
group by name, last_name, phone
having count(id) > 1
) t2 on ( t1.name = t2.name and t1.last_name = t2.last_name and t1.phone = t2.phone and t1.id < t2.id)

If my understanding of your requirements is correct, you want to update all of the STATE values to DUPE when there exists another row with a higher ID value that has the same NAME and LAST NAME. If so, use this:
update t set STATE = (case when sorted.RowNbr = 1 then null else 'DUPE' end)
from yourtable t
join (select
ID,
row_number() over
(partition by name, [last name], phone order by id desc) as RowNbr from yourtable)
sorted on sorted.ID = t.ID

Related

Is it possible for foxpro in sql statement to fill the winners_name column base on condition maximum score and id with different names

Is it possible for foxpro in sql statement to add and fill a winners_name column base on condition maximum score and id with different names.
I have created a sql statement but it was not supported by foxpro, is there other alternative to do this rather than using a loop (I like sql statement for faster result even in 50k row lines)
SELECT * , ;
(SELECT TOP 1 doc_name FROM Table1 as b1 WHERE ALLTRIM(b1.id) = a.id ORDER BY b1.score DESC, b1.id) as WINNERS_NAME ;
FROM Table1 as a
I have only 1 table, with columns [ name, id, score ]
A sample table would be like this
NAME | ID | SCORE |
BEN | 101 | 5 |
KEN | 101 | 2 |
ZEN | 101 | 3 |
JEN | 103 | 4 |
REN | 103 | 3 |
LEN | 102 | 5 |
PEN | 102 | 4 |
ZEN | 102 | 3 |
The result would be like this (winners_name is tag on ID)
NAME | ID | SCORE | WINNERS_NAME
BEN | 101 | 5 | BEN
KEN | 101 | 2 | BEN
ZEN | 101 | 3 | BEN
JEN | 103 | 4 | PEN
REN | 103 | 3 | PEN
LEN | 102 | 5 | LEN
PEN | 103 | 5 | PEN
ZEN | 102 | 3 | LEN
Try this approach:
SELECT
a.NAME,
a.ID,
a.SCORE,
b.WINNERS_NAME
FROM Table1 a
INNER JOIN
(
SELECT t1.ID, t1.NAME AS WINNERS_NAME
FROM
(
SELECT ID, SCORE, MIN(NAME) AS NAME
FROM Table1
GROUP BY ID, SCORE
) t1
INNER JOIN
(
SELECT ID, MAX(SCORE) AS MAX_SCORE
FROM Table1
GROUP BY ID
) t2
ON t1.ID = t2.ID AND
t1.SCORE = t2.MAX_SCORE
) b
ON a.ID = b.ID
ORDER BY
a.ID;
Follow the link below for a demo running in MySQL (though the syntax should still work on FoxPro):
Demo

Each rows to column values

I'm trying to create a view that shows first table's columns plus second table's first 3 records sorted by date in 1 row.
I tried to select specific rows using offset from sub table and join to main table, but when joining query result is ordered by date, without
WHERE tblMain_id = ..
clause in joining SQL it returns wrong record.
Here is sqlfiddle example: sqlfiddle demo
tblMain
| id | fname | lname | salary |
+----+-------+-------+--------+
| 1 | John | Doe | 1000 |
| 2 | Bob | Ross | 5000 |
| 3 | Carl | Sagan | 2000 |
| 4 | Daryl | Dixon | 3000 |
tblSub
| id | email | emaildate | tblmain_id |
+----+-----------------+------------+------------+
| 1 | John#Doe1.com | 2019-01-01 | 1 |
| 2 | John#Doe2.com | 2019-01-02 | 1 |
| 3 | John#Doe3.com | 2019-01-03 | 1 |
| 4 | Bob#Ross1.com | 2019-02-01 | 2 |
| 5 | Bob#Ross2.com | 2018-12-01 | 2 |
| 6 | Carl#Sagan.com | 2019-10-01 | 3 |
| 7 | Daryl#Dixon.com | 2019-11-01 | 4 |
View I am trying to achieve:
| id | fname | lname | salary | email_1 | emaildate_1 | email_2 | emaildate_2 | email_3 | emaildate_3 |
+----+-------+-------+--------+---------------+-------------+---------------+-------------+---------------+-------------+
| 1 | John | Doe | 1000 | John#Doe1.com | 2019-01-01 | John#Doe2.com | 2019-01-02 | John#Doe3.com | 2019-01-03 |
View I have created
| id | fname | lname | salary | email_1 | emaildate_1 | email_2 | emaildate_2 | email_3 | emaildate_3 |
+----+-------+-------+--------+---------+-------------+---------------+-------------+---------------+-------------+
| 1 | John | Doe | 1000 | (null) | (null) | John#Doe1.com | 2019-01-01 | John#Doe2.com | 2019-01-02 |
You can use conditional aggregation:
select m.id, m.fname, m.lname, m.salary,
max(s.email) filter (where seqnum = 1) as email_1,
max(s.emailDate) filter (where seqnum = 1) as emailDate_1,
max(s.email) filter (where seqnum = 2) as email_2,
max(s.emailDate) filter (where seqnum = 3) as emailDate_2,
max(s.email) filter (where seqnum = 3) as email_3,
max(s.emailDate) filter (where seqnum = 3) as emailDate_3
from tblMain m left join
(select s.*,
row_number() over (partition by tblMain_id order by emailDate desc) as seqnum
from tblsub s
) s
on s.tblMain_id = m.id
where m.id = 1
group by m.id, m.fname, m.lname, m.salary;
Here is a SQL Fiddle.
Here is a solution that should get you what you expect.
This works by first ranking records within each table and joining them together. Then, the outer query uses aggregation to generate the expected output.
This solution will work even if the first record in the main table does not have id 1. Also filtering takes occurs within the JOINs, so this should be quite efficient.
SELECT
m.id,
m.fname,
m.lname,
m.salary,
MAX(CASE WHEN s.rn = 1 THEN s.email END) email_1,
MAX(CASE WHEN s.rn = 1 THEN s.emaildate END) email_date1,
MAX(CASE WHEN s.rn = 2 THEN s.email END) email_2,
MAX(CASE WHEN s.rn = 2 THEN s.emaildate END) email_date2,
MAX(CASE WHEN s.rn = 3 THEN s.email END) email_3,
MAX(CASE WHEN s.rn = 3 THEN s.emaildate END) email_date3
FROM
(
SELECT m.*, ROW_NUMBER() OVER(ORDER BY id) rn
FROM tblMain
) m
INNER JOIN (
SELECT
email,
emaildate,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY emaildate) rn
FROM tblSub
) s
ON m.id = s.tblmain_id
AND m.rn = 1
AND s.rn <= 3
GROUP BY
m.id,
m.fname,
m.lname,
m.salary

How can I subtract two row's values within same column using sql query in access?

(query access)
This is the table structure:
+-----+--------+--------+
| id | name | sub1 |
+-----+--------+--------+
| 1 | ABC | 6.27% |
| 2 | ABC | 7.47% |
| 3 | PQR | 3.39% |
| 4 | PQR | 2.21% |
+-----+--------+--------+
I want to subtract Sub1
Output should be:
+-----+--------+---------+------------------------------------+
| id | name | sub1 | |
+-----+--------+---------+------------------------------------+
| 1 | ABC | 6.27% | 0 First Rec no need Subtract |
| 2 | ABC | 7.47% | 1.2% <=(7.47-6.27) |
| 3 | PQR | 3.39% | 0 First Rec no need Subtract |
| 4 | PQR | 2.21% | -1.18% <=(2.21-3.39) |
+-----+--------+---------+------------------------------------+
Thank you so much.
If you can guarantee consecutive id values, then the following presents an alternative:
select t.*, nz(t.sub1-u.sub1,0) as sub2
from YourTable t left join YourTable u on t.name = u.name and t.id = u.id+1
Change YourTable to the name of your table.
This is painful, but you can do:
select t.*,
(select top 1 t2.sub1
from t as t2
where t2.name = t.name and t2.id < t.id
order by t2.id desc
) as prev_sub1
from t;
This gives the previous value or NULL for the first row. You can just use - for the subtraction.
An index on (name, id) would help a bit with performance. However, if you can upgrade to a better database, you can then just use lag().

Merging multiple rows according to an order

Suppose there are the following rows
| Id | MachineName | WorkerName | MachineState |
|----------------------------------------------|
| 1 | Alpha | Young | RUNNING |
| 1 | Beta | | STOPPED |
| 1 | Gamma | Foo | READY |
| 1 | Zeta | Zatta | |
| 2 | Guu | Niim | RUNNING |
| 2 | Yuu | Jaam | STOPPED |
| 2 | Nuu | | READY |
| 2 | Faah | Siim | |
| 3 | Iem | | RUNNING |
| 3 | Nyt | Fish | READY |
| 3 | Qwe | Siim | |
We want to merge these rows according to following priority :
STOPPED > RUNNING > READY > (null or empty)
If a row has a value for greatest priority, then value from that row should be used (only if it is not null). If it is null, a value from any other row should be used. The rows should be grouped by id
The correct output for the above input is :
| Id | MachineName | WorkerName | MachineState |
|----------------------------------------------|
| 1 | Beta | Foo | STOPPED |
| 2 | Yuu | Jaam | STOPPED |
| 3 | Iem | Fish | RUNNING |
What would be a good sql query to accomplish this? I tried using joins, but it did not work out.
You can view this as a case of the group-wise maximum problem, provided you can obtain a suitable ordering over your MachineState column—e.g. by using a CASE expression:
SELECT a.Id,
COALESCE(a.MachineName, t.MachineName) MachineName,
COALESCE(a.WorkerName , t.WorkerName ) WorkerName,
a.MachineState
FROM myTable a JOIN (
SELECT Id,
MIN(MachineName) AS MachineName,
MIN(WorkerName ) AS WorkerName,
MAX(CASE MachineState
WHEN 'READY' THEN 1
WHEN 'RUNNING' THEN 2
WHEN 'STOPPED' THEN 3
END) AS MachineState
FROM myTable
GROUP BY Id
) t ON t.Id = a.Id AND t.MachineState = CASE a.MachineState
WHEN 'READY' THEN 1
WHEN 'RUNNING' THEN 2
WHEN 'STOPPED' THEN 3
END
See it on sqlfiddle:
| id | machinename | workername | machinestate |
|----|-------------|------------|--------------|
| 1 | Beta | Foo | STOPPED |
| 2 | Yuu | Jaam | STOPPED |
| 3 | Iem | Fish | RUNNING |
You could save yourself the pain of using CASE if MachineState was an ENUM type column (defined in the appropriate order). It so happens in this case that a simple lexicographic ordering over the string value will yield the same result, but that's a coincidence on which you really shouldn't rely as it's bound to slip under the radar when someone tries to maintain this code in the future.
This is a prioritization query. One method uses variables. Another uses union all . . . this works if the states are not repeated for a given id:
select t.*
from table t
where machinestate = 'STOPPED'
union all
select t.*
from table t
where machinestate = 'RUNNING' and
not exists (select 1 from table t2 where t2.id = t.id and t2.machinestate in ('STOPPED'))
union all
select t.*
from table t
where machinestate = 'READY' and
not exists (select 1 from table t2 where t2.id = t.id and t2.machinestate in ('STOPPED', 'RUNNING'));
change MachineState as enum:
`MachineState` enum('READY','RUNNING','STOPPED') DEFAULT NULL
and sql is simple:
select t.id,state.machinename,state.workername,t.mstate from state,(select id,max(MachineState) mstate from state group by Id) t where t.mstate=state.machinestate and t.id=state.id;

Do I need a recursive CTE to update a table that relies on itself?

I need to apologize for the title. I put a lot of thought into it but didn't get too far.
I have a table that looks like this:
+--------------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------+
| accountid | pricexxxxxid | accountid | pricelevelid | counts |
+--------------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------+
| 36B077D4-E765-4C70-BE18-2ECA871420D3 | 00000000-0000-0000-0000-000000000000 | 36B077D4-E765-4C70-BE18-2ECA871420D3 | F43C47CE-28C6-42E2-8399-92C58ED4BA9D | 1 |
| EBC18CBC-2D2E-44CB-B36A-0ADE9E2BDE9F | 00000000-0000-0000-0000-000000000000 | EBC18CBC-2D2E-44CB-B36A-0ADE9E2BDE9F | 3BEEA9D3-F26B-47E4-88FA-A2AA366980ED | 1 |
| 8DC8D0FC-3138-425A-A922-2F0CAC57E887 | 00000000-0000-0000-0000-000000000000 | 8DC8D0FC-3138-425A-A922-2F0CAC57E887 | F1B8AD5D-B008-4C3F-94A0-AD3F90C777D7 | 1 |
| 8F908A92-1327-4655-BAE4-C890D971A554 | 00000000-0000-0000-0000-000000000000 | 8F908A92-1327-4655-BAE4-C890D971A554 | 2E0EC67E-5F8F-4305-932E-BBF8DF83DBEC | 1 |
| 37221AAC-B885-4002-B7D9-591F8C14D019 | 00000000-0000-0000-0000-000000000000 | 37221AAC-B885-4002-B7D9-591F8C14D019 | F4A2A0CA-FDFF-4C21-AE92-D4583DC18DED | 1 |
| 66F406B4-0D9B-40B8-9A23-119EE74B00B7 | 00000000-0000-0000-0000-000000000000 | 66F406B4-0D9B-40B8-9A23-119EE74B00B7 | 204B8570-CEBA-4C72-9B72-8B9B14AF625E | 2 |
| D0168CE3-479E-439E-967C-4FF0D701291A | 00000000-0000-0000-0000-000000000000 | D0168CE3-479E-439E-967C-4FF0D701291A | 204B8570-CEBA-4C72-9B72-8B9B14AF625E | 2 |
| 57E5F6E5-0A8A-4E54-B793-2F6493DC1EA3 | 00000000-0000-0000-0000-000000000000 | 57E5F6E5-0A8A-4E54-B793-2F6493DC1EA3 | 893F9FD2-43C9-4355-AEFC-08A62BF2B066 | 3 |
+--------------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------+
It is sorted by ascending counts.
I would like to update the pricexxxxids that are all 00000000-0000-0000-0000-000000000000 with their corresponding pricelevelid.
For example for accountid = 36B077D4-E765-4C70-BE18-2ECA871420D3 I would like the pricexxxxid to be F43C47CE-28C6-42E2-8399-92C58ED4BA9D.
After that is done, I would like all the records FOLLOWING this one where accountid = 36B077D4-E765-4C70-BE18-2ECA871420D3 to be deleted.
Another words in result I will end up with a distinct list of accountids with pricexxxxid to be assigned with the corresponding value from pricelevelid.
Thank you so much for your guidance.
for your first case do !
update table
set pricexxxxids=pricelevelid.
if i understand your second case correctly :(delete duplicates/select distinct)?
delete from
(
select *,rn=row_number()over(partition by accountid order by accountid) from table
)x
where rn>1
--select distinct * from table
edited
select * from
(
select *,rn=row_number()over(partition by accountid order by accountid) from table
)x
where x.rn=1
updated
SELECT accountid,pricelevelid FROM
(
(SELECT *,
Row_number() OVER ( partition BY accountid ORDER BY counts, pricelevelid ) AS Recency
FROM table
)x
WHERE x.Recency = 1