Selecting duplicates from SQL server DB based on complex(?) criteria

Selecting duplicates from SQL server DB based on complex(?) criteria - sql

fileid custid dept1 dept2 date1 date2 date3
123 456 2 4 1/1/04 1/1/05 1/1/06
777 456 2 4 NULL 5/30/05 1/1/07
111 456 2 4 12/2/06 NULL 3/3/07
200 456 2 6 1/1/04 2/1/04 3/1/04
444 456 2 8 2/1/07 4/1/07 6/1/07
500 456 2 8 3/1/07 3/15/07 4/2/07
I trying to write some SQL that would pull the first 3 records above and display them as a 'set' based on the fact that the custid, dept1, and dept2 are the same and also that the dates 'overlap' ie, any of the dates in fileid 123 are earlier than the earliest date in fileid 777 and fileid 111. It wouldn't pull the 4th record because dept2 is different. And it would pull records 5 and 6 and display them as a separate set because custid, dept1, dept2 match and fileid 500's dates are 'inside' fileid 444's dates. Been pounding my head against a wall with this one. Can anyone help?
Here is an example of multiple rows with matching custiid, dept1 and dept2 not in the same set:
fileid custid dept1 dept2 date1 date2 date3
123 456 2 4 1/1/04 1/1/05 1/1/06
777 456 2 4 NULL 5/30/05 1/1/07
111 456 2 4 12/2/06 NULL 3/3/07
666 456 2 4 1/1/08 3/1/08 5/1/08
fileid 666 is not in the set because its dates don't overlap with any of the others.

I think that query gives your want. But i have to say this is not the cleanest answer. Also i get the conclusion in 3 query but it can be done with one query, but this will increase the complexity. In first query i find the duplicates in the second i list them and in the third i set the rules you want.
SELECT * INTO #temp FROM (Select custID,dept1,dept2 FROM #table
Group By custID,dept1,dept2
HAVING COUNT(custID) > 1) AS p
SELECT * INTO #temp2 FROM (Select ROW_NUMBER() OVER(PARTITION BY custID,dept1,dept2 Order By CustID ) as RN,*
FROM #table
Where custID IN (Select custID FROM #temp) AND dept1 IN (Select dept1 FROM #temp) AND dept2 IN (Select dept2 FROM #temp)
) AS x
Select * FROM #table Where fileID IN (
Select t1.fileID FROM #temp2 t1
INNER JOIN #temp2 t2 ON t1.RN = t2.RN-1 AND (
COALESCE(t2.date1,t2.date2) BETWEEN COALESCE(t1.date1,t1.date2) AND COALESCE(t1.date3,t1.date2)
OR
COALESCE(t2.date3,t2.date2) BETWEEN COALESCE(t1.date1,t1.date2) AND COALESCE(t1.date3,t1.date2)
)
AND t2.custID = t1.custID AND t2.dept1 = t1.dept1 AND t2.dept2 = t1.dept2)
OR
fileID IN (
Select t2.fileID FROM #temp2 t1
INNER JOIN #temp2 t2 ON t1.RN = t2.RN-1 AND (
COALESCE(t2.date1,t2.date2) BETWEEN COALESCE(t1.date1,t1.date2) AND COALESCE(t1.date3,t1.date2)
OR
COALESCE(t2.date3,t2.date2) BETWEEN COALESCE(t1.date1,t1.date2) AND COALESCE(t1.date3,t1.date2)
)
AND t2.custID = t1.custID AND t2.dept1 = t1.dept1 AND t2.dept2 = t1.dept2)
Here is a live link to this solution.

Related

Sql - Getting Sum on the join table

I just want to get the result which displays the reference which is not tallied in sum of table2. when i run my query below it will give me an wrong sum which it gets doubled even if group by cusid ,refno.
Table 1
RefNo
CusID
TotalAmount
1
1001
50
2
1001
30
3
1002
40
Table 2
RefNo
CusID
Particular
Amount
1
1001
Paper
30
1
1001
Pencil
30
2
1001
Ball
15
2
1001
Rubber
20
3
1002
Laptop
50
select * from Table1 a
INNER JOIN (Select CusID,RefNo, SUM(Amount) as CorrectTotal from Table2 group by
CusID,RefNo,
)b
ON b.CusID= a.CusID AND b.RefNo= a.RefNo
where a.TotalAmount != CorrectTotal
Expected Result

If you do it with FULL JOIN and with GROUP BY, you will also get rows where there is no record in the other table.
SELECT COALESCE(a.RefNo, b.RefNo) AS RefNo
, COALESCE(a.CusID, b.CusID) AS CusID
, a.TotalAmount
, SUM(b.Amount) AS CorrectTotal
FROM table1 a
FULL JOIN table2 b ON a.RefNo = b.RefNo
AND a.CusID = b.CusID
GROUP BY COALESCE(a.RefNo, b.RefNo)
, COALESCE(a.CusID, b.CusID)
, a.TotalAmount
ORDER BY 1, 2
Output
RefNo
CusID
TotalAmount
CorrectTotal
1
1001
50
60
2
1001
30
35
3
1002
40
50
8
888
88
(null)
9
999
(null)
99
See running demo on SQL Fiddle.

The other answer will work, but if you don't want to mess with a GROUP BY on the whole query you can also use an APPLY to do this:
SELECT a.*, c.CorrectAmount
FROM Table1 a
OUTER APPLY (
SELECT SUM(Amount) AS CorrectAmount
FROM Table2 b
WHERE b.CusID = a.CusID AND b.RefNo = a.RefNo
) c
WHERE a.TotalAmount <> c.CorrectAmount

Fetching latest records of individual by joining 2 tables

I have to fetch the latest record of the student which derived by joining 2 tables:
table 1: table 2:
id name id marks EXAM attended time status
-------- ----------------------------------------------
1 ABC 1 90 2019-04-05 06:00:00 PASS
2 DEF 1 25 2018-06-05 08:00:00 FAIL
2 45 2019-03-05 06:00:00 FAIL
2 22 2019-01-05 09:00:00 FAIL
On joining both tables I got this:
# name marks EXAM ATTENDED TIME status
------------------------------------------------------
1 ABC 90 2019-04-05 06:00:00 PASS
2 ABC 25 2018-06-05 08:00:00 FAIL
3 DEF 45 2019-03-05 06:00:00 FAIL
4 DEF 22 2019-01-05 09:00:00 FAIL
5 DEF 55 2019-04-05 09:00:00 PASS
6 DEF 66 2019-05-05 09:00:00 PASS
7 DEF 99 2018-05-05 09:00:00 PASS
I want to fetch the latest result on datetime and name.
The output I need is:
id name marks EXAM ATTENDED TIME status
------------------------------------------------------
1 ABC 90 2019-04-05 06:00:00 PASS
6 DEF 66 2019-05-05 09:00:00 PASS

You can try below using correlated subquery
select * from table1 a1
inner join table2 a on a1.id=a.id
where exam_attended_time in (select max(exam_attended_time) from table2 b where a.id=b.id)
OR you can use row_number() if your db supports it-
select * from
(
select a.name,a1.*,row_number(partition by a.id order by exam_attended_time desc)rn from table1 a1
inner join table2 a on a1.id=a.id
)X where rn=1

You could use a window function (ROW_NUMBER).
SELECT
x.id
, x.NAME
, x.marks
, x.ExamAttendTime
, x.status
FROM
(
SELECT
t1.id
, t1.NAME
, t2.marks
, t2.ExamAttendTime
, t2.status
, ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t2.ExamAttendTime DESC) AS ROWNUMBER
FROM
dbo.Table1 t1
JOIN dbo.Table2 t2 ON t2.id = t1.id
) x
WHERE
x.ROWNUMBER = 1

i don't know how you fetch the record like marks '99' and '66' and EXAM attended time
'2019-05-05 09:00:00 ' which is not available in table itself.
though this will might help you on getting correct data .
select a.id,a.name,b.marks,b.[EXAM attended time],b.[status] from table 1 a
join table 2 b on a.id=b.id where [EXAM attended time] in
(select max([EXAM attended time])[EXAM attended time]from exam group by id)

IF you are using SQL SERVER then you can use TOP as below to fetch latest records
SELECT A.id,
A.name,
B.marks,
B.EXAM_attended_time,
B.Status
FROM table1 A
OUTER APPLY (SELECT TOP 1 *
FROM table2 B WHERE B.id = A.id
ORDER BY B.EXAM_attended_time DESC) B
WHERE B.ID = A.id

Oralce SQL nested or inner join when you need to compare the same table but different rows with unique ID values

I'm having a trouble writing a query in ORACLE. I have a Table that contains values. for example:
ID quantity partID
123 50 10
100 20 10
100 30 11
123 null 8
456 null 100
789 25 123
456 50 9
I want to get all rows that has same ID but quantities to be 50 and null (exact same pairs of 50 and null only). for the given example I would like to get:
ID quantity partID
123 50 10
123 null 8
456 50 9
456 null 100
I tried inner join but it doesn't provide the exact output as expected.

You may try :
select ID, quantity, partID
from tab
where ID in
(
select ID
from tab
where nvl(quantity,50)=50
group by ID
having count(distinct nvl(quantity,0) )>1
);
ID QUANTITY PARTID
123 50 10
123 (null) 8
456 (null) 100
456 50 9
SQL Fiddle Demo
P.S. you may get the same results by commenting out having count(ID)=2 also but for those cases there may not exist one of 50 or null for values of quantity.

You can use exists:
select t.*
from t
where (t.quantity = 50 and
exists (select 1 from t t2 where t2.id = t.id and t2.partid = t.partid and t2.value is null)
) or
(t.quantity is null and
exists (select 1 from t t2 where t2.id = t.id and t2.partid = t.partid and t2.value = 50)
) ;

Get most recent records where one field is not null

I'm looking to narrow down my database to have only the most records. The most recent records need to have a value in a specific field.
ID Account_nbr Date Name
1 622 7/10/2018 Stu
2 622 7/24/2018
3 151 7/18/2018 Taylor
4 151 7/24/2018 Taylor
This is an example of the database.
I want the code to do this:
ID Account_nbr Date Name
1 622 7/10/2018 Stu
4 151 7/24/2018 Taylor
I have tried the following code:
Select m.*
FROM [table] m
INNER JOIN
(
SELECT last(Date) as LatestDate
,account_nbr
FROM [table]
WHERE Name IS NOT NULL
GROUP BY account_nbr
) b
ON m.Date = b.LatestDate
AND m.account_nbr = b.account_nbr
The output only included the most recent date and did not take into account records that were null in the name field.

I would do :
select t.*
from table as t
where t.name is not null and
t.date = (select max(t1.date)
from table as t1
where t1.account_nbr = t.account_nbr
);

Try this:
Select
m.*
From
[table] As m
Where
m.[Date] In
(Select Max([Date])
From [table] As T
Where T.[Name] Is Not Null
And T.account_nbr = m.account_nbr)

Update multiple rows with different values in Oracle

I am trying to update multiple rows using an inner view in oracle.
The select statement for updating this view is:
select count(distinct a.numCount) as numCount, a.accNum as accNum ,
s.unitNum as unitNum
from tableA a,tableS s where a.accNum is not null and s.fk_id=
(select id from tableD where sid=a.accNum )
group by a.accNum ,s.unitNum ;
Update statement that I am trying is below:
update
(select count(distinct a.numCount) as numCount, a.accNum as accNum ,
s.unitNum as unitNum
from tableA a,tableS s where a.accNum is not null and s.fk_id=
(select id from tableD where sid=a.accNum )
group by a.accNum ,s.unitNum ) k
set k.unitNum=k.numCount;
I am trying to update unitNum with value of numCount.
The above query is not working when used as a view.
Is there another way to update this in Oracle.
Please suggest.
Structure of the tables are as below:
TableA
accNum numCount
-----------------------
111 1
222 5
333 2
111 1
111 1
222 5
222 2
TableS
fk_id unitNum
-----------------------
123 0
768 0
734 0
TableD
ID sid
-----------------------
123 222
768 111
734 333
Output should be as below:
TableS
fk_id unitNum
-----------------------
123 3
768 3
734 1
Please suggest

update tableS s
set unitNum=
(select count(distinct a.numCount) as numCount
from tableA a, tableD d
where s.fk_id=d.id and d.sid=a.accNum
);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Selecting duplicates from SQL server DB based on complex(?) criteria - sql

Related

Sql - Getting Sum on the join table

Fetching latest records of individual by joining 2 tables

Oralce SQL nested or inner join when you need to compare the same table but different rows with unique ID values

Get most recent records where one field is not null

Update multiple rows with different values in Oracle

Categories

Resources