SQL to remove specific rows from select - sql

Ive got a table:
UserA UserB UserBB UserAA
for example:
1 2 2 1
1 3 3 1
2 1 1 2
2 4 4 2
2 5 5 2
5 2 2 5
What I want to achieve is to remove rows (duplicates) like to only leave rows as in example:
1 2 2 1
1 3 3 1
2 4 4 2
2 5 5 2
2 1 1 2 -> deleted because there is already 1 2 2 1
5 2 2 5 -> deleted because there is already 2 5 5 2
How to write such a query ?
Thanks for help

-- Find Duplicate Rows
SELECT MAX(ID) as ID, CustName, Pincode FROM #Customers
GROUP BY CustName, Pincode
HAVING COUNT(*) > 1
-- Delete Duplicate Rows
DELETE FROM #Customers
WHERE ID IN
( SELECT MAX(ID) FROM #Customers
GROUP BY CustName, Pincode
HAVING COUNT(*) > 1)
Taken from MSDN. :
http://archive.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=DuplicateRows
Let me know if you are unable to figure it out from that code.
This may be a little bit closer to your needs. :
DELETE FROM TABLE
WHERE USERA IN ( SELECT MAX(USERA) FROM TABLE
GROUP BY USERA, USERB, USERBB, USERAA HAVING COUNT(*) > 1)

The below also covers situations where UserA and UserB are equal between the two rows but UserAA and UserBB are switched and the reverse. Your question is a bit unclear about what exactly constitutes a duplicate. Hopefully this points you in the right direction at the very least though.
I would turn this into a SELECT statement first though and make sure that it is returning the rows that you think should be deleted and only those rows.
DELETE T1
FROM
My_Table T1
INNER JOIN My_Table T2 ON
(
T2.UserA = T1.UserA AND
T2.UserB = T1.UserB AND
T2.UserAA = T1.UserBB AND
T2.UserBB = T1.UserAA AND
T2.UserAA < T2.UserBB
) OR
(
T2.UserA = T1.UserB AND
T2.UserB = T1.UserA AND
T2.UserAA = T1.UserAA AND
T2.UserBB = T1.UserBB AND
T2.UserA < T2.UserB
) OR
(
T2.UserA = T1.UserB AND
T2.UserB = T1.UserA AND
T2.UserAA = T1.UserBB AND
T2.UserBB = T1.UserAA AND
T2.UserA < T2.UserB
)

It was Enough just to add:
Where UserA < UserB

Related

How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

Recursive query with CTE

I need some help with one query.
So, I already have CTE with the next data:
ApplicationID
CandidateId
JobId
Row
1
1
1
1
2
1
2
2
3
1
3
3
4
2
1
1
5
2
2
2
6
2
5
3
7
3
2
1
8
3
6
2
9
3
3
3
I need to find one job per candidate in a way, that this job was distinct for table.
I expect that next data from query (for each candidate select the first available jobid that's not taken by the previous candidate):
ApplicationID
CandidateId
JobId
Row
1
1
1
1
5
2
2
2
8
3
6
2
I have never worked with recursive queries in CTE, having read about them, to be honest, I don't fully understand how this can be applied in my case. I ask for help in this regard.
The following query returns the expected result.
WITH CTE AS
(
SELECT TOP 1 *,ROW_NUMBER() OVER(ORDER BY ApplicationID) N,
CONVERT(varchar(max), CONCAT(',',JobId,',')) Jobs
FROM ApplicationCandidateCTE
ORDER BY ApplicationID
UNION ALL
SELECT a.*,ROW_NUMBER() OVER(ORDER BY a.ApplicationID),
CONCAT(Jobs,a.JobId,',') Jobs
FROM ApplicationCandidateCTE a JOIN CTE b
ON a.ApplicationID > b.ApplicationID AND
a.CandidateId > b.CandidateId AND
CHARINDEX(CONCAT(',',a.JobId,','), b.Jobs)=0 AND
b.N = 1
)
SELECT * FROM CTE WHERE N = 1;
However, I have the following concerns:
The recursive CTE may extract too many rows.
The concatenated JobId may exceed varchar(max).
See dbfiddle.

Get max record for each group of records, link multiple tables

I seek to find the maximum timestamp (ob.create_ts) for each group of marketid's (ob.marketid), joining tables obe (ob.orderbookid = obe.orderbookid) and market (ob.marketid = m.marketid). Although there are a number of solutions posted like this for a single table, when I join multiple tables, I get redundant results. Sample table and desired results below:
table: ob
orderbookid
marketid
create_ts
1
1
1664635255298
2
1
1664635255299
3
1
1664635255300
4
2
1664635255301
5
2
1664635255302
6
2
1664635255303
table: obe
orderbookentryid
orderbookid
entryname
1
1
'entry-1'
2
1
'entry-2'
3
1
'entry-3'
4
2
'entry-4'
5
2
'entry-5'
6
3
'entry-6'
7
3
'entry-7'
8
4
'entry-8'
9
5
'entry-9'
10
6
'entry-10'
table: m
marketid
marketname
1
'market-1'
2
'market-2'
desired results
ob.orderbookid
ob.marketid
obe.orderbookentryid
obe.entryname
m.marketname
3
1
6
'entry-6'
'market-1'
3
1
7
'entry-7'
'market-1'
6
2
10
'entry-10'
'market-2'
Use ROW_NUMBER() to get a properly filtered ob table. Then JOIN the other tables onto that!
WITH
ob_filtered AS (
SELECT
orderbookid,
marketid
FROM
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY
marketid
ORDER BY
create_ts DESC
) AS create_ts_rownumber
FROM
ob
) ob_with_rownumber
WHERE
create_ts_rownumber = 1
)
SELECT
ob_filtered.orderbookid,
ob_filtered.marketid,
obe.orderbookentryid,
obe.entryname,
m.marketname
FROM
ob_filtered
JOIN m
ON m.marketid = ob_filtered.marketid
JOIN obe
ON ob_filtered.orderbookid = obe.orderbookid
;

SQL Search for missing record, then insert value

Below is a very oversimplified problem I am trying to solve
I have the following tables:
**quiz**
id title
--------------
1 first
2 second
3 third
4 fourth
5 fifth
**quiz_status**
id status user_id quiz_id
-------------------------------
1 0 1 1
2 0 1 2
3 0 1 3
if a I run the following:
select *
from quiz as q
left join quiz_status as qs
ON q.id = qs.quiz_id
where qs.user_id=1
I'd get:
id title id status user_id quiz_id
-------------------------------------------
1 first 1 0 1 1
2 second 2 0 1 2
3 third 3 0 1 3
4 fourth null null null null
5 fifth null null null null
I would like to be able to insert values where missing/null in the quiz_status table.
so the final outcome would be:
id title id status user_id quiz_id
-------------------------------------------
1 first 1 0 1 1
2 second 2 0 1 2
3 third 3 0 1 3
4 fourth 4 0 1 4
5 fifth 5 0 1 5
What would be the insert statement for that?
Consider the insert ... select syntax:
insert into quiz_status(status, user_id, quiz_id)
select 0, u.user_id, q.id
from (select distinct user_id from quiz_status) u
cross join quiz q
left join quiz_status qz on q.id = qz.quiz_id and u.user_id = qz.user_id
where qz.quiz_id is null
This works by generating all combinations of users and quizs, and then left joining the status table to filter on missing records. In the real life, you would likely have a users table that you can use in place of the select distinct subquery.
If you need just one user it's simpler:
insert into quiz_status(status, user_id, quiz_id)
select 0, 1, q.id
from quiz q
left join quiz_status qz on q.id = qz.quiz_id and qz.user_id = 1
where qz.quiz_id is null
Note: presumably, id is a serial column so I left it apart in the inserts.

How to select id when same id has multiple rows but I am looking for id which are missing a particular value

I have this table my_table_c with the below values
SELECT * FROM my_table_c
ID GROUP_ID GROUP_VALUE
1 2 1
3 3 2
3 4 1
5 4 1
5 2 1
2 2 2
2 3 2
2 4 1
I am looking for this output where I get only the ID which do not have group_id 2. Additionally, I don't want to get the ID where group_id 2 is absent but other group ids are present.
If group_id 2 is absent, that's my target id.
So with the values shown in table above, I just expect ID = 3 to be returned as other ids 1, 5 and 2 each have rows where group_id = 2.
Can anyone please help with a query to fetch this result.
You could get all the id's that have group_id = 2 and use NOT IN
select *
from my_table_c
where id not in (select id from my_table_c where group_id = 2)
Another way but using NOT EXISTS
select *
from my_table_c mtcA
where not exists (select *
from my_table_c mtcB
where mtcA.id = mtcB.id and mtcB.group_ID = 2)