how to get duplicates when I group rows? - sql

I have this table:
MyTable(ID, FK, ...)
I am using this query:
select ID fromMytable were FK <> 1
group by ID, FK
order by ID
This gives me the result that I want:
255
255
267
268
790
...
The 255 is duplicate because has two differnt KFs. The rest of the IDs has the same FK. I would like to get the IDs which has more than one FK and has differents values.
If an ID has two rows with FK = 2 and FK = 3 then get this ID, but if the ID has FK = 2, FK = 2, FK = 2 I don't want this ID because it has the same FK.
How could I get this IDs?
Thank you so much.

You should count distinct FKs
select ID from Mytable where FK <> 1
group by ID
having count(distinct FK) > 1
order by ID

Try this:
SELECT
ID, COUNT(*)
FROM
fromMytable
WHERE
FK <> 1
GROUP BY
ID
HAVING
COUNT(*) > 1
ORDER BY ID

Use HAVING to find only ID that exists more one once:
select DISTINCT ID
from Mytable
where FK <> 1
group by ID, FK
having count(*) >= 2
order by ID

You can use ROW_NUMBER window function.
SELECT ID FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RN, ID
from Mytable WHERE FK <> 1
) TMP
WHERE RN = 1

Related

SQL remove duplicate row depend on certain value

I spend day in hope to figure out how to solve this query.
I have following table
ID Name Pregnancy Gender
1 Raghad Yes Female
1 Raghad No Female
2 Ohoud no Male
What I need is to remove duplicate (in this case 1,1) and to keep one of these rows which has a pregnancy status of yes.
To clarify, I can't use delete since it's a restricted database. I can only retrieve data.
Using an exists clause:
DELETE
FROM yourTable t1
WHERE
pregnancy = 'no' AND
EXISTS (SELECT 1 FROM yourTable t2 WHERE t2.ID = t1.ID AND t2.pregnancy = 'yes');
There are other ways to go about doing this, e.g. using ROW_NUMBER, but as you did not tag your database, I offer the above solution which should work on basically any database.
If you want to just view your data with the "duplicates" removed, then use:
SELECT *
FROM yourTable t1
WHERE
pregnancy = 'yes' OR
NOT EXISTS (SELECT 1 FROM yourTable t2 WHERE t2.ID = t1.ID AND t2.pregnancy = 'yes');
If column Pregnancy have just two values "Yes" and "No", in that case you can use ROW_NUMBER() also to get the results.
;WITH CTE
AS (
SELECT *,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Pregnancy DESC) RN
FROM TABLE_NAME
)
SELECT *
FROM CTE
WHERE RN = 1
In case of multiple values when you want to give highest priorty to "Yes", you can write your query like following
;WITH CTE
AS (
SELECT *,ROW_NUMBER() OVER
(PARTITION BY id ORDER BY CASE WHEN Pregnancy = 'Yes' then 0 else 1 end) RN
FROM TABLE_NAME
)
SELECT *
FROM CTE
WHERE RN= 1
For this sample data you can group by ID, Name, Gender and return the maximum value of the column Pregnancy for each group since Yes is greater compared to No:
SELECT ID, Name, MAX(Pregnancy) Pregnancy, Gender
FROM tablename
GROUP BY ID, Name, Gender
See the demo.
Results:
> ID | Name | Pregnancy | Gender
> -: | :----- | :-------- | :-----
> 1 | Raghad | Yes | Female
> 2 | Ohoud | No | Male
Here is how you could do it in MySQL 8.
Similar Common Table Expressions exist in SQL Server and Oracle.
There you may need to add a comma after then closing parentheses that
ends the CTE (with) definition.
with dups as (
Select id from test
group by id
Having count(1) > 1
)
select * from test
where id in (select id from dups)
and Pregnancy = 'Yes'
union all
select * from test where id not in (select id from dups);
You can see it in action, by running it here
Note this does it without deleting the original.
But it gives you a result set to work with that has what you want.
If you wanted to delete, then you could use this instead, after the dups CTE definition:
delete from test
where id in (select id from dups) and Pregnancy = 'No'
Or distill this into:
delete from test
where id in (Select id from test
group by id
Having count(1) > 1) and Pregnancy = 'No'
1) First of all, update design of your table. ID must be primary key. This would automatically restrict the duplicate rows having same ID.
2) You can use Group by and having clause to remove duplicates
delete from table where pregnancy='no' and exists (SELECT
id
FROM table
GROUP BY id
HAVING count(id)>1)

SQL query to return rows where only one record is present in a given status

I have a table with data similar to below. I am trying to get a list of results that will display all rows where only one unique SourceID exists in status 10. If I were querying this table, I would expect ID's 3 and 4 to be returned.
Table Example
Select *
From table
Where Status = 10 and Source ID in
(
Select SourceID
From Table
Group by SourceID
Having Count(*) = 1
)
You can use NOT EXISTS :
SELECT t.*
FROM table t
WHERE NOT EXISTS (SELECT 1 FROM table t1 WHERE t1.SourceID = t.SourceID AND t1.Status <> t.Status);
Maybe that would work?
SELECT ID FROM Mytable
WHERE [Status] = 10
GROUP BY ID
HAVING COUNT(SourceID) = 1
First, find out all the unique SourceIDs
SELECT
SourceID
FROM
Data
GROUP BY
SourceID
HAVING
COUNT(SourceID) = 1
And then use this query as a sub query to get all the rows that has unique SourceID;
SELECT
*
FROM
Data
WHERE
SourceID IN (
SELECT
SourceID
FROM
Data
GROUP BY
SourceID
HAVING
COUNT(SourceID) = 1
)
Use a sub-query to check if t there is an exact count of 1 of those source id's
SELECT t.* FROM YourTable t WHERE t.status = 10
AND
(SELECT COUNT(0) x From YourTable t2
where t2.sourceid = t.sourceid) = 1

Delete ALL rows that have a duplicate ID

There are plenty of posts on SO where a solution is given to take out rows that are in one way or form duplicate to other rows, leaving only 1.
What I am looking for is how I can delete all rows from my temp-table that do not have a unique ID:
ID other_values
-----------------------------
1 foo bar
2 bar baz
2 null
2 something
3 else
I don't care about the other values; once the ID is not unique, I want all rows out, the result being:
ID other_values
-----------------------------
1 foo bar
3 else
How can I do this?
Try this:
--delete all rows from my temp-table that do not have a unique ID
DELETE from MYTABLE
WHERE ID IN (SELECT ID FROM MYTABLE GROUP BY ID HAVING COUNT(*) > 1)
I would use a DELETE command in conjunction with a subquery to detect duplicates
DELETE
FROM mytable
WHERE ID IN (SELECT ID FROM mytable GROUP BY ID HAVING COUNT(*) > 1)
Use Cte to delete rows.
WITH cte
AS (
SELECT id
,Other_values
,ROW_NUMBER() OVER (
PARTITION BY id ORDER BY id
) rownum
FROM mytable
)
DELETE
FROM cte
WHERE rownum > 1

Find out particular id

I have a table in sql like this:
id billpay
-------------------------
1024 0
1024 0
1024 1
1025 1
1025 1
I want to retrieve only those id having billpay 1
Please help me with this
Try this:
select distinct id from yourtable where billpay = 1
It should be like this:
SELECT id FROM tabel WHERE billpay = 1;
This will retrieve those ids in ascending order which have at least one record in the table with billpay = 1.
The DISTINCT keyword will ensure you don't receive back multiple records with the same id.
SELECT DISTINCT id
FROM [TableName]
WHERE billpay = 1
ORDER BY id ASC
If you want to exclude those ids which also have records with billpay = 0, then use this:
SELECT DISTINCT id
FROM [TableName]
WHERE billpay = 1
AND id NOT IN (SELECT id FROM [TableName] WHERE billpay = 0)
ORDER BY id ASC
Regards,
select ID
from MyData
Where billpay = 1
Group By ID
The group by will list unique IDs
select ID
from MyData A
Where not exists (select 'X' from MyData B where B.billpay <> 1 and B.ID = A.ID)
Group By ID
This will only list IDs where billpay is only 1
Try this:
SELECT id
FROM mytable
GROUP BY id
HAVING COUNT(CASE WHEN COALESCE(billpay, 0) <> 1 THEN 1 END) = 0
The above will select only those ids associated to billpay=1 and nothing but billpay=1.
SQL Fiddle Demo
The following query selects the ids from group of ids where the number of records with billpay = 1 is the same as the number of records in the group
select id
from bills
group by id
having sum(billpay) = count(id)
Use NOT EXISTS to find rows with no other than billplay 1, use DISTINCT to return only one of each id found.
select distinct id
from tablename t1
where not exists (select 1 from tablename t2
where t1.id = t2.id
and t2.billpay <> 1)
Try to use GROUP BY +MIN statement to exclude Id's with existing billpay=0
SELECT id
FROM yourtable
GROUP BY id
HAVING MIN(billpay)=1

Compare one field between two rows in the same table

I have a small table which contains group memberships to which I am struggling to find a query.
uid groupid userid
1 2 5
2 2 6
3 1 2
4 3 8
5 4 7
I was wondering if it is possible to return TRUE if two given user IDs where in the same group?
The following gets all groups that have two given members:
select groupid
from table t
where userid in ($userid1, $userid2)
group by groupid
having count(distinct userid) = 2;
You can turn this into a boolean if you like:
select (case when count(*) > 0 then true else false end)
from (select groupid
from table t
where userid in ($userid1, $userid2)
group by groupid
having count(distinct userid) = 2
) g;
SELECT groupid, CASE WHEN COUNT(distinct userid) > 1 THEN "TRUE" ELSE "FALSE" END
FROM my_table
WHERE userid IN ('x', 'y')
GROUP BY groupid
Note the x and y should be replaced with the given userids
TRUE:
SELECT groupid, CASE WHEN COUNT(distinct userid) > 1 THEN 'TRUE' ELSE 'FALSE' END
FROM my_table
WHERE userid IN (5,6)
GROUP BY groupid
FALSE:
SELECT groupid, CASE WHEN COUNT(distinct userid) > 1 THEN 'TRUE' ELSE 'FALSE' END
FROM my_table
WHERE userid IN (5,2)
GROUP BY groupid
http://sqlfiddle.com/#!15/3f156/1
UNIQUE userid
Each user can only have at most one entry (like the question seems to ask).
SELECT (SELECT groupid FROM tbl WHERE userid = 5)
= (SELECT groupid FROM tbl WHERE userid = 6);
Assuming useridis UNIQUE.
Returns TRUE or FALSE exactly like requested.
- or NULL if a userid is not found or groupid is NULL.
UNIQUE (userid, groupid)
Each user can only have multiple entries (as clarified in the comment):
Share all groups?
SELECT EXISTS (
SELECT 1
FROM (SELECT groupid FROM tbl2 WHERE userid = 5) a
FULL JOIN (SELECT groupid FROM tbl2 WHERE userid = 6) b USING (groupid)
WHERE a.groupid IS NULL OR
b.groupid IS NULL
) AS share_all;
Share at least one group?
SELECT EXISTS (
SELECT groupid FROM tbl2 WHERE userid = 8
INTERSECT
SELECT groupid FROM tbl2 WHERE userid = 9
) AS share_min_one;
Or
SELECT EXISTS (
SELECT 1
FROM (SELECT groupid FROM tbl2 WHERE userid = 5) a
JOIN (SELECT groupid FROM tbl2 WHERE userid = 6) b USING (groupid)
) AS share_min_one;
Share exactly one group?
SELECT count(*) = 1 AS share_exactly_one
FROM (SELECT groupid FROM tbl2 WHERE userid = 5) a
JOIN (SELECT groupid FROM tbl2 WHERE userid = 6) b USING (groupid);
SQL Fiddle with better test data.
All of these queries are fast with an index on userid. Faster with a multicolumn index on (userid, groupid) in Postgres 9.2+.
Ultimately this is a case of relational-division. Here's an arsenal of query techniques:
How to filter SQL results in a has-many-through relation