sql duplicates showing all data - sql

Given this data
id Name group
1 Jhon 001
2 Paul 002
3 Mary 001
How can I get the duplicates values showing all the fields? The duplicate is only on group, id and name won't be duplicates.
Should end up looking like one of those (any would be valid):
:::::::::::::::::::::::::::::::::::::::::::::::
group count values
001 2 1,3
:::::::::::::::::::::::::::::::::::::::::::::::
id name group
1 Jhon 001
3 Mary 001
I tried with
SELECT
group, COUNT(*)
FROM
people
GROUP BY
group
HAVING
COUNT(*) > 1
But if I try to add id and name to the group by, it won´t find any duplicate.
Thanks in advance.

Try this.
SELECT Id, Name, [Group]
FROM people
WHERE [Group] IN(
SELECT [Group]
FROM people
GROUP BY [Group]
HAVING COUNT(*) > 1)

I would do an inner query to find the groups with more than one member, and then use that inner query to bring back a list of the names.
For example:
SELECT Id, Name, group
FROM people
WHERE group in
(SELECT group
FROM people
GROUP BY group
HAVING count(*) > 1);

Avoid using Group because it is a reserved keyword in SQL :
SELECT *
FROM MyTable
WHERE groups IN(
SELECT groups
FROM MyTable
GROUP BY groups
HAVING COUNT(*) > 1)
Check Execution here

Just use exists:
select p.*
from people p
where exists (select 1
from people p2
where p2.group = p.group and
p2.id <> p.id
);
This should be the most performant solution. With an index on people(group, id), it should have very good performance.
Note: All the advice to avoid using group as a column name is good advice. You should change the name.

Related

Display column values next to each other SQL

I have to select which wards have patients with the same last name. I need to display the name of those patients and the IDs that come with them.
My query below works to give me the proper result:-
SELECT pt.NAME, pt.PATIENTID, pt.PATIENTID
FROM PATIENT pt
WHERE pt.NAME IN
(SELECT pt.NAME
FROM PATIENT pt
GROUP BY pt.NAME
HAVING COUNT(DISTINCT pt.WARDNO) < 2)
AND NAME IN
(SELECT NAME
FROM PATIENT
GROUP BY NAME
HAVING COUNT(*) > 1)
It gives me the following result in the table:
Name | PatientID
Jones | p10
Jones | p29
However, I want to display my results in the table like this and am not sure how.
Name | PatientID | PatientID
Jones | p10 | p29
Assuming you expect no more than two patient ID values, then a simple aggregation can work here:
WITH cte AS (
SELECT NAME
FROM PATIENT
GROUP BY NAME
HAVING COUNT(*) > 1 AND COUNT(DISTINCT WARDNO) < 2
)
SELECT
NAME,
MIN(PATIENTID) AS PatientID1,
MAX(PATIENTID) AS PatientID2
FROM cte
GROUP BY
NAME;
Thank you Tim. That aggregation did work and I did a little tweak to make it a lot cleaner. It works to display just the two PatientIDs which is what I need for now.
My new query is as follows
SELECT pt.NAME, MIN(pt.PATIENTID) AS PAT, MAX(pt.PATIENTID) AS PAT
FROM PATIENT pt
GROUP BY pt.NAME
HAVING COUNT(*) > 1 AND COUNT(DISTINCT pt.WARDNO) < 2
However, if I do have multiple PatientIDs I am curious what I would do.
I have to select which wards have patients with the same last name.
Assuming patients are not duplicated, you can get the patient/ward combinations as:
select p.wardno, p.name, count(*)
from patient p
group by p.wardno, p.name
having count(*) >= 2;
If you only want the wards, then you have one of the rare situations where select distinct and group by are appropriate:
select distinct p.wardno
from patient p
group by p.wardno, p.name
having count(*) >= 2;

IN SQL count after group by

i want to count after group by, not the total line
just want to count by the categories
after group by my result is like
course lecturer
comp1111 Jim
comp1100 Jim
comp1100 Jim
infs2321 Jess
infs2321 Jess
econ1222 Helen
my result after count should be
lecturer count
Jim 3
Jess 2
Helen 1
I don't see why you want a group by after you have grouped. You get your desired result by doing just one group. Please have a look at this sqlfiddle to see it working live.
CREATE TABLE Table1
(`course` varchar(8), `lecturer` varchar(5))
;
INSERT INTO Table1
(`course`, `lecturer`)
VALUES
('comp1111', 'Jim'),
('comp1100', 'Jim'),
('comp1100', 'Jim'),
('infs2321', 'Jess'),
('infs2321', 'Jess'),
('econ1222', 'Helen')
;
select
lecturer, count(*)
from
Table1
group by lecturer desc;
| LECTURER | COUNT(*) |
-----------|----------|--
| Jim | 3 |
| Jess | 2 |
| Helen | 1 |
EDIT:
You don't need an extra table. To get the row with the largest count you can simply do
select
lecturer, count(*)
from
Table1
group by lecturer
order by count(*) desc
limit 1;
for MySQL or
select top 1
lecturer, count(*)
from
Table1
group by lecturer
order by count(*) desc;
for MS SQL Server. In my first answer I had GROUP BY lecturer DESC which is the same as GROUP BY lecturer ORDER BY COUNT(*) DESC because in MySQL GROUP BY implies an ORDER BY.
If this is not what you want, be careful with using MAX() function. When you simply do for example
select
lecturer, max(whatever)
from
Table1
group by lecturer;
you don't necessarily get the row with holding the max of whatever.
You can also do
select
lecturer, max(whatever), min(whatever)
from
Table1
group by lecturer;
See? You just get the value returned by the function, not the row belonging to it. For examples how to solve this, please refer to this manual entry.
I hope I didn't confuse you now, this is probably more than you wanted to know, because above is especially for groups. I think what you really want to do is simply ordering the table the way you want, then pick just one row, like mentioned above.
Try this. It might work
SELECT LECTURER, COUNT(*)
FROM
(SELECT LECTURER, COURSE
FROM TABLE
WHERE
GROUP BY LECTURER, COURSE )
GROUP BY LECTURER;
try to this command in mysql
============================
select
lecturer, count(*)
from
Course_detail
group by lecturer desc;

Count single occurrences of a row item

I would like to count the number of times an item in a column has appeared only once. For example if in my table I had...
Name
----------
Fred
Barney
Wilma
Fred
Betty
Barney
Fred
...it would return me a count of 2 because only Wilma and Betty have appeared once.
Here is SQLFiddel Demo
Below is the Query which you can try:
select count(*) from
(select Name
from Table1
group by Name
having count(*) = 1) T
Till Above my post was for your actual Post.
Below is the post for modified question:
In oracle you can try below query:
select sum(count(rownum))
from Table1
group by "Name"
having count(*) = 1
OR
Here is SQLFiddel Demo
In SQL Server you can try below query:
SELECT COUNT(*)
FROM Table1 a
LEFT JOIN Table1 b
ON a.Name=b.Name
AND a.%%physloc%% <> b.%%physloc%%
WHERE b.Name IS NULL
OR
Here is the SQLFiddel Demo
In Sybase you can try below query:
select count(count(name))
from table
group by name
having count(name) = 1
as per #user2617962's answer.
Thank you
select count(*) from
(select count(*) from Table1
group by Name
having count(*) =1) s
SqlFiddle
Since you just need the count of column values appearing once without the actual value of the column, the query should be:
select count(count(name)) from table group by name having count(name)
= 1
Try following.
select name from (select name, count(name) as num from tblUsers group by name)
tblTemp where tblTemp.num=1
Mark it if this works..:)

Find duplicate records in a table using SQL Server

I am validating a table which has a transaction level data of an eCommerce site and find the exact errors.
I want your help to find duplicate records in a 50 column table on SQL Server.
Suppose my data is:
OrderNo shoppername amountpayed city Item
1 Sam 10 A Iphone
1 Sam 10 A Iphone--->>Duplication to be detected
1 Sam 5 A Ipod
2 John 20 B Macbook
3 John 25 B Macbookair
4 Jack 5 A Ipod
Suppose I use the below query:
Select shoppername,count(*) as cnt
from dbo.sales
having count(*) > 1
group by shoppername
will return me
Sam 2
John 2
But I don't want to find duplicate just over 1 or 2 columns. I want to find the duplicate over all the columns together in my data. I want the result as:
1 Sam 10 A Iphone
with x as (select *,rn = row_number()
over(PARTITION BY OrderNo,item order by OrderNo)
from #temp1)
select * from x
where rn > 1
you can remove duplicates by replacing select statement by
delete x where rn > 1
SELECT OrderNo, shoppername, amountPayed, city, item, count(*) as cnt
FROM dbo.sales
GROUP BY OrderNo, shoppername, amountPayed, city, item
HAVING COUNT(*) > 1
SQL> SELECT JOB,COUNT(JOB) FROM EMP GROUP BY JOB;
JOB COUNT(JOB)
--------- ----------
ANALYST 2
CLERK 4
MANAGER 3
PRESIDENT 1
SALESMAN 4
Just add all fields to the query and remember to add them to Group By as well.
Select shoppername, a, b, amountpayed, item, count(*) as cnt
from dbo.sales
group by shoppername, a, b, amountpayed, item
having count(*) > 1
To get the list of multiple records use following command
select field1,field2,field3, count(*)
from table_name
group by field1,field2,field3
having count(*) > 1
Try this instead
SELECT MAX(shoppername), COUNT(*) AS cnt
FROM dbo.sales
GROUP BY CHECKSUM(*)
HAVING COUNT(*) > 1
Read about the CHECKSUM function first, as there can be duplicates.
Try this
with T1 AS
(
SELECT LASTNAME, COUNT(1) AS 'COUNT' FROM Employees GROUP BY LastName HAVING COUNT(1) > 1
)
SELECT E.*,T1.[COUNT] FROM Employees E INNER JOIN T1 ON T1.LastName = E.LastName
with x as (
select shoppername,count(shoppername)
from sales
having count(shoppername)>1
group by shoppername)
select t.* from x,win_gp_pin1510 t
where x.shoppername=t.shoppername
order by t.shoppername
First of all, I doubt that the result it not accurate? Seem like there are Three 'Sam' from the original table. But it is not critical to the question.
Then here we come for the question itself. Based on your table, the best way to show duplicate value is to use count(*) and Group by clause. The query would look like this
SELECT OrderNo, shoppername, amountPayed, city, item, count(*) as RepeatTimes FROM dbo.sales GROUP BY OrderNo, shoppername, amountPayed, city, item HAVING COUNT(*) > 1
The reason is that all columns together from your table uniquely identified each record, which means the records will be considered as duplicate only when all values from each column are exactly the same, also you want to show all fields for duplicate records, so the group by will not miss any column, otherwise yes because you can only select columns that participate in the 'group by' clause.
Now I would like to give you any example for With...Row_Number()Over(...), which is using table expression together with Row_Number function.
Suppose you have a nearly same table but with one extra column called Shipping Date, and the value may change even the rest are the same. Here it is:
OrderNo shoppername amountpayed city Item Shipping Date
1 Sam 10 A Iphone 2016-01-01
1 Sam 10 A Iphone 2016-02-02
1 Sam 5 A Ipod 2016-03-03
2 John 20 B Macbook 2016-04-04
3 John 25 B Macbookair 2016-05-05
4 Jack 5 A Ipod 2016-06-06
Notice that row# 2 is not a duplicate one if you still take all columns as a unit. But what if you want to treat them as duplicate as well in this case? You should use With...Row_Number()Over(...), and the query would look like this:
WITH TABLEEXPRESSION
AS
(SELECT *,ROW_NUMBER() OVER (PARTITION BY OrderNo, shoppername, amountPayed, city, item ORDER BY [Shipping Date] as Identifier) --if you consider the one with late shipping date as the duplicate
FROM dbo.sales)
SELECT * FROM TABLEEXPRESSION
WHERE Identifier !=1 --or use '>1'
The above query will give result together with Shipping Date, for example:
OrderNo shoppername amountpayed city Item Shipping Date Identifier
1 Sam 10 A Iphone 2016-02-02 2
Note this one is different from the one with 2016-01-01, and the reason why 2016-02-02 has been filtered out is PARTITION BY OrderNo, shoppername, amountPayed, city, item ORDER BY [Shipping Date] as Identifier, and Shipping Date is NOT one of the column that need to be took care of for duplicate records, which means the one with 2016-02-02 still could be a perfect result for your question.
Now summarize it little bit, using count(*) and Group by clause together is the best choice when you only want to show all columns from Group byclause as the result, otherwise you will miss the columns that do not participate in group by.
While For With...Row_Number()Over(...), it is suitable in every scenario that you want to find duplicate records, however, it is little bit complicated to write the query and little bit over engineered compared to the former one.
If your purpose is to delete duplicate records from table, you have to use the later WITH...ROW_NUMBER()OVER(...)...DELETE FROM...WHERE one.
Hope this helps!
You can use below methods to find the output
with Ctec AS
(
select *,Row_number() over(partition by name order by Name)Rnk
from Table_A
)
select Name from ctec
where rnk>1
select name from Table_A
group by name
having count(*)>1
Select *
from dbo.sales
group by shoppername
having(count(Item) > 1)
Select EventID,count() as cnt
from dbo.EventInstances
group by EventID
having count() > 1
The following is running code:
SELECT abnno, COUNT(abnno)
FROM tbl_Name
GROUP BY abnno
HAVING ( COUNT(abnno) > 1 )

SQL - WHERE AGGREGATE>1

Imagine I have a db table of Customers containing {id,username,firstname,lastname}
If I want to find how many instances there are of different firstnames I can do:
select firstname,count(*) from Customers group by 2 order by 1;
username | count(*)
===================
bob | 1
jeff | 2
adam | 5
How do I write the same query to only return firstnames that occur more than once? i.e. in the above example only return the rows for jeff and adam.
You want the having clause, like so:
select
firstname,
count(*)
from Customers
group by firstname
having count(*) > 1
order by 1
group by 2 order by 1 is terrible, I should say. Use proper column names if that's supported: this will drastically improve readability.
With that in mind,
select firstname, count(*) c
from Customers
group by firstname
having count(*) > 1 -- Kudos to Shannon
order by c;
That's what the HAVING clause does. I'm not sure if this will work in informix, but give it a shot:
select firstname, count(*)
from Customers
group by firstname
HAVING COUNT(*) > 1