Group by count of distinct values within a table - sql

This is in oracle. Table EmployeeName:
EmployeeNameID|EmployeeID|FirstName|LastName
1|1|ABC|DEF
2|1|ABC|EFG
3|1|ABC|DEF
4|2|XYZ|PQR
5|2|DEF|RST
6|3|XYQ|BRQ
I want to find out how many employee records have more than one name. The result should be: First column is the EmployeeId and the 2nd column is the distinct number of names they have. For the first result the ABC|DEF repeats so I just want to count it once.
1|2
2|2
3|1
I tried to group by but not sure how to work with distinct names requirement.

Try this:
SELECT EmployeeID, COUNT(DISTINCT CONCAT(FirstName,'-', LastName))
FROM EmployeeName
GROUP BY EmployeeID;

I think you just want count(distinct):
select employeeid, count(distinct firstname || ' ' || lastname)
from t
group by employeeid;

Related

SQL Select column which is not used in select section of subquery which find duplicates

I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.
Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;
The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)
If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1

SQL - Display when count > 1

I have the following query that I am sure is incorrect. I have a table called KantechImages and it contains about 8500 rows. I want to display Name, ESRno, Dept & JobTitle where there is more than 1 occurence of someones Name. My query is:
SELECT Name, COUNT(*) AS count, ESRno, JobTitle, Dept
FROM dbo.KantechImages
GROUP BY Name, ESRno, JobTitle, Dept
HAVING (COUNT(*) > 1)
But it is only displaying 268 rows, which I know is incorrect. If I edit it to just SELECT Name & Count, it brings back over 500 rows.
You can do what you want with window functions:
select name, cnt, ESRno, JobTitle, Dept
from (select ki.*, count(*) over (partition by name) as cnt
from dbo.KantechImages ki
) ki
where cnt > 1;
Because you want the original rows, a group by in the outer select is not appropriate.
You query looks ok. I think the problem is with your data. Probably something you think are the same and they arent like extra spaces in the name. Or maybe bad matchs between names and ESRno.
Try something like
This should return the same +500 name, but order by Name you can compare if some have extra spaces and appear duplicated.
SELECT Name, count(Name)
FROM dbo.KantechImages
GROUP BY Name
ORDER BY Name
HAVING count(Name) > 1
This should return the same +500 because I assume each Name have a single ESRno, unless two ppl have same name. In that case you should get even more rows in your result.
SELECT Name, ESRno, Count(ESRno)
FROM dbo.KantechImages
GROUP BY Name, ESRno
ORDER BY Name, ESRno
HAVING count(ESRno) > 1

What is the easiest way to getting distinct count in SQL Server

I've a distinct query as like below for three columns. My requirement is that I need a count of distinct of these three columns.
select distinct empid, empname, salary from employee
This is the following query used to getting the count of the table in normal case, but I needed that distinct count, how can I make a query ?
select count(empid) from employee
select count(*) from
(
select distinct empid, empname, salary from employee
) x
if you don't want to use a sub select just use group by
Select Count(*) FROM employee GROUP BY empid, empname, salary

How to get all the columns, but have to group by only 2 columns in sql query

I have a table Employees, which has Fields as below:
Employee_name,Employee_id,Employee_status,Employee_loc,last_update_time.
This table does not have any constraint.
I have tried the below query.
select Employee_name, count(1)
from Employees
where Employee_status = 'ACTIVE'
Group by Employee_name,Employee_loc
having count(Employee_name) > 1
order by count(Employee_name) desc
In the select, I need to get Employee_id too.. Can any one help on how to get that?
You can just add Employee_id to the query, and also add it to the group by clause. (Adding it to the grouping won't make any difference in the query results, assuming each employee name each employee id is unique).
If the grouping does make a difference, that implies that some combinations of employee name and location have more than one ID associated with them. Your query would therefore need to decide which ID to return, possibly by using an aggregate function.
SELECT EMPLOYEE_NAME, EMPLOYEE_ID, COUNT(1)
FROM
EMPLOYEES
WHERE
EMPLOYEE_NAME IN
(
SELECT EMPLOYEE_NAME
FROM EMPLOYEES
WHERE Employee_status = 'ACTIVE'
GROUP BY Employee_name,Employee_loc
HAVING COUNT(*) > 1
)
GROUP BY EMPLOYEE_NAME, EMPLOYEE_ID
You can also use partition by clause and select whichever columns you want to see irrespective of the columns you are using for aggregation.
A very short and simple explanation here - Oracle "Partition By" Keyword

MySQL, return only rows where there are duplicates among two columns

I have a table in MySQL of contact information ;
first name, last name, address, etc.
I would like to run a query on this table that will return only rows with first and last name combinations which appear in the table more than once.
I do not want to group the "duplicates" (which may only be duplicates of the first and last name, but not other information like address or birthdate) -
I want to return all the "duplicate" rows so I can look over the results and determine if they are dupes or not. This seemed like it would be a simple thing to do, but it has not been.
Every solution I can find either groups the dupes and gives me a count only (which is not useful for what I need to do with the results) or doesn't work at all.
Is this kind of logic even possible in a query ? Should I try and do this in Python or something?
You should be able doing this with the GROUP BY approach in a sub-query.
SELECT t.first_name, t.last_name, t.address
FROM your_table t
JOIN ( SELECT first_name, last_name
FROM your_table
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) t2
ON ( t.first_name = t2.first_name, t.last_name = t2.last_name )
The sub-query returns all names (first_name and last_name) that exist more than once, and the JOIN returns all records that match these names.
You could do it with a GROUP BY / HAVING and A SUB SELECT. Something like
SELECT t.*
FROM Table t INNER JOIN
(
SELECT FirstName, LastName
FROM Table
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1
) Dups ON t.FirstName = Dups.FirstName
AND t.LastName = Dups.LastName
select * from people
join (select firstName, lastName
from people
group by firstName, lastName
having count(*) > 1
) dupe
using (firstName, lastName)