DISTINCT AND COUNT(*)=1 not working on SQL - sql

I need to show the ID (which is unique in every case) and the name, which is sometimes different. In my code I only want to show the names IF they are unique.
I tried with both distinct and count(*)=1, nothing solves my problem.
SELECT DISTINCT id, name
FROM person
GROUP BY id, name
HAVING count(name) = 1;
The result is still showing the names multiple times

By "unique", I assume you mean names that only appear once. That is not what "distinct" means in SQL; the use of distinct is to remove duplicates (either for counting or in a result set).
If so:
SELECT MAX(id), name
FROM person
GROUP BY name
HAVING COUNT(*) = 1;

If your DBMS supports it, you can use a window function:
SELECT id, name
FROM (
SELECT id, name, COUNT(*) OVER(PARTITION BY name) AS NameCount -- get count of each name
FROM person
) src
WHERE NameCount = 1
If not, you can do:
SELECT id, name
FROM person
WHERE name IN (
SELECT name
FROM person
GROUP BY name
HAVING COUNT(*) = 1 -- Only get names that occur once
)

Related

Getting MAX of a column and adding one more

I'm trying to make an SQL query that returns the greatest number from a column and its respective id.
For more information I have two columns ID and NUMBER. Both of them have 2 entries and I want to get the highest number with the ID next to it. This is what I tried but didn't success.
SELECT ID, MAX(NUMBER) AS MAXNUMB
FROM TABLE1
GROUP BY ID, MAXNUMB;
The problem I'm experiencing is that it just shows ALL the entries and if I add a "where" expression it just shows the same (all entries [ids+numbers]).
Pd.: Yes, I got what I wanted but only with one column (number) if I add another column (ID) to select it "brokes".
Try:
SELECT
ID,
A_NUMBER
FROM TABLE1
WHERE A_NUMBER = (
SELECT MAX(A_NUMBER)
FROM TABLE1);
Presuming you want the IDs* of the row with the highest number (and not, instead, the highest number for each ID -- if IDs were not unique in your table, for example).
* there may be more than one ID returned if there are two or more IDs with equal maximum numbers
you can try this
Select ID,maxNumber
From
(
SELECT
ID,
(Select Max(NUMBER) from Tmp where Id = t.Id) maxNumber
FROM
Tmp t
)T1
Group By ID,maxNumber
The query you posted has an illegal column name (number) and is group by the alias for the max value, which is illegal and also doesn't make sense; and you can't include the unaliased max() within the group-by either. So it's likely you're actually doing something like:
select id, max(numb) as maxnumb
from table1
group by id;
which will give one row per ID, with the maximum numb (which is the new name I've made up for your numeric column) for each ID. Or as you said you get "ALL the entries" you might have group by id, numb, which would show all rows from the table (unless there are duplicate combinations).
To get the maximum numb and the corresponding id you could group by id only, order by descending maxnumb, and then return the first row only:
select id, max(numb) as maxnumb
from table1
group by id
order by maxnumb desc
fetch first 1 row only
If there are two ID with the same maxnumb then you would only get one of them - and which one is indeterminate unless you modify the order by - but in that case you might prefer to use first 1 row with ties to see them all.
You could achieve the same thing with a subquery and analytic function to generating a ranking, and have the outer query return the highest-ranking row(s):
select id, numb as maxnumb
from (
select id, numb, dense_rank() over (order by numb desc) as rnk
from table1
)
where rnk = 1
You could also use keep to get the same result as first 1 row only:
select max(id) keep (dense_rank last order by numb) as id, max(numb) as maxnumb
from table1
fiddle

Filter by number of occurrences in a SQL Table

Given the following table where the Name value might be repeated in multiple rows:
How can we determine how many times a Name value exists in the table and can we filter on names that have a specific number of occurrances.
For instance, how can I filter this table to show only names that appear twice?
You can use group by and having to exhibit names that appear twice in the table:
select name, count(*) cnt
from mytable
group by name
having count(*) = 2
Then if you want the overall count of names that appear twice, you can add another level of aggregation:
select count(*) cnt
from (
select name
from mytable
group by name
having count(*) = 2
) t
It sounds like you're looking for a histogram of the frequency of name counts. Something like this
with counts_cte(name, cnt) as (
select name, count(*)
from mytable
group by name)
select cnt, count(*) num_names
from counts_cte
group by cnt
order by 2 desc;
You need to use a GROUP BY clause to find counts of name repeated as
select name, count(*) AS Repeated
from Your_Table_Name
group by name;
If You want to show only those Which are repeated more than one times. Then use the below query which will show those occurrences which are there more than one times.
select name, count(*) AS Repeated
from Your_Table_Name
group by name having count(*) > 1;

SQL SELECT Full Row with Duplicated Data in One Column

I am using Microsoft SQL Server 2014.
I am able to list emails which are duplicated.
But I am unable to list the entire row, which contain other fields such as EmployeeId, Username, FirstName, LastName, etc.
SELECT Email,
COUNT(Email) AS NumOccurrences
FROM EmployeeProfile
GROUP BY Email
HAVING ( COUNT(Email) > 1 )
May I know how can I list all field in the rows that contains Email appearing more than once in the table?
Thank you.
Try this:
WITH DataSource AS
(
SELECT *
,COUNT(*) OVER (PARTITION BY email) count_calc
FROM EmployeeProfile
)
SELECT *
FROM DataSource
WHERE count_calc > 1
select distinct * from EmployeeProfile where email in (SELECT
Email
FROM EmployeeProfile
GROUP BY Email
HAVING COUNT(*) > 1 )
SQL Fiddle
with cte as (
select *
, count(1) over (partition by email) noDuplicates
from Demo
)
select *
from cte
where noDuplicates > 1
order by Email, EmployeeId
Explanation:
I've used a common table expression (cte) here; but you could equally use a subquery; it makes no difference.
This cte/subquery fetches every row, and includes a new field called noDuplicates which says how many records have that same email address (including the record itself; so noDuplicates=1 actually means there are no duplicates; whilst noDuplicates=2 means the record itself and 1 duplicate, or 2 records with this email address). This field is calculated using an aggregate function over a window. You can read up on window functions here: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017
In out outer query we're then selecting only those records with noDuplicates greater than 1; i.e. where there are multiple records with the same mail address.
Finally I've sorted by Email and EmployeeId; so that duplicates are listed alongside one another, and are presented in the sequence in which they were (presumably) created; just to make whoever's then dealing with these results life easy.
If EmployeeId is unique, then you can EXISTS :
SELECT ep.*
FROM EmployeeProfile ep
WHERE EXISTS (SELECT 1
FROM EmployeeProfile ep1
WHERE ep1.Email = ep.Email AND ep1.EmployeeId <> ep.EmployeeId
);

SQL Server : select only one row based on a field when there are several rows

I have a table with 3 columns: Name, Surname, Email. Data in those columns is not unique.
I need to get result that matches following criteria:
Select all three columns
Email records should be unique
There should be only one record per Email
That means SELECT DISTINCT isn't applicable because it could retrieve multiple email records.
Any ideas?
You didn't specify your DBMS, but most systems support "Windowed Aggregate Functions":
with cte as
( select Email, Name, Surname,
row_number() over (partition by Email order by Name) as rn
from tab
)
select Email, Name, Surname
from tab
where rn = 1
This assigns a ranking to each email and returns only the first.
If you want to show all unique names associated with each email with one row per email you can use string aggregation.
If using MySQL (you didn't specify the database):
select group_concat(distinct name order by name separator ', ') as names,
group_concat(distinct surename order by name separator ', ') as surenames,
email
from tbl
group by email
If using PostgreSQL, string_agg is the equivalent. If using Oracle, listagg.
If you just arbitrarily want any name associated with the email, and you don't care which name, just as long as it's only one, you can use the previous answers.
However if your database doesn't support the with clause or window functions (ie. MySQL), you can use the below to arbitrarily show only one name and surname per email:
select x.*, y.surname
from (select email, max(name) as name from tbl group by email) x
join tbl y
on x.name = y.name
and x.email = y.email
This will show the correct surname for the given name because it picks the max(name) first and then gets the surname for that name and email.

How to find the highest populated instance in a column in SQL

So I have a table (person), that contains columns such as persons name, age, eye-color, favorite movie.
How do I find the most popular eye color(s), returning just the eye color (not the count) using SQL (Microsft Access), without using top as there might be multiple colours with the same count.
Thank you
SELECT
EyeColor
FROM
Person
GROUP BY
EyeColor
HAVING
COUNT(*) = (
SELECT MAX(i.EyeColorCount) FROM (
SELECT COUNT(*) AS EyeColorCount FROM Person GROUP BY EyeColor
) AS i
)
In Access, I think you need something on the lines of:
SELECT First(t.Eyecolor) AS FirstOfEyeColor
FROM (SELECT p.EyeColor, Count(p.EyeColor) AS C
FROM Person p
GROUP BY p.EyeColor
ORDER BY Count(p.EyeColor) DESC) AS t;