Select distinct lines by a field - sql

I am making a select that returns me a table likes this
Name surname
Jhon a
Jhon b
Jhon c
Joe a
Joe b
Joe c
But what I need to get is just one occurrence of Jhon and one of Joe with one of the surnames.
I can only have one Jhon with one surname and one Joe with a surname..
I cannot make an order by because I need to select Name and surname.. Also if I use distinct I will have all Jhons and Joes..
Can you help me?

You can just use aggregation:
select name, max(surname) as surname
from table t
group by name;
You can also do something similar with analytic functions:
select t.name, t.surname
from (select t.*, row_number() over (partition by name order by name) as seqnum
from table t
) t
where seqnum = 1;
This is particularly useful if you want to get more than one column from the same row.

Related

SQL Server : finding duplicates based on first few characters on column

I want to find duplicates based on the first three characters of the surname, is there a way a to do that on SQL? I can compare the whole name, but how to do we compare the first few characters?
Below are my tables
custid forename surname dateofbirth
----------------------------------------
1 David John 16-09-1985
2 David Jon 16-09-1985
3 Sarah Smith 10-08-2015
4 Peter Proca 11-06-2011
5 Peter Proka 11-06-2011
This is my query that I am currently running to compare
SELECT
y.id, y.forename, y.surname
FROM
customers y
INNER JOIN
(SELECT
forename, surname, COUNT(*) AS CountOf
FROM customers
GROUP BY forename, surname
HAVING COUNT(*) > 1) dt ON y.forename = dt.forename
You can use left():
select c.*
from (select c.*, count(*) over (partition by left(surname, 3)) as cnt
from customers c
) c
order by surname;
You can include the forename as well in the partition by if you mean forename and first three letters of surname.
You can use exists as follows:
select t.* from t
Where exists
(select 1 from t tt
Where left(t.surname, 3) = left(tt.surname, 3) and t.custid <> tt.custid
)
order by t.surname;

Distinct on specific columns in SQL

I know someone on here already asked the similar questions. However, most of them still want to return the first row or last row if multiple rows have the same attributes. For my case, I want to simply discard the rows which have the same specific attributes.
For example, I have a toy dataset like this:
gender age name
f 20 zoe
f 20 natalia
m 39 tom
f 20 erika
m 37 eric
m 37 shane
f 22 jenn
I only want to distinct on gender and age, then discard all rows if those two attributes, which returns:
gender age name
m 39 tom
f 22 jenn
You could use the window (analytic) variant of count to find the rows that have a just one occurance of the gender/age combination:
SELECT gender, age, name
FROM (SELECT gender, age, name, COUNT(*) OVER (PARTITION BY gender, age) AS cnt
FROM mytable) t
WHERE cnt = 1
Use the HAVING clause in a CTE.
;WITH DistinctGenderAges AS
(
SELECT gender
,age
FROM YourTable
GROUP BY gender
,age
HAVING COUNT(*) = 1
)
SELECT yt.gender, yt.age, yt.name
FROM DistinctGenderAges dga
INNER JOIN YourTable yt ON dga.gender = yt.gender AND dga.age = yt.age
No matter what, you have to tell the database which value to pick for name. If you don't care an easy solution is to group:
SELECT gender, age, MIN(name) as name FROM mytable GROUP BY gender, age HAVING COUNT(*)=1
You can use any valid aggregate for name, but you have to pick something.

Find out columns values associated to more than one columns values in Oracle

I have a table in oracle with non unique column values. The combinations are also non unique. But association in a particular order has to be unique. I have tried many solutions. This question is the closest but i need a solution in Oracle SQL. Following is my table
--------------------------------------------------
Teacher subject class_id
--------------------------------------------------
Paul English 001
Paul English 002
Allen English 003
Sia Maths 134
John Computer 913
Jack Physics 341
Arlene Maths 001
-------------------------------------------------
The query should return only following info
English, Maths
i.e subjects that are associated to more than one teachers.
Maybe you want something like this:
select listagg (subject, ', ') within group (order by subject) subjects
from (
select subject from classes
group by subject
having count(teacher)>1
);
SUBJECTS
-------------------------
English, Maths
or you can achieve the same result with analytic functions:
select listagg (subject, ', ') within group (order by subject) subjects
from (
select subject,
count(teacher) over (partition by subject) teachers,
row_number() over (partition by subject order by class_id) rn
from classes
)
where teachers>1 and rn=1
;
If I understand correctly, you want subjects that have more than one teacher. This is a simple aggregation query with a having clause:
select subject
from t
group by subject
having min(teacher) <> max(teacher);

sql command to display highest repeated field in a column

How to display the highest repeated field in a column in sql ?
for eg if a column contains:
jack
jack
john
john
john
how to display the maximum repeated field (i.e) john from the above column?
select chairman
from mytable
group by chairman
HAVING COUNT(*) = (
select TOP 1 COUNT(*)
from mytable
group by chairman
ORDER BY COUNT(*) DESC)
select name from persons
group by name
having count(*) = (
select count(*) from persons
group by name
order by count(*) desc
limit 1)

Need a hand with a simple query

I need a help with a query. I think is not so difficult.
I need to do a select with distinct and at the same time, do a count(*) of how many rows are returned by this distinct.
One example:
Table names>
Id Name
1 john
2 john
3 mary
I need a query thats return:
Name Total
john 2
mary 1
select name, count(*) from names group by name;
SELECT name, COUNT(*) FROM names GROUP BY name
SELECT name, count(*) as occurrences FROM names GROUP BY name