Select all unique values of all attributes in one query - sql

I have a table and I want to select all unique values of all attributes in one query.
For example table Person with 3 columns name, age, city.
Example:
Name
age
city
Alex
34
New York
Leo
34
London
Roy
20
London
Alex
28
Moscow
Mike
36
London
And I want to have a result with unique values of every attribute
Name
age
city
Alex
20
New York
Leo
28
London
Roy
34
Moscow
36
Is it possible to do this query?
I tried to make some queries with DISTINCT and UNION, but the result with always a multiplication of rows.

This is not how relational databases work, but sometimes you got to do what you got to do.
You can do:
select a.name, b.age, c.city
from (select distinct name, row_number() over() as rn from t) a
full join (select distinct age, row_number() over() as rn from t) b on b.rn = a.rn
full join (select distinct city, row_number() over() as rn from t) c
on c.rn = coalesce(a.rn, b.rn)

One option is to aggregate into array, then unnest those arrays:
select x.*
from (
select array_agg(distinct name) as names,
array_agg(distinct age) as ages,
array_agg(distinct city) as cities
from the_table
) d
cross join lateral unnest(d.names, d.ages, d.cities) with ordinality as x(name, age, city);
I would expect this to be quite slow if you really have many distinct values ("millions"), but if you only expect very few distinct values ("hundreds" or "thousands") , then this might be OK.

Related

How to get values of one column without the aggregate column?

I have this table:
first_name
last_name
age
country
John
Doe
31
USA
Robert
Luna
22
USA
David
Robinson
22
UK
John
Reinhardt
25
UK
Betty
Doe
28
UAE
How can I get only the names of the oldest per country?
When I do this query
SELECT first_name,last_name, MAX(age)
FROM Customers
GROUP BY country
I get this result:
first_name
last_name
MAX(age)
Betty
Doe
31
John
Reinhardt
22
John
Doe
31
But I want to get only first name and last name without the aggregate function.
If window functions are an option, you can use ROW_NUMBER for this task.
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY country ORDER BY age DESC) AS rn
FROM tab
)
SELECT first_name, last_name, age, country
FROM cte
WHERE rn = 1
Check the demo here.
It sounds like you want to get the oldest age per country first,
SELECT Country, MAX(age) AS MAX_AGE_IN_COUNTRY
FROM Customers
GROUP BY Country
With that, you want to match that back to the original table (aka a join) to see which names they match up to.
So, something like this perhaps:
SELECT Customers.*
FROM Customers
INNER JOIN
(
SELECT Country, MAX(age) AS MAX_AGE_IN_COUNTRY
FROM Customers
GROUP BY Country
) AS max_per_country_query
ON Customers.Country = max_per_country_query.Country
AND Customers.Age = max_per_country_query.MAX_AGE_IN_COUNTRY
If your database supports it, I prefer using the CTE style of handling these subqueries because it's easier to read and debug.
WITH cte_max_per_country AS (
SELECT Country, MAX(age) AS MAX_AGE_IN_COUNTRY
FROM Customers
GROUP BY Country
)
SELECT Customers.*
FROM Customers C
INNER JOIN cte_max_per_country
ON C.Country = cte_max_per_country.Country
AND C.Age = cte_max_per_country.MAX_AGE_IN_COUNTRY

SQL Server : finding duplicates based on first few characters on column

I want to find duplicates based on the first three characters of the surname, is there a way a to do that on SQL? I can compare the whole name, but how to do we compare the first few characters?
Below are my tables
custid forename surname dateofbirth
----------------------------------------
1 David John 16-09-1985
2 David Jon 16-09-1985
3 Sarah Smith 10-08-2015
4 Peter Proca 11-06-2011
5 Peter Proka 11-06-2011
This is my query that I am currently running to compare
SELECT
y.id, y.forename, y.surname
FROM
customers y
INNER JOIN
(SELECT
forename, surname, COUNT(*) AS CountOf
FROM customers
GROUP BY forename, surname
HAVING COUNT(*) > 1) dt ON y.forename = dt.forename
You can use left():
select c.*
from (select c.*, count(*) over (partition by left(surname, 3)) as cnt
from customers c
) c
order by surname;
You can include the forename as well in the partition by if you mean forename and first three letters of surname.
You can use exists as follows:
select t.* from t
Where exists
(select 1 from t tt
Where left(t.surname, 3) = left(tt.surname, 3) and t.custid <> tt.custid
)
order by t.surname;

Distinct on specific columns in SQL

I know someone on here already asked the similar questions. However, most of them still want to return the first row or last row if multiple rows have the same attributes. For my case, I want to simply discard the rows which have the same specific attributes.
For example, I have a toy dataset like this:
gender age name
f 20 zoe
f 20 natalia
m 39 tom
f 20 erika
m 37 eric
m 37 shane
f 22 jenn
I only want to distinct on gender and age, then discard all rows if those two attributes, which returns:
gender age name
m 39 tom
f 22 jenn
You could use the window (analytic) variant of count to find the rows that have a just one occurance of the gender/age combination:
SELECT gender, age, name
FROM (SELECT gender, age, name, COUNT(*) OVER (PARTITION BY gender, age) AS cnt
FROM mytable) t
WHERE cnt = 1
Use the HAVING clause in a CTE.
;WITH DistinctGenderAges AS
(
SELECT gender
,age
FROM YourTable
GROUP BY gender
,age
HAVING COUNT(*) = 1
)
SELECT yt.gender, yt.age, yt.name
FROM DistinctGenderAges dga
INNER JOIN YourTable yt ON dga.gender = yt.gender AND dga.age = yt.age
No matter what, you have to tell the database which value to pick for name. If you don't care an easy solution is to group:
SELECT gender, age, MIN(name) as name FROM mytable GROUP BY gender, age HAVING COUNT(*)=1
You can use any valid aggregate for name, but you have to pick something.

Name of Teacher with Highest Wage - recursive CTE

I am trying to get the max salary of each dept and display that teacher by first name as a separate column. So dept 1 may have 4 rows but one name showing for max salary. I'm Using SQL SERVER
With TeacherList AS(
Select Teachers.FirstName,Teachers.LastName,
Teachers.FacultyID,TeacherID, 1 AS LVL,PrincipalTeacherID AS ManagerID
FROM dbo.Teachers
WHERE PrincipalTeacherID IS NULL
UNION ALL
Select Teachers.FirstName,Teachers.LastName,
Teachers.FacultyID,Teachers.TeacherID, TeacherList.LVL +
1,Teachers.PrincipalTeacherID
FROM dbo.Teachers
INNER JOIN TeacherList ON Teachers.PrincipalTeacherID =
TeacherList.TeacherID
WHERE Teachers.PrincipalTeacherID IS NOT NULL)
SELECT * FROM TeacherList;
SAMPLE OUTPUT :
Teacher First Name | Teacher Last Name | Faculty| Highest Paid In Faculty
Eric Smith 1 Eric
Alex John 1 Eric
Jessica Sewel 1 Eric
Aaron Gaye 2 Aaron
Bob Turf 2 Aaron
I'm not sure from your description but this will return all teachers and the last row is the name of the teacher with the highest pay on the faculty.
select tr.FirstName,
tr.LastName,
tr.FacultyID,
th.FirstName
from Teachers tr
join (
select FacultyID, max(pay) highest_pay
from Teachers
group by FacultyID
) t on tr.FacultyID = t.FacultyID
join Teachers th on th.FacultyID = t.FacultyID and
th.pay = t.highest_pay
this will produce an unexpected result (duplicate rows) if there are more persons with the highest salary on the faculty. In such case you may use window functions as follows:
select tr.FirstName,
tr.LastName,
tr.FacultyID,
t.FirstName
from Teachers tr
join
(
select t.FirstName,
t.FacultyID
from
(
select t.*,
row_number() over (partition by FacultyID order by pay desc) rn
from Teachers t
) t
where t.rn = 1
) t on tr.FacultyID = t.FacultyID
This will display just one random teacher from faculty with highest salary.
dbfiddle demo
You can do this with a CROSS APPLY.
SELECT FirstName, LastName, FacultyID, HighestPaid
FROM Teachers t
CROSS APPLY (SELECT TOP 1 FirstName AS HighestPaid
FROM Teachers
WHERE FacultyID = t.FacultyID
ORDER BY Salary DESC) ca

Select distinct lines by a field

I am making a select that returns me a table likes this
Name surname
Jhon a
Jhon b
Jhon c
Joe a
Joe b
Joe c
But what I need to get is just one occurrence of Jhon and one of Joe with one of the surnames.
I can only have one Jhon with one surname and one Joe with a surname..
I cannot make an order by because I need to select Name and surname.. Also if I use distinct I will have all Jhons and Joes..
Can you help me?
You can just use aggregation:
select name, max(surname) as surname
from table t
group by name;
You can also do something similar with analytic functions:
select t.name, t.surname
from (select t.*, row_number() over (partition by name order by name) as seqnum
from table t
) t
where seqnum = 1;
This is particularly useful if you want to get more than one column from the same row.