Sql distinct group of rows - sql

In sql i want get distict sets of rows : identical group for Characteristic and Value only one time :
The column Characteristic can range from one to 10
Table :
Name
Characteristic
Value
Mary
eyes
Blu
Mary
hair
blonde
Mary
Sex
Female
Jhon
eyes
Black
Jhon
Hair
Black
Jhon
Sex
Male
Jhon
Nation
Franch
Bill
eyes
Blu
Bill
Hair
Blond
Bill
Sex
Male
Will
eyes
Green
Will
Hair
Blond
Will
Sex
Male
Will
Nation
Spain
Lilly
eyes
Blu
Lilly
Hair
Blonde
Lilly
Sex
Female
mark
eyes
Black
mark
Hair
Black
mark
Sex
Male
mark
Nation
Franch
Anna
eyes
Blu
Anna
Hair
Blonde
Anna
Sex
Female
Antonio
eyes
Black
Antonio
Hair
Black
Antonio
Sex
Male
Antonio
Nation
Franch
The result that i want to achieve :
Group
Characteristic
Value
1
eyes
Blu
1
Hair
Blonde
1
Sex
Female
2
eyes
Black
2
Hair
Black
2
Sex
Male
2
Nation
Franch
3
eyes
Blu
3
Hair
Blond
3
Sex
Male
4
eyes
Green
4
Hair
Blode
4
Sex
Male
4
Nation
Spain
and finally if it's possible :
Name
Characteristic
Value
Group
Mary
eyes
Blu
1
Mary
Hair
Blonde
1
Mary
Sex
Female
1
Jhon
eyes
Black
2
Jhon
Hair
Black
2
Jhon
Sex
Male
2
Jhon
Nation
Franch
2
Bill
eyes
Blu
3
Bill
Hair
Blond
3
Bill
Sex
Male
3
Will
eyes
Green
4
Will
Hair
Blond
4
Will
Sex
Male
4
Will
Nation
Spain
4
Lilly
eyes
Blu
1
Lilly
Hair
Blonde
1
Lilly
Sex
Female
1
mark
eyes
Black
2
mark
Hair
Black
2
mark
Sex
Male
2
mark
Nation
Franch
2
Anna
eyes
Blu
1
Anna
Hair
Blonde
1
Anna
Sex
Female
1
Antonio
eyes
Black
2
Antonio
Hair
Black
2
Antonio
Sex
Male
2
Antonio
Nation
Franch
2

You can use STRING_AGG to join all the characteristics together, then use ROW_NUMBER and DENSE_RANK to count them. Then you re-join that back to the base table.
For your first query, you can do it like this.
SELECT
Groups.GroupId,
t.Characteristic,
t.Value
FROM YourTable t
JOIN (
SELECT
t.Name,
t.GroupDefinition,
GroupId = DENSE_RANK() OVER (ORDER BY t.GroupDefinition),
RowId = ROW_NUMBER() OVER (PARTITION BY t.GroupDefinition ORDER BY t.Name)
FROM (
SELECT
t.Name,
GroupDefinition = STRING_AGG(Characteristic + ':' + Value, '|')
WITHIN GROUP (ORDER BY t.Characteristic)
FROM YourTable t
GROUP BY
t.Name
) t
) Groups ON Groups.Name = t.Name
WHERE Groups.RowId = 1;
The second query is as follows.
SELECT
Groups.GroupId,
t.*
FROM YourTable t
JOIN (
SELECT
t.Name,
t.GroupDefinition,
GroupId = DENSE_RANK() OVER (ORDER BY t.GroupDefinition),
RowId = ROW_NUMBER() OVER (PARTITION BY t.GroupDefinition ORDER BY t.Name)
FROM (
SELECT
t.Name,
GroupDefinition = STRING_AGG(Characteristic + ':' + Value, '|')
WITHIN GROUP (ORDER BY t.Characteristic)
FROM YourTable t
GROUP BY
t.Name
) t
) Groups ON Groups.Name = t.Name;
db<>fiddle
Another option would be to aggregate it into a JSON or XML format, then shred it back out without re-joining the base table.

Related

Gather the number of customer by street

I have two tables :
Customer:
id
name
address_id
1
John
4
2
Kate
5
3
Bob
2
4
Michael
2
5
Adriana
3
6
Ann
1
Address:
id
detail_str_name
city
district
street_name
1
France,Paris,str.2,N5
Paris
Paris
str.2
2
France,Parise,str.2 ,N3
Paris
Paris
str.2
3
France, Lille ,str.3,N4
Lille
Lille
str.3
4
France,Paris,str.4,N3
Paris
Paris
str.4
5
France, Paris, Batignolles,N4
Paris
Batignolles
Batignolles
I want table like this:
name
detail_str_name
city
district
street_name
sum(cu.num_cust)
John
France,Paris,str.4,N3
Paris
Paris
str.4
1
Kate
France, Paris, Batignolles,N4
Paris
Batignolles
Batignolles
1
Bob
France,Parise,str.2 ,N3
Paris
Paris
str.2
3
Michael
France,Parise,str.2 ,N3
Paris
Paris
str.2
3
Adriana
France, Lille ,str.3,N4
Lille
Lille
str.3
1
Ann
France,Paris,str.2,N5
Paris
Paris
str.2
3
I want to count customer group by city,district and street_name, not detail_str_name.
I try:
select cu..name,ad.detail_str_name, ad.city,ad.district, ad.street_name,sum(cu.num_cust)
from
(select address_id, name,count (id) as num_cust
from customer
group by address_id,name) cu
left join address ad on cu.address_id = ad.id
group by cu..name,ad.detail_str_name, ad.city,ad.district, ad.street_name
But,this code groups by detail_str_name,
Which does not suit me.
What can I change?
I haven't been able to check this so it might not be totally correct but I think the query below should get the data you require.
This SQLTutorial article on the partition by clause might be useful.
SELECT cu.name,
ad.detail_str_name,
ad.city,
ad.district,
ad.street_name,
COUNT(cu.name) OVER(PARTITION BY ad.city, ad.district, ad.street_name) AS 'num_cust'
FROM customer cu
JOIN address ad ON ad.id = cu.address_id

How to select data in SQL based on a filter which changes if there is no data in a specific table column?

I have tables similar to the three below. I need to join the first two tables based on id, and then join the third table based on second name. However the last table needs a filter where the city should be equal to London unless age is empty in which case the city should equal Manchester.
I tried the code below using CASE statement but it is not working. I am new to SQL so I was not sure how can I combine a where statement with an if clause where the filter for the selection changes depending on whether there is data in a different column than the one used to filter by. The DBMS I am using Toad for Oracle.
FIRST.NAME.TABLE
ID FIRST_NAME ENTRY_DATE
1 JOHN 09/09/2019
2 NICOLA 09/09/2019
3 PATRICK 05/09/2019
4 JOAN 01/09/2019
5 JAKE 09/09/2019
6 AMELIA 01/09/2019
7 CAMERON 09/09/2019
SECOND.NAME.TABLE
ID SECOND_NAME ENTRY_DATE
1 BROWN 09/09/2019
2 SMITH 09/09/2019
3 COLE 05/09/2019
4 HOUSTON 01/09/2019
5 FARRIS 09/09/2019
6 HATHAWAY 01/09/2019
7 JONES 09/09/2019
CITY.AGE.TABLE
CITY SECOND_NAME AGE
LONDON BROWN 24.00
LONDON SMITH
MANCHESTER COLE 30.00
MANCHESTER HOUSTON 66.00
LONDON FARRIS
LONDON HATHAWAY 32.00
GLASGOW JONES 28.00
MANCHESTER SMITH 32.00
LONDON FARRIS 62.00
SELECT FN.ID,
FN.FIRST_NAME,
SN.SECOND_NAME,
AC.CITY,
AC.AGE
FROM FIRST.NAME.TABLE AS FN
INNER JOIN SECOND.NAME.TABLE SN
ON FN.ID=SN.ID
INNER JOIN CITY.AGE.TABLE AS CA
ON SN.SECOND NAME=AC.SECOND_NAME
WHERE FN.ENTRY_DATE='09-SEP-19'
AND SN.ENTRY_DATE='09-SEP-19'
AND (CASE WHEN AC.CITY='LONDON' AND AC.AGE IS NOT NULL
THEN AC.CITY='LONDON'
ELSE AS.CITY='MANCHESTER' END)
You can express this as boolean logic:
WHERE FN.ENTRY_DATE = DATE '2019-09-09' AND
SN.ENTRY_DATE = DATE '2019-09-09' AND
(AC.AGE IS NOT NULL AND AC.CITY = 'LONDON' OR
AC.AGE IS NULL AND AC.CITY = 'MANCHESTER'
)
This answers your question about how to implement the logic using SQL. However, I'm not sure that is the logic that you really want. I speculate that you really want a LEFT JOIN to the age table.

Obtain percentage of values in a column

I have a table called Director, which looks like
DirectorID FirstName FamilyName FullName DoB Gender
1 Steven Spielberg Steven Spielberg 1946-12-18 Male
2 Joel Coen Joel Coen 1954-11-29 Male
3 Ethan Coen Ethan Coen 1957-09-21 Male
4 George Lucas George Lucas 1944-05-14 Male
5 Ang Lee Ang Lee 1954-10-23 Male
6 Martin Scorsese Martin Scorsese 1942-11-17 Male
7 Mimi Leder Mimi Leder 1952-01-26 Female
I am trying to work out the percentage of Female to Male directors
I can work out the number of Male and Female Directors using:
SELECT count(*) as myMale from Director
where Gender = 'Male'
SELECT count(*) as myFemale from Director
where Gender = 'Female')
But I am having trouble combining them to obtain a percentage of Female Directors.
I am looking for the result of 14.3%, which is calculated using:
Total Female Directors / (Total Male Directors + Total Female Directors)
or
1/(6+1)
How would I do this with SQL?
A simple method uses aggregation. Assuming directors are either male or female (binary), then a simple conditional aggregation suffices:
select avg(case when gender = 'Female' then 1.0 else 0 end) as ratio_female
from directors;
If you want to limit this only to male and female (assuming other options), then include where gender in ('female', 'male').

List the Id who appeared once only in Relational Algebra

Let's say there's a table called Winner, with 3 attributes: Name, Gender and Id.
Name Gender Id
Kevin Male 8
Kevin Male 8
Benny Male 31
Jenny Female 7
Louie Male 4
Peter Male 11
Kevin Male 2
Jenny Female 7
Jenny Female 7
Chris Male 23
Louie Female 14
Apart from those people who is actually 2 different person but with the same name and those people who have the same name but with different gender, their Id's will be the unique value to identify themselves. If I want to list all the Id's who appeared once only in the list, I am thinking to do something like this:
Am I expressing it correctly ?
I don't know what your formula is trying to say, but in SQL you can achieve the result you want with a GROUP BY query:
SELECT Id, COUNT(Id) AS idCount
FROM Winner
GROUP BY Id
HAVING COUNT(Id) = 1

Returning only records that have matching fields in other records

I have a query that returns a list of customers and their addresses.
ID FName LName Address1 City Postcode
--------------------------------------------------------
1 James Smith 1 Bank Street London W1C 1AA
2 Sarah Jones 45 Moor Ave London SW1 1YH
3 Mary Smith 1 Bank Street London W1C 1AA
4 Sean Baker 17 White Blvd London SE3 7TH
5 Bob Patel 58B Canal St London NW2 2TT
6 Seeta Patel 58B Canal St London NW2 2TT
7 David Hound 4 Main St London E11 8AB
I'm trying to produce another query from this data that selects a list of customers who are related/living together.The criteria for this would be the same Address 1 and Postcode fields.
My question is how I can produce a query that only selects records that have at least 1 other record with matching [Address1] and [Postcode]? ie; in the above example return only records 1, 3, 5 and 6.
Select * From
Customers c JOIN
(SELECT Address1, PostCode FROM Customer GROUP BY Address1, PostCode HAVING Count(1) > 1) c2
ON c.Address1 = c2.Address1 AND c.PostCode = c2.PostCode