Summarize Null Values in Table with Group By - sql

I have two tables:
Person(ID, Name)
Sports(person_ID, Sport)
The Problem: Sport can have NULL values. And if that is the case then if I group by ID the sport should be NULL.
SELECT p.ID, p.Name, s.Sport
FROM Person p
INNER JOIN Sports s ON p.ID=s.person_id
GROUP BY p.ID
Without the Group By the table looks like this:
p.ID p.Name s.Sport
1 tom soccer
1 tom NULL
2 lisa golf
2 lisa soccer
3 tim golf
3 tim NULL
What I want now:
1 tom NULL
2 lisa golf
3 tim NULL
But what I get:
1 tom soccer
2 lisa golf
3 tim golf
I've tried subselects and ifs but I couldn't get anything to work. Thanks in advance!

Here is a query which should generate your expected result set, though as #jarlh has pointed out, it isn't clear why Lisa should play golf over soccer.
SELECT
p.ID,
p.Name,
CASE WHEN COUNT(CASE WHEN s.Sport IS NULL THEN 1 END) > 0
THEN NULL ELSE MIN(s.Sport) END AS Sport
FROM Person p
INNER JOIN Sports s
ON p.ID = s.person_id
GROUP BY
p.ID,
p.name;
Note that I group by both the ID and name, which would be required on many databases (though perhaps not SQLite).

you can't manage the NULL value with aggreagtion function as MIN()
but you could try
SELECT p.ID, p.Name, min(ifnull(s.Sport,''))
FROM Person p
INNER JOIN Sports s ON p.ID=s.person_id
GROUP BY p.ID, p.name

Assuming the version of SQLLite you are using supports row_number(), please try below, you can set a row_number to 1 if you order by s.sport ASC, then select the first row for each category. If there is NULL, it should locate at the top row of each category via this query. You don't need to use group by:
;with cte as (
select p.ID, p.Name, s.Sport,
ROW_NUMBER() OVER (PARTITION BY p.ID ORDER BY s.Sport ASC) AS rn
FROM Person p INNER JOIN Sports s ON p.ID=s.person_id
)
select *
from cte
where rn=1

You can do this with a correlated subquery, avoiding the join in the outer query:
select p.*,
(select s.sport
from sports s
where s.personId = p.id
order by (s.sport is null) desc, s.sport asc
) as min_sport
from person p;
This may prove useful under some circumstances. With an index on sports(personid, sport), it might be faster than the group by, depending on the data (lots of people, few sports per person).
Also, this is slightly different from your query because it returns all people, even those with no sports.

Related

Sql query for the lowest score per country

I have a database with 3 tables. In table country I have id and name columns. The sport table also has id and name columns. Finally, the table match has id, player1, and player2(that are ids of country that play one against other), winner_id (id of country that won the match) and sport_id of the sport which was played. The least wins means that I just need in which sport country had the least wins, no matther on played matches.
I want to show the sport per country with the least wins. It should look like this:
Country
Sport
Wins
France
Basketball
2
How can I construct this query? I'm using SQL Server.
Data in table look like this. Table countries:
country_id
name
1
France
2
England
Table sport:
sport_id
name
1
Footbal
2
Basketball
Table match:
match_id
player1
player2
winner_id
sport_id
1
3
1
3
1
2
6
4
4
2
I want to note that the used wording with least wins is not clear, in my solution with least wins means most matches played with least wins.
To get this ranking, we need to know how many matches a country has played in each sport and how many of those have been won.
SELECT
country.name AS country,
sport.name AS sport,
sport_wins.wins
FROM
country
OUTER APPLY (
SELECT TOP 1
t.match_count,
COALESCE(t.wins, 0) AS wins,
t.sport_id
FROM (
SELECT
COUNT(*) AS match_count,
m_c.sport_id,
t.wins
FROM match m_c
OUTER APPLY (
SELECT
COUNT(*) AS wins,
match.sport_id
FROM match
WHERE country.country_id = match.winner_id
AND match.sport_id = m_c.sport_id
GROUP BY match.sport_id
) t
WHERE country.country_id IN (m_c.player1, m_c.player2)
GROUP BY m_c.sport_id, t.wins
) t
ORDER BY t.wins ASC, t.match_count DESC
) sport_wins
JOIN sport ON sport.sport_id = sport_wins.sport_id
Please, check a demo.
If you do not take into account losses, but only the number of wins is of interest, you can use a query like this one
WITH cte AS (
SELECT
country.country_id,
sport.sport_id,
SUM(CASE WHEN match.winner_id = country.country_id THEN 1 ELSE 0 END) AS wins
FROM country
CROSS JOIN sport
JOIN match ON match.sport_id = sport.sport_id
AND country.country_id IN (match.player1, match.player2)
GROUP BY country.country_id, sport.sport_id
)
SELECT
country.name,
sport.name,
t.min_wins AS wins
FROM (
SELECT
country_id,
MIN(wins) AS min_wins
FROM cte
GROUP BY country_id
) t
JOIN cte ON cte.country_id = t.country_id AND cte.wins = min_wins
JOIN country ON cte.country_id = country.country_id
JOIN sport ON cte.sport_id = sport.sport_id
This query takes into account the fact that the country participates in matches in sport, so if a country does not compete in a sport, that sport will not be included in the statistics as it will have 0 wins and this will be the minimum value.
Please, check a demo
You need to first cross-join the sports with the countries, then get the total.
Then you can use a row-number approach to get the bottom country in each sport
SELECT
c.Country,
c.Sport,
c.Wins
FROM (
SELECT
c.name Country,
s.name Sport,
COUNT(m.winner_id) Wins,
ROW_NUMBER() OVER (PARTITION BY s.sport_id, s.name ORDER BY COUNT(m.winner_id)) rn
FROM country c
CROSS JOIN sport s
LEFT JOIN [match] m
ON s.sport_id = m.sport_id AND m.winner_id = c.country_id
GROUP BY
s.sport_id,
s.name,
c.country_id,
c.name
) c
WHERE c.rn = 1;

How to use partition on a join to get a count

I'm confused on how to get a count without using group by on a join
I know I can get the desired results using group by, but the table joins are long and lots of selected headers with case statement so I was hoping to avoid that
I'm sure I've seen this done before using partition over but can't find a good example using it on a join. Maybe it's not possible!?
I've tried
select
p.FirstName,
p.Surname,
count(pr.RelativePersonId) over (partition by pr.RelativePersonId) as [RelativesOnRecord]
from People p
left join PersonRelatives pr
on p.PersonId = pr.PersonId
For my tables:
People
PersonId | FirstName | Surname
1 Jim Bo
2 Harry Bo
3 Strong Bo
PersonRelatives
Id | PersonId | RelativePersonId
1 1 2
2 1 3
Where I'm trying to get
PersonId | FirstName | Surname | RelativesOnRecord
1 Jim Bo 2
I also tried joining with a SELECT TOP 1 but that just gives me the one result so one count. Is this even possible without group by?
It seems you are partitioning by the wrong column - you want to have the number of relatives for each person from People, right ? Use
count(pr.RelativePersonId) over (partition by pr.PersonId) as [RelativesOnRecord]
Based on your example, you want aggregation:
select p.PersonId, p.FirstName, p.Surname, count(*) as [RelativesOnRecord]
from People p join
PersonRelatives pr
on p.PersonId = pr.PersonId
group by p.PersonId, p.FirstName, p.Surname;
You could use apply or a correlated subquery, but window functions do not seem appropriate here.

SQL Left join Query Issue

I have the following tables
Users
ID Name
1 John Smith
2 James Jones
3 Peter Brown
Purchases
USERID NAME
1 Apple
1 Pear
1 Banana
2 Apple
2 Pear
3 Apple
How can i have a sql query that returns users that brought (apple & pear & banana)
So in the above tables it would return only 'John Smith'
Thanks
This is an example of a set-within-sets query. A flexible way to solve this is using group by and having:
select userid
from purchases
where name in ('apple', 'pear', 'banana')
group by userid
having count(*) = 3;
To get the name, you would join in the Users table.
If, unlike in your sample data, duplicates are allowed in the table, then use count(distinct name) = 3 in the having clause.
You can JOIN both tables and COUNT products:
SELECT u.Name
FROM Purchases p
JOIN Users u
ON p.UserID = u.ID
WHERE p.Name IN ('Apple', 'Pear', 'Banana')
GROUP BY userid
HAVING COUNT(DISTINCT p.Name) = 3;
SqlFiddleDemo
There is possibility that user has bought 2 apples and one banana so you should count distinct product names.
If you are using Postgresql/SQL Server/Oracle you need to wrap Name with aggregation function:
SELECT MAX(u.Name) AS Name
FROM Purchases p
JOIN Users u
ON p.UserID = u.ID
WHERE p.Name IN ('Apple', 'Pear', 'Banana')
GROUP BY userid
HAVING COUNT(DISTINCT p.Name) = 3;
SQL Fiddle
SELECT users.name, purchased.items
FROM users
INNER JOIN (
select userid as id, array_agg(distinct name) items
from purchases
group by userid
having array_length(array_agg(distinct name),1)=3
) purchased USING(id);
This does not work if you are only looking for those specific items. If there are other items that could be purchased and you are only looking for users that purchased those specific items, you would have to do something similar to what Gordon, with a correction for the number of times an item was purchased.

MSSQL: Display Rows for a Select with Case and Count even if Count = 0

i hope i can explain my Problem in detail, so you guys can understand me.
Ive created a small example.
I have a Table which looks like this:
City | Name
Berlin | Mike
Berlin City| Peter
Stuttgart | Boris
here is my Query:
SELECT CASE
WHEN City like '%Berlin%' THEN 'Count Person in Berlin:'
WHEN City like '%Stuttgart%' THEN 'Count Person in Stuttgart:'
WHEN City like '%Dresden%' THEN 'Count Person in Dresden:'
ELSE 'unknown'
END AS Text,
COUNT(Name) AS countPersons
FROM tblTest
GROUP BY City
This is the result:
Count Person in Berlin: 2
Count Person in Stuttgart: 1
But my desired result is:
Count Person in Berlin: 2
Count Person in Stuttgart: 1
Count Person in Dresden: 0
how can i get my desired result? hope you can help me.
Thanks in advance.
SQL Fiddle Demo
If you don't have a table with the list of cities, then you can use a subquery. The key to solving this type of problem is left outer join:
select cities.city, count(t.city) as numpeople
from (select 'Berlin' as city union all
select 'Stuttgart' union all
select 'Dresden'
) cities left outer join
tbltest t
on t.city = cities.city
group by cities.city;
If you want to have 'unknown' as well, then full outer join can be used:
select coalesce(cities.city, 'unknown') as city, count(t.city) as numpeople
from (select 'Berlin' as city union all
select 'Stuttgart' union all
select 'Dresden'
) cities full outer join
tbltest t
on t.city = cities.city
group by coalesce(cities.city, 'unknown');

Finding group maxes in SQL join result [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
SQL: Select first row in each GROUP BY group?
Two SQL tables. One contestant has many entries:
Contestants Entries
Id Name Id Contestant_Id Score
-- ---- -- ------------- -----
1 Fred 1 3 100
2 Mary 2 3 22
3 Irving 3 1 888
4 Grizelda 4 4 123
5 1 19
6 3 50
Low score wins. Need to retrieve current best scores of all contestants ordered by score:
Best Entries Report
Name Entry_Id Score
---- -------- -----
Fred 5 19
Irving 2 22
Grizelda 4 123
I can certainly get this done with many queries. My question is whether there's a way to get the result with one, efficient SQL query. I can almost see how to do it with GROUP BY, but not quite.
In case it's relevant, the environment is Rails ActiveRecord and PostgreSQL.
Here is specific postgresql way of doing this:
SELECT DISTINCT ON (c.id) c.name, e.id, e.score
FROM Contestants c
JOIN Entries e ON c.id = e.Contestant_id
ORDER BY c.id, e.score
Details about DISTINCT ON are here.
My SQLFiddle with example.
UPD To order the results by score:
SELECT *
FROM (SELECT DISTINCT ON (c.id) c.name, e.id, e.score
FROM Contestants c
JOIN Entries e ON c.id = e.Contestant_id
ORDER BY c.id, e.score) t
ORDER BY score
The easiest way to do this is with the ranking functions:
select name, Entry_id, score
from (select e.*, c.name,
row_number() over (partition by e.contestant_id order by score) as seqnum
from entries e join
contestants c
on c.Contestant_id = c.id
) ec
where seqnum = 1
I'm not familiar with PostgreSQL, but something along these lines should work:
SELECT c.*, s.Score
FROM Contestants c
JOIN (SELECT MIN(Score) Score, Contestant_Id FROM Entries GROUP BY Contestant_Id) s
ON c.Id=s.Contestant_Id
one of solutions is
select min(e.score),c.name,c.id from entries e
inner join contestants c on e.contestant_id = c.id
group by e.contestant_id,c.name,c.id
here is example
http://sqlfiddle.com/#!3/9e307/27
This simple query should do the trick..
Select contestants.name as name, entries.id as entry_id, MIN(entries.score) as score
FROM entries
JOIN contestants ON contestants.id = entries.contestant_id
GROUP BY name
ORDER BY score
this grabs the min score for each contestant and orders them ASC