How to use conditional group by aggregations correctly - sql

I want to be able to count the total type of apples (organic only) from each continent, broken down by countries; including the total count if they're mixed.
For example, food item B1 is organic golden apples from the USA. Thus there should be a count of "1" golden_bag and "1" for organic. Now, A1 is also organic from Argentina - however, it has both granny and red delicious apples - thus it is counted as "1" mixed_bag and "1" for granny_bag and "1" for red_bag as well.
Finally, E1 and F1 are both fuji apples from laos, but one is organic the other isn't; so total count is 2 fuji_bag and it should have a total count of 1 for organic_fd.
Table X:
food_item | food_area | food_loc | food_exp
A1 lxgs argentina 1/1/20
B1 iyan usa 5/31/21
C1 lxgs peru 4/1/20
D1 wa8e norway 10/1/19
E1 894a laos 5/1/19
F1 894a laos 9/17/19
Table Y:
food_item | organic
A1 Y
B1 Y
C1 N
D1 N
E1 Y
F1 N
Table Z:
food_item | food_type
A1 189
A1 190
B1 191
C1 189
D1 192
E1 193
F1 193
SELECT continent, country,
SUM(organic) AS organic_fd, SUM(Granny) AS granny_bag,
SUM(Red_delc) AS red_bag, SUM(Golden) AS golden_bag,
SUM(Gala) AS gala_bag, SUM(Fuji) AS fuji_bag,
SUM(CASE WHEN Granny + Red_delc + Golden + Gala + Fuji > 1 THEN 1 ELSE 0 END) AS mixed_bag
FROM (SELECT (CASE SUBSTR (x.food_area, 4, 1)
WHEN 's' THEN 'SA' WHEN 'n' THEN 'NA'
WHEN 'e' THEN 'EU' WHEN 'a' THEN 'AS' ELSE NULL END) continent,
x.food_loc country, COUNT(y.organic) AS Organic
COUNT(CASE WHEN z.food_type = '189' THEN 1 END) AS Granny,
COUNT(CASE WHEN z.food_type = '190' THEN 1 END) AS Red_delc,
COUNT(CASE WHEN z.food_type = '191' THEN 1 END) AS Golden,
COUNT(CASE WHEN z.food_type = '192' THEN 1 END) AS Gala,
COUNT(CASE WHEN z.food_type = '193' THEN 1 END) AS Fuji
FROM x LEFT JOIN z ON x.food_item = z.food_item
LEFT JOIN y on x.food_item = y.food_item and y.organic = 'Y'
WHERE x.exp_date > sysdate
GROUP BY SUBSTR (x.food_area, 4, 1), x.food_loc, y.organic) h
GROUP BY h.continent, h.country, h.organic
I'm not getting the correct output, since for example, Laos will show TWICE to account for the organic count and non-organic count. So it will show 1 organic_fd and 0 organic_fd and 1 fuji_bag and the other line will be another 1 fuji_bag. I would like the TOTAL count. (Also, if I add more food items, my mixed_bag shows mostly "1" count for each record/lines).
Below is the desired output:
| continent | country |organic_fd | granny_bag| red_bag| golden_bag| gala_bag|fuji_bag | mixed_bag
| SA | argentina | 1 | 1 | 1 | 0 | 0 | 0 | 1
| SA | peru | 0 | 1 | 0 | 0 | 0 | 0 | 0
| NA | usa | 1 | 0 | 0 | 1 | 0 | 0 | 0
| EU | norway | 0 | 0 | 0 | 0 | 1 | 0 | 0
| AS | laos | 1 | 0 | 0 | 0 | 0 | 2 | 0
So, say I want to add another food item, G1 from Norway and it has 3 types of organic apples: fuji, red, granny... then Norway will now have a count of 1 for the following columns: mixed_bag, organic_fd, fuji_bag, red_bag ,granny_bag (in addition to the previous count of 1 gala_bag). If you add H1, which is exactly the same as G1, then it will now have a total count of 2 for the following: mixed_bag, organic_fd, fuji_bag,red_bag, granny_bag

The query:
WITH
t AS (
SELECT
CASE SUBSTR(X.food_area, LENGTH(X.food_area), 1)
WHEN 's' THEN 'SA'
WHEN 'n' THEN 'NA'
WHEN 'e' THEN 'EU'
WHEN 'a' THEN 'AS'
ELSE NULL
END AS continent,
x.food_loc AS country,
COUNT(DISTINCT CASE Y.organic WHEN 'Y' THEN X.food_item END) OVER (
PARTITION BY x.food_loc
) AS organic_fd,
CASE
WHEN MIN(Z.food_type) OVER (
PARTITION BY x.food_loc, X.food_item
) = Z.food_type AND
MAX(Z.food_type) OVER (
PARTITION BY x.food_loc, X.food_item
) > Z.food_type THEN 1 END AS mixed,
Z.food_type
FROM X
JOIN Y ON X.food_item = Y.food_item
JOIN Z ON Y.food_item = Z.food_item
)
SELECT
continent, country, organic_fd,
COUNT(CASE WHEN food_type = '189' THEN 1 END) AS Granny,
COUNT(CASE WHEN food_type = '190' THEN 1 END) AS Red_delc,
COUNT(CASE WHEN food_type = '191' THEN 1 END) AS Golden,
COUNT(CASE WHEN food_type = '192' THEN 1 END) AS Gala,
COUNT(CASE WHEN food_type = '193' THEN 1 END) AS Fuji,
COUNT(mixed) AS mixed_bag
FROM t
GROUP BY continent, country, organic_fd
You can try this query here: https://rextester.com/TSSH87409.

You have one to many relationship between x and z, and join may produce many rows for each row in x, like in case of A1. So you have to number rows in x at first, this is what my subquery t1 do, except of mapping values. Then group them taking max() for each counted column (granny, organic etc.), like in subquery t2. Finally sum values.
dbfiddle demo
with
t1 as (
select rn, food_item, food_area, food_loc country, food_exp, food_type,
decode(substr(food_area, 4, 1), 's', 'SA', 'n', 'NA', 'e', 'EU', 'a', 'AS') continent,
case organic when 'Y' then 1 else 0 end org,
case when food_type = '189' then 1 else 0 end gra,
case when food_type = '190' then 1 else 0 end red,
case when food_type = '191' then 1 else 0 end gol,
case when food_type = '192' then 1 else 0 end gal,
case when food_type = '193' then 1 else 0 end fuj
from (select rownum rn, x.* from x) x join y using (food_item) join z using (food_item)
where food_exp > sysdate),
t2 as (
select rn, country, continent, max(org) org, max(gra) gra,
max(red) red, max(gol) gol, max(gal) gal, max(fuj) fuj,
case when max(gra) + max(red) + max(gol) + max(gal) + max(fuj) > 1
then 1 else 0
end mix
from t1 group by rn, country, continent)
select continent, country, sum(org) organic_fd, sum(gra) granny, sum(red) red_delc,
sum(gol) golden_bag, sum(gal) gala_bag, sum(fuj) fuji_bag, sum(mix) mixed_bag
from t2
group by continent, country
Above query gave expected output, please test it and adjust if needed. I noticed you use left joins. If there is possibility that for some rows in X there is no data in Y or Z you may have to add nvl()s in calculations. Maybe you should also put mapped, hardcoded values into tables. Hardcoding them is not good practice. Hope this helps :)

Related

SQL - Making 4 new columns in a result from another column

So, I'm making a data base for my college class, it's about a foreign languages school, and I need to ( using a single query ), have a number of people that are attending a certain language class, but it has to be seperated by the age group. For example, this is how the result table should look like:
Language | 14-25 | 25-35 | 35-50 | 50+ |
German | 1 | 0 | 0 | 0 |
Italian | 2 | 1 | 0 | 0 |
English | 5 | 0 | 0 | 0 |
I need to do this by joining the tables "Class" that has attributes (Language, Number of students), and "Student" that has attributes (ID, name, surname, age, prior knowledge ( eg. A1, B2, ... ))
So I somehow have to figure out in which age group a certain individual goes to, then if he goes there, increment the number of students for that age group by one.
You can build the sum and group the entries using CASE WHEN, so your query will look like this:
SELECT c.language,
SUM(CASE WHEN s.age BETWEEN 14 AND 25 THEN 1 ELSE 0 END) AS '14-25',
SUM(CASE WHEN s.age BETWEEN 25 AND 35 THEN 1 ELSE 0 END) AS '25-35',
SUM(CASE WHEN s.age BETWEEN 35 AND 50 THEN 1 ELSE 0 END) AS '35-50',
SUM(CASE WHEN s.age >= 50 THEN 1 ELSE 0 END) AS '50+'
FROM class c
JOIN student_class sc ON c.language = sc.class_language
JOIN student s ON s.id = sc.student_id
GROUP BY c.language;
You have to take care because as example a person whose age is 25 will be selected in both groups "15-25" and "25-35". If this is not intended, you could do something like this:
...SUM(CASE WHEN s.age BETWEEN 14 AND 25 THEN 1 ELSE 0 END) AS '14-25',
SUM(CASE WHEN s.age BETWEEN 26 AND 35 THEN 1 ELSE 0 END) AS '25-35',
SUM(CASE WHEN s.age BETWEEN 36 AND 50 THEN 1 ELSE 0 END) AS '35-50',
SUM(CASE WHEN s.age > 50 THEN 1 ELSE 0 END) AS '50+'...
Please see the working example here: db<>fiddle
You could add an ORDER BY c.language at the end if you want.
A last note: The column aliases shown here ('14-25' etc.) will not work on every DB type and might be replaced depending on DB type and personal "taste".
Assuming you have a table called something like ClassStudent which is linking the individual students to the class (which you absolutely need to fulfil this requirement)...
SELECT c.Language,
[14-25] = SUM(IIF(s.age BETWEEN 14 AND 25, 1, 0)),
[25-35] = SUM(IIF(s.age BETWEEN 25 AND 35, 1, 0)),
[35-50] = SUM(IIF(s.age BETWEEN 35 AND 50, 1, 0)),
[50+] = SUM(IIF(s.age >= 50, 1, 0)),
FROM Class c
INNER JOIN ClassStudent cs ON c.Language = cs.Language /* you need this table */
INNER JOIN Student s ON cs.StudentID = s.ID
GROUP BY c.Language
Here, IIF is like a ternary operator in SQL form, and the SUM lets you count up where the condition is met.

Group by one column and return several columns on multiple conditions - T-SQL

I have two tables which I can generate with SELECT statements (joining multiple tables) as follows:
Table 1:
ID
Site
type
time
1
Dallas
2
01-01-2021
2
Denver
1
02-01-2021
3
Chicago
1
03-01-2021
4
Chicago
2
29-11-2020
5
Denver
1
28-02-2020
6
Toronto
2
11-05-2019
Table 2:
ID
Site
collected
deposited
1
Denver
NULL
29-01-2021
2
Denver
01-04-2021
29-01-2021
3
Chicago
NULL
19-01-2020
4
Dallas
NULL
29-01-2019
5
Winnipeg
13-02-2021
17-01-2021
6
Toronto
14-02-2020
29-01-2020
I would like the result to be grouped by Site, having on each column the COUNT of type=1 , type=2, deposited and collected, all of the 4 columns between a selected time interval. Example: (interval between 01-06-2020 and 01-06-2021:
Site
type1
type2
deposited
collected
Dallas
0
1
0
0
Denver
1
0
2
1
Chicago
1
1
0
0
Toronto
0
0
0
0
Winnipeg
0
0
1
1
How about union all and aggregation?
select site,
sum(case when type = 1 then 1 else 0 end) as type_1,
sum(case when type = 2 then 1 else 0 end) as type_2,
sum(deposited) as deposited, sum(collected) as collected
from ((select site, type, 0 as deposited, 0 as collected
from table1
) union all
(select site, null,
(case when deposited is not null then 1 else 0 end),
(case when collected is not null then 1 else 0 end)
from table2
)
) t12
group by site;
Combine your tables 1 and 2 with a join on Site
Use COUNT(CASE WHEN type = 1 then 1 END) as type1 and a similar construct for type 2
Use COUNT(CASE WHEN somedate BETWEEN '2020-06-01' and '2021-06-01' then 1 END) as ... for your dates

Display output is columns based on filter criteria

I am trying to display data is columns/subcolumns based on certain filter criteria using case when statement but not getting required output.
data:
ID ID2 Country Type
1 001 US A
1 009 US A
2 002 AU B
3 003 CA A
3 005 CA A
4 007 US B
5 001 FR B
6 003 US B
7 002 US A
8 004 NZ A
based on my current case statement, here is how my output looks:
Type Country Count
B Other 2
B US 1
B Subtotal 3
A Other 4
A US 3
A Subtotal 7
Total 10
I want to display the following format, bonus if I can get the subtotal/totals:
Type-A Type-B
US Other US Other
3 4 1 2
I also need Subtotals, and Grandtotals, but these need to be calculated separately.
SubTotal: 7 SubTotal: 3
Grand Total: 10
You can like this
select
sum(case when Type = 'A' and Country = 'US' then 1 else 0 end) as US_TYPE_A,
sum(case when Type = 'A' and Country != 'US' then 1 else 0 end) as Other_TYPE_A,
sum(case when Type = 'B' and Country = 'US' then 1 else 0 end) as US_TYPE_B,
sum(case when Type = 'B' and Country != 'US' then 1 else 0 end) as Other_TYPE_B
from myTable

Find count group by id in SQL Server

I need some help to solve this query. I have a table which contains the ages of the passengers who are going to stay in a room which is mentioned below:
Age RoomId
----- ---
1 1
12 1
8 1
19 1
3 2
12 2
18 2
21 3
Also, I have properties table which contains the maximum age of the child and maximum age of the infant. Based on the age of the passenger, I need to segregate them to adult, child, and infant to each of the properties.
Properties table structure
Property Id Maximum_child_age Maximum_infant_age
-------------------------------------------------
1 11 2
Desired output
RoomId Adult Child Infant PropertyId
--------------------------------
1 2 1 1 1
2 2 1 0 1
3 1 0 0 1
Use conditional aggregation :
SELECT
SUM(CASE WHEN pas.age > ppt.Maximum_child_age THEN 1 ELSE 0 END) AS Adult,
SUM(CASE WHEN pas.age BETWEEN Maximum_infant_age AND ppt.Maximum_child_age THEN 1 ELSE 0 END) AS Child,
SUM(CASE WHEN pas.age < ppt.Maximum_infant_age THEN 1 ELSE 0 END) AS Infant,
ppt.id
FROM
passengers pas
CROSS JOIN properties ppt
GROUP BY ppt.id
Cross join the properties and then do conditional aggregation.
SELECT count(CASE
WHEN pa.ages > pr.maximum_child_age THEN
1
END) adult,
count(CASE
WHEN pa.ages > pr.maximum_infant_age
AND pa.ages <= pr.maximum_child_age THEN
1
END) child,
count(CASE
WHEN pa.ages <= pr.maximum_infant_age THEN
1
END) infant,
pr.propertyid
FROM passengers pa
CROSS JOIN properties pr
GROUP BY pr.propertyid;

Grouping sub query in one row

ClientID Amount flag
MMC 600 1
MMC 700 1
FDN 800 1
FDN 350 2
FDN 700 1
Using sql server,Below query I am getting 2 rows fro FDN. I just would like to combine Client values in one row.
Output should be like
Client gtcount, totalAmountGreaterThan500 lscount,AmountLessThan500
MMC 2 1300 0 0
FDN 2 1500 1 350
SELECT
f.ClientID,f.flag,
case when flag = 1 then count(*) END as gtcount,
SUM(CASE WHEN flag = 1 THEN Amount END) AS totalAmountGreaterThan500,
case when flag = 2 then count(*) END as lscount,
SUM(CASE WHEN Flag = 2 THEN Amount END) AS AmountLessThan500,
from
( select ClientID, Amount,flag from #myTable)f
group by ClientID,f.flag
Try
SELECT ClientID,
SUM(CASE WHEN flag = 1 THEN 1 ELSE 0 END) AS gtcount,
SUM(CASE WHEN flag = 1 THEN Amount ELSE 0 END) AS totalAmountGreaterThan500,
SUM(CASE WHEN flag = 2 THEN 1 ELSE 0 END) AS lscount,
SUM(CASE WHEN Flag = 2 THEN Amount ELSE 0 END) AS AmountLessThan500
FROM Table1
GROUP BY ClientID
Output:
| CLIENTID | GTCOUNT | TOTALAMOUNTGREATERTHAN500 | LSCOUNT | AMOUNTLESSTHAN500 |
|----------|---------|---------------------------|---------|-------------------|
| FDN | 2 | 1500 | 1 | 350 |
| MMC | 2 | 1300 | 0 | 0 |
Here is SQLFiddle demo
Looks like your desired output is off -- there aren't any mmc records less than 500. You can accomplish this using sum with case for each of your fields, removing flag from the group by:
SELECT
ClientID,
SUM(CASE WHEN flag = 1 THEN 1 END) as gtcount,
SUM(CASE WHEN flag = 1 THEN Amount END) AS totalAmountGreaterThan500,
SUM(CASE WHEN flag = 2 THEN 1 END) as ltcount,
SUM(CASE WHEN Flag = 2 THEN Amount END) AS AmountLessThan500
from myTable
group by ClientID
SQL Fiddle Demo
On a different note, not sure why you need the Flag field. If it's just being used to denote less than records, just add the logic to the query:
SUM(CASE WHEN Amount <= 500 Then ...)