AWS Athena create a binary matrix - sql

Is it possible to create a binary matrix in AWS Athena from a table. For example we have the following table:
name
product
John
Bike
John
Shirt
John
Ball
Blake
Shirt
Mike
Ball
Mike
Hat
To be converted to the following:
name
Bike
Shirt
Ball
Hat
John
1
1
1
0
Blake
0
1
0
0
Mike
0
0
1
1

I suggest you to use Case expression:
select name,
sum(case when product = 'bike' then 1 else 0 end) as "Bike",
sum(case when product = 'Shirt' then 1 else 0 end) as "Shirt",
sum(case when product = 'Ball' then 1 else 0 end) as "Ball",
sum(case when product = 'Hat' then 1 else 0 end) as "Hat"
from tableName

Related

SQL: SUM OR COUNT with CASE WHEN condition in multiple criteria

Course name
Section number
Course type
MATH 101
1
In person
MATH 101
2
In person
MATH 101
3
Online
MATH 101
4
In person
SOC 101
1
In person
SOC 101
2
In person
SOC 101
3
In person
ENGL 201
1
In person
ENGL 201
2
Online
ENGL 201
3
Online
ENGL 201
4
In person
PHY 101
1
Online
PHY 101
2
Online
From this table, I'd like to count Courses with only an 'In person' course, an 'Online' course, and both course types.
The query I tried is below.
SELECT
SUM(CASE WHEN coursetype = 'Inperson' AND coursetype = 'Online' THEN 1 ELSE 0 END) AS bothtype,
SUM(CASE WHEN coursetype = 'Online' THEN 1 ELSE 0 END) AS Onlineonly,
SUM(CASE WHEN coursetype = 'Inperson' THEN 1 ELSE 0 END) AS Onlineonly
From Course
The result what I expected is
bothtpye
Onlineonly
Inpersononly
2
1
1
but I got
bothtpye
Onlineonly
Inpersononly
0
7
6
Please advise me to get through this.
Thank you.
My solution uses double conditional aggregation.
SELECT SUM (CASE WHEN In_Person > 0 AND Online > 0 THEN 1 ELSE 0 END) as bothtype,
SUM (CASE WHEN In_Person > 0 AND Online = 0 THEN 1 ELSE 0 END) as inpersononly,
SUM (CASE WHEN In_Person = 0 AND Online > 0 THEN 1 ELSE 0 END) as onlineonly
FROM (
SELECT Course_name,
SUM(CASE WHEN Course_type='In Person' THEN 1 ELSE 0 END) as In_Person,
SUM(CASE WHEN Course_type='Online' THEN 1 ELSE 0 END) as Online
FROM Course
GROUP BY Course_name
) tot
DEMO Fiddle
SUGGESTION ( using PL/SQL ! ) :
CREATE PROCEDURE countCourses(OUT bothtype INT,OUT Inpersononly INT,OUT Onlineonly INT)
begin
SELECT COUNT(*) INTO bothtype FROM Course;
select COUNT(*) INTO Inpersononly FROM Course
WHERE courseType = "In person";
select COUNT(*) INTO Onlineonly FROM Course
WHERE courseType = "Online";
end;
call countCourses(#bothtype,#Inpersononly,#Onlineonly);
SELECT #bothtype,#Inpersononly,#Onlineonly;
EXPLICATION :
Creating procedure to store the count of each type of course in OUT variable
Call the procedure with convenient parameters
Select out given parameters

Group by one column and return several columns on multiple conditions - T-SQL

I have two tables which I can generate with SELECT statements (joining multiple tables) as follows:
Table 1:
ID
Site
type
time
1
Dallas
2
01-01-2021
2
Denver
1
02-01-2021
3
Chicago
1
03-01-2021
4
Chicago
2
29-11-2020
5
Denver
1
28-02-2020
6
Toronto
2
11-05-2019
Table 2:
ID
Site
collected
deposited
1
Denver
NULL
29-01-2021
2
Denver
01-04-2021
29-01-2021
3
Chicago
NULL
19-01-2020
4
Dallas
NULL
29-01-2019
5
Winnipeg
13-02-2021
17-01-2021
6
Toronto
14-02-2020
29-01-2020
I would like the result to be grouped by Site, having on each column the COUNT of type=1 , type=2, deposited and collected, all of the 4 columns between a selected time interval. Example: (interval between 01-06-2020 and 01-06-2021:
Site
type1
type2
deposited
collected
Dallas
0
1
0
0
Denver
1
0
2
1
Chicago
1
1
0
0
Toronto
0
0
0
0
Winnipeg
0
0
1
1
How about union all and aggregation?
select site,
sum(case when type = 1 then 1 else 0 end) as type_1,
sum(case when type = 2 then 1 else 0 end) as type_2,
sum(deposited) as deposited, sum(collected) as collected
from ((select site, type, 0 as deposited, 0 as collected
from table1
) union all
(select site, null,
(case when deposited is not null then 1 else 0 end),
(case when collected is not null then 1 else 0 end)
from table2
)
) t12
group by site;
Combine your tables 1 and 2 with a join on Site
Use COUNT(CASE WHEN type = 1 then 1 END) as type1 and a similar construct for type 2
Use COUNT(CASE WHEN somedate BETWEEN '2020-06-01' and '2021-06-01' then 1 END) as ... for your dates

Finding fields to update based on combinations

I need to be able to display in my results who needs updates. I have a temp table I created that looks like this. The rule is per ID they cannot have more than 1 MASTER = 1. They must have FULLTIME = 1 on that record and all other records will be FULLTIME = 0 and PARTTIME = 1. This is quite difficult because you have to compare across multiple IDs.
I've tried combinations using maxes, count distinct, subqueries, etc. No luck getting it done. I've even tried to do some manipulation in Excel but it's totally confusing to me.
select distinct
x.ID,
COUNT(x.ID) AS ID_Count
from #FT0PT1M1Version2 as x
join (
select
ID, NAME, MASTER, FULLTIME, PARTTIME
from #FT0PT1M1Version2
WHERE E = 'P'
GROUP BY
ID, NAME, MASTER, FULLTIME, PARTTIME
HAVING COUNT(ID) = '1'
) as y
on x.ID = y.ID
WHERE
x.PARTTIME = '1' and
x.MASTER = '1'
group by x.ID
HAVING COUNT(x.ID) = '1'
order by 1
Temp Table
ID NAME MASTER FULLTIME PARTTIME
1 JAMES JONES 0 1 0
1 JAMES JONES 1 0 1
1 JAMES JONES 0 0 1
2 MICHEAL JORDAN 1 1 0
2 MICHEAL JORDAN 0 0 1
2 MICHEAL JORDAN 0 0 1
3 JOHN DOE 1 1 0
3 JOHN DOE 0 0 1
Expected Results
ID NAME MASTER FULLTIME PARTTIME UPDATE
1 JAMES JONES 0 1 0 Y
1 JAMES JONES 1 0 1 Y
1 JAMES JONES 0 0 1 N
2 MICHEAL JORDAN 1 1 0 N
2 MICHEAL JORDAN 0 0 1 N
2 MICHEAL JORDAN 0 0 1 N
3 JOHN DOE 1 1 0 N
3 JOHN DOE 1 0 1 Y
You could try below query but I would prefer to put a check constraint on columns like check if master=1 then update ='Y' something like that.
SELECT ID, NAME,
CASE
WHEN (MASTER=1 AND FULLTIME=1 And PARTTIME=0)
OR (MASTER=0 AND FULLTIME=0 And PARTTIME=1)
Then 'N'
ELSE 'Y'
END as "Update"
from table group by ID, NAME, Update
Having sum(Master) = 1;

Select sport results ordering by medals

I have a table:
sport country place
ski swe 1
ski nor 2
ski rus 3
luge swe 1
luge usa 2
luge ger 3
bob nor 1
bob rus 2
bob ger 3
where place is 1 for gold, 2 for silver, 3 for bronze
Now the normal displying scenario is a list of countries, first max gold, then silver then bronze. for that exampe it would be:
swe g:2 s:0 b:0 sum:2
rus g:0 s:1 b:1 sum:2
usa g:0 s:1 b:0 sum:1
nor g:0 s:0 b:2 sum:2
what would be SQL query to get list of countries ordering by that way?
regards
select
country,
sum(case when place = 1 then 1 else 0 end) as gold,
sum(case when place = 2 then 1 else 0 end) as silver,
sum(case when place = 3 then 1 else 0 end) as bronce,
count(*) as allmedals
from tab
group by country
For ordering the result you might do
order by sum(4 - place) desc -- weighted medals
using multiple ordering is a key: here is query. Thanks to user:dnoeth
select
country,
sum(case when place = 1 then 1 else 0 end) as gold,
sum(case when place = 2 then 1 else 0 end) as silver,
sum(case when place = 3 then 1 else 0 end) as bronce,
count(*) as allmedals
from tab
group by country ORDER BY gold DESC, silver DESC, bronce DESC

Grouping sub query in one row

ClientID Amount flag
MMC 600 1
MMC 700 1
FDN 800 1
FDN 350 2
FDN 700 1
Using sql server,Below query I am getting 2 rows fro FDN. I just would like to combine Client values in one row.
Output should be like
Client gtcount, totalAmountGreaterThan500 lscount,AmountLessThan500
MMC 2 1300 0 0
FDN 2 1500 1 350
SELECT
f.ClientID,f.flag,
case when flag = 1 then count(*) END as gtcount,
SUM(CASE WHEN flag = 1 THEN Amount END) AS totalAmountGreaterThan500,
case when flag = 2 then count(*) END as lscount,
SUM(CASE WHEN Flag = 2 THEN Amount END) AS AmountLessThan500,
from
( select ClientID, Amount,flag from #myTable)f
group by ClientID,f.flag
Try
SELECT ClientID,
SUM(CASE WHEN flag = 1 THEN 1 ELSE 0 END) AS gtcount,
SUM(CASE WHEN flag = 1 THEN Amount ELSE 0 END) AS totalAmountGreaterThan500,
SUM(CASE WHEN flag = 2 THEN 1 ELSE 0 END) AS lscount,
SUM(CASE WHEN Flag = 2 THEN Amount ELSE 0 END) AS AmountLessThan500
FROM Table1
GROUP BY ClientID
Output:
| CLIENTID | GTCOUNT | TOTALAMOUNTGREATERTHAN500 | LSCOUNT | AMOUNTLESSTHAN500 |
|----------|---------|---------------------------|---------|-------------------|
| FDN | 2 | 1500 | 1 | 350 |
| MMC | 2 | 1300 | 0 | 0 |
Here is SQLFiddle demo
Looks like your desired output is off -- there aren't any mmc records less than 500. You can accomplish this using sum with case for each of your fields, removing flag from the group by:
SELECT
ClientID,
SUM(CASE WHEN flag = 1 THEN 1 END) as gtcount,
SUM(CASE WHEN flag = 1 THEN Amount END) AS totalAmountGreaterThan500,
SUM(CASE WHEN flag = 2 THEN 1 END) as ltcount,
SUM(CASE WHEN Flag = 2 THEN Amount END) AS AmountLessThan500
from myTable
group by ClientID
SQL Fiddle Demo
On a different note, not sure why you need the Flag field. If it's just being used to denote less than records, just add the logic to the query:
SUM(CASE WHEN Amount <= 500 Then ...)