Presto SQL pivoting (for lack of a better word) data - sql

I am working with some course data in a Presto database. The data in the table looks like:
student_id period score completed
1 2016_Q1 3 Y
1 2016_Q3 4 Y
3 2017_Q1 4 Y
4 2018_Q1 2 N
I would like to format the data so that it looks like:
student_id 2018_Q1_score 2018_Q1_completed 2017_Q3_score
1 0 N 5
3 4 Y 4
4 2 N 2
I know that I could do this by joining to the table for each time period, but I wanted to ask here to see if any gurus had a recommendation for a more scalable solution (e.g. perhaps not having to manually create a new join for each period). Any suggestions?

You can just use conditional aggregation:
select student_id,
max(case when period = '2018_Q1' then score else 0 end) as score_2018q1,
max(case when period = '2018_Q1' then completed then 'N' end) as completed_2018q1,
max(case when period = '2017_Q3' then score else 0 end) as score_2017q3
from t
group by student_id

Related

How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

SQL transform table with sum based on values

i have table like this:
operation_id
order_id
qty
qty_type
detail_type
1
1
240
ready
glued
1
1
199
ready
unglued
1
1
100
done
glued
1
2
50
ready
glued
and would like to transform into this. it means to add 4 columns and to sum them from above table based on a conditions, like detail_type = 'glued', qty_type = 'ready' etc.
operation_id
order_id
qty_glued_ready
qty_unglued_ready
qty_glued_done
qty_unglued_done
1
1
240
199
10
10
can somebody help me how query should look like?
I assume it is just an example that you have mentioned in your OP and it is not accurate according to your table data you have mentioned.
I don't understand how your qty_glued_done is 10
But here is something you can start working out with:
SELECT o.`operation_id`, o.`order_id`,
SUM(CASE WHEN `detail_type`='glued' AND o.`qty_type`='ready' THEN o.`qty` ELSE 0 END) AS qty_glued_ready,
SUM(CASE WHEN `detail_type`='unglued' AND o.`qty_type`='ready' THEN o.`qty` ELSE 0 END) AS qty_unglued_ready
(and so on)
FROM `operation_table` o GROUP BY o.`operation_id`

SQL - Impala - How to unfold one categorical column into many?

I have the following table :
user category number
1 A 8
1 B 6
2 A 1
2 C 9
3 B 5
I want to "unfold" or "dummify" the category column and fill them with the "number" column to obtain:
user cat_A cat_B cat_C
1 8 6 0
2 1 0 9
3 0 5 0
Is it possible to achieve this in SQL (Impala) ?
I found this question How to create dummy variable columns for thousands of categories in Google BigQuery?
However it seems a little bit complex and I'd rather do it in Pandas.
Is there a simpler solution, knowing that I have 10 categories (A, B, C, D etc)?
You can try to use condition aggregate function.
SELECT user,
SUM(CASE WHEN category = 'A' THEN number ELSE 0 END) cat_A,
SUM(CASE WHEN category = 'B' THEN number ELSE 0 END) cat_B,
SUM(CASE WHEN category = 'C' THEN number ELSE 0 END) cat_C
FROM T
GROUP BY user

SQL Server : how can I get difference between counts of total rows and those with only data

I have a table with data as shown below (the table is built every day with current date, but I left off that field for ease of reading).
This table keeps track of people and the doors they enter on a daily basis.
Table entrance_t:
id entrance entered
------------------------
1 a 0
1 b 0
1 c 0
1 d 0
2 a 1
2 b 0
2 c 0
2 d 0
3 a 0
3 b 1
3 c 1
3 d 1
My goal is to report on people and count entrances not used(grouping on people), but ONLY if they entered(entered=1).
So using the above table, I would like the results of query to be...
id count
----------
2 3
3 1
(id=2 did not use 3 of the entrances and id=3 did not use 1)
I tried queries(some with inner joins on two instances of same table) and I can get the entrances not used, but it's always for everybody. Like this...
id count
----------
1 4
2 3
3 1
How do I not display results id=1 since they did not enter at all?
Thank you,
You could use conditional aggregation:
SELECT id, count(CASE WHEN entered = 0 THEN 1 END) AS cnt
FROM entrance_t
GROUP BY id
HAVING count(CASE WHEN entered = 1 THEN 1 END) > 0;
DBFiddle Demo

SQL: A count inside a case inside a case perhaps?

Good day all.
below is an image relating to what I am attempting to achieve.
In one table there is two fields one is an ID and one is a Type.
I figured a picture paints a thousand words, so check the below
I have tried a few things with case and other things but none worked.
There is a couple of things to note: We cannot use temporary tables, inserts or deletes due to certain limitations.
Data Sample:
ID Type
3 bad
2 zeal
4 tro
3 pol
2 tro
2 lata
4 wrong
3 dead
2 wrong
3 dead
4 wrong
3 lata
2 bad
2 zeal
First of all you need a table containing the type groups:
type typegroup
bad 1
tro 1
zeal 1
dead 2
lata 2
wrong 2
pol 3
Then join, group by type group in order to get one result line per type group and count.
select
tg.typegroup,
count(case when id = 2 then 1 end) as id2,
count(case when id = 3 then 1 end) as id3
count(case when id = 4 then 1 end) as id4
from typegroups tg
join mytable m on m.type = tg.type
group by tg.typegroup
order by tg.typegroup;
UPDATE: Of course you can create such table on-the-fly.
...
from
(
select 'bad' as type, 1 as typegroup
union all
select 'tro' as type, 1 as typegroup
union all
...
) tg
join mytable m on m.type = tg.type
...
And you can move this to a WITH clause if you prefer so.