SQL transform table with sum based on values - sql

i have table like this:
operation_id
order_id
qty
qty_type
detail_type
1
1
240
ready
glued
1
1
199
ready
unglued
1
1
100
done
glued
1
2
50
ready
glued
and would like to transform into this. it means to add 4 columns and to sum them from above table based on a conditions, like detail_type = 'glued', qty_type = 'ready' etc.
operation_id
order_id
qty_glued_ready
qty_unglued_ready
qty_glued_done
qty_unglued_done
1
1
240
199
10
10
can somebody help me how query should look like?

I assume it is just an example that you have mentioned in your OP and it is not accurate according to your table data you have mentioned.
I don't understand how your qty_glued_done is 10
But here is something you can start working out with:
SELECT o.`operation_id`, o.`order_id`,
SUM(CASE WHEN `detail_type`='glued' AND o.`qty_type`='ready' THEN o.`qty` ELSE 0 END) AS qty_glued_ready,
SUM(CASE WHEN `detail_type`='unglued' AND o.`qty_type`='ready' THEN o.`qty` ELSE 0 END) AS qty_unglued_ready
(and so on)
FROM `operation_table` o GROUP BY o.`operation_id`

Related

Transpose data in HIVE

I have the following dataset in Hive, and I would like to transpose rows into columns.
Customer
Status
Quantity
25
Paid
5
25
N Paid
2
67
Open
12
67
Paid
4
45
N Paid
3
45
Open
2
I would like to have a new table after transpose that shows only one line by a customer and multiple columns by Status, e.g.
Customer
Paid
N Paid
Open
25
5
2
0
67
4
0
12
45
0
3
2
I tried some examples I've found on the Internet, but I could not make it works. Here, for the sake of simplicity, I listed only three statuses, but in fact, I could have more than that.
In SAS, I used to did something such as the following:
proc transpose
data = imputtable;
out = outputtable;
by customer;
id status;
var quantity;
run;
SAS gets all the existing statuses and pivots them into columns. I was looking to do the same in Hive.
Regards,
Marcio
Use conditional aggregation:
select Customer,
sum(case when Status = 'Paid' then Quantity else 0 end) as Paid ,
sum(case when Status = 'N Paid' then Quantity else 0 end) as `N Paid` ,
sum(case when Status = 'Open' then Quantity else 0 end) as Open
from table
group by Customer

Presto SQL pivoting (for lack of a better word) data

I am working with some course data in a Presto database. The data in the table looks like:
student_id period score completed
1 2016_Q1 3 Y
1 2016_Q3 4 Y
3 2017_Q1 4 Y
4 2018_Q1 2 N
I would like to format the data so that it looks like:
student_id 2018_Q1_score 2018_Q1_completed 2017_Q3_score
1 0 N 5
3 4 Y 4
4 2 N 2
I know that I could do this by joining to the table for each time period, but I wanted to ask here to see if any gurus had a recommendation for a more scalable solution (e.g. perhaps not having to manually create a new join for each period). Any suggestions?
You can just use conditional aggregation:
select student_id,
max(case when period = '2018_Q1' then score else 0 end) as score_2018q1,
max(case when period = '2018_Q1' then completed then 'N' end) as completed_2018q1,
max(case when period = '2017_Q3' then score else 0 end) as score_2017q3
from t
group by student_id

Working out the percentage of outcomes in a column within a table

I am using SQL developer and have a table called table1 which looks like this (but with loads more data):
item_id seller_id warranty postage_class
------- --------- -------- -------------
14 2 1 2
17 6 1 1
14 2 1 1
14 2 1 2
14 2 1 1
14 2 1 2
I want to identify the percentage of items sent by first class.
If anyone could help me out that would be amazing!
You can use conditional aggregation. The simplest method is probably:
select avg(case when postage_class = 1 then 1.0 else 0 end)
from t;
Note this calculates a ratio between 0 and 1. If you want a "percentage" between 0 and 100, then use 100.0 instead of 1.0.
Some databases make it possible to shorten this even further. For instance, in Postgres, you can do:
select avg( (postage_class = 1)::int )
from t;

Grouping by multiple fields

I have a table of ParentID's which are products made by combining the required amount of the corresponding BaseID product.
Product table:
ParentID BaseID Required UOH
-------------------------------------
1 55 1 400
1 56 .5 400
2 55 1 400
2 57 1 400
3 58 1 0
I need to select the ParentID's where there are enough of each required base product (UOH) to create the Parent.
The Query should return
ParentID
----------------
1
2
The only way I know how to do this is by using a pivot view. Is there another or a better way to accomplish this?
Thanks
You can use group by and having:
select parentid
from table t
group by parentid
having sum(case when uoh < required then 1 else 0 end) = 0
The having clause counts the number of times where uoh is less than required. If the count is zero, then all base ids have sufficient amounts.

Inserting a new indicator column to tell if a given row maximizes another column in SQL

I currently have a table in SQL that looks like this
PRODUCT_ID_1 PRODUCT_ID_2 SCORE
1 2 10
1 3 100
1 10 3000
2 10 10
3 35 100
3 2 1001
That is, PRODUCT_ID_1,PRODUCT_ID_2 is a primary key for this table.
What I would like to do is use this table to add in a row to tell whether or not the current row is the one that maximizes SCORE for a value of PRODUCT_ID_1.
In other words, what I would like to get is the following table:
PRODUCT_ID_1 PRODUCT_ID_2 SCORE IS_MAX_SCORE_FOR_ID_1
1 2 10 0
1 3 100 0
1 10 3000 1
2 10 10 1
3 35 100 0
3 2 1001 1
I am wondering how I can compute the IS_MAX_SCORE_FOR_ID_1 column and insert it into the table without having to create a new table.
You can try like this...
Select PRODUCT_ID_1, PRODUCT_ID_2 ,SCORE,
(Case when b.Score=
(Select Max(a.Score) from TableName a where a.PRODUCT_ID_1=b. PRODUCT_ID_1)
then 1 else 0 End) as IS_MAX_SCORE_FOR_ID_1
from TableName b
You can use a window function for this:
select product_id_1,
product_id_2,
score,
case
when score = max(score) over (partition by product_id_1) then 1
else 0
end as is_max_score_for_id_1
from the_table
order by product_id_1;
(The above is ANSI SQL and should run on any modern DBMS)