How can I pivot this query in BigQuery - google-bigquery

I've got a query in Bigquery which i'd like to pivot the output on Project Site Name and Study ID.
This is the code:
SELECT
Project_Site_Name,
NIHR_Portfolio_Study_ID,
CASE WHEN (Recruitment_Year = '2017/18')
THEN COUNT(Patient_Local_Number) END AS rec1718,
CASE WHEN (Recruitment_Year = '2018/19')
THEN COUNT(Patient_Local_Number) END AS rec1819
FROM `fourth-jigsaw-118116.Partners.PARData`
WHERE
NIHR_Portfolio_Study_ID = 1358 AND
Recruitment_Year IN ('2017/18', '2018/19')
GROUP BY
Project_Site_Name,
NIHR_Portfolio_Study_ID,
Recruitment_Year
Currently, this is the output:
And I'd like it to look like this:
I've been searching for a solution for some time now, I think it might have something to do with nesting the aggregation in a subquery but that's about as far as i've got. Any suggestions?
best wishes
Dave

If you want to pivot out result columns based on the recruitment year, then you should not be grouping by that column. Instead, use conditional aggregation to determine the counts for each recruitment year.
SELECT
Project_Site_Name,
NIHR_Portfolio_Study_ID,
COUNT(CASE WHEN Recruitment_Year = '2017/18' THEN 1 END) AS rec1718,
COUNT(CASE WHEN Recruitment_Year = '2018/19' THEN 1 END) AS rec1819
FROM fourth-jigsaw-118116.Partners.PARData
WHERE
NIHR_Portfolio_Study_ID = 1358 AND
Recruitment_Year IN ('2017/18', '2018/19')
GROUP BY
Project_Site_Name,
NIHR_Portfolio_Study_ID;

Related

sql oracle sum valaues in multiple columns

im looking for solutions to my problem.
i have a query
select em_name, sum(abs_day_left)
from pp_employees,
pp_types_abs,
pp_abs
where em_id = abs_em_id and abs_abs_id = abs_id and
abs_kod in ('12','13','14','15')
group by em_name
i want to make more columns with another abs_kod number (image attachment)
for example
second column
... abs_kod in
('656','44','323','33')
third column
... abs_kod in
('63','55','565','556')
and more..
example table
thanks for help and nice weekend
One more thing...
the formula counts days from the whole month
how to make it count correctly the days when it sets the parameters for the half month, for example from 1980-01-01 to 1980-03-15
thanks in advance
bob
I think that you are looking for conditional aggregation:
select
em_name,
sum(case when abs_kod in (12,13,14,15) then abs_day_left end) abs_day_left_1,
sum(case when abs_kod in (656,44,323,33) then abs_day_left end) abs_day_left_2,
sum(case when abs_kod in (63,55,565,556) then abs_day_left end) abs_day_left_3
from pp_employees
inner join pp_abs on em_id = abs_em_id
inner join pp_types_abs on abs_abs_id = abs_id
where and abs_kod in (12,13,14,15,656,44,323,33,63,55,565,556)
group by em_name
Notes:
always use explicit joins instead of old-shool, implicit joins - I tried to fix this but I am unsure I did it correctly, for a reason that lies in the following point...
always qualify the columns in the query with the table they belong to

AND OR SQL operator with multiple records

I have the following query where if brand1/camp1 taken individually, query returns the correct value but if I specify more than one brand or campaigns, it returns some other number and I am not sure what the math is behind that. It is not the total of the two either.
I think it is IN operator that is specifying OR with "," as opposed to what I require it to do which is consider AND
select campaign,
sum(case when campaign in ('camp1', 'camp2') and description in ('brand1', 'brand2') then orders else 0 end) as brand_convs
from data.camp_results
where campaign in ('camp1', 'camp2') and channel='prog' and type='sbc'
group by campaign
having brand_convs > 0
order by brand_convs desc;
Any thoughts?
The problem is in the IN part as you suspected: The two IN operators do not affect eachother in any way, so campaign can be camp1 while description is brand2.
If your DBMS supports multiple columns in an IN statement, you use a single IN statement:
SELECT campaign, SUM(
CASE WHEN (campaign, description) IN (
('camp1', 'brand1'),
('camp2', 'brand2')
) THEN orders ELSE 0 END
) [rest of query...]
If not, you're probably going to have to use ANDs and ORs
SELECT campaign, SUM(
CASE WHEN
(campaign='camp1' AND description='brand1')
OR (campaign='camp2' AND description='brand2')
THEN orders ELSE 0 END
) [rest of query...]

SubQuery Aggregates in ActiveRecord

I'm trying to avoid using straight up SQL in my Rails app, but need to do a quite large version of this:
SELECT ds.product_id,
( SELECT SUM(units) FROM daily_sales WHERE (date BETWEEN '2015-01-01' AND '2015-01-08') AND service_type = 1 ) as wk1,
( SELECT SUM(units) FROM daily_sales WHERE (date BETWEEN '2015-01-09' AND '2015-01-16') AND service_type = 1 ) as wk2
FROM daily_sales as ds group by ds.product_id
I'm sure it can be done, but i'm struggling to write this as an active record statement. Can anyone help?
If you must do this in a single query, you'll need to write some SQL for the CASE statements. The following is what you need:
ranges = [ # ordered array of all your date-ranges
Date.new(2015, 1, 1)..Date.new(2015, 1, 8),
Date.new(2015, 1, 9)..Date.new(2015, 1, 16)
]
overall_range = (ranges.first.min)..(ranges.last.max)
grouping_sub_str = \
ranges.map.with_index do |range, i|
"WHEN (date BETWEEN '#{range.min}' AND '#{range.max}') THEN 'week#{i}'"
end.join(' ')
grouping_condition = "CASE #{grouping_sub_str} END"
grouping_columns = ['product_id', grouping_condition]
DailySale.where(date: overall_range).group(grouping_columns).sum(:units)
That will produce a hash with array keys and numeric values. A key will be of the form [product_id, 'week1'] and the value will be the corresponding sum of units for that week.
Simplify your SQL to the following and try converting it..
SELECT ds.product_id,
, SUM(CASE WHEN date BETWEEN '2015-01-01' AND '2015-01-08' AND service_type = 1
THEN units
END) WK1
, SUM(CASE WHEN date BETWEEN '2015-01-09' AND '2015-01-16' AND service_type = 1
THEN units
END) WK2
FROM daily_sales as ds
group by ds.product_id
Every rail developer sooner or later hits his/her head against the walls of Active Record query interface just to find the solution in Arel.
Arel gives you the flexibility that you need in creating your query without using loops, etc. I am not going to give runnable code rather some hints how to do it yourself:
We are going to use arel_tables to create our query. For a model called for example Product, getting the Arel table is as easy as products = Product.arel_table
Getting sum of a column is like daily_sales.project(daily_sales[:units].count).where(daily_sales[:date].gt(BEGIN_DATE).where(daily_sales[:date].lt(END_DATE). You can chain as many wheres as you want and it will be translated into SQL ANDs.
Since we need to have multiple sums in our end result you need to make use of Common Table Expressions(CTE). Take a look at docs and this answer for more info on this.
You can use those CTEs from step 3 in combination with group and you are done!

SQL multiple SELECT too slow (7 min)

This source is good but too slow.
Function:
Selecting all rows if SC and %%5 and 2013.07.11 < date < 2013.07.18
and
some older lines represent lines
Method:
Finding X count rows.
one by one to see whether there is consistency 28 days
select efi_name, efi_id, count(*) as dupes, id, mlap_date
from address m
where
mlap_date > "2013.07.11"
and mlap_date < "2013.07.18"
and mlap_type = "SC"
and calendar_id not like "%%5"
and concat(efi_id,irsz,ucase(city), ucase(address)) in (
select concat(k.efi_id,k.irsz,ucase(k.city), ucase(k.address)) as dupe
from address k
where k.mlap_date > adddate(m.`mlap_date`,-28)
and k.mlap_date < m.mlap_date
and k.mlap_type = "SC"
and k.calendar_id not like "%%5"
and k.status = 'Befejezett'
group by concat(k.efi_id,k.irsz,ucase(k.city), ucase(k.address))
having (count(*) > 1)
)
group by concat(efi_id,irsz,ucase(city), ucase(address))
Thanks for helping!
NOT LIKE plus wildcard-prefixed terms are index-usage killers.
You could also try replacing the IN + inline table with an inner join: does the optimizer run the NOT LIKE query twice (see your explain plan)?
It looks like you might be using MySql, in which case you could build a hash column based on
efi_id
irsz
ucase(city)
ucase(address))
and compare that column directly. This is a way of implementing a hash join in MySql.
I don't think you need a subquery to do this. You should be able to do it just with the outer group by and conditional aggregations.
select efi_name, efi_id,
sum(case when mlap_date > "2013.07.11" and mlap_date < "2013.07.18" then 1 else 0 end) as dupes,
id, mlap_date
from address m
where mlap_type = 'SC' and calendar_id not like '%%5'
group by efi_id,irsz, ucase(city), ucase(address)
having sum(case when m.status = 'Befejezett' and
m.mlap_date <= '2013.07.11' and
k.mlap_date > adddate(date('2013.07.11'), -28)
then 1
else 0
end) > 1
This produces a slightly different result from your query. Instead of looking at the 28 days before each record, it looks at all records in the week period and then at the four weeks before that period. Despite this subtle difference, it is still identifying dupes in the four-week period before the one-week period.

Sum in Decode statement?

I have to provide counts for different activities, some of the columns are from some table but few I have to take by joining other tables. And in the last column I have to add counts *of 5 columns togather in one field*.Please see my query below and advice the best way to achieve my results :)
SELECT web. OID,web. MARKETING_GROUP,
SUM(DECODE(WEB.EVENT_TYPE,5,WEB.ACTIVITY_COUNT,0)) AS DISCUSSIONCOMMENT ,
SUM(DECODE(WEB.EVENT_TYPE,6,WEB.ACTIVITY_COUNT,0)) AS DISCUSSIONSTART ,
SUM(DECODE(WEB.EVENT_TYPE,7,WEB.ACTIVITY_COUNT,0)) AS DISCUSSIONVIEW,
SUM(case when o.when _clicked is not null then c(*))AS clickcount,
**SUM(case when WEB.EVENT_TYPE in(5,6,7,8)then WEB.ACTIVITY_COUNT +c.count(*))as Total** --------- This is where I am getting stuck and confused???
from GMMI_AIR.WEB_ACTIVITY_FCT WEB join GMMI_AIR.COUPON_WEB_ACTIVITY C
on WEB.OID_WEB_ACTIVITY_FCT = C.OID_COUPON_WEB_ACTIVITY
I am really stuck as the table which is joined doesn'thave field named activity_count or activity_type so to see the counts I can only do count(*) but to add it all other fileds in one column and sum it up is very confusing??
Any help??/
SELECT web. OID,web. MARKETING_GROUP,
SUM(DECODE(WEB.EVENT_TYPE,5,WEB.ACTIVITY_COUNT,0)) AS DISCUSSIONCOMMENT ,
SUM(DECODE(WEB.EVENT_TYPE,6,WEB.ACTIVITY_COUNT,0)) AS DISCUSSIONSTART ,
SUM(DECODE(WEB.EVENT_TYPE,7,WEB.ACTIVITY_COUNT,0)) AS DISCUSSIONVIEW,
SUM(case when o.when _clicked is not null then c(*))AS clickcount,
SUM(case when WEB.EVENT_TYPE in(5,6,7,8)then WEB.ACTIVITY_COUNT END) +c.count(*) as Total
from GMMI_AIR.WEB_ACTIVITY_FCT WEB join GMMI_AIR.COUPON_WEB_ACTIVITY C
on WEB.OID_WEB_ACTIVITY_FCT = C.OID_COUPON_WEB_ACTIVITY
GROUP BY web. OID,web. MARKETING_GROUP;