Using Bigquery , how to use conditional statement involving dates - google-bigquery

I require a column which counts the number days i.e(Order Date - 01Jan2020).
Condition is -> if Order Date lies between 01Jan2020 and 31Mar2020
then (DATE_DIFF('2020-01-31' ,Order Date, DAY))
else 0
Question -> how to use this condition statement in BigQuery ?
Table -
Customer ID | Order Date
298 | 2020-02-28
78 | 2020-04-02
31 | 2021-01-09
345 | 2021-09-09
74 | 2020-01-20
I tried -
if((Order Date <'2020-01-01') and (Order Date >'2020-03-31'),(DATE_DIFF('2020-01-31' ,Order Date, DAY)
,0))

Try SELECT statement with WHERE clause:
SELECT id, orderdate, DATE_DIFF('2020-01-31', orderdate, DAY) as datediff FROM `yourdataset.ordertable`
WHERE orderdate BETWEEN '2020-01-01' AND '2020-03-31';
Output:
id orderdate datediff
298 2020-02-28 -28
74 2020-01-20 11
CASE statement:
SELECT
id,
orderdate,
CASE
WHEN (orderdate BETWEEN '2020-01-01' AND '2020-03-31') THEN DATE_DIFF('2020-01-31', orderdate, DAY)
ELSE 0
END AS `datediff`
FROM `yourdataset.ordertable`
Output:
id orderdate datediff
78 02/04/20 0
298 28/02/20 -28
345 09/09/21 0
31 09/01/21 0
74 20/01/20 11

Try below syntax: Once able to run it. Replace current_date with Order_Date. Hope this will work.
select
if(current_date <'2020-01-01' and current_date >'2020-03-31' ,DATE_DIFF('2020-01-31' ,current_date, DAY),0);

Related

Add first and last date of a sequence

I am working on a database which have a huge collection of rows. I want to update it so repeated records will be deleted. Now, I have a date column in table and I want to convert it into startDate and endDate. Please check:
id | date | price | minutes | prefixId | sellerId | routeTypeId
1234 2020-01-01 0.123 0 1 1 1
1235 2020-01-04 0.123 0 1 1 1
1236 2020-01-05 0.123 123 1 1 1
1237 2020-01-06 0.123 31 1 1 1
1238 2020-01-07 0.123 23 1 1 1
1239 2020-01-08 0.130 41 1 2 1
1240 2020-01-09 0.130 0 1 1 1
What I am looking for is:
id | startDate | endDate | price | minutes | prefixId | sellerId | routeTypeId
1234 2020-01-01 2020-01-01 0.123 0 1 1 1
1235 2020-01-04 2020-01-07 0.123 0 1 1 1
1239 2020-01-08 2020-01-08 0.130 41 1 2 1
1240 2020-01-09 2020-01-09 0.130 0 1 2 2
Dates will be considered in a series if price, prefixId, sellerId, routeTypeId will remain same with previous row and date column generates a series (without any gap between dates. So, 2020-01-01, 2020-01-2, 2020-01-10 are two different series for example)
This is a gaps-and-islands problem. You can use lag() and a cumulative sum:
select price, prefixId, sellerId, routeTypeId,
min(minutes),
min(date), max(date)
from (select t.*,
sum(case when prev_date = date - interval '1 day' then 0 else 1 end) over (order by date) as grp
from (select t.*,
lag(date) over (partition by price, prefixId, sellerId, routeTypeId order by date) as prev_date
from t
) t
) t
group by grp, price, prefixId, sellerId, routeTypeId
This is a "Gaps & Islands" problem. You can do it using:
select
min(id) as id,
min(date) as start_date,
max(date) as end_date,
min(price) as price,
...
from (
select *,
sum(inc) over(order by id) as grp
from (
select *,
case when price = lag(price) over(order by id)
and date = lag(date) over(
partition by price, prefixId, sellerId, routeTypeId
order by id)
+ interval '1 day'
then 0 else 1 end as inc
from t
) x
) y
group by grp

Customized week number with date

How can I make the start of the week 1 from 5/1/2016 and the end of the week 52 at 4/30/2017. For example: 5/1/2016 - 5/7/2016 should be week 1. SQL Server has the week counting with respect to the fiscal year. Is this possible? I am using SQL Server 2016.
select order_date,
datepart(week from order_date)weekorder, product_code
from my_table
where order_date > '4/30/2016'
and order_date < '5/1/2017'
order_date weekorder product_code
2017-03-01 9 16PSS
2016-11-26 48 16PZS
2016-11-18 47 16PSST
2016-05-31 23 16PRS
Requested:
order_date weekorder product_code
2017-03-01 47 16PSS
2016-11-26 22 16PZS
2016-11-18 21 16PSST
2016-05-31 5 16PRS
Well, you can use date arithmetic:
select order_date,
datediff(day, '2016-04-30', order_date) / 7 as weekorder,
product_code
from my_table
where order_date > '2016-04-30' and
order_date < '2017-05-01';

Sum Based on Date

I currently have this code that I want to sum every quantity based on the year. I have written a code that I thought would sum all the charges in 2016 and 2017, but it isn't running correctly.
I added the two different types of partition by statements to test and see if either would work and they don't. When I take them out, the Annual column just shows me the quantity for that specific receipt.
Here is my current code:
SELECT
ReceiptNumber
,Quantity
,Date
,sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER (PARTITION BY Date)
as Annual2016
,sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER (PARTITION BY ReceiptNumber)
as Annual2017
FROM Table1
GROUP BY ReceiptNumber, Quantity, Date
I would like my data to look like this
ReceiptNumber Quantity Date Annual2016 Annual2017
1 5 2016-01-05 17 13
2 11 2017-04-03 17 13
3 12 2016-11-11 17 13
4 2 2017-09-09 17 13
Here is a sample of some of the data I am pulling from:
ReceiptNumber Quantity Date
1 5 2016-01-05
2 11 2017-04-03
3 12 2016-11-11
4 2 2017-09-09
5 96 2015-07-08
6 15 2016-12-12
7 24 2016-04-19
8 31 2017-01-02
9 10 2017-0404
10 18 2015-10-10
11 56 2017-06-02
Try something like this
Select
..
sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2016
sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER ()as Annual2017
..
Where Date >= '2016-01-01' and Date < '2018-01-01'
If you want it printed only once at the top then you should run it in a separate query like:
SELECT YEAR(Date) y, sum(Quantity) s FROM Table1 GROUP BY YEAR(Date)
and then do the main query like this:
SELECT * FROM table1
Easy, peasey ... ;-)
Your original question could also be answered with:
SELECT *,
(SELECT SUM(Quantity) FROM Table1 WHERE YEAR(Date)=2016 ) Annual2016,
(SELECT SUM(Quantity) FROM Table1 WHERE YEAR(Date)=2017 ) Annual2017
FROM table1
You need some conditional aggreation over a Window Aggregate. Simply remove both PARTITION BY as you're already filtering the year in the CASE:
SELECT
ReceiptNumber
,Quantity
,Date
,sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2016
,sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2017
FROM Table1
You probably don't need the final GROUP BY ReceiptNumber, Quantity, Date

SQL: Get date when cumulative sum reaches a mark

I have a table in the following format:
APP_iD| Date | Impressions
113 2015-01-01 10
113 2015-01-02 5
113 2015-01-03 50
113 2015-01-04 35
113 2015-01-05 30
113 2015-01-06 75
Now, I need to know the date when cumulative SUM of those impressions crossed 65/100/150 and so on.
I tried using CASE WHEN statement:
CASE WHEN SUM(impressions) >100
THEN date
but it doesn't sum the data across the column. It just does checks against the individual row.
My final result should look like:
APP_iD | Date_65 | Date_100 | Date_150
113 2015-01-03 2015-01-04 2015-01-06
Does anyone know how to do this?
Is this even possible?
Use sum() over() to get the running sum and check for the required values with a case expression. Finally aggregate the results to get one row per each app_id.
select app_id,max(dt_65),max(dt_100),max(dt_150)
from (
select app_id
,case when sum(impressions) over(partition by app_id order by dt) between 65 and 99 then dt end dt_65
,case when sum(impressions) over(partition by app_id order by dt) between 100 and 149 then dt end dt_100
,case when sum(impressions) over(partition by app_id order by dt) >= 150 then dt end dt_150
from t) x
group by app_id
with c as (
select
app_id, date,
sum(impressions) over (partition by app_id order by date) as c
from t
)
select app_id, s65.date, s100.date, s150.date
from
(
select distinct on (app_id) app_id, date
from c
where c >= 65 and c < 100
order by app_id, date
) s65
left join
(
select distinct on (app_id) app_id, date
from c
where c >= 100 and c <150
order by app_id, date
) s100 using (app_id)
left join
(
select distinct on (app_id) app_id, date
from c
where c >= 150
order by app_id, date
) s150 using (app_id)
;
app_id | date | date | date
--------+------------+------------+------------
113 | 2015-01-03 | 2015-01-04 | 2015-01-06
Without the pivot:
select distinct on (app_id, break) app_id, break, date
from (
select *,
case
when c < 100 then 65
when c < 150 then 100
else 150
end as break
from (
select
app_id, date,
sum(impressions) over (partition by app_id order by date) as c
from t
) t
where c >= 65
) t
order by app_id, break, date
;
app_id | break | date
--------+-------+------------
113 | 65 | 2015-01-03
113 | 100 | 2015-01-04
113 | 150 | 2015-01-06
You can try this for desired result.
with t as (select app_id, date, sum(Impressions)
over (partition by app_id order by date) AS s from tbl)
select app_id,
min(date_65) AS date_65 ,
min(date_100) AS date_100,
min(date_150) AS date_150
-- more columns to observe other sum of Impressions
from
(select app_id,
CASE WHEN (s >= 65 and s < 100) THEN date END AS date_65,
CASE WHEN (s >= 100 and s < 150) THEN date END AS date_100,
CASE WHEN (s >= 150 ) THEN date END AS date_150
-- more cases to observe other sum of Impressions
from t) q
group by q.app_id
if you want to observe more sum of Impressions, just add more conditions

SQL select MAX data within each 12 hours

I have a table called temp. In this table I have Date and Value.
Date | Value
2016/04/01 07:00am | 1
2016/04/01 09:00am | 2
2016/04/01 11:00am | 3
...
2016/04/01 07:00pm | 5
2016/04/01 11:00pm | 2
...
2016/04/02 07:00am | 10
2016/04/02 09:00am | 13
2016/04/02 11:00am | 1
...
2016/04/02 07:00pm | 32
2016/04/02 09:00pm | 40
I would like to return:
Date | Value
04/01/2016 11:00am | 3
04/01/2016 07:00pm | 5
04/02/2016 09:00am | 13
04/02/2016 09:00pm | 40
The idea is to group in 12 hour intervals and then find the max value of said group.
So far I have:
SELECT t.date, max(t.value)
FROM temp t
WHERE t.Date between DATEADD(hour, 7, '04/01/2016') and DATEADD(minute, 1859, '04/02/2016')
GROUP BY DATEPART(Hour, t.date)%12, t.date
ORDER BY Date
But it returns all the data, no 12 hour groups.
Any ideas?
You don't want MAX as you don't want to group by the date, you want the single instance of the datetime that has the largest value. Therefore you can use ROW_NUMBER with a PARTITION based on the date and AM/PM period to get the row with the largest value in that period (ORDER BY t.value DESC):
SELECT date, value
FROM
(SELECT t.date,
t.value,
ROW_NUMBER()
OVER(PARTITION BY CAST(t.date AS date), CASE WHEN DATEPART(hour, t.date) < 12 THEN 0 ELSE 1 END
ORDER BY t.value DESC) AS rownum
FROM temp t
WHERE t.Date between DATEADD(hour, 7, '04/01/2016') and DATEADD(minute, 1859, '04/02/2016')
) max_val
WHERE max_val.rownum = 1
ORDER BY Date