Postgres query to get data datewise - sql

I am using PostgreSQL.
I have a table like below:
ID product_id Date Qty
-----------------------------------
1 12 2008-06-02 50
2 3 2008-07-12 5
3 12 2009-02-10 25
4 10 2012-11-01 22
5 2 2011-03-25 7
Now I want the result like below (i.e product wise sum of qty field of last 4 years):
product_id
QTY(current_year)
QTY( current year + last_year)
QTY_last2_years
QTY > 2 years

SELECT product_id
,sum(CASE mydate >= x.t THEN qty END) AS qty_current_year
,sum(CASE mydate >= (x.t - interval '1 y') THEN qty END) AS qty_since_last_year
,sum(CASE mydate >= (x.t - interval '2 y')
AND mydate < x.t THEN qty END) AS qty_last_2_year
,sum(CASE mydate < (x.t - interval '2 y') THEN qty END) AS qty_older
FROM tbl
CROSS JOIN (SELECT date_trunc('year', now()) AS t) x -- calculate once
GROUP BY 1;
To resuse the calculated beginning of the current year I CROSS JOIN it as subquery x.

Related

Calculate number of workdays PER MONTH from start_date and end_date

So I have a table that looks like this :
task_id | start_date |end_date
I want to calculate the number of workdays (just days from mondays to fridays , no holidays) per month.
for example : if a task took from 02-01-2022 to 05-02-2022 to be accomplished, i need the result to look something like this
task_id | january |february |march |april .............|december
1 21 4 0 0 .......... 0
You can try to use generate_series function to generate date during your start_date and end_date which we can easy to count then the condition aggregate function to make pivot.
extract can get the month number or workdays(from Mondays to Fridays) by TIMESTAMP type, we can use that be count condition in aggregate function.
SELECT t1.task_id,
count(CASE WHEN extract(isodow from dt) BETWEEN 1 AND 5 AND EXTRACT(MONTH from dt) = 1 THEN 1 END) january,
count(CASE WHEN extract(isodow from dt) BETWEEN 1 AND 5 AND EXTRACT(MONTH from dt) = 2 THEN 1 END) february,
count(CASE WHEN extract(isodow from dt) BETWEEN 1 AND 5 AND EXTRACT(MONTH from dt) = 3 THEN 1 END) march
-- more months you can write
FROM T t1
CROSS JOIN generate_series(t1.start_date,t1.end_date,'1 day'::interval) dt
group by t1.task_id
sqlfiddle

SQL - In a week get result count of records in that week and count of records ageing 7days from that week

This is redshift SQL
I'm trying to get 2 results for a week:
Total records in that week
Total records ageing greater than 7 days from that week.
say there are sample 100 records in below format, in current example 7 records/week:
day code week
1/1/2020 P001 1
1/2/2020 P002 1
1/3/2020 P003 1
1/4/2020 P004 1
1/5/2020 P005 2
1/6/2020 P006 2
1/7/2020 P007 2
1/8/2020 P008 2
1/9/2020 P009 2
1/10/2020 P010 2
1/11/2020 P011 2
.....................
4/8/2020 P099 15
Trying to get output like this:
Week count count>7 days
1 7 0
2 7 7
3 7 14
4 7 21
15 7 98
Basically for the latest week, i'm trying to get distinct number of records ageing more than 7 days. In actual use case, the number of records in week will vary.
What i've tried:
calendar_week_number,
count(code) as count 1,
count(DISTINCT (case when datediff(day, trunc(completion_date-7), '2020-01-01') then code end)) as count 2,
count(case when completion_date between TO_DATE('20200101','YYYYMMDD') and TO_DATE(completion_date,'YYYYMMDD')-7 then code end) as count 3
from rbsrpt.RBS_DAILY_ASIN_PROC_SNPSHT ul
LEFT JOIN rbsrpt.dim_rbs_time t ON Trunc(ul.completion_date) = trunc(t.cal_date)
where
mp=1
and calendar_year=2020
group by
calendar_week_number
order by calendar_week_number desc
but my output is as below:
week count1 count 2 count 3
51 2866 2866 0
50 3211 3211 0
49 6377 6377 0
48 9013 9013 0
47 5950 5950 0
One option uses lateral joins. It is probably more efficient to aggregate the calendar table by weeks first, then perform the searches on week per week in the dataset.
Assuming Postgres (since there is no TO_DATE() in MySQL):
select d.cal_date, c1.*, c2.*
from (
select calendar_week_number, min(cal_date) as cal_date
rbsrpt.dim_rbs_time t
group by calendar_week_number
) t
cross join lateral (
select count(*) as cnt
from rbsrpt.rbs_daily_asin_proc_snpsht r
where r.completion_date >= t.cal_date
and r.completion_date < t.cal_date + interval '7 day'
) c1
cross join lateral (
select count(*) as cnt_aged
from rbsrpt.rbs_daily_asin_proc_snpsht r
where r.completion_date >= t.cal_date - interval '7' day
and r.completion_date < t.cal_date
) c2
This ages out records after 7 days. If you wanted 30 days instead, you would change the where clause of the second subquery:
cross join lateral (
select count(*) as cnt_aged
from rbsrpt.rbs_daily_asin_proc_snpsht r
where r.completion_date >= t.cal_date - interval '30 day'
and r.completion_date < t.cal_date - interval '23 day'
) c2
Edit: if your database does not support lateral joins, you can use subqueries instead:
select d.cal_date,
(
select count(*)
from rbsrpt.rbs_daily_asin_proc_snpsht r
where r.completion_date >= t.cal_date
and r.completion_date < t.cal_date + interval '7 day'
) as cnt,
(
select count(*)
from rbsrpt.rbs_daily_asin_proc_snpsht r
where r.completion_date >= t.cal_date - interval '7' day
and r.completion_date < t.cal_date
) as cnt_aged
from (
select calendar_week_number, min(cal_date) as cal_date
rbsrpt.dim_rbs_time t
group by calendar_week_number
) t

db2 compare year and month side by side

I need to compare side by side the companies values by current year vs last year and current month with same month of the previous year.
I use this query to get the values
SELECT STORE, SUM(TOTAL) as VAL, DATE FROM MYTABLE
WHERE DATE=CURRENT_DATE GROUP BY STORE ORDER BY STORE
below the results
STORE | VAL | DATE
1 10 CURRENT_DATE (2018-27-03)
1 20 2018-26-03
1 30 2018-25-03
2 20 CURRENT_DATE (2018-27-03)
2 20 2018-26-02
and i need this
STORE | VALUE CURRENT YEAR | VALUE LAST YEAR
1 60 30 (CALCULATED)
2 40 50 (CALCULATED)
STORE | VALUE CURRENT MONTH | VALUE SAME MONTH OF LAST YEAR
1 60 30 (CALCULATED)
2 20 50 (CALCULATED)
Thank you
You could just join two sub-selects together.
E.g with this DDL and Data
CREATE TABLE MYTABLE (STORE int, VAL int, D DATE);
INSERT INTO MYTABLE VALUES
( 1, 10, '2018-03-27')
,( 1, 20, '2018-03-26')
,( 1, 10, '2018-02-25')
,( 1, 35, '2017-03-25')
,( 2, 20, '2018-03-27')
,( 2, 15, '2017-03-26');
This will get you current month and last month last year values
SELECT C.*, LY.VAL_CURR_MONTH_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_MONTH
FROM MYTABLE WHERE INT(D)/100=INT(CURRENT_DATE)/100
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE INT(D)/100 = INT(CURRENT_DATE)/100 -100
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
Then this for years
SELECT C.*, LY.VAL_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_YEAR
FROM MYTABLE WHERE INT(D)/10000=INT(CURRENT_DATE)/10000
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_LY
FROM MYTABLE
WHERE INT(D)/10000 = INT(CURRENT_DATE)/10000 -1
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
P.S. there are many other ways to manipulate dates, but casting to INT is maybe one of the easier ways
Also, here is a more flexible way to get the "Same Month of Last Year" value. A similar method can get "last Year" values.
SELECT T.*
, AVG(VAL) OVER(
PARTITION BY STORE
ORDER BY YEAR_MONTH
RANGE BETWEEN 101 PRECEDING AND 100 PRECEDING
) AS SAME_MONTH_PREV_YEAR
FROM
( SELECT STORE
, INTEGER(D)/100 AS YEAR_MONTH
, SUM(VAL) AS VAL
FROM
MYTABLE T
GROUP BY
STORE
, INTEGER(D)/100
) AS T
;
Gives
STORE YEAR_MONTH VAL SAME_MONTH_PREV_YEAR
----- ---------- --- --------------------
1 201703 35 NULL
1 201802 10 NULL
1 201803 30 35
2 201703 15 NULL
2 201803 20 15
It is better to avoid functions on table columns in where clauses. Check following SQLs which are based on P. Vernon sample table.
Note: These SQLs are for DB2 LUW 11.1
For month:
SELECT STORE,
SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CURR_MONTH,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE D between first_day(current date) and last_day(current date)
or D between first_day(current date - 1 year) and last_day(current date - 1 year)
GROUP BY STORE
ORDER BY STORE
For year:
SELECT STORE, SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CY,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_LY
FROM MYTABLE
WHERE D between first_day(current date - (month(current date) - 1) months)
and last_day(current date + (12 - month(current date)) months)
or D between first_day(current date - (month(current date) - 1) months - 1 year)
and last_day(current date + (12 - month(current date)) months - 1 year)
GROUP BY STORE
ORDER BY STORE

Sum Based on Date

I currently have this code that I want to sum every quantity based on the year. I have written a code that I thought would sum all the charges in 2016 and 2017, but it isn't running correctly.
I added the two different types of partition by statements to test and see if either would work and they don't. When I take them out, the Annual column just shows me the quantity for that specific receipt.
Here is my current code:
SELECT
ReceiptNumber
,Quantity
,Date
,sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER (PARTITION BY Date)
as Annual2016
,sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER (PARTITION BY ReceiptNumber)
as Annual2017
FROM Table1
GROUP BY ReceiptNumber, Quantity, Date
I would like my data to look like this
ReceiptNumber Quantity Date Annual2016 Annual2017
1 5 2016-01-05 17 13
2 11 2017-04-03 17 13
3 12 2016-11-11 17 13
4 2 2017-09-09 17 13
Here is a sample of some of the data I am pulling from:
ReceiptNumber Quantity Date
1 5 2016-01-05
2 11 2017-04-03
3 12 2016-11-11
4 2 2017-09-09
5 96 2015-07-08
6 15 2016-12-12
7 24 2016-04-19
8 31 2017-01-02
9 10 2017-0404
10 18 2015-10-10
11 56 2017-06-02
Try something like this
Select
..
sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2016
sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER ()as Annual2017
..
Where Date >= '2016-01-01' and Date < '2018-01-01'
If you want it printed only once at the top then you should run it in a separate query like:
SELECT YEAR(Date) y, sum(Quantity) s FROM Table1 GROUP BY YEAR(Date)
and then do the main query like this:
SELECT * FROM table1
Easy, peasey ... ;-)
Your original question could also be answered with:
SELECT *,
(SELECT SUM(Quantity) FROM Table1 WHERE YEAR(Date)=2016 ) Annual2016,
(SELECT SUM(Quantity) FROM Table1 WHERE YEAR(Date)=2017 ) Annual2017
FROM table1
You need some conditional aggreation over a Window Aggregate. Simply remove both PARTITION BY as you're already filtering the year in the CASE:
SELECT
ReceiptNumber
,Quantity
,Date
,sum(CASE WHEN (Date >= '2016-01-01' and Date < '2017-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2016
,sum(CASE WHEN (Date >= '2017-01-01' and Date < '2018-01-01') THEN
Quantity
ELSE 0 END)
OVER () as Annual2017
FROM Table1
You probably don't need the final GROUP BY ReceiptNumber, Quantity, Date

GROUP BY next months over N years

I need to aggregate amounts grouped by "horizon" 12 next months over 5 year:
assuming we are 2015-08-15
SUM amount from 0 to 12 next months (from 2015-08-16 to 2016-08-15)
SUM amount from 12 to 24 next months (from 2016-08-16 to 2017-08-15)
SUM amount from 24 to 36 next months ...
SUM amount from 36 to 48 next months
SUM amount from 48 to 60 next months
Here is a fiddled dataset example:
+----+------------+--------+
| id | date | amount |
+----+------------+--------+
| 1 | 2015-09-01 | 10 |
| 2 | 2015-10-01 | 10 |
| 3 | 2016-10-01 | 10 |
| 4 | 2017-06-01 | 10 |
| 5 | 2018-06-01 | 10 |
| 6 | 2019-05-01 | 10 |
| 7 | 2019-04-01 | 10 |
| 8 | 2020-04-01 | 10 |
+----+------------+--------+
Here is the expected result:
+---------+--------+
| horizon | amount |
+---------+--------+
| 1 | 20 |
| 2 | 20 |
| 3 | 10 |
| 4 | 20 |
| 5 | 10 |
+---------+--------+
How can I get these 12 next months grouped "horizons" ?
I tagged PostgreSQL but I'm actually using an ORM so it's just to find the idea. (by the way I don't have access to the date formatting functions)
I would split by 12 months time frame and group by this:
SELECT
FLOOR(
(EXTRACT(EPOCH FROM date) - EXTRACT(EPOCH FROM now()))
/ EXTRACT(EPOCH FROM INTERVAL '12 month')
) + 1 AS "horizon",
SUM(amount) AS "amount"
FROM dataset
GROUP BY horizon
ORDER BY horizon;
SQL Fiddle
Inspired by: Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)
Assuming you need intervals from current date to this day next year and so on, I would query this like this:
SELECT 1 AS horizon, SUM(amount) FROM dataset
WHERE date > now()
AND date < (now() + '12 months'::INTERVAL)
UNION
SELECT 2 AS horizon, SUM(amount) FROM dataset
WHERE date > (now() + '12 months'::INTERVAL)
AND date < (now() + '24 months'::INTERVAL)
UNION
SELECT 3 AS horizon, SUM(amount) FROM dataset
WHERE date > (now() + '24 months'::INTERVAL)
AND date < (now() + '36 months'::INTERVAL)
UNION
SELECT 4 AS horizon, SUM(amount) FROM dataset
WHERE date > (now() + '36 months'::INTERVAL)
AND date < (now() + '48 months'::INTERVAL)
UNION
SELECT 5 AS horizon, SUM(amount) FROM dataset
WHERE date > (now() + '48 months'::INTERVAL)
AND date < (now() + '60 months'::INTERVAL)
ORDER BY horizon;
You can generalize it and make something like this using additional variable:
SELECT number AS horizon, SUM(amount) FROM dataset
WHERE date > (now() + ((number - 1) * '12 months'::INTERVAL))
AND date < (now() + (number * '12 months'::INTERVAL));
Where number is an integer from range [1,5]
Here is what I get from the Fiddle:
| horizon | sum |
|---------|-----|
| 1 | 20 |
| 2 | 20 |
| 3 | 10 |
| 4 | 20 |
| 5 | 10 |
Perhaps CTE?
WITH RECURSIVE grps AS
(
SELECT 1 AS Horizon, (date '2015-08-15') + interval '1' day AS FromDate, (date '2015-08-15') + interval '1' year AS ToDate
UNION ALL
SELECT Horizon + 1, ToDate + interval '1' day AS FromDate, ToDate + interval '1' year
FROM grps WHERE Horizon < 5
)
SELECT
Horizon,
(SELECT SUM(amount) FROM dataset WHERE date BETWEEN g.FromDate AND g.ToDate) AS SumOfAmount
FROM
grps g
SQL fiddle
Rather simply:
SELECT horizon, sum(amount) AS amount
FROM generate_series(1, 5) AS s(horizon)
JOIN dataset ON "date" >= current_date + (horizon - 1) * interval '1 year'
AND "date" < current_date + horizon * interval '1 year'
GROUP BY horizon
ORDER BY horizon;
You need a union and an aggregate function:
select 1 as horizon,
sum(amount) amount
from the_table
where date >= current_date
and date < current_date + interval '12' month
union all
select 2 as horizon,
sum(amount) amount
where date >= current_date + interval '12' month
and date < current_date + interval '24' month
union all
select 3 as horizon,
sum(amount) amount
where date >= current_date + interval '24' month
and date < current_date + interval '36' month
... and so on ...
But I don't know, how to do that with an obfuscation layer (aka ORM) but I'm sure it supports (or it should) aggregation and unions.
This could easily be wrapped up into a PL/PgSQL function where you pass the "horizon" and the SQL is built dynamically so that all you need to call is something like: select * from sum_horizon(5) where 5 indicates the number of years.
Btw: date is a horrible name for a column. For one because it's a reserved word, but more importantly because it doesn't document the meaning of the column. Is it a "release date"? A "due date"? An "order date"?
Try this
select
id,
sum(case when date>=current_date and date<current_date+interval 1 year then amount else 0 end) as year1,
sum(case when date>=current_date+interval 1 year and date<current_date+interval 2 year then amount else 0 end) as year2,
sum(case when date>=current_date+interval 2 year and date<current_date+interval 3 year then amount else 0 end) as year3,
sum(case when date>=current_date+interval 3 year and date<current_date+interval 4 year then amount else 0 end) as year4,
sum(case when date>=current_date+interval 4 year and date<current_date+interval 5 year then amount else 0 end) as year5
from table
group by id