Curious if there are any methods to sum a total based on a weekly classification for a n day period aside from union - sql

I am looking to sum a total based on a case that ranges for a week from query below to accumulate up to 90 day period. I can currently accomplish this by limiting the dates and union them together; however, is there another way?
The given query is only 2 weeks I would have to continue to union more subselects to fulfill 90 days.
select comp_id, sum(total) from (
(
SELECT CASE
WHEN AVG(amount) < 10 THEN 0
WHEN COUNT(p_id) < SUM(amount)*.5
THEN SUM(amount)*.5
ELSE COUNT(p_id)
END as total, avg(amount), comp_id
FROM p_container INNER JOIN chg ON chg_p_id = p_id
INNER JOIN c_type ON c_type_id = chtype_id
where correction_name like '%correction word%'
AND p_date BETWEEN GETDATE () - 9 AND GETDATE () - 2
group by comp_id
) UNION ALL (
SELECT CASE
WHEN AVG(amount) < 5 THEN 0
WHEN COUNT(p_id) < SUM(amount)*.06
THEN SUM(amount)*.06
ELSE COUNT(p_id)
END as total, avg(amount), comp_id
FROM p_container INNER JOIN chg ON chg_p_id = p_id
INNER JOIN c_type ON c_type_id = chtype_id
where correction_name like '%correction word%'
AND p_date BETWEEN GETDATE () - 17 AND GETDATE () - 10
group by comp_id
)) group by comp_id

You can use a case expressions:
SELECT (CASE WHEN p_date BETWEEN GETDATE() - 9 AND GETDATE() - 2 THEN 'group1'
WHEN p_date BETWEEN GETDATE() - 17 AND GETDATE() - 10 THEN 'group2'
END) as grp,
(CASE WHEN AVG(amount) < 10 THEN 0
WHEN COUNT(p_id) < SUM(amount)*0.5 THEN SUM(amount)*0.5
ELSE COUNT(p_id)
END) as total, AVG(amount),
comp_id
FROM p_container INNER JOIN
chg
ON chg_p_id = p_id INNER JOIN
c_type
ON c_type_id = chtype_id
WHERE correction_name like '%correction word%' AND
p_date >= GETDATE() - 17
GROUP BY grp, comp_id;

Related

How to get difference in value over a sliding time window?

I'm attempting to write a SQL query which returns every product where the most recent price on an order within the last 30 days is different than the most recent price in the previous 30 days, and that calculated variance. I'm currently using PostgreSQL 11.
Data Model
Right now, the data is structured into three tables: orders, products, and a pivot table, order_product. Here is the simplified version of the table structure:
Orders
id
order_date
1
2022-01-15
2
2022-02-15
3
2022-03-08
Products
id
name
1
Some product
2
Another product
3
Yet another product
Order_Product
order_id
product_id
unit_price
1
1
10
1
2
20
1
3
10
2
1
12
2
2
20
2
3
5
3
1
15
Desired Output
The desired output would be something like the following:
id
name
order_date
latest_unit_price
previous_unit_price
variance
1
Some product
2022-03-08
15
10
5
3
Yet another product
2022-02-15
5
10
-5
What I've done so far
I've been able to write a join that combines the Orders and Products via the order_product table, within the 60-day window, which is seemingly the easy part:
SELECT
"products"."id",
"products"."name",
"order_product"."unit_price",
"orders"."order_date"
FROM
products
JOIN order_product ON products.id = order_product.product_id
JOIN orders ON order_product.order_id = orders.id
WHERE
order_date BETWEEN now() - INTERVAL '60 days'
AND now()
I've been trying to work with RANK() and LAG(); however, where I'm getting stuck is being able to find the rank the rows within the 30-day time windows, and then calculate the variance between the two windows.
Any help would be much appreciated!
Update: Added solution
Building off of the answer by D-Shih, I had to tweak this to work based on the time window starting from the current date:
WITH CTE AS (
SELECT
"products"."id",
"products"."name",
"order_product"."unit_price",
"orders"."order_date"
FROM
products
JOIN order_product ON products.id = order_product.product_id
JOIN orders ON order_product.order_id = orders.id
WHERE
order_date BETWEEN now() - INTERVAL '60 days' AND now()
),
CTE2 AS (
SELECT
*,
EXTRACT(DAYS FROM now() - order_date :: timestamp) gap_days
FROM
CTE
),
CTE3 AS (
SELECT
*,
(CASE WHEN gap_days < 30 THEN 1 ELSE 0 END) grp
FROM
CTE2
)
SELECT
id,
name,
MAX(CASE WHEN grp = 1 THEN order_date END) order_date,
MAX(CASE WHEN grp = 1 THEN unit_price END) latest_unit_price,
MAX(CASE WHEN grp = 0 THEN unit_price END) previous_unit_price,
SUM(CASE WHEN grp = 1 THEN unit_price ELSE - unit_price END) variance
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID, grp ORDER BY order_date DESC) rn
FROM
CTE3
) t1
WHERE
rn = 1
GROUP BY
id,
name
HAVING
MAX(CASE WHEN grp = 1 THEN unit_price END) <> MAX(CASE WHEN grp = 0 THEN unit_price END)
sqlfiddle
You can try to use EXTRACT with LAG window function to get days difference from order_date and previous order_date each productId.
Then use SUM aggregate condition window function to calculate the group
grp = 0 within the last 30 days
grp = 1 most recent price in the previous 30 days,
the query would be look like as below.
WITH CTE AS (
SELECT "products"."id",
"products"."name",
"order_product"."unit_price",
"orders"."order_date"
FROM
products
JOIN order_product ON products.id = order_product.product_id
JOIN orders ON order_product.order_id = orders.id
WHERE
order_date BETWEEN now() - INTERVAL '60 days'
AND now()
), CTE2 AS (
SELECT *,EXTRACT(DAYS FROM order_date - LAG(order_date,1,order_date) OVER(PARTITION BY id ORDER BY order_date)) gap_seconds
FROM CTE
), CTE3 AS (
SELECT *,(CASE WHEN SUM(gap_seconds) OVER(PARTITION BY id ORDER BY order_date) > 30 THEN 1 ELSE 0 END) grp
FROM CTE2
)
SELECT id,
name,
MAX(CASE WHEN grp = 1 THEN order_date END) order_date,
MAX(CASE WHEN grp = 1 THEN unit_price END) latest_unit_price,
MAX(CASE WHEN grp = 0 THEN unit_price END) previous_unit_price,
SUM(CASE WHEN grp = 1 THEN unit_price ELSE - unit_price END) variance
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID,grp ORDER BY order_date DESC) rn
FROM CTE3
) t1
WHERE rn = 1
GROUP BY id,
name
HAVING MAX(CASE WHEN grp = 1 THEN unit_price END) <> MAX(CASE WHEN grp = 0 THEN unit_price END)
sqlfiddle

Select data where sum for last 7 from max-date is greater than x

I have a data set as such:
Date Value Type
2020-06-01 103 B
2020-06-01 100 A
2020-06-01 133 A
2020-06-11 150 A
2020-07-01 1000 A
2020-07-21 104 A
2020-07-25 140 A
2020-07-28 1600 A
2020-08-01 100 A
Like this:
Type ISHIGH
A 1
B 0
Here's the query i tried,
select type, case when sum(value) > 10 then 1 else 0 end as total_usage
from table_a
where (select sum(value) as usage from tableA where date = max(date)-7)
group by type, date
This is clearly not right. What is a simple way to do this?
It is a simply group by except that you need to be able to access max date before grouping:
select type
, max(date) as last_usage_date
, sum(value) as total_usage
, case when sum(case when date >= cutoff_date then value end) >= 1000 then 'y' end as [is high!]
from t
cross apply (
select dateadd(day, -6, max(date))
from t as x
where x.type = t.type
) as ca(cutoff_date)
group by type, cutoff_date
If you want just those two columns then a simpler approach is:
select t.type, case when sum(value) >= 1000 then 'y' end as [is high!]
from t
left join (
select type, dateadd(day, -6, max(date)) as cutoff_date
from t
group by type
) as a on t.type = a.type and t.date >= a.cutoff_date
group by t.type
Find the max date by type. Then used it to find last 7 days and sum() the value.
with
cte as
(
select [type], max([Date]) as MaxDate
from tableA
group by [type]
)
select c.[type], sum(a.Value),
case when SUM(a.Value) > 1000 then 1 else 0 end as ISHIGH
from cte c
inner join tableA a on a.[type] = c.[type]
and a.[Date] >= DATEADD(DAY, -7, c.MaxDate)
group by c.[type]
This can be done through a cumulative total as follows:
;With CTE As (
Select [type], [date],
SUM([value]) Over (Partition by [type] Order by [date] Desc) As Total,
Row_Number() Over (Partition by [type] Order by [date] Desc) As Row_Num
From Tbl)
Select Distinct CTE.[type], Case When C.[type] Is Not Null Then 1 Else 0 End As ISHIGH
From CTE Left Join CTE As C On (CTE.[type]=C.[type]
And DateDiff(dd,CTE.[date],C.[date])<=7
And C.Total>1000)
Where CTE.Row_Num=1
I think you are quite close with you initial attempt to solve this. Just a tiny edit:
select type, case when sum(value) > 1000 then 1 else 0 end as total_usage
from tableA
where date > (select max(date)-7 from tableA)
group by type

sql get balance at end of year

I have a transactions table for a single year with the amount indicating the debit transaction if the value is negative or credit transaction values are positive.
Now in a given month if the number of debit records is less than 3 or if the sum of debits for a month is less than 100 then I want to charge a fee of 5.
I want to build and sql query for this in postgre:
select sum(amount), count(1), date_part('month', date) as month from transactions where amount < 0 group by month;
I am able get records per month level, I am stuck on how to proceed further and get the result.
You can start by generating the series of month with generate_series(). Then join that with an aggregate query on transactions, and finally implement the business logic in the outer query:
select sum(t.balance)
- 5 * count(*) filter(where coalesce(t.cnt, 0) < 3 or coalesce(t.debit, 0) < 100) as balance
from generate_series(date '2020-01-01', date '2020-12-01', '1 month') as d(dt)
left join (
select date_trunc('month', date) as dt, count(*) cnt, sum(amount) as balance,
sum(-amount) filter(where amount < 0) as debit
from transactions t
group by date_trunc('month', date)
) t on t.dt = d.dt
Demo on DB Fiddle:
| balance |
| ------: |
| 2746 |
How about this approach?
SELECT
SUM(
CASE
WHEN usage.amount_s > 100
OR usage.event_c > 3
THEN 0
ELSE 5
END
) AS YEAR_FEE
FROM (SELECT 1 AS month UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4 UNION
SELECT 5 UNION
SELECT 6 UNION
SELECT 7 UNION
SELECT 8 UNION
SELECT 9 UNION
SELECT 10 UNION
SELECT 11 UNION
SELECT 12
) months
LEFT OUTER JOIN
(
SELECT
sum(amount) AS amount_s,
count(1) event_c,
date_part('month', date) AS month
FROM transactions
WHERE amount < 0
GROUP BY month
) usage ON months.month = usage.month;
First you must use a resultset that returns all the months (1-12) and join it with a LEFT join to your table.
Then aggregate to get the the sum of each month's amount and with conditional aggregation subtract 5 from the months that meet your conditions.
Finally use SUM() window function to sum the result of each month:
SELECT DISTINCT SUM(
COALESCE(SUM(t.Amount), 0) -
CASE
WHEN SUM((t.Amount < 0)::int) < 3
OR SUM(CASE WHEN t.Amount < 0 THEN -t.Amount ELSE 0 END) < 100 THEN 5
ELSE 0
END
) OVER () total
FROM generate_series(1, 12, 1) m(month) LEFT JOIN transactions t
ON m.month = date_part('month', t.date) AND date_part('year', t.date) = 2020
GROUP BY m.month
See the demo.
Results:
> | total |
> | ----: |
> | 2746 |
I think you can use the hanving clause.
Select ( sum(a.total) - (12- count(b.cnt ))*5 ) as result From
(Select sum(amount) as total , 'A' as name from transactions ) as a left join
(Select count(amount) as cnt , 'A' as name
From transactions
where amount <0
group by month(date)
having not(count(amount) <3 or sum(amount) >-100) ) as b
on a.name = b.name
select
sum(amount) - 5*(12-(
select count(*)
from(select month, count(amount),sum(amount)
from transactions
where amount<0
group by month
having Count(amount)>=3 And Sum(amount)<=-100))) as balance
from transactions ;

How to pull data from multiple date ranges with one SQL query?

I have two queries. Each query pulls the total count of orders between organization and customer, and the sum of receivables for the orders. The queries are identical except for the date range.
SELECT org.organization_id, org.name, cust.name as customer,
count(*) as num_orders, round (sum(cast(o.total_charge as real))) as receivables
FROM
organization as org, orders as o, organization as cust, reconcile_order as ro
WHERE org.organization_id = o.shipper_org_id
and o.broker_org_id = cust.organization_id
and o.order_id = ro.order_id
and o.status = 'D'
and (ro.receive_payment_in_full = 0 or ro.receive_payment_in_full is NULL)
and (NOW()::DATE - o.delivery_confirmed_date::DATE) < 31
group by org.organization_id, org.name,
cust.name
order by org.name asc limit 20
SELECT org.organization_id, org.name, cust.name as customer,
count(*) as num_orders, round (sum(cast(o.total_charge as real))) as receivables
FROM
organization as org, orders as o, organization as cust, reconcile_order as ro
WHERE org.organization_id = o.shipper_org_id
and o.broker_org_id = cust.organization_id
and o.order_id = ro.order_id
and o.status = 'D'
and (ro.receive_payment_in_full = 0 or ro.receive_payment_in_full is NULL)
and (NOW()::DATE - o.delivery_confirmed_date::DATE) between 31 and 60
group by org.organization_id, org.name,
cust.name
order by org.name asc limit 20
But I need to make this one query so that the output is a single table with columns for orders and receivables in the first date range, and next to those columns another pair of columns for the second date range. (i.e. num_orders < 31, receivables < 31, num_orders 31-60, receivables 31-60)
You can put condition statements inside the count() and sum() functions.
So if you adjusted your where clause to bring back all the orders (across both date ranges) then you could make multiple result columns in your select clause, each counting and summing from just the date range you want.
SELECT ...
count(CASE WHEN (NOW()::DATE - o.delivery_confirmed_date::DATE) < 31 THEN 1 ELSE NULL END) as num_orders_a,
round(sum(CASE WHEN (NOW()::DATE - o.delivery_confirmed_date::DATE) < 31 THEN cast(o.total_charge as real) ELSE NULL END)) as receivables_a,
count(CASE WHEN (NOW()::DATE - o.delivery_confirmed_date::DATE) BETWEEN 31 AND 60 THEN 1 ELSE NULL END) as num_orders_b,
round(sum(CASE WHEN (NOW()::DATE - o.delivery_confirmed_date::DATE) BETWEEN 31 AND 60 THEN cast(o.total_charge as real) ELSE NULL END)) as receivables_b
(same FROM, WHERE, GROUP BY, and ORDER BY sections)
There are a number of ways to skin this cat, and there is a real potential trade-off here between performance and code maintainability.
A CTE here would help with code readability / transparency / maintainability. This is a little bit of a hack way to do it, but this is one idea:
with order_data as (
SELECT
org.organization_id, org.name, cust.name as customer,
o.total_charge::real,
case
when current_date - o.delivery_confirmed_date::DATE < 31 then 1
when current_date - o.delivery_confirmed_date::date < 61 then 2
else 3
end as cat
FROM
organization as org,
orders as o,
organization as cust,
reconcile_order as ro
WHERE
org.organization_id = o.shipper_org_id
and o.broker_org_id = cust.organization_id
and o.order_id = ro.order_id
and o.status = 'D'
and (ro.receive_payment_in_full = 0 or ro.receive_payment_in_full is NULL)
)
select
organization_id, name, customer,
sum (case when cat = 1 then 1 else 0 end) as "Orders < 31",
round (sum (case when cat = 1 then total_charge else 0 end)) as "Rec < 31",
sum (case when cat = 2 then 1 else 0 end) as "Orders 31-60",
round (sum (case when cat = 2 then total_charge else 0 end)) as "Rec 31-60",
sum (case when cat = 3 then 1 else 0 end) as "Orders 61+",
round (sum (case when cat = 3 then total_charge else 0 end)) as "Rec 61+"
from order_data
group by
organization_id, name, name
order by name asc
I think the more common approach might be to pass a "days_delta" column from the CTE (as current_date - o.delivery_confirmed_date::DATE) and have your sum functions look more like this:
sum (case when days_delta between 31 and 60 then ... end) as "31-60"
And... anyone who says you don't need a CTE -- they're right. You don't. For me it just makes the code more pleasant to deal with.
-- EDIT --
The less attractive (and less functional) cousin of the CTE, the subquery:
select
organization_id, name, customer,
sum (case when cat = 1 then 1 else 0 end) as "Orders < 31",
round (sum (case when cat = 1 then total_charge else 0 end)) as "Rec < 31",
sum (case when cat = 2 then 1 else 0 end) as "Orders 31-60",
round (sum (case when cat = 2 then total_charge else 0 end)) as "Rec 31-60",
sum (case when cat = 3 then 1 else 0 end) as "Orders 61+",
round (sum (case when cat = 3 then total_charge else 0 end)) as "Rec 61+"
from (
SELECT
org.organization_id, org.name, cust.name as customer,
o.total_charge::real,
case
when current_date - o.delivery_confirmed_date::DATE < 31 then 1
when current_date - o.delivery_confirmed_date::date < 61 then 2
else 3
end as cat
FROM
organization as org,
orders as o,
organization as cust,
reconcile_order as ro
WHERE
org.organization_id = o.shipper_org_id
and o.broker_org_id = cust.organization_id
and o.order_id = ro.order_id
and o.status = 'D'
and (ro.receive_payment_in_full = 0 or ro.receive_payment_in_full is NULL)
) as order_data
group by
organization_id, name, name
order by name asc
Im not sure that I understand your exact question, but how about this:
Select earlier_ones.organization_id,earlier_ones.organization_id, name, customer, earlier_ones.receivables, later_ones.receivables
FROM (
SELECT org.organization_id, org.name, cust.name as customer,
count(*) as num_orders, round (sum(cast(o.total_charge as real))) as receivables
FROM
organization as org, orders as o, organization as cust, reconcile_order as ro
WHERE org.organization_id = o.shipper_org_id
and o.broker_org_id = cust.organization_id
and o.order_id = ro.order_id
and o.status = 'D'
and (ro.receive_payment_in_full = 0 or ro.receive_payment_in_full is NULL)
and (NOW()::DATE - o.delivery_confirmed_date::DATE) < 31
group by org.organization_id, org.name,
cust.name
order by org.name asc limit 20
) earlier_ones
LEFT JOIN (
SELECT org.organization_id, org.name, cust.name as customer,
count(*) as num_orders, round (sum(cast(o.total_charge as real))) as receivables
FROM
organization as org, orders as o, organization as cust, reconcile_order as ro
WHERE org.organization_id = o.shipper_org_id
and o.broker_org_id = cust.organization_id
and o.order_id = ro.order_id
and o.status = 'D'
and (ro.receive_payment_in_full = 0 or ro.receive_payment_in_full is NULL)
and (NOW()::DATE - o.delivery_confirmed_date::DATE) between 31 and 60
group by org.organization_id, org.name,
cust.name
order by org.name asc limit 20
) later_ones ON earlier_ones.organization_id = later_ones.organization_id AND earlier_ones.name = later_ones.name;

SQL query to group by age range from date created

I want to get statistics with sql query. My table is like this:
ID MATERIAL CREATEDATE DEPARTMENT
1 M1 10.10.1980 D1
2 M2 11.02.1970 D2
2 M3 18.04.1971 D3
.....................
.....................
.....................
How can I get a range of data count like this
DEPARTMENT AGE<10 10<AGE<20 20<AGE
D1 24 123 324
D2 24 123 324
Assuming that CREATEDATE is a date column, in PostgreSQL you can use the AGE function:
select DEPARTMENT, age(CREATEDATE) as AGE
from Materials
and with date_part you can get the age in years. To show the data in the format that you want, you could use this GROUP BY query:
select
DEPARTMENT,
sum(case when date_part('year', age(CREATEDATE))<10 then 1 end) as "age<10",
sum(case when date_part('year', age(CREATEDATE))>=10 and date_part('year', age(CREATEDATE))<20 then 1 end) as "10<age<20",
sum(case when date_part('year', age(CREATEDATE))>=20 then 1 end) as "20<age"
from
Materials
group by
DEPARTMENT
which can be simplified as:
with mat_age as (
select DEPARTMENT, date_part('year', age(CREATEDATE)) as mage
from Materials
)
select
DEPARTMENT,
sum(case when mage<10 then 1 end) as "age<10",
sum(case when mage>=10 and mage<20 then 1 end) as "10<age<20",
sum(case when mage>=20 then 1 end) as "20<age"
from
mat_age
group by
DEPARTMENT;
if you are using PostgreSQL 9.4 you can use FILTER:
with mat_age as (
select DEPARTMENT, date_part('year', age(CREATEDATE)) as mage
from Materials
)
select
DEPARTMENT,
count(*) filter (where mage<10) as "age<10",
count(*) filter (where mage>=10 and mage<20) as "10<age<20",
count(*) filter (where mage>=20) as "20<age"
from
mat_age
group by
DEPARTMENT;
The following solution assumes that your CREATEDATE column exists as some sort of valid Postgres date type. If this be not the case, and it is being stored as text, you will first have to convert it to date in order for the query to work.
SELECT DEPARTMENT,
SUM(CASE WHEN DATEDIFF(year, CREATEDATE, now()::date) < 10 THEN 1 ELSE 0 END) AS "AGE<10",
SUM(CASE WHEN DATEDIFF(year, CREATEDATE, now()::date) >= 10 AND
DATEDIFF(year, CREATEDATE, now()::date) < 20 THEN 1 ELSE 0 END) AS "10<AGE<20",
SUM(CASE WHEN DATEDIFF(year, CREATEDATE, now()::date) >= 20 THEN 1 ELSE 0 END) AS "20<AGE"
FROM Materials
GROUP BY DEPARTMENT
You can use extract(year FROM age(createdate)) to get the exact age
i.e
select extract(year FROM age(timestamp '01-01-1989')) age
will give you
Result:
age
---
27
so you can use following select statement to get your desired output:
SELECT dept
,sum(CASE WHEN age < 10THEN 1 END) "age<10"
,sum(CASE WHEN age >= 10 AND age < 20 THEN 1 END) "10<age<20"
,sum(CASE WHEN age >= 20 THEN 1 END) "20<age"
FROM (
SELECT dept,extract(year FROM age(crdate)) age
FROM dt
) t
GROUP BY dept
If you don't want to use a sub select use this.
SELECT dept
,sum(CASE WHEN extract(year FROM age(crdate)) < 10THEN 1 END) "age<10"
,sum(CASE WHEN extract(year FROM age(crdate)) >= 10 AND extract(year FROM age(crdate)) < 20 THEN 1 END) "10<age<20"
,sum(CASE WHEN extract(year FROM age(crdate)) >= 20 THEN 1 END) "20<age"
FROM dt
GROUP BY dept