Compare between values from the same table in postgresql - sql

I have the following table:
id partid orderdate qty price
1 10 01/01/2017 10 3
2 10 02/01/2017 5 9
3 11 01/01/2017 0.5 0.001
4 145 02/01/2017 5 18
5 10 12/12/2016 8 7
6 10 05/07/2010 81 7.5
Basically I want to compare the most recent purchasing of parts to the other purchasing of the same part in a period of 24 months. For that matter compare id=2 to id = 1,5.
I want to check if the price of the latest orderdate (per part) is larger than the average price of that part in the last 24 months.
So first I need to calculate the avg price:
partid avgprice
10 (3+9+7)/3=6.33 (7.5 is out of range)
11 0.001
145 18
I also need to know the latest orderdate of each part:
id partid
2 10
3 11
4 145
and then I need to check if id=2, id=3, id=6 (latest purchases) are bigger than the average. If they are I need to return their partid.
So I should have something like this:
id partid avgprice lastprice
2 10 6.33 9
3 11 0.001 0.001
4 145 18 18
Finally I need to return partid=10 since 9>6.33
Now to my questions...
I'm not sure how I can find the latest order in PostgreSQL.
I tried:
select id, distinct partid,orderdate
from table
where orderdate> current_date - interval '24 months'
order by orderdate desc
This gives :
ERROR: syntax error at or near "distinct".
I'm a bit of a lost here. I know what I want to do but I cant translate it to SQL. Any one can help?

Get the avarage per part and the last order per price and join these:
select
lastorder.id,
lastorder.partid,
lastorder.orderdate,
lastorder.price as lastprice,
avgorder.price as avgprice
from
(
select
partid,
avg(price) as price
from mytable
where orderdate >= current_date - interval '24 months'
group by partid
) avgorder
join
(
select distinct on (partid)
id,
partid,
orderdate,
price
from mytable
order by partid, orderdate desc
) lastorder on lastorder.partid = avgorder.partid
and lastorder.price > avgorder.price;

This can be solved without distinct (which is heavy on the DB anyways):
with avg_price as (
select partid, avg(price) as price
from table
where orderdate> current_date - interval '24 months'
group by partid
)
select f.id, f.partid, av.price, f.price
from (
select id, partid, orderdate, price, rank() over (partition by partid order by orderdate desc)
from table
) as f
join avg_price av on f.partid = av.partid
where f.rank = 1
and av.price < f.price

Related

How to use SUM() OVER (partition by)?

Imagine, from 1st to 3rd november you have sold a certain amount of goods (there are two types A and B), and now you need to determine how much was sold in total for the day.
How can I query last 2 columns (sum and quantity for date) that my table looks like this?:
Date Type Quantity Amount Sum_Quantity Sum_Amount
01-11 A 2 100 5 300
01-11 B 3 200 5 300
02-11 A 1 700 3 950
02-11 B 2 250 3 950
03-11 A 2 600 7 800
03-11 B 5 200 7 800
And how can I query, if I want to take the results partitioned by month?
SELECT date,
type,
quantity,
amount,
-- Partition by date
SUM(quantity) OVER (PARTITION BY date) AS sum_quantity_date_part,
SUM(amount) OVER (PARTITION BY date) AS sum_amount_date_part,
-- Partition by month
SUM(quantity) OVER (
PARTITION BY EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date)
) AS sum_quantity_month_part,
SUM(amount) OVER (
PARTITION BY EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date)
) AS sum_amount_month_part
FROM sales
ORDER BY date, type
;

Calculating average time between customer orders and average order value in Postgres

In PostgreSQL I have an orders table that represents orders made by customers of a store:
SELECT * FROM orders
order_id
customer_id
value
created_at
1
1
188.01
2020-11-24
2
2
25.74
2022-10-13
3
1
159.64
2022-09-23
4
1
201.41
2022-04-01
5
3
357.80
2022-09-05
6
2
386.72
2022-02-16
7
1
200.00
2022-01-16
8
1
19.99
2020-02-20
For a specified time range (e.g. 2022-01-01 to 2022-12-31), I need to find the following:
Average 1st order value
Average 2nd order value
Average 3rd order value
Average 4th order value
E.g. the 1st purchases for each customer are:
for customer_id 1, order_id 8 is their first purchase
customer 2, order 6
customer 3, order 5
So, the 1st-purchase average order value is (19.99 + 386.72 + 357.80) / 3 = $254.84
This needs to be found for the 2nd, 3rd and 4th purchases also.
I also need to find the average time between purchases:
order 1 to order 2
order 2 to order 3
order 3 to order 4
The final result would ideally look something like this:
order_number
AOV
av_days_since_last_order
1
254.84
0
2
300.00
28
3
322.22
21
4
350.00
20
Note that average days since last order for order 1 would always be 0 as it's the 1st purchase.
Thanks.
select order_number
,round(avg(value),2) as AOV
,coalesce(round(avg(days_between_orders),0),0) as av_days_since_last_order
from
(
select *
,row_number() over(partition by customer_id order by created_at) as order_number
,created_at - lag(created_at) over(partition by customer_id order by created_at) as days_between_orders
from t
) t
where created_at between '2022-01-01' and '2022-12-31'
group by order_number
order by order_number
order_number
aov
av_days_since_last_order
1
372.26
0
2
25.74
239
3
200.00
418
4
201.41
75
5
159.64
175
Fiddle
Im suppose it should be something like this
WITH prep_data AS (
SELECT order_id,
cuntomer_id,
ROW_NUMBER() OVER(PARTITION BY order_id, cuntomer_id ORDER BY created_at) AS pushcase_num,
created_at,
value
FROM pushcases
WHERE created_at BETWEEN :date_from AND :date_to
), prep_data2 AS (
SELECT pd1.order_id,
pd1.cuntomer_id,
pd1.pushcase_num
pd2.created_at - pd1.created_at AS date_diff,
pd1.value
FROM prep_data pd1
LEFT JOIN prep_data pd2 ON (pd1.order_id = pd2.order_id AND pd1.cuntomer_id = pd2.cuntomer_id AND pd1.pushcase_num = pd2.pushcase_num+1)
)
SELECT order_id,
cuntomer_id,
pushcase_num,
avg(value) AS avg_val,
avg(date_diff) AS avg_date_diff
FROM prep_data2
GROUP BY pushcase_num

How to calculate needed amount for supply order?

Table "client_orders":
date
ordered
id
28.05
50
1
23.06
60
2
24.05
50
1
25.06
130
2
Table "stock":
id
amount
date
1
60
23.04
2
90
25.04
1
10
24.04
2
10
24.06
I want to calculate the amount I need to order (to fulfill the stock) for what date. For instance, it should be:
30 by 28.05 (60+10-50-50=-30) for id = 1
-90 by 25.06 (90-60+10-130=-90) for id = 2
I tried to do it with LAG function, but the problem is that the stock here is not updating.
SELECT *,
SUM(amount - ordered) OVER (PARTITION BY sd.id ORDER BY d.date ASC)
FROM stock sd
LEFT JOIN (SELECT date,
id,
ordered
FROM client_orders) AS d
ON sd.id = d.id
Couldn't find anything similar on the web. Grateful if you share articles/examples how to do that.
You could make a union of the two tables and sum all stock amounts with the negative of ordered amounts. For the date you could instead take the corresponding maximum value.
SELECT id,
SUM(amount),
MAX(date)
FROM (SELECT id,
-ordered AS amount,
date
FROM client_orders
UNION
SELECT *
FROM stock
) stock_and_orders
GROUP BY id
Try it here.

Add a column with customers orders count at the time they passed the order

I have the following table
order_id
created_at
customer_id
1
2020-01-02
11
2
2020-02-03
12
3
2020-02-03
11
I would like to add a column "customer_orders_count" that will assign the number of orders that a customer passed to each transaction, ie obtain this table :
order_id
created_at
customer_id
customer_orders_count
1
2020-01-02
11
1
2
2020-02-03
12
1
2
2020-02-03
11
2
My problem it's I can't find how to calculated a local "customer_orders_count" dependind on each order, I only managed to add a column with the global "customer_orders_count" and for example for the first row order_id=1 I'll get customer_orders_count=2 whereas I'll like to be 1.
Does anyone has and idea ?
Use cumulative count:
with mytable as (
select 1 as order_id, date '2020-01-02' as created_at, 11 as customer_id union all
select 2, '2020-02-03', 12 union all
select 3 , '2020-02-03', 11
)
select *, count(*) over (partition by customer_id order by created_at) as customer_orders_count
from mytable
order by order_id
Use row_number():
select t.*,
row_number() over (partition by customer_id order by created_at) as customer_order_count
from t;
This is subtly different from using a cumulative count(). This version guarantees that the numbers for a given customer are never duplicated, even when the dates are the same. A cumulative count has no such guarantee.

How to do a Min and Max of date but following the changes in price points

I'm not really sure how to word this question better so I'll provide the data that I have and the result that I'm after.
This is the data that I have
sku sales qty date
A 100 1 1-Jan-19
A 200 2 2-Jan-19
A 100 1 3-Jan-19
A 240 2 4-Jan-19
A 360 3 5-Jan-19
A 360 4 6-Jan-19
A 200 2 7-Jan-19
A 90 1 8-Jan-19
B 100 1 9-Jan-19
B 200 2 10-Jan-19
And this is the result that I'm after
sku price sum(qty) sum(sales) min(date) max(date)
A 100 4 400 1-Jan-19 3-Jan-19
A 120 5 600 4-Jan-19 5-Jan-19
A 90 4 360 6-Jan-19 6-Jan-19
A 100 2 200 7-Jan-19 7-Jan-19
A 90 1 90 8-Jan-19 8-Jan-19
B 100 3 300 9-Jan-19 10-Jan-19
As you can see, I'm trying to get the min and max date of each price point, where price = sales/qty. At this point, I can get the min and max date of the same price but I can separate it when there's another price in between. I think I have to use some sort of min(date) over (partition by sales/qty order by date) but I can't figure it out yet.
I'm using Redshift SQL
This is a gaps-and-islands query. You can do this by generating a sequence and subtracting that from the date. Then aggregate:
select sku, price, sum(qty), sum(sales),
min(date), max(date)
from (select t.*,
row_number() over (partition by sku, price order by date) as seqnum
from t
) t
group by sku, price, (date - seqnum * interval '1 day')
order by sku, price, min(date);
You can do with Sub Query and LAG
FIDDLE DEMO
SELECT SKU, Price, SUM(Qty) SumQty, SUM(Sales) SumSales, MIN(date) MinDate, MAX(date) MaxDate
FROM (
SELECT SKU,Price,SUM(is_change) OVER(order by SKU, date) is_change,Sales, Qty,date
FROM (SELECT SKU, Sales/Qty AS Price, Sales, Qty,date,
CASE WHEN Sales/Qty = lag(Sales/Qty) over (order by SKU, date)
and SKU = lag(SKU) OVER (order by SKU, date) then 0 ELSE 1 END AS is_change
FROM Tbl
)InnerSelect
) X GROUP BY sku, price,is_change
ORDER BY SKU,MIN(date)
Output