How to do a Min and Max of date but following the changes in price points - sql

I'm not really sure how to word this question better so I'll provide the data that I have and the result that I'm after.
This is the data that I have
sku sales qty date
A 100 1 1-Jan-19
A 200 2 2-Jan-19
A 100 1 3-Jan-19
A 240 2 4-Jan-19
A 360 3 5-Jan-19
A 360 4 6-Jan-19
A 200 2 7-Jan-19
A 90 1 8-Jan-19
B 100 1 9-Jan-19
B 200 2 10-Jan-19
And this is the result that I'm after
sku price sum(qty) sum(sales) min(date) max(date)
A 100 4 400 1-Jan-19 3-Jan-19
A 120 5 600 4-Jan-19 5-Jan-19
A 90 4 360 6-Jan-19 6-Jan-19
A 100 2 200 7-Jan-19 7-Jan-19
A 90 1 90 8-Jan-19 8-Jan-19
B 100 3 300 9-Jan-19 10-Jan-19
As you can see, I'm trying to get the min and max date of each price point, where price = sales/qty. At this point, I can get the min and max date of the same price but I can separate it when there's another price in between. I think I have to use some sort of min(date) over (partition by sales/qty order by date) but I can't figure it out yet.
I'm using Redshift SQL

This is a gaps-and-islands query. You can do this by generating a sequence and subtracting that from the date. Then aggregate:
select sku, price, sum(qty), sum(sales),
min(date), max(date)
from (select t.*,
row_number() over (partition by sku, price order by date) as seqnum
from t
) t
group by sku, price, (date - seqnum * interval '1 day')
order by sku, price, min(date);

You can do with Sub Query and LAG
FIDDLE DEMO
SELECT SKU, Price, SUM(Qty) SumQty, SUM(Sales) SumSales, MIN(date) MinDate, MAX(date) MaxDate
FROM (
SELECT SKU,Price,SUM(is_change) OVER(order by SKU, date) is_change,Sales, Qty,date
FROM (SELECT SKU, Sales/Qty AS Price, Sales, Qty,date,
CASE WHEN Sales/Qty = lag(Sales/Qty) over (order by SKU, date)
and SKU = lag(SKU) OVER (order by SKU, date) then 0 ELSE 1 END AS is_change
FROM Tbl
)InnerSelect
) X GROUP BY sku, price,is_change
ORDER BY SKU,MIN(date)
Output

Related

How to use SUM() OVER (partition by)?

Imagine, from 1st to 3rd november you have sold a certain amount of goods (there are two types A and B), and now you need to determine how much was sold in total for the day.
How can I query last 2 columns (sum and quantity for date) that my table looks like this?:
Date Type Quantity Amount Sum_Quantity Sum_Amount
01-11 A 2 100 5 300
01-11 B 3 200 5 300
02-11 A 1 700 3 950
02-11 B 2 250 3 950
03-11 A 2 600 7 800
03-11 B 5 200 7 800
And how can I query, if I want to take the results partitioned by month?
SELECT date,
type,
quantity,
amount,
-- Partition by date
SUM(quantity) OVER (PARTITION BY date) AS sum_quantity_date_part,
SUM(amount) OVER (PARTITION BY date) AS sum_amount_date_part,
-- Partition by month
SUM(quantity) OVER (
PARTITION BY EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date)
) AS sum_quantity_month_part,
SUM(amount) OVER (
PARTITION BY EXTRACT(YEAR FROM date),
EXTRACT(MONTH FROM date)
) AS sum_amount_month_part
FROM sales
ORDER BY date, type
;

Calculating average time between customer orders and average order value in Postgres

In PostgreSQL I have an orders table that represents orders made by customers of a store:
SELECT * FROM orders
order_id
customer_id
value
created_at
1
1
188.01
2020-11-24
2
2
25.74
2022-10-13
3
1
159.64
2022-09-23
4
1
201.41
2022-04-01
5
3
357.80
2022-09-05
6
2
386.72
2022-02-16
7
1
200.00
2022-01-16
8
1
19.99
2020-02-20
For a specified time range (e.g. 2022-01-01 to 2022-12-31), I need to find the following:
Average 1st order value
Average 2nd order value
Average 3rd order value
Average 4th order value
E.g. the 1st purchases for each customer are:
for customer_id 1, order_id 8 is their first purchase
customer 2, order 6
customer 3, order 5
So, the 1st-purchase average order value is (19.99 + 386.72 + 357.80) / 3 = $254.84
This needs to be found for the 2nd, 3rd and 4th purchases also.
I also need to find the average time between purchases:
order 1 to order 2
order 2 to order 3
order 3 to order 4
The final result would ideally look something like this:
order_number
AOV
av_days_since_last_order
1
254.84
0
2
300.00
28
3
322.22
21
4
350.00
20
Note that average days since last order for order 1 would always be 0 as it's the 1st purchase.
Thanks.
select order_number
,round(avg(value),2) as AOV
,coalesce(round(avg(days_between_orders),0),0) as av_days_since_last_order
from
(
select *
,row_number() over(partition by customer_id order by created_at) as order_number
,created_at - lag(created_at) over(partition by customer_id order by created_at) as days_between_orders
from t
) t
where created_at between '2022-01-01' and '2022-12-31'
group by order_number
order by order_number
order_number
aov
av_days_since_last_order
1
372.26
0
2
25.74
239
3
200.00
418
4
201.41
75
5
159.64
175
Fiddle
Im suppose it should be something like this
WITH prep_data AS (
SELECT order_id,
cuntomer_id,
ROW_NUMBER() OVER(PARTITION BY order_id, cuntomer_id ORDER BY created_at) AS pushcase_num,
created_at,
value
FROM pushcases
WHERE created_at BETWEEN :date_from AND :date_to
), prep_data2 AS (
SELECT pd1.order_id,
pd1.cuntomer_id,
pd1.pushcase_num
pd2.created_at - pd1.created_at AS date_diff,
pd1.value
FROM prep_data pd1
LEFT JOIN prep_data pd2 ON (pd1.order_id = pd2.order_id AND pd1.cuntomer_id = pd2.cuntomer_id AND pd1.pushcase_num = pd2.pushcase_num+1)
)
SELECT order_id,
cuntomer_id,
pushcase_num,
avg(value) AS avg_val,
avg(date_diff) AS avg_date_diff
FROM prep_data2
GROUP BY pushcase_num

Running total with if clause or do while

Consider the following table with 3 columns.
Use this to create a SQL query to list the top products by revenue that make up 25% of the total revenue in 2020.
(i.e. If total revenue is 1000 then list of top products that account for <= 250)
Table ProductRevenue:
Date_DD ... date(YYYY-MM-DD)
Product_Name ... varchar(250)
Revenue ... decimal(10,2)
Sample data:
Date_DD Product_Name Revenue
-------------------------------------
2020-11-30 a 100
2020-10-02 b 100
2020-07-07 c 100
2020-04-04 d 100
2020-05-05 f 50
2020-06-06 g 120
2020-05-30 h 90
2020-11-13 k 120
2020-01-30 l 120
I used that code but don't know how to use where clause . Anyone can help?
SELECT
product_name, revenue,
SUM(revenue) OVER (ORDER BY revenue DESC, product_name) AS running _total
FROM
TABLE_PRODUCT_REVENUE
new code
select product_name, revenue, running_total from
(SELECT product_name, revenue, SUM(revenue) OVER ( ORDER BY revenue DESC, product_name) AS running_total
FROM TABLE_PRODUCT_REVENUE ) o
where running_total<(select max(running_total) from
(SELECT product_name, revenue, SUM(revenue) OVER ( ORDER BY revenue DESC, product_name) AS running_total
FROM TABLE_PRODUCT_REVENUE ) o )*0.25
group by product_name, revenue, running_total
order by running_total
I just need to know where can i add where clause where YEAR([Date_DD])=2000 anyone can help?
The question was not very descriptive; however, below might help you narrow down the issue.
Below will show up you the running total for the Product.
SELECT
product_name, revenue,
SUM(revenue) OVER (partition by product_name ORDER BY revenue DESC, product_name) AS running _total
FROM
TABLE-PRODUCT_REVENUE
below would give the result if the product total is more significant than x amount
select
*,case when running _total >=1000 then 'top selling product' else null end
(
SELECT
product_name, revenue,
SUM(revenue) OVER (partition by product_name ORDER BY revenue DESC, product_name) AS running _total
FROM
TABLE-PRODUCT_REVENUE
)t
where running_total >= xxx amount

T-SQL calculate the percent increase or decrease between the earliest and latest for each project

I have a table like below, I am trying to run a query in T-SQL to get the earliest and latest costs for each project_id according to the date column and calculate the percent cost increase or decrease and return the data-set show in the second table (I have simplified the table in this question).
project_id date cost
-------------------------------
123 7/1/17 5000
123 8/1/17 6000
123 9/1/17 7000
123 10/1/17 8000
123 11/1/17 9000
456 7/1/17 10000
456 8/1/17 9000
456 9/1/17 8000
876 1/1/17 8000
876 6/1/17 5000
876 8/1/17 10000
876 11/1/17 8000
Result:
(Edit: Fixed the result)
project_id "cost incr/decr pct"
------------------------------------------------
123 80% which is (9000-5000)/5000
456 -20%
876 0%
Whatever query I run I get duplicates.
This is what I tried:
select distinct
p1.Proj_ID, p1.date, p2.[cost], p3.cost,
(nullif(p2.cost, 0) / nullif(p1.cost, 0)) * 100 as 'OVER UNDER'
from
[PROJECT] p1
inner join
(select
[Proj_ID], [cost], min([date]) min_date
from
[PROJECT]
group by
[Proj_ID], [cost]) p2 on p1.Proj_ID = p2.Proj_ID
inner join
(select
[Proj_ID], [cost], max([date]) max_date
from
[PROJECT]
group by
[Proj_ID], [cost]) p3 on p1.Proj_ID = p3.Proj_ID
where
p1.date in (p2.min_date, p3.max_date)
Unfortunately, SQL Server does not have a first_value() aggregation function. It does have an analytic function, though. So, you can do:
select distinct project_id,
first_value(cost) over (partition by project_id order by date asc) as first_cost,
first_value(cost) over (partition by project_id order by date desc) as last_cost,
(first_value(cost) over (partition by project_id order by date desc) /
first_value(cost) over (partition by project_id order by date asc)
) - 1 as ratio
from project;
If cost is an integer, you may need to convert to a representation with decimal places.
You can use row_number and OUTER APPLY over top 1 ... prior to SQL 2012
select
min_.projectid,
latest_.cost - min_.cost [Calculation]
from
(select
row_number() over (partition by projectid order by date) rn
,projectid
,cost
from projectable) min_ -- get the first dates per project
outer apply (
select
top 1
cost
from projectable
where
projectid = min_.projectid -- get the latest cost for each project
order by date desc
) latest_
where min_.rn = 1
This might perform a little better
;with costs as (
select *,
ROW_NUMBER() over (PARTITION BY project_id ORDER BY date) mincost,
ROW_NUMBER() over (PARTITION BY project_id ORDER BY date desc) maxcost
from table1
)
select project_id,
min(case when mincost = 1 then cost end) as cost1,
max(case when maxcost = 1 then cost end) as cost2,
(max(case when maxcost = 1 then cost end) - min(case when mincost = 1 then cost end)) * 100 / min(case when mincost = 1 then cost end) as [OVER UNDER]
from costs a
group by project_id

Compare between values from the same table in postgresql

I have the following table:
id partid orderdate qty price
1 10 01/01/2017 10 3
2 10 02/01/2017 5 9
3 11 01/01/2017 0.5 0.001
4 145 02/01/2017 5 18
5 10 12/12/2016 8 7
6 10 05/07/2010 81 7.5
Basically I want to compare the most recent purchasing of parts to the other purchasing of the same part in a period of 24 months. For that matter compare id=2 to id = 1,5.
I want to check if the price of the latest orderdate (per part) is larger than the average price of that part in the last 24 months.
So first I need to calculate the avg price:
partid avgprice
10 (3+9+7)/3=6.33 (7.5 is out of range)
11 0.001
145 18
I also need to know the latest orderdate of each part:
id partid
2 10
3 11
4 145
and then I need to check if id=2, id=3, id=6 (latest purchases) are bigger than the average. If they are I need to return their partid.
So I should have something like this:
id partid avgprice lastprice
2 10 6.33 9
3 11 0.001 0.001
4 145 18 18
Finally I need to return partid=10 since 9>6.33
Now to my questions...
I'm not sure how I can find the latest order in PostgreSQL.
I tried:
select id, distinct partid,orderdate
from table
where orderdate> current_date - interval '24 months'
order by orderdate desc
This gives :
ERROR: syntax error at or near "distinct".
I'm a bit of a lost here. I know what I want to do but I cant translate it to SQL. Any one can help?
Get the avarage per part and the last order per price and join these:
select
lastorder.id,
lastorder.partid,
lastorder.orderdate,
lastorder.price as lastprice,
avgorder.price as avgprice
from
(
select
partid,
avg(price) as price
from mytable
where orderdate >= current_date - interval '24 months'
group by partid
) avgorder
join
(
select distinct on (partid)
id,
partid,
orderdate,
price
from mytable
order by partid, orderdate desc
) lastorder on lastorder.partid = avgorder.partid
and lastorder.price > avgorder.price;
This can be solved without distinct (which is heavy on the DB anyways):
with avg_price as (
select partid, avg(price) as price
from table
where orderdate> current_date - interval '24 months'
group by partid
)
select f.id, f.partid, av.price, f.price
from (
select id, partid, orderdate, price, rank() over (partition by partid order by orderdate desc)
from table
) as f
join avg_price av on f.partid = av.partid
where f.rank = 1
and av.price < f.price