SQL - Same Table join to calculate profit from last entry - sql

I have a table of transactions for various products. I want to calculate the profit made on each
Product Date Profit Incremental Profit
--------------------- --------------------------- -----------
Apple 2016-05-21 100
Banana 2016-05-21 60
Apple 2016-06-15 30
Apple 2016-08-20 10
Banana 2016-08-20 5
Can I create a SQL query that can group based on product and give me incremental profit on every date for each product. For example on 21-05-2015 since it is first date so incremental profit will be 0. But on 15-06-2016 it will be -70 (30-100).
The expected output is:
Product Date Profit Incremental Profit
--------------------- --------------------------- -----------
Apple 2016-05-21 100 0
Banana 2016-05-21 60 0
Apple 2016-06-15 30 -70
Apple 2016-08-20 10 -20
Banana 2016-08-20 5 -55

maybe u can use this.
select
a.product
,a.date
,a.profit
,isnull(a.profit - (select top 1 x.profit from profit x where x.product = a.product and x.date < a.date),0) as profit
from PROFIT a
order by product, date

Try this
DECLARE #Tbl TABLE (Product NVARCHAR(50), Date_ DATETIME, Profit INT)
INSERT INTO #Tbl
VALUES
('Apple' , '2016-05-21', 100),
('Banana', '2016-05-21', 60 ),
('Apple', '2016-06-15', 30 ),
('Apple', '2016-08-20', 10 ),
('Banana', '2016-08-20', 5 )
;WITH CTE
AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Date_) RowId
FROM #Tbl
)
SELECT
CurrentRow.Product ,
CurrentRow.Date_ ,
CurrentRow.Profit ,
CurrentRow.Profit - ISNULL(PrevRow.Profit, CurrentRow.Profit) 'Incremental Profit'
FROM
CTE CurrentRow LEFT JOIN
(SELECT CTE.Product ,CTE.Profit, CTE.RowId + 1 RowId FROM CTE) PrevRow ON CurrentRow.Product = PrevRow.product AND
CurrentRow.RowId = PrevRow.RowId
ORDER BY CurrentRow.Date_
Result:
Product Date_ Profit Incremental Profit
Apple 2016-05-21 100 0
Banana 2016-05-21 60 0
Apple 2016-06-15 30 -70
Apple 2016-08-20 10 -20
Banana 2016-08-20 5 -55
Edit:
UPDATE #Tbl
SET [Incremental Profit] = A.[Incremental Profit]
FROM
(
SELECT
CurrentRow.Product ,
CurrentRow.Date_ ,
CurrentRow.Profit ,
CurrentRow.Profit - ISNULL(PrevRow.Profit, CurrentRow.Profit) 'Incremental Profit'
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Date_) RowId FROM #Tbl) CurrentRow LEFT JOIN
(SELECT *, ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Date_) + 1 RowId FROM #Tbl) PrevRow ON CurrentRow.Product = PrevRow.Product AND
CurrentRow.RowId = PrevRow.RowId
) A
WHERE
[#Tbl].Product = A.Product AND
[#Tbl].Date_ = A.Date_

Related

Calculating average time between customer orders and average order value in Postgres

In PostgreSQL I have an orders table that represents orders made by customers of a store:
SELECT * FROM orders
order_id
customer_id
value
created_at
1
1
188.01
2020-11-24
2
2
25.74
2022-10-13
3
1
159.64
2022-09-23
4
1
201.41
2022-04-01
5
3
357.80
2022-09-05
6
2
386.72
2022-02-16
7
1
200.00
2022-01-16
8
1
19.99
2020-02-20
For a specified time range (e.g. 2022-01-01 to 2022-12-31), I need to find the following:
Average 1st order value
Average 2nd order value
Average 3rd order value
Average 4th order value
E.g. the 1st purchases for each customer are:
for customer_id 1, order_id 8 is their first purchase
customer 2, order 6
customer 3, order 5
So, the 1st-purchase average order value is (19.99 + 386.72 + 357.80) / 3 = $254.84
This needs to be found for the 2nd, 3rd and 4th purchases also.
I also need to find the average time between purchases:
order 1 to order 2
order 2 to order 3
order 3 to order 4
The final result would ideally look something like this:
order_number
AOV
av_days_since_last_order
1
254.84
0
2
300.00
28
3
322.22
21
4
350.00
20
Note that average days since last order for order 1 would always be 0 as it's the 1st purchase.
Thanks.
select order_number
,round(avg(value),2) as AOV
,coalesce(round(avg(days_between_orders),0),0) as av_days_since_last_order
from
(
select *
,row_number() over(partition by customer_id order by created_at) as order_number
,created_at - lag(created_at) over(partition by customer_id order by created_at) as days_between_orders
from t
) t
where created_at between '2022-01-01' and '2022-12-31'
group by order_number
order by order_number
order_number
aov
av_days_since_last_order
1
372.26
0
2
25.74
239
3
200.00
418
4
201.41
75
5
159.64
175
Fiddle
Im suppose it should be something like this
WITH prep_data AS (
SELECT order_id,
cuntomer_id,
ROW_NUMBER() OVER(PARTITION BY order_id, cuntomer_id ORDER BY created_at) AS pushcase_num,
created_at,
value
FROM pushcases
WHERE created_at BETWEEN :date_from AND :date_to
), prep_data2 AS (
SELECT pd1.order_id,
pd1.cuntomer_id,
pd1.pushcase_num
pd2.created_at - pd1.created_at AS date_diff,
pd1.value
FROM prep_data pd1
LEFT JOIN prep_data pd2 ON (pd1.order_id = pd2.order_id AND pd1.cuntomer_id = pd2.cuntomer_id AND pd1.pushcase_num = pd2.pushcase_num+1)
)
SELECT order_id,
cuntomer_id,
pushcase_num,
avg(value) AS avg_val,
avg(date_diff) AS avg_date_diff
FROM prep_data2
GROUP BY pushcase_num

Table with daily historical stock prices. How to pull stocks where the price reached a certain number for the first time

I have a table with historical stocks prices for hundreds of stocks. I need to extract only those stocks that reached $10 or greater for the first time.
Stock
Price
Date
AAA
9
2021-10-01
AAA
10
2021-10-02
AAA
8
2021-10-03
AAA
10
2021-10-04
BBB
9
2021-10-01
BBB
11
2021-10-02
BBB
12
2021-10-03
Is there a way to count how many times each stock hit >= 10 in order to pull only those where count = 1 (in this case it would be stock BBB considering it never reached 10 in the past)?
Since I couldn't figure how to create count I've tried the below manipulations with min/max dates but this looks like a bit awkward approach. Any idea of a simpler solution?
with query1 as (
select Stock, min(date) as min_greater10_dt
from t
where Price >= 10
group by Stock
), query2 as (
select Stock, max(date) as max_greater10_dt
from t
where Price >= 10
group by Stock
)
select Stock
from t a
join query1 b on b.Stock = a.Stock
join query2 c on c.Stock = a.Stock
where not(a.Price < 10 and a.Date between b.min_greater10_dt and c.max_greater10_dt)
This is a type of gaps-and-islands problem which can be solved as follows:
detect the change from < 10 to >= 10 using a lagged price
count the number of such changes
filter in only stock where this has happened exactly once
and take the first row since you only want the stock (you could group by here but a row number allows you to select the entire row should you wish to).
declare #Table table (Stock varchar(3), Price money, [Date] date);
insert into #Table (Stock, Price, [Date])
values
('AAA', 9, '2021-10-01'),
('AAA', 10, '2021-10-02'),
('AAA', 8, '2021-10-03'),
('AAA', 10, '2021-10-04'),
('BBB', 9, '2021-10-01'),
('BBB', 11, '2021-10-02'),
('BBB', 12, '2021-10-03');
with cte1 as (
select Stock, Price, [Date]
, row_number() over (partition by Stock, case when Price >= 10 then 1 else 0 end order by [Date] asc) rn
, lag(Price,1,0) over (partition by Stock order by [Date] asc) LaggedStock
from #Table
), cte2 as (
select Stock, Price, [Date], rn, LaggedStock
, sum(case when Price >= 10 and LaggedStock < 10 then 1 else 0 end) over (partition by Stock) StockOver10
from cte1
)
select Stock
--, Price, [Date], rn, LaggedStock, StockOver10 -- debug
from cte2
where Price >= 10
and StockOver10 = 1 and rn = 1;
Returns:
Stock
BBB
Note: providing DDL+DML as show above makes it much easier of people to assist.

T-SQL calculate the percent increase or decrease between the earliest and latest for each project

I have a table like below, I am trying to run a query in T-SQL to get the earliest and latest costs for each project_id according to the date column and calculate the percent cost increase or decrease and return the data-set show in the second table (I have simplified the table in this question).
project_id date cost
-------------------------------
123 7/1/17 5000
123 8/1/17 6000
123 9/1/17 7000
123 10/1/17 8000
123 11/1/17 9000
456 7/1/17 10000
456 8/1/17 9000
456 9/1/17 8000
876 1/1/17 8000
876 6/1/17 5000
876 8/1/17 10000
876 11/1/17 8000
Result:
(Edit: Fixed the result)
project_id "cost incr/decr pct"
------------------------------------------------
123 80% which is (9000-5000)/5000
456 -20%
876 0%
Whatever query I run I get duplicates.
This is what I tried:
select distinct
p1.Proj_ID, p1.date, p2.[cost], p3.cost,
(nullif(p2.cost, 0) / nullif(p1.cost, 0)) * 100 as 'OVER UNDER'
from
[PROJECT] p1
inner join
(select
[Proj_ID], [cost], min([date]) min_date
from
[PROJECT]
group by
[Proj_ID], [cost]) p2 on p1.Proj_ID = p2.Proj_ID
inner join
(select
[Proj_ID], [cost], max([date]) max_date
from
[PROJECT]
group by
[Proj_ID], [cost]) p3 on p1.Proj_ID = p3.Proj_ID
where
p1.date in (p2.min_date, p3.max_date)
Unfortunately, SQL Server does not have a first_value() aggregation function. It does have an analytic function, though. So, you can do:
select distinct project_id,
first_value(cost) over (partition by project_id order by date asc) as first_cost,
first_value(cost) over (partition by project_id order by date desc) as last_cost,
(first_value(cost) over (partition by project_id order by date desc) /
first_value(cost) over (partition by project_id order by date asc)
) - 1 as ratio
from project;
If cost is an integer, you may need to convert to a representation with decimal places.
You can use row_number and OUTER APPLY over top 1 ... prior to SQL 2012
select
min_.projectid,
latest_.cost - min_.cost [Calculation]
from
(select
row_number() over (partition by projectid order by date) rn
,projectid
,cost
from projectable) min_ -- get the first dates per project
outer apply (
select
top 1
cost
from projectable
where
projectid = min_.projectid -- get the latest cost for each project
order by date desc
) latest_
where min_.rn = 1
This might perform a little better
;with costs as (
select *,
ROW_NUMBER() over (PARTITION BY project_id ORDER BY date) mincost,
ROW_NUMBER() over (PARTITION BY project_id ORDER BY date desc) maxcost
from table1
)
select project_id,
min(case when mincost = 1 then cost end) as cost1,
max(case when maxcost = 1 then cost end) as cost2,
(max(case when maxcost = 1 then cost end) - min(case when mincost = 1 then cost end)) * 100 / min(case when mincost = 1 then cost end) as [OVER UNDER]
from costs a
group by project_id

How to show one column in two column base on second column in SQL Server

I have a table sales with columns
Month SalesAmount
--------------------------
4 50000
5 60000
6 70000
7 50000
8 60000
9 40000
I want result like this
From Month To Month Result
-----------------------------------------------
4 6 Increasing
6 7 Decreasing
7 8 Increasing
8 9 Decreasing
without using a cursor
Try this. Basically, you need to join the table to itself by the month (+1), then pull the data you want/perform any calcs.
Select
M1.Month as [From],
M2.Month as [To],
Case
When M2.SalesAmount > M1.SalesAmount Then 'Increasing'
When M2.SalesAmount < M1.SalesAmount Then 'Decreasing'
Else 'Holding Steady'
End
From sales M1
Inner Join sales M2 on M2.Month = M1.Month + 1
This works if you want the breakdown month by month. However, your example data set compresses months 4-6. Without more details on how you determine what to compress, I'm going to make the following assumptions:
You want detailed data for the last 3 periods, and a compressed summary of all other periods.
You wish only the overall trend between the first month and the last month inside the compressed period. i.e. you want to know the difference between the first, and the last month values.
To do that, the query starts to get more complicated. I've done it with two Unioned queries:
With
compressed_range as
( select min([Month]) as min_month, max([Month]) - 3 as max_month from sales )
Select
M1.[Month] as [From],
M2.[Month] as [To],
Case
When M2.SalesAmount > M1.SalesAmount Then 'Increasing'
When M2.SalesAmount < M1.SalesAmount Then 'Decreasing'
Else 'Holding Steady'
End
From sales M1
Inner Join sales M2 on M2.[Month] = ( select max_month from compressed_range )
Where M1.Month = ( select min_month from compressed_range )
Union All
Select
M1.Month as [From],
M2.Month as [To],
Case
When M2.SalesAmount > M1.SalesAmount Then 'Increasing'
When M2.SalesAmount < M1.SalesAmount Then 'Decreasing'
Else 'Holding Steady'
End
From sales M1
Inner Join sales M2 on M2.Month = M1.Month + 1
Where M2.Month >= (Select max_month + 1 from compressed_range)
This gives your desired result:
DECLARE #T TABLE (Month INT, SalesAmount MONEY);
INSERT #T
VALUES (4, 50000), (5, 60000), (6, 70000), (7, 50000), (8, 60000), (9, 40000);
WITH CTE AS
( SELECT FromMonth = T2.Month,
ToMonth = T.Month,
Result = CASE T2.Result
WHEN -1 THEN 'Decreasing'
WHEN 0 THEN 'Static'
WHEN 1 THEN 'Increasing'
END,
GroupingSet = ROW_NUMBER() OVER(ORDER BY T.Month) - ROW_NUMBER() OVER(PARTITION BY T2.Result ORDER BY T.Month)
FROM #T T
CROSS APPLY
( SELECT TOP 1
T2.SalesAmount,
T2.Month,
Result = SIGN(T.SalesAmount - T2.SalesAmount)
FROM #T T2
WHERE T2.Month < T.Month
ORDER BY T2.Month DESC
) T2
)
SELECT FromMonth = MIN(FromMonth),
ToMonth = MAX(ToMonth),
Result
FROM CTE
GROUP BY Result, GroupingSet
ORDER BY FromMonth;
The first stage is to get the sales amount for the previous month each time:
SELECT *
FROM #T T
CROSS APPLY
( SELECT TOP 1
T2.SalesAmount,
T2.Month,
Result = SIGN(T.SalesAmount - T2.SalesAmount)
FROM #T T2
WHERE T2.Month < T.Month
ORDER BY T2.Month DESC
) T2
ORDER BY T.MONTH
Will Give:
Month SalesAmount SalesAmount Month Result
5 60000.00 50000.00 4 1.00
6 70000.00 60000.00 5 1.00
7 50000.00 70000.00 6 -1.00
8 60000.00 50000.00 7 1.00
9 40000.00 60000.00 8 -1.00
Where Result is just an indicator of whether or not the amount has increased or decreased. You then need to apply an ordering trick whereby each member of a sequence - it's postion in the sequence is constant for sequential members. So with the above data set if we added:
RN1 = ROW_NUMBER() OVER(ORDER BY T.Month),
RN2 = ROW_NUMBER() OVER(PARTITION BY T2.Result ORDER BY T.Month)
Month SalesAmount SalesAmount Month Result RN1 RN2 | RN1 - RN2
5 60000.00 50000.00 4 1.00 1 1 | 0
6 70000.00 60000.00 5 1.00 2 2 | 0
7 50000.00 70000.00 6 -1.00 3 1 | 2
8 60000.00 50000.00 7 1.00 4 3 | 1
9 40000.00 60000.00 8 -1.00 5 2 | 3
So you can see for the first 2 rows the final column RN1 - RN2 remains the same as they are both increasing, then when the result changes, the difference between these two row_numbers chnages, so creates a new group.
You can then group by this calculation (the GroupingSet column in the original query), to group your consecutive periods of increase and decrease together.
Example on SQL Fiddle
If you are using only month no in your table structure, you can try something like this
SELECT s1.month AS From_Month,
s2.month AS To_Month,
CASE
WHEN s2.salesamount > s1.salesamount THEN 'Increasing'
ELSE 'Decresing'
END AS res
FROM sales AS s1,
sales AS s2
WHERE s1.month + 1 = s2.month
demo at http://sqlfiddle.com/#!6/0819d/11

Calculation of balance after each transaction

I have table like this:
cust_id acc_no trans_id trans_type amount
1111 1001 10 credit 2000.0
1111 1001 11 credit 1000.0
1111 1001 12 debit 1000.0
2222 1002 13 credit 2000.0
2222 1002 14 debit 1000.0
I want a Hive query or sql query for every transaction done by a customer the balance should be calculated so.
I want output as follows:
cust_id acc_no trans_id trans_type amount balance
1111.0 1001.0 10.0 credit 2000.0 2000.0
1111.0 1001.0 11.0 credit 1000.0 3000.0
1111.0 1001.0 12.0 debit 1000.0 2000.0
2222.0 1002.0 13.0 credit 2000.0 2000.0
2222.0 1002.0 14.0 debit 1000.0 1000.0
I've tried
SELECT *
FROM (SELECT cust_id,
acc_no,
trans_id,
trans_type,
amount,
CASE
WHEN Trim(trans_type) = 'credit' THEN ball =
Trim(bal) + Trim(amt)
ELSE ball = Trim(bal) - Trim(amt)
end
FROM ban) l;
This query will do the trick :
SELECT t1.cust_id,t1.acc_no,t1.trans_id,t1.trans_type,t1.amount,
sum(t2.amount*case when t2.trans_type = 'credit' then 1
else -1 end) as balance
FROM Table1 t1
INNER JOIN Table1 t2 ON t1.cust_id = t2.cust_id AND
t1.acc_no = t2.acc_no AND
t1.trans_id >= t2.trans_id
GROUP BY t1.cust_id,t1.acc_no,t1.trans_id,t1.trans_type,t1.amount
See SQLFIDDLE : http://www.sqlfiddle.com/#!2/3b5d8/15/0
EDIT :
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE Table1
(`cust_id` int, `acc_no` int, `trans_id` int,
`trans_type` varchar(6), `amount` int)
;
INSERT INTO Table1
(`cust_id`, `acc_no`, `trans_id`, `trans_type`, `amount`)
VALUES
(1111, 1001, 10, 'credit', 2000.0),
(1111, 1001, 11, 'credit', 1000.0),
(1111, 1001, 12, 'debit', 1000.0),
(2222, 1002, 13, 'credit', 2000.0),
(2222, 1002, 14, 'debit', 1000.0)
;
Query 1:
SELECT t1.cust_id,t1.acc_no,t1.trans_id,t1.trans_type,t1.amount,
sum(t2.amount*case when t2.trans_type = 'credit' then 1
else -1 end) as balance
FROM Table1 t1
INNER JOIN Table1 t2 ON t1.cust_id = t2.cust_id AND
t1.acc_no = t2.acc_no AND
t1.trans_id >= t2.trans_id
GROUP BY t1.cust_id,t1.acc_no,t1.trans_id,t1.trans_type,t1.amount
Results:
| CUST_ID | ACC_NO | TRANS_ID | TRANS_TYPE | AMOUNT | BALANCE |
|---------|--------|----------|------------|--------|---------|
| 1111 | 1001 | 10 | credit | 2000 | 2000 |
| 1111 | 1001 | 11 | credit | 1000 | 3000 |
| 1111 | 1001 | 12 | debit | 1000 | 2000 |
| 2222 | 1002 | 13 | credit | 2000 | 2000 |
| 2222 | 1002 | 14 | debit | 1000 | 1000 |
A simple solution is to quantify each transaction (- or +) based on trans_type and then get cumulative sum using window function .
SELECT cust_id,
acc_no,
trans_id,
trans_type,
amount,
Sum (real_amount)
OVER (ORDER BY cust_id) AS balance
FROM (SELECT cust_id,
acc_no,
trans_id,
trans_type,
amount,
( CASE trans_type
WHEN 'credit' THEN amount
WHEN 'debit' THEN amount *- 1
END ) AS real_amount
FROM test) t
You could do this easily through a View, calculating this directly on the table is possible but leads to performance and scalability issues (the database will slow down as the table grows). By using a View the calculation is performed as-needed; if you index the view you can keep the balances up to date without impacting the performance of the transaction table.
If you really insist on it being in the transaction table itself you could possibly use a calculated column which runs a user-defined function to determine the current balance. However this will depend largey on the specific SQL backend you're using.
Here's a basic SELECT Statement which calculates the current balance by Account:
select
acc_no,
sum(case trans_type
when 'credit' then amount
when 'debit' then amount * -1
end) as Amount
from Transactions
group by acc_no
You can use window function:
select cust_id,
acc_no, trans_id, trans_type, amount,
sum(pre_balance) over (partition by cust_id order by trans_id) as balance
from
(select cust_id, acc_no, trans_id, trans_type,
amount,
amount as pre_balance from test
where trans_type = 'credit'
union
select cust_id, acc_no, trans_id, trans_type,
amount, -amount as pre_balance from
test where trans_type = 'debit'
order by trans_id) as sub;
with current_balances as (
SELECT
id,
user_id,
SUM(amount) OVER (PARTITION BY user_id ORDER BY created ASC) as current_balance
FROM payments_transaction pt
ORDER BY created DESC
)
SELECT
pt.id,
amount,
pt.user_id,
cb.current_balance as running_balance
FROM
payments_transaction pt
INNER JOIN
current_balances cb
ON pt.id = cb.id
ORDER BY created DESC
LIMIT 10;
This will work very efficiently for big returns, and won't break on filtering or limiting. Please note that if you select only for one user or a subset of them, provide user_id filter in both current_balances cte, and the main select to omit whole table scan.
Table (Transaction)
-
"id" "amount" "is_credit"
1 10000 1
2 2000 0
3 5000 1
Query :
SELECT *
FROM (
SELECT id, amount, SUM(CASE When is_credit=1 Then amount Else -amount End) OVER (ORDER BY id) AS balance
FROM `Transaction`
GROUP BY id, amount
)
ORDER BY id ;
Output :
"id" "amount" "is_credit" "balance"
1 10000 1 10000
2 2000 0 8000
3 5000 1 13000