Generate billing balance depending on the number of payments made - sql

Below is the table I have created and inserted values in it:
CREATE TABLE Invoices
(
InvID int,
InvAmount int
)
GO
INSERT INTO Invoices
VALUES (1, 543), (2, 749)
CREATE TABLE payments
(
PayID int IDENTITY (1, 1),
InvID int,
PayAmount int,
PayDate date
)
INSERT INTO payments
VALUES (1, 20, '2016-01-01'),
(1, 35, '2016-01-07'),
(1, 78, '2016-01-13'),
(1, 52, '2016-01-25'),
(2, 40, '2016-01-03'),
(2, 54, '2016-01-15'),
(2, 63, '2016-01-17'),
(2, 59, '2016-01-28')
SELECT * FROM Invoices
SELECT * FROM payments
As shown in the screenshot above, the Invoice table specifies various customer billings (the first billing totals 543, the second billing totals 749).
As shown in the screenshot above, the payments table specifies the various payments the customer made for each of the billings. For example, one can see that on January 1st the customer paid 20 USD out of billing no. 1 (which totals 543 USD), and on January 3rd the customer paid 40 USD out of billing no. 2, (which totals 749 USD).
Now the question is:
Write a query that displays the billing balance, based on the number of payments made so far.
The query result should exactly look like the screenshot below:
This is what I have tried:
SELECT
payments.InvID,
InvAmount - SUM(PayAmount) OVER (PARTITION BY payments.InvID ORDER BY PayID
ROWS BETWEEN UNBOUNDED PRECEDING AND 0 FOLLOWING) AS 'InvAmount',
PayDate, PayAmount,
InvAmount - SUM(PayAmount) OVER (PARTITION BY payments.InvID ORDER BY PayID) AS 'Balance'
FROM
Invoices
JOIN
payments ON payments.InvID = Invoices.InvID
After running the query, I got the following result which is shown below:
As you can see from the screenshot above, I nearly got the result I wanted.
The only problem is that InvAmount is exactly returning the same row values as Balance. I am not able to retain the starting row values of InvAmount which are 543 (InvID = 1) and 749 (InvID = 2) respectively.
How can this issue be solved?

You can add back the PayAmount in the calculation
InvAmount
+ PayAmount
- SUM(PayAmount) OVER (PARTITION BY payments.InvID
ORDER BY PayID
ROWS BETWEEN UNBOUNDED PRECEDING
AND 0 FOLLOWING) AS InvAmount
Or use BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING. But you need to handle NULL value for the very first row
InvAmount
- ISNULL(SUM(PayAmount) OVER (PARTITION BY payments.InvID
ORDER BY PayID
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING), 0) AS InvAmount
db>fiddle demo

You can use PARTITION BY clause for invoice identifier and then deduct the pay amount from the invoice amount as given below
;WITH CTE_Balance as
(
SELECT i.InvID, i.InvAmount, p.PayAmount, p.PayDate
, InvAmount - sum(payamount) over (partition by p.invid order by paydate rows between unbounded preceding and current row) as balance
, ROW_NUMBER() over(partition by p.invid order by p.paydate) as rnk
FROM payments as p
inner join Invoices as i
on i.InvID = p.InvID
)
SELECT invid, case when rnk =1 then invamount else lag(balance) over(partition by invid order by paydate) end as invamount
,payAmount, paydate, balance
FROM CTE_Balance
invid
invamount
payAmount
paydate
balance
1
543
20
2016-01-01
523
1
523
35
2016-01-07
488
1
488
78
2016-01-13
410
1
410
52
2016-01-25
358
2
749
40
2016-01-03
709
2
709
54
2016-01-15
655
2
655
63
2016-01-17
592
2
592
59
2016-01-28
533

Related

Calculating average time between customer orders and average order value in Postgres

In PostgreSQL I have an orders table that represents orders made by customers of a store:
SELECT * FROM orders
order_id
customer_id
value
created_at
1
1
188.01
2020-11-24
2
2
25.74
2022-10-13
3
1
159.64
2022-09-23
4
1
201.41
2022-04-01
5
3
357.80
2022-09-05
6
2
386.72
2022-02-16
7
1
200.00
2022-01-16
8
1
19.99
2020-02-20
For a specified time range (e.g. 2022-01-01 to 2022-12-31), I need to find the following:
Average 1st order value
Average 2nd order value
Average 3rd order value
Average 4th order value
E.g. the 1st purchases for each customer are:
for customer_id 1, order_id 8 is their first purchase
customer 2, order 6
customer 3, order 5
So, the 1st-purchase average order value is (19.99 + 386.72 + 357.80) / 3 = $254.84
This needs to be found for the 2nd, 3rd and 4th purchases also.
I also need to find the average time between purchases:
order 1 to order 2
order 2 to order 3
order 3 to order 4
The final result would ideally look something like this:
order_number
AOV
av_days_since_last_order
1
254.84
0
2
300.00
28
3
322.22
21
4
350.00
20
Note that average days since last order for order 1 would always be 0 as it's the 1st purchase.
Thanks.
select order_number
,round(avg(value),2) as AOV
,coalesce(round(avg(days_between_orders),0),0) as av_days_since_last_order
from
(
select *
,row_number() over(partition by customer_id order by created_at) as order_number
,created_at - lag(created_at) over(partition by customer_id order by created_at) as days_between_orders
from t
) t
where created_at between '2022-01-01' and '2022-12-31'
group by order_number
order by order_number
order_number
aov
av_days_since_last_order
1
372.26
0
2
25.74
239
3
200.00
418
4
201.41
75
5
159.64
175
Fiddle
Im suppose it should be something like this
WITH prep_data AS (
SELECT order_id,
cuntomer_id,
ROW_NUMBER() OVER(PARTITION BY order_id, cuntomer_id ORDER BY created_at) AS pushcase_num,
created_at,
value
FROM pushcases
WHERE created_at BETWEEN :date_from AND :date_to
), prep_data2 AS (
SELECT pd1.order_id,
pd1.cuntomer_id,
pd1.pushcase_num
pd2.created_at - pd1.created_at AS date_diff,
pd1.value
FROM prep_data pd1
LEFT JOIN prep_data pd2 ON (pd1.order_id = pd2.order_id AND pd1.cuntomer_id = pd2.cuntomer_id AND pd1.pushcase_num = pd2.pushcase_num+1)
)
SELECT order_id,
cuntomer_id,
pushcase_num,
avg(value) AS avg_val,
avg(date_diff) AS avg_date_diff
FROM prep_data2
GROUP BY pushcase_num

Filter Table results Self Join

Imagine a large table that contains receipt information. Since it holds so much data, you are required to return a subset of the data, excluding or consolidating rows where possible.
Here is the SQL and results table showing how the data should be returned.
create table table1
(RecieptNo smallint, Customer varchar(10), ReceiptDate date,
ItemDesc varchar(10), Amount smallint)
insert into table1 values
(100, 'Matt','2022-01-05','Ball', 10),
(101, 'Mark','2022-01-07','Hat', 20),
(101, 'Mark','2022-01-07','Jumper', -20),
(101, 'Mark','2022-01-14','Spoon', 30),
(102, 'Luke','2022-01-15','Fork', 15),
(102, 'Luke','2022-01-17','Spork', -10),
(103, 'John','2022-01-20','Orange', 13),
(103, 'John','2022-01-25','Pear', 12)
If there are rows on the same receipt where the negative and positive values cancel out, do not return either row.
If there is a receipt with a negative amount not exceeding positive amount, the negative amount should be deducted from positive line.
RecieptNo
Customer
ReceiptDate
ItemDesc
Amount
100
Matt
2022-01-05
Ball
10
101
Mark
2022-01-14
Spoon
30
102
Luke
2022-01-15
Fork
5
103
John
2022-01-20
Orange
13
103
John
2022-01-25
Pear
12
This is proving tricky, any ideas?
Based on table you provided, I suppose you want only row with the earliest date when you have multiple rows with same receipts which bring positive Amount after deduction.
;WITH cte AS (
SELECT *
, SUM( amount) OVER (PARTITION BY RecieptNo ORDER BY RecieptNo, ReceiptDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS ActualAmount
, ROW_NUMBER() OVER (PARTITION BY RecieptNo ORDER BY RecieptNo, ReceiptDate) AS rn
FROM table1)
SELECT RecieptNo, Customer, ReceiptDate, ItemDesc, ActualAmount
FROM cte
WHERE ActualAmount > 0 AND rn = 1
Read about window functions and cte's though.

How to distribute sales with partitions

I have 2 tables:
1st table columns: ItemCode int, Amount float (I have over 1000 ItemCodes)
2nd table columns: ItemCode int, SoldAmount float, Price float (I have over 10000 sale rows for different items)
Example:
ItemId 1528's Amount in 1st table is 244. That items sales in the 2nd table is as below:
Sale 1 Amount = 120, Price = 10
Sale 2 Amount = 120, Price = 30
Sale 3 Amount = 100, Price = 20
Sale 4 Amount = 10, Price = 25
ItemCode
Amount
1528
244
1530
150
ItemCode
Date
Amount
Price
1528
2021.11.01
120
10
1530
2021.10.01
120
30
1528
2021.09.01
100
20
1530
2021.08.01
10
25
Tried cursor and loop , but no desired output.
The desired outcome is to distribute that 100 amount with the sales above like following:
Sale 1 Amount 60: 100 - 60 = 40 with price 5 --- So we continue to the next row and subtract whatever is left
Sale 2 Amount 30: 40 - 30 = 10 with price 6 --- So we continue to the next row and subtract whatever is left
Sale 3 Amount 20: 10 - 20 = -10 with price 7 --- So we stop here as the amount is equal to 0 or below .
As the result we should get this:
60 * 5 = 300
30 * 6 = 180
10 * 7 = 70 (that 10 is derived from whatever could be subtracted before it hits 0)
Desired table as below
ItemCode
Date
Amount
Price
1528
2021.11.01
120
10
1528
2021.10.01
120
30
1528
2021.09.01
4
20
My last attempt was as below
WITH CTE AS (
SELECT ItemCode, SUM(Amount) AS Amount
FROM table 1
GROUP BY STOCKREF )
SELECT *,
IIF(LAG(table1.Amount - table2.amount) OVER (PARTITION BY table1.Amount ORDER BY Date DESC) IS NULL, table1.Amount - table2.amount,
LAG(table1.Amount - table2.amount) OVER (PARTITION BY CTE.ItemCode ORDER BY Date DESC) - table2.AMOUNT) AS COL
FROM CTE JOIN (SELECT ItemCode, DATE_, AMOUNT, PRICE FROM table2) table 2 ON table1.ItemCode = table2.Amount
Hopefully this addresses the right question - if you're trying to create a running total per item_code, deducting the sale quantity from starting inventory from first-to-last sale, maybe this would work:
CREATE TABLE #items (item_code INT,
item_amount INT);
INSERT INTO #items (item_code, item_amount)
VALUES (1528, 244);
INSERT INTO #items (item_code, item_amount)
VALUES (1529, 240);
CREATE TABLE #sales (item_code INT,
sale_date DATE,
sale_amount INT,
sale_price DECIMAL(12,2));
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-12-01', 50, 5);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-11-29', 120, 6.76292);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-11-15', 120, 6.6453);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-11-01', 100, 6.96875);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-30', 48, 7.2);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-18', 48, 3.5);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-09', 96, 3.9);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-05', 96, 3.75);
;WITH all_sales_with_running_totals AS ( --Calculate the running total of each item, deducting sale amount from total starting amount, in order of first sale to last
SELECT s.item_code,
s.sale_date,
s.sale_price,
i.item_amount AS starting_amount,
s.sale_amount,
i.item_amount - SUM(sale_amount) OVER(PARTITION BY s.item_code
ORDER BY s.sale_date
ROWS UNBOUNDED PRECEDING
) AS running_sale_amount
FROM #sales AS s
JOIN #items AS i ON s.item_code = i.item_code
),
sales_with_prev_running_total AS ( --Add the previous rows' running total, to assist with the final calculation
SELECT item_code,
sale_date,
sale_price,
starting_amount,
sale_amount,
running_sale_amount,
LAG(running_sale_amount, 1, NULL) OVER(PARTITION BY item_code
ORDER BY sale_date
)AS prev_running_sale_amount
FROM all_sales_with_running_totals
)
SELECT item_code, --Return the final running sale amount for each sale - if the inventory has already run out, return null. If there is insufficient inventory to fill the order, fill it with the qty remaining. Otherwise, fill the entire order.
sale_date,
sale_price,
starting_amount,
sale_amount,
running_sale_amount,
prev_running_sale_amount,
CASE WHEN prev_running_sale_amount <= 0
THEN NULL
WHEN running_sale_amount < 0
THEN prev_running_sale_amount
ELSE sale_amount
END AS result_sale_amount
FROM sales_with_prev_running_total;

Exclude null or zeroes in previous trading days average calculation

I thought I got it, but actually not. Working with some trading data and need to do average stockprice for trading days only. Used the below query for 3 day average; but recently found out there can be dividends on a trading holiday; so for those days in the fact table there is data for the stockcode and closeprice is either zero or null.
Please help me to improve my query to ignore zero and nulls in the 3 preceding trading day's average calculation
select StockCode, datekey, ClosePrice,
AVG(ClosePrice) OVER (partition by StockCode order by datekey
ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING) Avg3Days
from Fact
You can partition by StockCode AND sign(NullIf([ClosePrice],0)) rather than having to know the trading days.
Example
Declare #YourTable Table ([datekey] date,[StockCode] varchar(50),[ClosePrice] money)
Insert Into #YourTable Values
('2019-06-15','xyx',5)
,('2019-06-16','xyx',10)
,('2019-06-17','xyx',NULL)
,('2019-06-18','xyx',0)
,('2019-06-19','xyx',15)
,('2019-06-20','xyx',20)
Select *
,AvgPrice = AVG(ClosePrice) over (partition by StockCode,sign(NullIf([ClosePrice],0)) order By datekey rows between 3 preceding and 1 preceding )
from #YourTable
Order By datekey
Returns
datekey StockCode ClosePrice AvgPrice
2019-06-15 xyx 5.00 NULL
2019-06-16 xyx 10.00 5.00
2019-06-17 xyx NULL NULL
2019-06-18 xyx 0.00 NULL
2019-06-19 xyx 15.00 7.50
2019-06-20 xyx 20.00 10.00
Update
A little uglier, but perhaps something like this
Select *
,AvgPrice = case when sum(1) over (partition by StockCode,sign(NullIf([ClosePrice],0)) order By datekey rows between 3 preceding and 1 preceding ) = 3
then avg(ClosePrice) over (partition by StockCode,sign(NullIf([ClosePrice],0)) order By datekey rows between 3 preceding and 1 preceding )
else null end
from #YourTable
Order By datekey
Returns
Assuming you have a flag that indicates trading days, you can do something like this:
SELECT StockCode, datekey, ClosePrice,
(CASE WHEN isTradingDay = 1
THEN AVG(ClosePrice) OVER (PARTITION BY StockCode, isTradingDay
ORDER BY datekey
ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING
)
END) as Avg3Days
FROM Fact;
This takes the average of the previous three trading days. The value is NULL on non-trading days.
If the StockCode is NULL, it will not be included in the average anyway. If the only indicator is the closePrice, then one method is:
SELECT f.StockCode, f.datekey, f.ClosePrice,
(CASE WHEN v.isTradingDay = 1
THEN AVG(f.ClosePrice) OVER (PARTITION BY f.StockCode, v.isTradingDay
ORDER BY f.datekey
ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING
)
END) as Avg3Days
FROM Fact f CROSS APPLY
(VALUES (CASE WHEN f.ClosePrice > 0 THEN 1 ELSE 0 END)
) v(isTradingDay);
Personally, I would prefer to have an explicit trading day indicator rather than relying on special values of the close price. For instance, trading on a single stock might be suspending for some company-specific reason.
You may want to also have WHERE f.StockCode <> '' to filter out invalid stock codes.

SQL -- Get items responsible for top 50% of sales

I have a table like this:
ITEM_SALES
ITEM_NAME SALES
Item_name_1 5000
...
Item_name_x 3
What I want to get is get the items that represent the top 50% of sales. So for example, if total sales was 10,000, just item_name_1 alone would represent 50% of sales.
I can obviously get total sales with:
select sum(sales) from ITEM_SALES.
...and then divide by 2 to get how many sales 50% of sales is.
However, I don't know how I'd go from there to getting the top items that represent 50% of sales.
You can do this using analytic functions:
select s.*
from (select item_name, sum(sales) as sumsales,
sum(sum(sales)) over (order by sum(sales) desc) as cumesales,
sum(sum(sales)) over () as totsales,
from item_sales
group by item_name
) s
where (cumesales - sumsales) < 0.5 * totsales;
The subquery calculates the sales for each item, as well as two other values:
The cumulative sales, from highest to that item.
The total sales.
The where clause then gets items up to and include the one that passes the 50% threshold.
Oracle Setup:
CREATE TABLE ITEM_SALES ( ITEM_NAME, SALES ) AS
SELECT 'item_name_' || LEVEL, 50 - 3 * (LEVEL - 1)
FROM DUAL
CONNECT BY LEVEL <= 16;
Query:
SELECT *
FROM (
SELECT ITEM_NAME,
SALES,
SUM( SALES ) OVER ( ORDER BY SALES DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS cumulative_sales,
SUM( SALES ) OVER ( ORDER BY SALES DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS total_sales
FROM ITEM_SALES
)
WHERE cumulative_sales <= total_sales/2;
Results:
ITEM_NAME SALES CUMULATIVE_SALES TOTAL_SALES
------------ ----- ---------------- -----------
item_name_1 50 50 440
item_name_2 47 97 440
item_name_3 44 141 440
item_name_4 41 182 440
item_name_5 38 220 440