Calculate balance amount in redshift - sql

DB-Fiddle
CREATE TABLE inventory (
id SERIAL PRIMARY KEY,
stock_date DATE,
inbound_quantity INT,
outbound_quantity INT
);
INSERT INTO inventory
(stock_date, inbound_quantity, outbound_quantity
)
VALUES
('2020-01-01', '900', '0'),
('2020-01-02', '0', '300'),
('2020-01-03', '400', '250'),
('2020-01-04', '0', '100'),
('2020-01-05', '700', '500');
Expected Output:
stock_date
inbound_quantity
outbound_quantity
balance
2020-01-01
900
0
900
2020-01-02
0
300
600
2020-01-03
400
250
750
2020-01-04
0
100
650
2020-01-05
700
500
850
Query:
SELECT
iv.stock_date AS stock_date,
iv.inbound_quantity AS inbound_quantity,
iv.outbound_quantity AS outbound_quantity,
SUM(iv.inbound_quantity - iv.outbound_quantity) OVER (ORDER BY stock_date ASC) AS Balance
FROM inventory iv
GROUP BY 1,2,3
ORDER BY 1;
With the above query I am able to calculate the balance of inbound_quantity and outbound_quantity in PostgresSQL.
However, when I run the same query in Amazon-Redshift I get this error:
Amazon Invalid operation: Aggregate window functions with an ORDER BY clause require a frame clause;
1 statement failed.
How do I need to change the query to also make it work in Redshift?

As the error speaks, you need to add the frame specification clause, namely the ROWS UNBOUNDED PRECEDING into your window function.
SELECT iv.stock_date AS stock_date,
iv.inbound_quantity AS inbound_quantity,
iv.outbound_quantity AS outbound_quantity,
SUM(iv.inbound_quantity - iv.outbound_quantity) OVER (
ORDER BY stock_date ASC
ROWS UNBOUNDED PRECEDING
) AS Balance
FROM inventory iv
GROUP BY 1,2,3
ORDER BY 1;

Related

Calculate balance amount per prodcuct in redshift

DB-Fiddle
CREATE TABLE inventory (
id SERIAL PRIMARY KEY,
stock_date DATE,
product VARCHAR(255),
inbound_quantity INT,
outbound_quantity INT
);
INSERT INTO inventory
(stock_date, product, inbound_quantity, outbound_quantity
)
VALUES
('2020-01-01', 'Product_A', '900', '0'),
('2020-01-02', 'Product_A', '0', '300'),
('2020-01-03', 'Product_A', '400', '250'),
('2020-01-04', 'Product_A', '0', '100'),
('2020-01-05', 'Product_A', '700', '500'),
('2020-01-03', 'Product_B', '850', '0'),
('2020-01-08', 'Product_B', '100', '120'),
('2020-02-20', 'Product_B', '0', '360'),
('2020-02-25', 'Product_B', '410', '230'),
Expected Result:
stock_date
product
inbound_quantity
outbound_quantity
balance
2020-01-01
Product_A
900
0
900
2020-01-02
Product_A
0
300
600
2020-01-03
Product_A
400
250
750
2020-01-04
Product_A
0
100
650
2020-01-05
Product_A
700
500
850
2020-01-03
Product_B
740
0
740
2020-01-08
Product_B
100
120
720
2020-02-20
Product_B
0
360
360
2020-02-25
Product_B
410
230
540
2020-03-09
Product_B
290
0
830
I want to calculate the balance per product.
So far I have been able to develop this query below but it does not work.
I get error window "product" does not exist.
SELECT
iv.stock_date AS stock_date,
iv.product AS product,
iv.inbound_quantity AS inbound_quantity,
iv.outbound_quantity AS outbound_quantity,
SUM(iv.inbound_quantity - iv.outbound_quantity) OVER
(product ORDER BY stock_date ASC ROWS UNBOUNDED PRECEDING) AS Balance
FROM inventory iv
GROUP BY 1,2,3,4
ORDER BY 2,1;
How do I need to modify the query to make it work?
You are almost there
You should add partition by in front of product
SELECT
iv.stock_date AS stock_date,
iv.product AS product,
iv.inbound_quantity AS inbound_quantity,
iv.outbound_quantity AS outbound_quantity,
SUM(iv.inbound_quantity - iv.outbound_quantity) OVER
(partition by product ORDER BY stock_date ASC ROWS UNBOUNDED PRECEDING) AS Balance
FROM inventory iv
GROUP BY 1,2,3,4
ORDER BY 2,1;
So, it should be like this
The error message "product" does not exist is because you are trying to reference the column "product" in the OVER clause, but it is not included in the GROUP BY clause.
To fix this issue, you will need to include the "product" column in the GROUP BY clause, and also add a partition by clause to the OVER clause, so that the SUM function will calculate the balance per product.
Try this query:
SELECT
iv.stock_date AS stock_date,
iv.product AS product,
iv.inbound_quantity AS inbound_quantity,
iv.outbound_quantity AS outbound_quantity,
SUM(iv.inbound_quantity - iv.outbound_quantity) OVER
(PARTITION BY product ORDER BY stock_date ASC ROWS UNBOUNDED PRECEDING) AS Balance
FROM inventory iv
GROUP BY 1,2,3,4
ORDER BY 2,1;
This way, SUM function will only sum the inbound_quantity - outbound_quantity for each product, and not for all the products.
The query will return the expected result.

Generate billing balance depending on the number of payments made

Below is the table I have created and inserted values in it:
CREATE TABLE Invoices
(
InvID int,
InvAmount int
)
GO
INSERT INTO Invoices
VALUES (1, 543), (2, 749)
CREATE TABLE payments
(
PayID int IDENTITY (1, 1),
InvID int,
PayAmount int,
PayDate date
)
INSERT INTO payments
VALUES (1, 20, '2016-01-01'),
(1, 35, '2016-01-07'),
(1, 78, '2016-01-13'),
(1, 52, '2016-01-25'),
(2, 40, '2016-01-03'),
(2, 54, '2016-01-15'),
(2, 63, '2016-01-17'),
(2, 59, '2016-01-28')
SELECT * FROM Invoices
SELECT * FROM payments
As shown in the screenshot above, the Invoice table specifies various customer billings (the first billing totals 543, the second billing totals 749).
As shown in the screenshot above, the payments table specifies the various payments the customer made for each of the billings. For example, one can see that on January 1st the customer paid 20 USD out of billing no. 1 (which totals 543 USD), and on January 3rd the customer paid 40 USD out of billing no. 2, (which totals 749 USD).
Now the question is:
Write a query that displays the billing balance, based on the number of payments made so far.
The query result should exactly look like the screenshot below:
This is what I have tried:
SELECT
payments.InvID,
InvAmount - SUM(PayAmount) OVER (PARTITION BY payments.InvID ORDER BY PayID
ROWS BETWEEN UNBOUNDED PRECEDING AND 0 FOLLOWING) AS 'InvAmount',
PayDate, PayAmount,
InvAmount - SUM(PayAmount) OVER (PARTITION BY payments.InvID ORDER BY PayID) AS 'Balance'
FROM
Invoices
JOIN
payments ON payments.InvID = Invoices.InvID
After running the query, I got the following result which is shown below:
As you can see from the screenshot above, I nearly got the result I wanted.
The only problem is that InvAmount is exactly returning the same row values as Balance. I am not able to retain the starting row values of InvAmount which are 543 (InvID = 1) and 749 (InvID = 2) respectively.
How can this issue be solved?
You can add back the PayAmount in the calculation
InvAmount
+ PayAmount
- SUM(PayAmount) OVER (PARTITION BY payments.InvID
ORDER BY PayID
ROWS BETWEEN UNBOUNDED PRECEDING
AND 0 FOLLOWING) AS InvAmount
Or use BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING. But you need to handle NULL value for the very first row
InvAmount
- ISNULL(SUM(PayAmount) OVER (PARTITION BY payments.InvID
ORDER BY PayID
ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING), 0) AS InvAmount
db>fiddle demo
You can use PARTITION BY clause for invoice identifier and then deduct the pay amount from the invoice amount as given below
;WITH CTE_Balance as
(
SELECT i.InvID, i.InvAmount, p.PayAmount, p.PayDate
, InvAmount - sum(payamount) over (partition by p.invid order by paydate rows between unbounded preceding and current row) as balance
, ROW_NUMBER() over(partition by p.invid order by p.paydate) as rnk
FROM payments as p
inner join Invoices as i
on i.InvID = p.InvID
)
SELECT invid, case when rnk =1 then invamount else lag(balance) over(partition by invid order by paydate) end as invamount
,payAmount, paydate, balance
FROM CTE_Balance
invid
invamount
payAmount
paydate
balance
1
543
20
2016-01-01
523
1
523
35
2016-01-07
488
1
488
78
2016-01-13
410
1
410
52
2016-01-25
358
2
749
40
2016-01-03
709
2
709
54
2016-01-15
655
2
655
63
2016-01-17
592
2
592
59
2016-01-28
533

Filter Table results Self Join

Imagine a large table that contains receipt information. Since it holds so much data, you are required to return a subset of the data, excluding or consolidating rows where possible.
Here is the SQL and results table showing how the data should be returned.
create table table1
(RecieptNo smallint, Customer varchar(10), ReceiptDate date,
ItemDesc varchar(10), Amount smallint)
insert into table1 values
(100, 'Matt','2022-01-05','Ball', 10),
(101, 'Mark','2022-01-07','Hat', 20),
(101, 'Mark','2022-01-07','Jumper', -20),
(101, 'Mark','2022-01-14','Spoon', 30),
(102, 'Luke','2022-01-15','Fork', 15),
(102, 'Luke','2022-01-17','Spork', -10),
(103, 'John','2022-01-20','Orange', 13),
(103, 'John','2022-01-25','Pear', 12)
If there are rows on the same receipt where the negative and positive values cancel out, do not return either row.
If there is a receipt with a negative amount not exceeding positive amount, the negative amount should be deducted from positive line.
RecieptNo
Customer
ReceiptDate
ItemDesc
Amount
100
Matt
2022-01-05
Ball
10
101
Mark
2022-01-14
Spoon
30
102
Luke
2022-01-15
Fork
5
103
John
2022-01-20
Orange
13
103
John
2022-01-25
Pear
12
This is proving tricky, any ideas?
Based on table you provided, I suppose you want only row with the earliest date when you have multiple rows with same receipts which bring positive Amount after deduction.
;WITH cte AS (
SELECT *
, SUM( amount) OVER (PARTITION BY RecieptNo ORDER BY RecieptNo, ReceiptDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS ActualAmount
, ROW_NUMBER() OVER (PARTITION BY RecieptNo ORDER BY RecieptNo, ReceiptDate) AS rn
FROM table1)
SELECT RecieptNo, Customer, ReceiptDate, ItemDesc, ActualAmount
FROM cte
WHERE ActualAmount > 0 AND rn = 1
Read about window functions and cte's though.

How to distribute sales with partitions

I have 2 tables:
1st table columns: ItemCode int, Amount float (I have over 1000 ItemCodes)
2nd table columns: ItemCode int, SoldAmount float, Price float (I have over 10000 sale rows for different items)
Example:
ItemId 1528's Amount in 1st table is 244. That items sales in the 2nd table is as below:
Sale 1 Amount = 120, Price = 10
Sale 2 Amount = 120, Price = 30
Sale 3 Amount = 100, Price = 20
Sale 4 Amount = 10, Price = 25
ItemCode
Amount
1528
244
1530
150
ItemCode
Date
Amount
Price
1528
2021.11.01
120
10
1530
2021.10.01
120
30
1528
2021.09.01
100
20
1530
2021.08.01
10
25
Tried cursor and loop , but no desired output.
The desired outcome is to distribute that 100 amount with the sales above like following:
Sale 1 Amount 60: 100 - 60 = 40 with price 5 --- So we continue to the next row and subtract whatever is left
Sale 2 Amount 30: 40 - 30 = 10 with price 6 --- So we continue to the next row and subtract whatever is left
Sale 3 Amount 20: 10 - 20 = -10 with price 7 --- So we stop here as the amount is equal to 0 or below .
As the result we should get this:
60 * 5 = 300
30 * 6 = 180
10 * 7 = 70 (that 10 is derived from whatever could be subtracted before it hits 0)
Desired table as below
ItemCode
Date
Amount
Price
1528
2021.11.01
120
10
1528
2021.10.01
120
30
1528
2021.09.01
4
20
My last attempt was as below
WITH CTE AS (
SELECT ItemCode, SUM(Amount) AS Amount
FROM table 1
GROUP BY STOCKREF )
SELECT *,
IIF(LAG(table1.Amount - table2.amount) OVER (PARTITION BY table1.Amount ORDER BY Date DESC) IS NULL, table1.Amount - table2.amount,
LAG(table1.Amount - table2.amount) OVER (PARTITION BY CTE.ItemCode ORDER BY Date DESC) - table2.AMOUNT) AS COL
FROM CTE JOIN (SELECT ItemCode, DATE_, AMOUNT, PRICE FROM table2) table 2 ON table1.ItemCode = table2.Amount
Hopefully this addresses the right question - if you're trying to create a running total per item_code, deducting the sale quantity from starting inventory from first-to-last sale, maybe this would work:
CREATE TABLE #items (item_code INT,
item_amount INT);
INSERT INTO #items (item_code, item_amount)
VALUES (1528, 244);
INSERT INTO #items (item_code, item_amount)
VALUES (1529, 240);
CREATE TABLE #sales (item_code INT,
sale_date DATE,
sale_amount INT,
sale_price DECIMAL(12,2));
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-12-01', 50, 5);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-11-29', 120, 6.76292);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-11-15', 120, 6.6453);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1528, '2021-11-01', 100, 6.96875);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-30', 48, 7.2);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-18', 48, 3.5);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-09', 96, 3.9);
INSERT INTO #sales (item_code, sale_date, sale_amount, sale_price)
VALUES (1529, '2021-11-05', 96, 3.75);
;WITH all_sales_with_running_totals AS ( --Calculate the running total of each item, deducting sale amount from total starting amount, in order of first sale to last
SELECT s.item_code,
s.sale_date,
s.sale_price,
i.item_amount AS starting_amount,
s.sale_amount,
i.item_amount - SUM(sale_amount) OVER(PARTITION BY s.item_code
ORDER BY s.sale_date
ROWS UNBOUNDED PRECEDING
) AS running_sale_amount
FROM #sales AS s
JOIN #items AS i ON s.item_code = i.item_code
),
sales_with_prev_running_total AS ( --Add the previous rows' running total, to assist with the final calculation
SELECT item_code,
sale_date,
sale_price,
starting_amount,
sale_amount,
running_sale_amount,
LAG(running_sale_amount, 1, NULL) OVER(PARTITION BY item_code
ORDER BY sale_date
)AS prev_running_sale_amount
FROM all_sales_with_running_totals
)
SELECT item_code, --Return the final running sale amount for each sale - if the inventory has already run out, return null. If there is insufficient inventory to fill the order, fill it with the qty remaining. Otherwise, fill the entire order.
sale_date,
sale_price,
starting_amount,
sale_amount,
running_sale_amount,
prev_running_sale_amount,
CASE WHEN prev_running_sale_amount <= 0
THEN NULL
WHEN running_sale_amount < 0
THEN prev_running_sale_amount
ELSE sale_amount
END AS result_sale_amount
FROM sales_with_prev_running_total;

Assign total value of month to each day of month

DB-Fiddle
CREATE TABLE sales (
id SERIAL PRIMARY KEY,
country VARCHAR(255),
sales_date DATE,
sales_volume DECIMAL,
fix_costs DECIMAL
);
INSERT INTO sales
(country, sales_date, sales_volume, fix_costs
)
VALUES
('DE', '2020-01-03', '500', '0'),
('FR', '2020-01-03', '350', '0'),
('None', '2020-01-31', '0', '2000'),
('DE', '2020-02-15', '0', '0'),
('FR', '2020-02-15', '0', '0'),
('None', '2020-02-29', '0', '5000'),
('DE', '2020-03-27', '180', '0'),
('FR', '2020-03-27', '970', '0'),
('None', '2020-03-31', '0', '4000');
Expected Result:
sales_date | country | sales_volume | fix_costs
--------------|-------------|-------------------|-----------------
2020-01-03 | DE | 500 | 2000
2020-01-03 | FR | 350 | 2000
2020-02-15 | DE | 0 | 5000
2020-02-15 | FR | 0 | 5000
2020-03-27 | DE | 180 | 4000
2020-03-27 | FR | 970 | 4000
As you can see in my table I have a total of fix_costs assigned to the last day of each month.
In my results I want to assign this total of fix_costs to each day of the month.
Therefore, I tried to go with this query:
SELECT
s.sales_date,
s.country,
s.sales_volume,
f.fix_costs
FROM sales s
JOIN
(SELECT
((date_trunc('MONTH', sales_date) + INTERVAL '1 MONTH - 1 DAY')::date) AS month_ld,
SUM(fix_costs) AS fix_costs
FROM sales
WHERE country = 'None'
GROUP BY month_ld) f ON f.month_ld = LAST_DAY(s.sales_date)
WHERE country <> 'None'
GROUP BY 1,2,3;
For this query I get an error on the LAST_DAY(s.sales_date) since this expression does not exist in PostgresSQL.
However, I have no clue how I can replace it correctly in order to get the expected result.
Can you help me?
(MariaDB Fiddle as comparison)
demos:db<>fiddle
SELECT
s1.sales_date,
s1.country,
s1.sales_volume,
s2.fix_costs
FROM sales s1
JOIN sales s2 ON s1.country <> 'None' AND s2.country = 'None'
AND date_trunc('month', s1.sales_date) = date_trunc('month', s2.sales_date)
You need a natural self-join. Join conditions are:
First table without None records (s1.country <> 'None')
Second table only None records (s2.country = 'None')
Date: Only consider year and month part, ignore days. This can be achieved by normalizing the dates of both tables to the first of the month by using date_trunc(). So, e.g. '2020-02-15' results in '2020-02-01' and '2020-02-29' results in '2020-02-01' too, which works well as comparision and join condition.
Alternatively:
SELECT
*
FROM (
SELECT
sales_date,
country,
sales_volume,
SUM(fix_costs) OVER (PARTITION BY date_trunc('month', sales_date)) as fix_costs
FROM sales
) s
WHERE country <> 'None'
You can use the SUM() window function over the group of date_trunc() as described above. Then you need filter the None records afterwards
If I understand correctly, use window functions:
select s.*,
sum(fix_costs) over (partition by date_trunc(sales_date)) as month_fixed_costs
from sales;
Note that this assumes that fixed costs are NULL or 0 on other days -- which is true for the data in the question.