Fetch conditional rows in SQL server - sql

I need a query like below. ApplicationID and InvoiceNumber columns show purchases made. Negative values in the Revenue rows indicate shopping refund. The ApplicationID column does not change when the purchase is refunded, but the InvoiceNumber column changes for the refund. I determine the returns according to the price totals of different InvoiceNumbers in the same ApplicationID equal to zero. For example, customer A bought 4 products that InvoiceNumber=AA in ApplicationID=11 shopping, but refund 2 of them (InvoiceNumber=BB). I want to get the remaining rows after the refunds are extracted. So in this example, rows 1-2 and 5-6 will eliminate each other for ApplicationID=11 and only rows 3-4 will remain. In addition, ApplicationID=22 and ApplicationID=33 rows will also come as it does not contain refunds. Finally, rows 3,4,7, 8 and 9 will get. How do I do this?
CustomerCode ApplicationID InvoiceNumber Date Revenue
A 11 AA 1.01.2020 150
A 11 AA 2.01.2020 200
A 11 AA 1.01.2020 250
A 11 AA 1.01.2020 300
A 11 BB 5.01.2020 -150
A 11 BB 5.01.2020 -200
A 22 CC 7.02.2020 500
A 22 DD 7.02.2020 700
A 11 AA 2.01.2020 800
I wrote the result I want. I want to subtract zero sum of revenue according to CustomerCode and ApplicationID and fetch all other columns
example code:
select a.CustomerCode,a.ApplicationID from Table a
group by CustomerCode,a.ApplicationID
having SUM(Revenue)>0
My desired result:
CustomerCode ApplicationID InvoiceNumber Date Revenue
A 11 AA 1.01.2020 250
A 11 AA 1.01.2020 300
A 22 CC 7.02.2020 500
A 22 DD 7.02.2020 700
A 11 AA 2.01.2020 800

I think you've gone down a route of needing to sum your results to remove certain rows from your data but that's not necessarily the case.
You can use a LEFT JOIN back to itself joining on CustomerCode, ApplicationID and Revenue = -Revenue; this effectively finds "purchase" rows that have an associated "refund" row (and vice versa). You can then just filter them off with your WHERE clause
Here's the code I used
DROP TABLE IF EXISTS #Orders
CREATE TABLE #Orders (CustomerCode VARCHAR(1), ApplicationID INT, InvoiceNumber VARCHAR(2), [Date] DATE, Revenue INT)
INSERT INTO #Orders (CustomerCode, ApplicationID, InvoiceNumber, Date, Revenue)
VALUES ('A', 11, 'AA', '2020-01-01', 150),
('A', 11, 'AA', '2020-01-02', 200),
('A', 11, 'AA', '2020-01-01', 250),
('A', 11, 'AA', '2020-01-01', 300),
('A', 11, 'BB', '2020-01-05', -150),
('A', 11, 'BB', '2020-01-05', -200),
('A', 22, 'CC', '2020-01-07', 500),
('A', 22, 'DD', '2020-01-07', 700),
('A', 11, 'AA', '2020-01-02', 800)
SELECT O.CustomerCode, O.ApplicationID, O.InvoiceNumber, O.Date, O.Revenue
FROM #Orders AS O
LEFT JOIN #Orders AS O2 ON O2.ApplicationID = O.ApplicationID AND O2.CustomerCode = O.CustomerCode AND O.Revenue = -O2.Revenue
WHERE O2.ApplicationID IS NULL
And this is the output:
CustomerCode ApplicationID InvoiceNumber Date Revenue
A 11 AA 2020-01-01 250
A 11 AA 2020-01-01 300
A 22 CC 2020-01-07 500
A 22 DD 2020-01-07 700
A 11 AA 2020-01-02 800

Related

How can I get cumulative sum using SQL?

I have a table of individual sales, for which I would like to summarize into two columns, with a monthly total in one and a cumulative sum in another.
Company A and Company B are subsidiary company under the same parent company, thus, need to be considered as one for calculating cumulative income.
I tried this code and output is following:
SUM(INCOME) OVER(PARTITION BY COMPANY ORDER BY MONTH ROWS UNBOUNDED PRECEDING) AS CUMULATIVE
Company Month Income Cumulative
Company A 1 20 20
Company B 1 0 20
Company C 1 20 20
Company A 2 20 40
Company B 2 0 40
But I want to return 0 when Company B has 0 income for cumulative
Company Month Income Cumulative
Company A 1 20 20
Company B 1 0 0
Company C 1 20 20
Company A 2 20 40
Company B 2 0 0
How can I return 0 for cumulative when either company A or company B has income of 0?!
You have a bad design for the prupose you want and need
adding an identidy field for sorting purposes and converting your short name into numbers, you can achieve it with window function.
A column for the year is also necessary as you can't sort when you got two janissary for the same company
CREATE TABLE tabl1 (
id_num int IDENTITY(1,1),
Company VARCHAR(9),
Month VARCHAR(3),
Income INTEGER
);
INSERT INTO tabl1
(Company, Month, Income)
VALUES
('Company A', 'Jan', '20'),
('Company B', 'Jan', '0'),
('Company A', 'Feb', '20'),
('Company B', 'Feb', '0'),
('Company C', 'Jan', '20');
5 rows affected
SELECT
Company, Month, Income,
SUM(Income) OVER(PARTITION BY company ORDER BY
CASE WHEN Month= 'Jan' THEN 1
WHEN Month = 'Feb' THEN 2
WHEN Month = 'Dec' THEN 12
END) as Cumulative
FROM
tabl1
ORDER BY id_num
Company
Month
Income
Cumulative
Company A
Jan
20
20
Company B
Jan
0
0
Company A
Feb
20
40
Company B
Feb
0
0
Company C
Jan
20
20
fiddle
So a better would be
CREATE TABLE tabl1 (
id_num int IDENTITY(1,1),
Company VARCHAR(9),
Month VARCHAR(3),
[Year] int,
Income INTEGER
);
INSERT INTO tabl1
(Company, Month,[Year], Income)
VALUES
('Company A', 'Jan',2023, '20'),
('Company B', 'Jan',2023, '0'),
('Company A', 'Feb',2023, '20'),
('Company B', 'Feb',2023, '0'),
('Company C', 'Jan',2023, '20');
5 rows affected
SELECT
Company, Month,[Year], Income,
SUM(Income) OVER(PARTITION BY company ORDER BY
CASE WHEN Month= 'Jan' THEN 1
WHEN Month = 'Feb' THEN 2
WHEN Month = 'Dec' THEN 12
END, YEAR ) as Cumulative
FROM
tabl1
ORDER BY id_num
Company
Month
Year
Income
Cumulative
Company A
Jan
2023
20
20
Company B
Jan
2023
0
0
Company A
Feb
2023
20
40
Company B
Feb
2023
0
0
Company C
Jan
2023
20
20
fiddle

Display Average Billing Amount For Each Customer only between years 2019-2021

QUESTION : Display Average Billing Amount For Each Customer ONLY between YEAR(2019-2021).
If customer doesn't have any billing amount for any of the particular year then consider as 0.
-------: OUTPUT :
Customer_ID | Customer_Name | AVG_Billed_Amount
-------------------------------------------------------------------------
1 | A | 87.00
2 | B | 200.00
3 | C | 183.00
--------: EXPLANATION :
If any customer doesn't have any billing records for these 3 years then we need to consider as one record with billing_amount = 0
Like Customer C doesn't have any record for Year 2020, so for C Average will be
(250+300+0)/3 = 183.33 OR 183.00
TEMP TABLE HAS FOLLOWING DATA
DROP TABLE IF EXISTS #TEMP;
CREATE TABLE #TEMP
(
Customer_ID INT
, Customer_Name NVARCHAR(100)
, Billing_ID NVARCHAR(100)
, Billing_creation_Date DATETIME
, Billed_Amount INT
);
INSERT INTO #TEMP
SELECT 1, 'A', 'ID1', TRY_CAST('10-10-2020' AS DATETIME), 100 UNION ALL
SELECT 1, 'A', 'ID2', TRY_CAST('11-11-2020' AS DATETIME), 150 UNION ALL
SELECT 1, 'A', 'ID3', TRY_CAST('12-11-2021' AS DATETIME), 100 UNION ALL
SELECT 2, 'B', 'ID4', TRY_CAST('10-11-2019' AS DATETIME), 150 UNION ALL
SELECT 2, 'B', 'ID5', TRY_CAST('11-11-2020' AS DATETIME), 200 UNION ALL
SELECT 2, 'B', 'ID6', TRY_CAST('12-11-2021' AS DATETIME), 250 UNION ALL
SELECT 3, 'C', 'ID7', TRY_CAST('01-01-2018' AS DATETIME), 100 UNION ALL
SELECT 3, 'C', 'ID8', TRY_CAST('05-01-2019' AS DATETIME), 250 UNION ALL
SELECT 3, 'C', 'ID9', TRY_CAST('06-01-2021' AS DATETIME), 300
-----------------------------------------------------------------------------------
Here, 'A' has 3 transactions - TWICE in year 2020(100+150) and 1 in year 2021(100), but none in 2019(SO, Billed_Amount= 0).
so the average will be calculated as (100+150+100+0)/4
DECLARE #BILL_dATE DATE = (SELECT Billing_creation_date from #temp group by customer_id, Billing_creation_date) /*-- THIS THROWS ERROR AS #BILL_DATE WON'T ACCEPT MULTIPLE VALUES.*/
OUTPUT should look like this:
Customer_ID
Customer_Name
AVG_Billed_Amount
1
A
87.00
2
B
200.00
3
C
183.00
You just need a formula to count the number of missing years.
That's 3 - COUNT(DISTINCT YEAR(Billing_creation_Date)
Then the average = SUM() / (COUNT() + (3 - COUNT(DISTINCT YEAR)))...
SELECT
Customer_ID,
Customer_Name,
SUM(Billed_Amount) * 1.0
/
(COUNT(*) + 3 - COUNT(DISTINCT YEAR(Billing_creation_Date)))
AS AVG_Billed_amount
FROM
#temp
WHERE
Billing_creation_Date >= '2019-01-01'
AND Billing_creation_Date < '2022-01-01'
GROUP BY
Customer_ID,
Customer_Name
Demo : https://dbfiddle.uk/ILcfiGWL
Note: The WHERE clause in another answer here would cause a scan of the table, due to hiding the filtered column behind a function. The way I've formed the WHERE clause allows a "Range Seek" if the column is in an index.
Here is a query that can do that :
select s.Customer_ID, s.Customer_Name, sum(Billed_amount)/ ( 6 - count(1)) as AVG_Billed_Amount from (
select Customer_ID, Customer_Name, sum(Billed_Amount) as Billed_amount
from TEMP
where year(Billing_creation_Date) between 2019 and 2021
group by Customer_ID, year(Billing_creation_Date)
) as s
group by Customer_ID;
According to your description the customer_name C will be 137.5000 not 183.00 since 2018 is not counted and 2020 is not there.

Cumulative sum of a column

I have a table that has the below data.
COUNTRY LEVEL NUM_OF_DUPLICATES
US 9 6
US 8 24
US 7 12
US 6 20
US 5 39
US 4 81
US 3 80
US 2 430
US 1 178
US 0 430
I wrote a query that will calculate the sum of cumulative rows and got the below output .
COUNTRY LEVEL NUM_OF_DUPLICATES POOL
US 9 6 6
US 8 24 30
US 7 12 42
US 6 20 62
US 5 39 101
US 4 81 182
US 3 80 262
US 2 130 392
US 1 178 570
US 0 254 824
Now I want to to filter the data and take only where the POOL <=300, if the POOL field does not have the value 300 then I should take the first value after 300. So, in the above example we do not have the value 300 in the field POOL, so we take the next immediate value after 300 which is 392. So I need a query so that I can pull the records POOL <= 392(as per the example above) which will yield me the output as
COUNTRY LEVEL NUM_OF_DUPLICATES POOL
US 9 6 6
US 8 24 30
US 7 12 42
US 6 20 62
US 5 39 101
US 4 81 182
US 3 80 262
US 2 130 392
Please let me know your thoughts. Thanks in advance.
declare #t table(Country varchar(5), Level int, Num_of_Duplicates int)
insert into #t(Country, Level, Num_of_Duplicates)
values
('US', 9, 6),
('US', 8, 24),
('US', 7, 12),
('US', 6, 20),
('US', 5, 39),
('US', 4, 81),
('US', 3, 80),
('US', 2, 130/*-92*/),
('US', 1, 178),
('US', 0, 430);
select *, sum(Num_of_Duplicates) over(partition by country order by Level desc),
(sum(Num_of_Duplicates) over(partition by country order by Level desc)-Num_of_Duplicates) / 300 as flag,--any row which starts before 300 will have flag=0
--or
case when sum(Num_of_Duplicates) over(partition by country order by Level desc)-Num_of_Duplicates < 300 then 1 else 0 end as startsbefore300
from #t;
select *
from
(
select *, sum(Num_of_Duplicates) over(partition by country order by Level desc) as Pool
from #t
) as t
where Pool - Num_of_Duplicates < 300 ;
The logic here is quite simple:
Calculate the running sum POOL value up to the current row.
Filter rows so that the previous row's total is < 300, you can either subtract the current row's value, or use a second sum
If the total up to the current row is exactly 300, the previous row will be less, so this row will be included
If the current row's total is more than 300, but the previous row is less then it will also be included
All higher rows are excluded
It's unclear what ordering you want. I've used NUM_OF_DUPLICATES column ascending, but you may want something else
SELECT
COUNTRY,
LEVEL,
NUM_OF_DUPLICATES,
POOL
FROM (
SELECT *,
POOL = SUM(NUM_OF_DUPLICATES) OVER (ORDER BY NUM_OF_DUPLICATES ROWS UNBOUNDED PRECEDING)
-- alternative calculation
-- ,POOLPrev = SUM(NUM_OF_DUPLICATES) OVER (ORDER BY NUM_OF_DUPLICATES ROWS UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM YourTable
) t
WHERE POOL - NUM_OF_DUPLICATES < 300;
-- you could also use POOLPrev above
I used two temp tables to get the answer.
DECLARE #t TABLE(Country VARCHAR(5), [Level] INT, Num_of_Duplicates INT)
INSERT INTO #t(Country, Level, Num_of_Duplicates)
VALUES ('US', 9, 6),
('US', 8, 24),
('US', 7, 12),
('US', 6, 20),
('US', 5, 39),
('US', 4, 81),
('US', 3, 80),
('US', 2, 130),
('US', 1, 178),
('US', 0, 254);
SELECT
Country
,Level
, Num_of_Duplicates
, SUM (Num_of_Duplicates) OVER (ORDER BY id) AS [POOL]
INTO #temp_table
FROM
(
SELECT
Country,
level,
Num_of_Duplicates,
ROW_NUMBER() OVER (ORDER BY country) AS id
FROM #t
) AS A
SELECT
[POOL],
ROW_NUMBER() OVER (ORDER BY [POOL] ) AS [rank]
INTO #Temp_2
FROM #temp_table
WHERE [POOL] >= 300
SELECT *
FROM #temp_table WHERE
[POOL] <= (SELECT [POOL] FROM #Temp_2 WHERE [rank] = 1 )
DROP TABLE #temp_table
DROP TABLE #Temp_2

SQL to find inventory

I am very new to SQL and I have a the following situation:
BTW, "qoh" = "quantity on hand"
product table:
prod_code
qoh
11QER/31
8
13-Q2/P2
32
14-Q1/L3
18
1546-QQ2
15
1558-QW1
23
2232/QTY
8
2232/QWE
6
2238/QPD
12
23109-HB
23
23114-AA
8
54778-2T
43
89-WRE-Q
11
PVC23DRT
188
SM-18277
172
SW-23116
237
WR3/TT3
18
invoiced detail table:
prod_code
quantity-sold
13-Q2/P2
1
23109-HB
1
54778-2T
2
2238/QPD
1
1546-QQ2
1
13-Q2/P2
5
54778-2T
3
23109-HB
2
PVC23DRT
12
SM-18277
3
2232/QTY
1
23109-HB
1
89-WRE-Q
1
13-Q2/P2
2
54778-2T
1
PVC23DRT
5
WR3/TT3
3
23109-HB
1
I want to write a SQL to find out the current inventory, QOH in the product table minus the related item quantity-sold:
prod_code
quantity-sold
13-Q2/P2
0 (8 - 1 - 5 -2)
54778-2T
37 (43 - 2 - 3 - 1)
I tried to use join tables etc and the results did not come out right.
How to write the SQL?
Much appreciated!
Philip
A correlated subquery is a simple what to write this:
select p.*,
p.qoh - (select coalesce(sum(id.quantity_sold), 0)
from invoice_detail id
where id.prod_code = p.prod_code
) as current_inventory
from product p;
It is not clear why your result set has only two products. This version would include all products. And with in index on invoice_detail(prod_code, quantity_sold) should be the fastest method to do what you want.
SELECT
product.prod_code,
prod_code.qoh,
COALESCE( qtySold.total_sold, 0 ) AS qty_sold,
( prod_code.qoh - COALESCE( qtySold.total_sold, 0 ) ) AS current_inventory
FROM
product
LEFT OUTER JOIN
(
SELECT
prod_code,
SUM( quantity_sold ) AS total_sold
FROM
invoiced_detail
GROUP BY
prod_code
) AS qtySold ON
product.prod_code = qtySold.prod_code
ORDER BY
product.prod_code
A left join may be used if you summarized the total quantity sold first and used this as a subquery to join on. See example with your sample data below:
Example
Schema (SQLite v3.30)
CREATE TABLE products (
`prod_code` VARCHAR(8),
`qoh` INTEGER
);
INSERT INTO products
(`prod_code`, `qoh`)
VALUES
('11QER/31', '8'),
('13-Q2/P2', '32'),
('14-Q1/L3', '18'),
('1546-QQ2', '15'),
('1558-QW1', '23'),
('2232/QTY', '8'),
('2232/QWE', '6'),
('2238/QPD', '12'),
('23109-HB', '23'),
('23114-AA', '8'),
('54778-2T', '43'),
('89-WRE-Q', '11'),
('PVC23DRT', '188'),
('SM-18277', '172'),
('SW-23116', '237'),
('WR3/TT3', '18');
CREATE TABLE invoice_details (
`prod_code` VARCHAR(8),
`quantity-sold` INTEGER
);
INSERT INTO invoice_details
(`prod_code`, `quantity-sold`)
VALUES
('13-Q2/P2', '1'),
('23109-HB', '1'),
('54778-2T', '2'),
('2238/QPD', '1'),
('1546-QQ2', '1'),
('13-Q2/P2', '5'),
('54778-2T', '3'),
('23109-HB', '2'),
('PVC23DRT', '12'),
('SM-18277', '3'),
('2232/QTY', '1'),
('23109-HB', '1'),
('89-WRE-Q', '1'),
('13-Q2/P2', '2'),
('54778-2T', '1'),
('PVC23DRT', '5'),
('WR3/TT3', '3'),
('23109-HB', '1');
Query #1
SELECT
p.prod_code,
p.qoh - IFNULL( t1.total_quantity_sold,0) as current_inventory
FROM
products p
LEFT JOIN
(SELECT
id.`prod_code`,
sum(id.`quantity-sold`) as total_quantity_sold
FROM
invoice_details id
GROUP BY
id.prod_code
) t1 ON p.prod_code = t1.prod_code;
prod_code
current_inventory
11QER/31
8
13-Q2/P2
24
14-Q1/L3
18
1546-QQ2
14
1558-QW1
23
2232/QTY
7
2232/QWE
6
2238/QPD
11
23109-HB
18
23114-AA
8
54778-2T
37
89-WRE-Q
10
PVC23DRT
171
SM-18277
169
SW-23116
237
WR3/TT3
15
View on DB Fiddle

How to pick the latest row

I have two tables, namely Price List (Table A) and Order Record (Table B):-
Table A
SKU Offer Date Amt
AAA 20120115 22
AAA 20120223 24
AAA 20120331 25
AAA 20120520 28
Table B
A001 AAA 20120201
B001 AAA 20120410
C001 AAA 20120531
I have to retrieve the latest pricing for each customer. The expected output should be like this:-
Customer SKU Order Date Amt
A001 AAA 20120201 28
B001 AAA 20120410 28
C001 AAA 20120531 28
Thanks.
Here is T-SQL - not sure what you are running, add that as a tag in your questions for better answers - Wrote this before the edit of the OP, so double check the cols.
EDITED per x-zeros' comment
SELECT B.CUSTOMER,S.SKU,B.ORDERDATE,S.Amt
FROM TABLE_B B
INNER JOIN
( SELECT C.SKU,C.OFFERDATE,C.Amt,
ROW_NUMBER() OVER (PARTITION BY C.SKU ORDER BY C.OFFERDATE DESC) X
FROM TABLE_A C
)S ON S.X = 1 AND B.SKU = S.SKU
ORDER BY B.CUSTOMER
CREATE TABLE TABLE_A
(SKU varchar(8), OfferDate Date, Amt int)
INSERT INTO TABLE_A
VALUES('AAA', '2012-01-15', 22),
('AAA' ,'2012-02-23', 24),
('AAA' ,'2012-03-31', 25),
('AAA' ,'2012-05-20', 28),
('BBB','2011-01-15 00:00:00.000', 33),
('BBB','2011-02-23 00:00:00.000', 35),
('BBB','2011-03-31 00:00:00.000', 36),
('BBB','2011-05-20 00:00:00.000', 39),
('CCC', '2012-01-15', 43),
('CCC' ,'2012-02-23', 45),
('CCC' ,'2012-03-31', 47),
('CCC' ,'2012-04-18', 44)
CREATE TABLE TABLE_B
(CUSTOMER varchar(8),SKU varchar(8), OrderDate Date)
INSERT INTO TABLE_B
VALUES('A001','AAA','2012-02-01'),
('B001','AAA','2012-04-10'),
('C001','AAA','2012-05-31'),
('A001','BBB','2011-02-01'),
('B001','BBB','2011-04-10'),
('C001','BBB','2011-05-31'),
('B001','CCC','2011-04-10'),
('C001','CCC','2011-05-31')