Get NULL value when using an aggregate function - sql

Here is the tables:
https://dbfiddle.uk/markdown?rdbms=sqlserver_2019&fiddle=effc94afe681b2dfdb3e2c02c2b005ea
I want to find the average Total Amount for last 3 values (I mean the last 3 OrderID) for each customer. If customer doesn't have 3 operation, result should be null.
Here is my answer (T-SQL):
SELECT s.CustomerID,avg(s.TotalAmount) as AverageofLast3_operation
FROM (SELECT OrderID, CustomerID, EventDate, TotalAmount,
ROW_NUMBER() over (partition by CustomerID ORDER BY OrderID asc) as Row_num
FROM CustomerOperation
)s
WHERE s.Row_num>3
GROUP BY CustomerID
And the result is:
CustomerID
AverageofLast3_operation
1
7833
2
1966
According to the question, I should also have a row like this:
CustomerID
AverageofLast3_operation
3
NULL
How can I achieve this with T-SQL?

You need conditional aggregation:
SELECT CustomerID,
AVG(CASE WHEN counter >= 3 THEN TotalAmount END) AS AverageofLast3_operation
from (
SELECT OrderID, CustomerID, EventDate, TotalAmount,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderID DESC) AS Row_num,
COUNT(*) OVER (PARTITION BY CustomerID) counter
FROM CustomerOperation
) s
WHERE Row_num <= 3
GROUP BY CustomerID;
Or:
SELECT CustomerID,
CASE WHEN COUNT(*) = 3 THEN AVG(TotalAmount) END AS AverageofLast3_operation
from (
SELECT OrderID, CustomerID, EventDate, TotalAmount,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderID DESC) AS Row_num
FROM CustomerOperation
) s
WHERE Row_num <= 3
GROUP BY CustomerID;
See the demo.

You can use a conditional average like so:
with t as (
select customerId,
case when
Row_Number() over(partition by customerid order by orderid desc) <=3 then totalamount
else 0 end TotalAmount,
Count(*) over (partition by customerid) cnt
from CustomerOperation
)
select customerId, Avg(case when cnt>=3 then totalamount end) as Average
from t
where totalAmount>0
group by CustomerId

Related

Query for parents where all children have a pair/duplicate

I'm looking for direction on a query to get ID's where, for each transaction, return TransDd's where all children (could say product,qty,price) have a pair / duplicate value. Example here:
TransID Product QTY Price
1 a 2 1.0
1 a 2 1.0
1 b 3 2.5
2 a 1 1.0
2 a 1 1.0
2 b 2 2.0
2 b 2 2.0
3 a 5 2.0
3 a 4 3.0
4 a 1 2.0
4 a 1 2.0
4 b 2 2.0
4 b 2 2.0
4 c 1 1.0
In this example, only transID 2 would be returned.
so far, I'm stuck along the lines of
select transid, product, qty, price
, row_number() over (partition by transid, product, qty, price order by transID desc) rk
from x
But I think I'm on the wrong track there. Appreciate any direction.
You can do this using count() instead of row_number():
select transid
from (select x.*,
count(*) over (partition by transid, product, qty, price) as cnt
from x
) x
group by transid
having min(cnt) > 1;
However, that is sort of overkill, you could also use group by in the subquery:
select transid
from (select transid, product, qty, price, count(*) as cnt
from x
group by transid, product, qty, price
) x
group by transid
having min(cnt) > 1;
If I understand correctly, this should get you the answer you want:
CREATE TABLE dbo.SampleData (TransID int, Product char(1), Qty int, Price decimal(2,1));
INSERT INTO dbo.SampleData (TransID,
Product,
Qty,
Price)
VALUES (1,'a',2,1.0),
(1,'a',2,1.0),
(1,'a',2,1.0),
(1,'b',3,2.5),
(2,'a',1,1.0),
(2,'a',1,1.0),
(2,'b',2,2.0),
(2,'b',2,2.0),
(3,'a',5,2.0),
(3,'a',4,3.0);
WITH Counts AS (
SELECT TransID,Product,Qty,
COUNT(*) AS Dups
FROM dbo.SampleData
GROUP BY TransID, Product, Qty)
SELECT TransID
FROM Counts
GROUP BY TransID
HAVING MIN(Dups) >= 2;
DROP TABLE dbo.SampleData;
Use NOT EXISTS and check for IDs where there does not EXIST a row that doesn't have a duplicate.
select TransID
from table
except
select TransID
from table
group by TransID, Product, QTY, Price
having count(*) = 1
Try this query:
select transid,
product,
qty,
price
from (
select transid,
product,
qty,
price,
count(*) over (partition by transid, product) cntproduct,
count(*) over (partition by transid, qty) cntqty,
count(*) over (partition by transid, price) cntprice
from my_table
) a where cntprice > 1 and cntproduct > 1 and cntqty > 1
SELECT TransID FROM
(
SELECT COUNT(*) AS Count, TransID, Product, QTY, Price
FROM x
GROUP BY TransID, Product, QTY, Price
HAVING Count = 2
) AS Table1
NOT IN
SELECT TransID FROM
(
SELECT COUNT(*) AS Count, TransID, Product, QTY, Price
FROM x
GROUP BY TransID, Product, QTY, Price
HAVING Count = 1
) AS Table2
Then read the TransID. Done!

selecting specific occurrences from a group SQL

Data set looks like
id statusid statusdate
100 22 04/12/2016
100 22 04/14/2016
100 25 04/16/2016
100 25 04/17/2016
100 25 04/19/2016
100 22 04/22/2016
100 22 05/14/2016
100 27 05/19/2016
100 27 06/14/2016
100 25 06/18/2016
100 22 07/14/2016
100 22 07/18/2016
Task is to select the First time each status was logged. Number of unique times each status were logged.
Example :
For Status 22
First time status date: 04/12/2016
Last time Status first date: 07/14/2016
Number of Unique times it went to that status: 3
This is complicated than it looks. I assume you want the unique consecutive times for a given status.
Use a difference of row numbers approach to classify consecutive rows into groups. Then get the row numbers in those groups. Finally aggregate to get the first day in the first and last group and the number of distinct groups.
select statusid
,max(case when rn_grp_asc=1 then statusdate end) as last_time_first_status
,max(case when rn_grp_desc=1 then statusdate end) as last_time_first_status
,count(distinct grp) as unique_times_in_status
from (select t.*
,row_number() over(partition by statusid order by grp,statusdate) as rn_grp_asc
,row_number() over(partition by statusid order by grp desc,statusdate) as rn_grp_desc
from (select t.*,row_number() over(order by statusdate)
-row_number() over(partition by statusid order by statusdate) as grp
from tbl t
) t
) t
group by statusid
select ID,
statusid,
Status,
max(case when rn_grp_asc=1 then statusdate end) as frst_time_first_status,
max(case when rn_grp_desc=1 then statusdate end) as last_time_first_status,
count(distinct grp) as unique_times_in_status
from ( select t.*,
row_number() over(partition by id,statusid order by grp,statusdate) as rn_grp_asc,
row_number() over(partition by id,statusid order by grp desc,statusdate) as rn_grp_desc
from ( Select ID,
StatusID,
Status,
StatusDate,
row_number() over(partition by id order by Systemdate)-row_number() over(partition by id,statusid order by Systemdate) as grp
from t
where ID=100
) t
) t
group by id,statusid,Status
Above query returns the correct results, but the following query does not.
select ID,
statusid,
Status,
max(case when rn_grp_asc=1 then statusdate end) as frst_time_first_status,
max(case when rn_grp_desc=1 then statusdate end) as last_time_first_status,
count(distinct grp) as unique_times_in_status
from ( select t.*,
row_number() over(partition by id,statusid order by grp,statusdate) as rn_grp_asc,
row_number() over(partition by id,statusid order by grp desc,statusdate) as rn_grp_desc
from ( Select ID,
StatusID,
Status,
StatusDate,
row_number() over(partition by id order by Systemdate)-row_number() over(partition by id,statusid order by Systemdate) as grp
from t
) t
) t
where ID=100
group by id,statusid,Status

T-SQL: Select partitions which have more than 1 row

I've managed to use this query
SELECT
PartGrp,VendorPn, customer, sum(sales) as totalSales,
ROW_NUMBER() OVER (PARTITION BY partgrp, vendorpn ORDER BY SUM(sales) DESC) AS seqnum
FROM
BG_Invoice
GROUP BY
PartGrp, VendorPn, customer
ORDER BY
PartGrp, VendorPn, totalSales DESC
To get a result set like this. A list of sales records grouped by a group, a product ID (VendorPn), a customer, the customer's sales, and a sequence number which is partitioned by the group and the productID.
PartGrp VendorPn Customer totalSales seqnum
------------------------------------------------------------
AGS-AS 002A0002-252 10021013 19307.00 1
AGS-AS 002A0006-86 10021013 33092.00 1
AGS-AS 010-63078-8 10020987 10866.00 1
AGS-SQ B71040-39 10020997 7174.00 1
AGS-SQ B71040-39 10020998 2.00 2
AIRFRAME 0130-25 10017232 1971.00 1
AIRFRAME 0130-25 10000122 1243.00 2
AIRFRAME 0130-25 10008637 753.00 3
HARDWARE MS28775-261 10005623 214.00 1
M250 23066682 10013266 175.00 1
How can I filter the result set to only return rows which have more than 1 seqnum? I would like the result set to look like this
PartGrp VendorPn Customer totalSales seqnum
------------------------------------------------------------
AGS-SQ B71040-39 10020997 7174.00 1
AGS-SQ B71040-39 10020998 2.00 2
AIRFRAME 0130-25 10017232 1971.00 1
AIRFRAME 0130-25 10000122 1243.00 2
AIRFRAME 0130-25 10008637 753.00 3
Out of the first result set example, only rows with VendorPn "B71040-39" and "0130-25" had multiple customers purchase the product. All products which had only 1 customer were removed. Note that my desired result set isn't simply seqnum > 1, because i still need the first seqnum per partition.
I would change your query to be like this:
SELECT PartGrp,
VendorPn,
customer,
sum(sales) as totalSales,
ROW_NUMBER() OVER (PARTITION BY partgrp,vendorpn ORDER BY SUM(sales) DESC) as seqnum,
COUNT(1) OVER (PARTITION BY partgrp,vendorpn) as cnt
FROM BG_Invoice
GROUP BY PartGrp,VendorPn, customer
HAVING cnt > 1
ORDER BY PartGrp,VendorPn, totalSales desc
You can try something like:
SELECT PartGrp,VendorPn, customer, sum(sales) as totalSales,
ROW_NUMBER() OVER (PARTITION BY partgrp,vendorpn ORDER BY SUM(sales) DESC) as seqnum
FROM BG_Invoice
GROUP BY PartGrp,VendorPn, customer
HAVING seqnum <> '1'
ORDER BY PartGrp,VendorPn, totalSales desc
WITH CTE AS (
SELECT
PartGrp,VendorPn, customer, sum(sales) as totalSales,
ROW_NUMBER() OVER (PARTITION BY partgrp, vendorpn ORDER BY SUM(sales) DESC) AS seqnum
FROM
BG_Invoice
GROUP BY
PartGrp, VendorPn, customer)
SELECT DISTINCT
a.*
FROM
CTE a
JOIN
CTE b
ON a.PartGrp = b.PartGrp
AND a.VendorPn = b.VendorPn
WHERE
b.seqnum > 1
ORDER BY
a.PartGrp,
a.VendorPn,
a.totalSales DESC;

Find the Last Price that is different from the Current Price

I have sales transactions in a SQL Server table like this:
ItemNumber, TrxDate, UnitPrice
ABC, 1/1/2013, 10.00
ABC, 2/1/2013, 10.00
ABC, 3/1/2013, 13.00
ABC, 4/1/2013, 14.00
ABC, 5/1/2013, 14.00
XYZ, 1/1/2013, 18.00
XYZ, 2/1/2013, 18.00
XYZ, 3/1/2013, 20.00
XYZ, 4/1/2013, 20.00
XYZ, 5/1/2013, 20.00
I need a stored procedure to produce output that would look like this
ItemNumber, LastPrice, PriorPrice
ABC, 14.00, 13.00
XYZ, 20.00, 18.00
Assmunig SQL Server 2005+:
;WITH CTE AS
(
SELECT *,
RN=ROW_NUMBER() OVER(PARTITION BY ItemNumber ORDER BY TrxDate DESC)
FROM ( SELECT ItemNumber,
MAX(TrxDate) TrxDate,
UnitPrice
FROM YourTable
GROUP BY ItemNumber,
UnitPrice) A
)
SELECT ItemNumber,
MIN(CASE WHEN RN = 1 THEN UnitPrice END) LastPrice,
MIN(CASE WHEN RN = 2 THEN UnitPrice END) PriorPrice
FROM CTE
GROUP BY ItemNumber
You can do this using the lag() function first to find when the price changes:
select ItemNumber,
max(case when seqnum = 1 then Price end) as LastPrice,
max(case when seqnum = 2 then Price end) as PriorPrice
from (select t.*, row_number() over (partition by ItemNumber order by TrxDate desc) as seqnum
from (select t.*,
lag(Price) over (partition by ItemNumber order by TrxDate) as PrevPrice
from t
) t
where Price <> PrevPrice or PrevPrice is NULL
) t
group by ItemNumber;
Lag is only available starting with SQL Server 2012.
If you don't have lag(), you can do the same thing with a correlated subquery:
select ItemNumber,
max(case when seqnum = 1 then Price end) as LastPrice,
max(case when seqnum = 2 then Price end) as PriorPrice
from (select t.*, row_number() over (partition by ItemNumber order by TrxDate) as seqnum
from (select t.*,
(select top 1 t2.Price
from t t2
where t.ItemNumber = t2.ItemNumber and
t.TrxDate > t2.TrxDate
order by t2.TrxDate desc
) as PrevPrice
from t
) t
where Price <> PrevPrice or PrevPrice is NULL
) t
group by ItemNumber;

Is there a way to do something like SQL NOT top statement?

I'm trying to make a SQL statement that gives me the top X records and then all sums all the others. The first part is easy...
select top 3 Department, Sum(sales) as TotalSales
from Sales
group by Department
What would be nice is if I union a second query something like...
select NOT top 3 "Others" as Department, Sum(sales) as TotalSales
from Sales
group by Department
... for a result set that looks like,
Department TotalSales
----------- -----------
Mens Clothes 120.00
Jewelry 113.00
Shoes 98.00
Others 312.00
Is there a way to do an equivalent to a NOT operator on a TOP? (I know I can probably make a temp table of the top X and work with that, but I'd prefer a solution that was just a single sql statement.)
WITH q AS
(
SELECT ROW_NUMBER() OVER (ORDER BY SUM(sales) DESC) rn,
CASE
WHEN ROW_NUMBER() OVER (ORDER BY SUM(sales) DESC) <= 3 THEN
department
ELSE
'Others'
END AS dept,
SUM(sales) AS sales
FROM sales
GROUP BY
department
)
SELECT dept, SUM(sales)
FROM q
GROUP BY
dept
ORDER BY
MAX(rn)
WITH cte
As (SELECT Department,
Sum(sales) as TotalSales
from Sales
group by Department),
cte2
AS (SELECT *,
CASE
WHEN ROW_NUMBER() OVER (ORDER BY TotalSales DESC) <= 3 THEN
ROW_NUMBER() OVER (ORDER BY TotalSales DESC)
ELSE 4
END AS Grp
FROM cte)
SELECT MAX(CASE
WHEN Grp = 4 THEN 'Others'
ELSE Department
END) AS Department,
SUM(TotalSales) AS TotalSales
FROM cte2
GROUP BY Grp
ORDER BY Grp
You can use a union to sum all other departments. A common table expression makes this a little bit more readable:
; with Top3Sales as
(
select top 3 Department
, Sum(sales) as TotalSales
from Sales
group by
Department
order by
Sum(sales) desc
)
select Department
, TotalSales
from Top3Sales
union all
select 'Other'
, SUM(Sales)
from Sales
where Department not in (select Department from Top3Sales)
Example at data.stackexchange.com.
SELECT TOP 3 Department, SUM(Sales) AS TotalSales
FROM Sales
GROUP BY Department
UNION ALL
SELECT 'Others', SUM(s.Sales)
FROM Sales s
WHERE s.Department NOT IN
(SELECT Department
FROM (SELECT TOP 3 Department, SUM(Sales)
FROM Sales
GROUP BY Department) D)