for my SQL Server 2016 project I have an Orders table looks like the one below and I want to create a SQL query that shows the oldest order for each customer / product. There are thousands of orders in the Orders table today and I should expect this to grow in size so I want this to perform well.
The goal is the output to look like this:
OrderID
CustomerID
ProductID
OrderDt
OrderAmt
123
1
1
1/1/2021
$50
456
1
2
1/2/2021
$20
345
2
1
1/1/2021
$30
The data in the Orders table today look like this:
OrderID
CustomerID
ProductID
OrderDt
OrderAmt
123
1
1
1/1/2021
$50
758
1
1
1/2/2021
$80
563
1
2
1/3/2021
74
684
1
2
1/4/2021
23
456
1
2
1/2/2021
$20
345
2
1
1/1/2021
$30
The canonical method is to use row_number():
select t.*
from (select t.*,
row_number() over (partition by customerid, productid order by orderdt, orderid) as seqnum
from t
) t
where seqnum = 1;
With an index on (customerid, productid, orderdt), then a correlated subquery might be a smidgen faster:
select t.*
from t
where t.orderdt = (select min(t2.orderdt)
from t t2
where t2.productid = t.productid and t2.customerid = t.customerid
);
Or a slightly less performance method without subqueries:
select top (1) with ties t.*
from t
order by row_number() over (partition by productid, customerid order by orderdt);
Related
I've two tables tblOrder and tblOrderDetails. I want to get order no, total price per order (Quantity*UnitCost) and OrderDate as given below.
Order No
Total
OrderDate
ORD 1
3000
01/01/2021
ORD 2
2750
01/03/2021
What I've tried is giving me quantity is not a part of aggregate function.
SELECT tblOrder.OrderNo, tblOrderDetails.UnitCost*tblOrderDetails.Quantity AS Total, OrderDate
FROM tblOrderDetails INNER JOIN tblOrder ON tblOrderDetails.OrderId = tblOrder .OrderId
GROUP BY tblOrder.OrderNo;
Table structures and data
Table tblOrder:
OrderId
OrderNo
OrderDate
1
ORD 1
01/01/2021
2
ORD 2
01/03/2021
Table tblOrderDetails:
OrderDetailId
Quantity
UnitCost
OrderId
1
100
30
1
2
50
40
2
2
10
15
2
2
20
30
2
select o.OrderNo
,od.total
,o.OrderDate
from
(
select OrderId
,sum(Quantity*UnitCost) as total
from tblOrderDetails
group by OrderId
) od join tblOrder o on o.OrderId = od.OrderId
OrderNo
total
OrderDate
ORD 1
3000
2021-01-01
ORD 2
2750
2021-01-03
Fiddle
Your requirements are not 100% clear, but maybe, you can just do this, without any subquery:
SELECT tblOrder.OrderNo,
SUM(tblOrderDetails.UnitCost*tblOrderDetails.Quantity) AS Total,
OrderDate
FROM tblOrderDetails
INNER JOIN tblOrder ON tblOrderDetails.OrderId = tblOrder.OrderId
GROUP BY tblOrder.OrderNo,OrderDate;
To see the difference to Danny's answer - which might also be fine - have a look here: db<>fiddle
I want to write a query to locate a group of clients whose purchased specific 2 product categories, at the same time, getting the information of first transaction date and first item they purchased. Since I used group by function, I could only get customer id but not first item purchase due to the nature of group by. Any thoughts to solve this problem?
What I have are transaction tables(t), customer_id tables(c) and product tables(p). Mine is SQL server 2008.
Update
SELECT t.customer_id
,t.product_category
,MIN(t.transaction_date) AS FIRST_TRANSACTION_DATE
,SUM(t.quantity) AS TOTAL_QTY
,SUM(t.sales) AS TOTAL_SALES
FROM transaction t
WHERE t.product_category IN ('VEGETABLES', 'FRUITS')
AND t.transaction_date BETWEEN '2020/01/01' AND '2022/09/30'
GROUP BY t.customer_id
HAVING COUNT(DISTINCT t.product_category) = 2
**Customer_id** **transaction_date** **product_category** **quantity** **sales**
1 2022-05-30 VEGETABLES 1 100
1 2022-08-30 VEGETABLES 1 100
2 2022-07-30 VEGETABLES 1 100
2 2022-07-30 FRUITS 1 50
2 2022-07-30 VEGETABLES 2 200
3 2022-07-30 VEGETABLES 3 300
3 2022-08-01 FRUITS 1 50
3 2022-08-05 FRUITS 1 50
4 2022-08-07 FRUITS 1 50
4 2022-09-05 FRUITS 2 100
In the above, what I want to show after executing the SQL query is
**Customer_id** **FIRST_TRANSACTION_DATE** **first_product_category** **TOTAL_QUANTITY** **TOTAL_SALES**
2 2022-07-30 VEGETABLES, FRUITS 4 350
3 2022-07-30 VEGETABLES 5 400
Customer_id 1 and 4 will not be shown as they only purchased either vegetables or fruits but not both
Check now, BTW need find logic with product_category
select CustomerId, transaction_date, product_category, quantity, sales
from(
select CustomerId, transaction_date, product_category , sum(quantity) over(partition by CustomerId ) as quantity , sum(sales) over(partition by CustomerId ) as sales, row_number() over(partition by CustomerId order by transaction_date ASC) rn
from(
select CustomerId, transaction_date, product_category, quantity, sales
from tablee t
where (product_category = 'FRUITS' and
EXISTS (select CustomerId
from tablee tt
where product_category = 'VEGETABLES'
and t.CustomerId = tt.CustomerId)) OR
(product_category = 'VEGETABLES' and
EXISTS (select CustomerId
from tablee tt
where product_category = 'FRUITS'
and t.CustomerId = tt.CustomerId)))x)over_all
where rn = 1;
HERE is FIDDLE
I have a control table, where Prices with Item number are tracked date wise.
id ItemNo Price Date
---------------------------
1 a001 100 1/1/2003
2 a001 105 1/2/2003
3 a001 110 1/3/2003
4 b100 50 1/1/2003
5 b100 55 1/2/2003
6 b100 60 1/3/2003
7 c501 35 1/1/2003
8 c501 38 1/2/2003
9 c501 42 1/3/2003
10 a001 95 1/1/2004
This is the query I am running.
SELECT pr.*
FROM prices pr
INNER JOIN
(
SELECT ItemNo, max(date) max_date
FROM prices
GROUP BY ItemNo
) p ON pr.ItemNo = p.ItemNo AND
pr.date = p.max_date
order by ItemNo ASC
I am getting below values
id ItemNo Price Date
------------------------------
10 a001 95 2004-01-01
6 b100 60 2003-01-03
9 c501 42 2003-01-03
Question is, is my query right or wrong? though I am getting my desired result.
Your query does what you want, and is a valid approach to solve your problem.
An alternative option would be to use a correlated subquery for filtering:
select p.*
from prices p
where p.date = (select max(p1.date) from prices where p1.itemno = p.itemno)
The upside of this query is that it can take advantage of an index on (itemno, date).
You can also use window functions:
select *
from (
select p.*, rank() over(partition by itemno order by date desc) rn
from prices p
) p
where rn = 1
I would recommend benchmarking the three options against your real data to assess which one performs better.
I am using SQL Server 2008 and I was wondering how to remove duplicate customers either from the table or exclude it in my query. An Account_ID can only have 1 product associated with it. And the account with the most recent purchase date is what should be showing. An example is below:
Account_ID, Account_Purchase, Purchase_Date
1 Product 1 1/1/2016
2 Product 1 1/2/2016
3 Product 2 1/5/2016
1 Product 3 3/12/2016
4 Product 3 1/5/2016
Ideally I would only see:
Account_ID, Account_Purchase, Purchase_Date
2 Product 1 1/2/2016
3 Product 2 1/5/2016
1 Product 3 3/12/2016
4 Product 3 1/5/2016
This should not show up because it is not the most recent purchase from account 1
Account_ID, Account_Purchase, Purchase_Date
1 Product 1 1/1/2016
Thank you all for help, folks!
Simply acquire the latest purchase_date using max and group by account_id. Then use inner join to get the other details from the acquired details.
SELECT TABLE_NAME.* FROM TABLE_NAME
INNER JOIN(
SELECT Account_ID, MAX(Purchase_Date) AS Purchase_Date
GROUP BY Account_ID
) LatestPurchases
ON TABLE_NAME.Account_ID = LatestPurchases.Account_ID
AND TABLE_NAME.Purchase_Date = LatestPurchases.Purchase_Date
Try below query, please replace TABLENAME with your table
WITH CTE
AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Account_ID ORDER BY Purchase_Date DESC) AS RN
FROM TABLENAME
)
SELECT
*
FROM CTE
WHERE RN = 1
Here is another query
SELECT
t.Account_id,
t.Account_Purchase,
t.Purchase_Date
FROM
tablename t
WHERE
t.Purchase_Date = (SELECT MAX(Purchase_date) FROM Tablename WHERE Account_ID = t.Account_ID)
ORDER BY
t.Purchase_Date DESC
There is a table tbl_products that contains data as shown below:
Id Name
----------
1 P1
2 P2
3 P3
4 P4
5 P5
6 P6
And another table tbl_inputs that contains data as shown below:
Id Product_Id Price Register_Date
----------------------------------------
1 1 10 2010-01-01
2 1 20 2010-10-11
3 1 30 2011-01-01
4 2 100 2010-01-01
5 2 200 2009-01-01
6 3 500 2011-01-01
7 3 270 2010-10-15
8 4 80 2010-01-01
9 4 50 2010-02-02
10 4 92 2011-01-01
I want to select all products(id, name, price, register_date) with maximum date in each group.
For Example:
Id Name Price Register_Date
----------------------------------------
3 P1 30 2011-01-01
4 P2 100 2010-01-01
6 P3 500 2011-01-01
10 P4 92 2011-01-01
select
id
,name
,code
,price
from tbl_products tp
cross apply (
select top 1 price
from tbl_inputs ti
where ti.product_id = tp.id
order by register_date desc
) tii
Although is not the optimum way you can do it like:
;with gb as (
select
distinct
product_id
,max(register_date) As max_register_date
from tbl_inputs
group by product_id
)
select
id
,product_id
,price
,register_date
from tbl_inputs ti
join gb
on ti.product_id=gb.product_id
and ti.register_date = gb.max_register_date
But as I said earlier .. this is not the way to go in this case.
;with cte as
(
select t1.id, t1.name, t1.code, t2.price, t2.register_date,
row_number() over (partition by product_id order by register_date desc) rn
from tbl_products t1
join tbl_inputs t2
on t1.id = t2.product_id
)
select id, name, code, price, register_date
from cte
where rn = 1
Something like this..
select id, product_id, price, max(register_date)
from tbl_inputs
group by id, product_id, price
you can use the max function and the group by clause. if you only need results from the table tbl_inputs you even don't need a join
select product_id, max(register_date), price
from tbl_inputs
group by product_id, price
if you need field from the tbl_prducts you have to use a join.
select p.name, p. code, i.id, i.price, max(i.register_date)
from tbl_products p join tbl_inputs i on p.id=i.product_id
grooup by p.name, p. code, i.id, i.price
Try this:
SELECT id, product_id, price, register_date
FROM tbl_inputs T1 INNER JOIN
(
SELECT product_id, MAX(register_date) As Max_register_date
FROM tbl_inputs
GROUP BY product_id
) T2 ON(T1.product_id= T2.product_id AND T1.register_date= T2.Max_register_date)
This is, of course, assuming your dates are unique. if they are not, you need to add the DISTINCT Keyword to the outer SELECT statement.
edit
Sorry, I didn't explain it very well. Your dates can be duplicated, it's not a problem as long as they are unique per product id. if you can have duplicated dates per product id, then you will have more then one row per product in the outcome of the select statement I suggested, and you will have to find a way to reduce it to one row per product.
i.e:
If you have records like that (when the last date for a product appears more then once in your table with different prices)
id | product_Id | price | register_date
--------------------------------------------
1 | 1 | 10.00 | 01/01/2000
2 | 1 | 20.00 | 01/01/2000
it will result in having both of these records as outcome.
However, if the register_date is unique per product id, then you will get only one result for each product id.