Getting max and latest rows in SQL - sql

I have a table containing Orders, where in the same day multiple orders can be created for a given Name. I need to return the latest Order for a given date and name, and if there are multiple orders on that day for a name, return the one with the largest order value.
Sample data:
ID | NAME | OrderDate | OrderValue
----+------+--------------+--------------
1 | A | 2019-01-15 | 100
2 | B | 2019-01-15 | 200
3 | A | 2019-01-15 | 150
4 | C | 2019-01-17 | 450
5 | D | 2019-01-18 | 300
6 | C | 2019-01-17 | 500
Result returned should be:
ID | NAME | OrderDate | OrderValue
----+------+--------------+--------------
2 | B | 2019-01-15 | 200
3 | A | 2019-01-15 | 150
5 | D | 2019-01-18 | 300
6 | C | 2019-01-17 | 500
I can do this in multiple SQL queries, but is there a simplistic query to achieve the above result?

Starting SQL Server 2005, just use ROW_NUMBER():
SELECT ID, Name, OrderDate, OrderValue
FROM (
SELECT
o.*,
ROW_NUMBER() OVER(PARTITION BY Name, OrderDate ORDER BY OrderValue DESC) rn
FROM orders o
) x WHERE rn = 1
ROW_NUMBER() assigns a rank to each record within groups of records having the same Name and OrderDate, sorted by OrderValue. The record with the highest order value gets row number 1.
With older versions, a solution to filter the table is to use a correlated subquery with a NOT EXITS condition :
SELECT ID, Name, OrderDate, OrderValue
FROM orders o
WHERE NOT EXISTS (
SELECT 1
FROM orders o1
WHERE
o1.Name = o.Name
AND o1.OrderDate = o.OrderDate
AND o1.OrderValue > o.OrderValue
)
The NOT EXISTS condition ensures that there is no other record with a highest OrderValue for the same Name and OrderDate.

Use cross apply:
select o.id, name, orderdate, o.ordervalue
from orders o
cross apply (select top 1 id, ordervalue from orders where name=o.name and orderdate=o.OrderDate order by ordervalue desc) oo
where o.id=oo.id
order by o.id

Related

How to join tables only with the latest record in SQL SERVER [duplicate]

This question already has answers here:
Join to only the "latest" record with t-sql
(7 answers)
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 4 months ago.
I want to list all customer with the latest phone number and most recent customer type
the phone number and type of customers are changing periodically so I want the latest record only without getting old values based on the lastestupdate column
Customer:
+------------+--------------------+------------+
|latestUpdate| CustID | AddID | TypeID |
+------------+--------+-----------+-------------
| 2020-03-01 | 1 | 1 | 1 |
| 2020-04-07 | 2 | 2 | 2 |
| 2020-06-13 | 3 | 3 | 3 |
| 2020-03-29 | 4 | 4 | 4 |
| 2020-02-06 | 5 | 5 | 5 |
+------------+--------+------------+----------+
CustomerAddress:
+------------+--------+-----------+
|latestUpdate| AddID | Mobile |
+------------+--------+-----------+
| 2020-03-01 | 1 | 66666 |
| 2020-04-07 | 1 | 55555 |
| 2020-06-13 | 2 | 99999 |
| 2020-03-29 | 3 | 11111 |
| 2020-02-06 | 3 | 22222 |
+------------+--------+-----------+
CustomerType:
+------------+--------+-----------+
|latestUpdate| TypeId | TypeName |
+------------+--------+-----------+
| 2020-03-01 | 1 | First |
| 2020-04-07 | 1 | Second |
| 2020-06-13 | 3 | Third |
| 2020-03-29 | 4 | Fourth |
| 2020-02-06 | 5 | Fifth |
+------------+--------+-----------+
When I tried to join I am always getting duplicated customerID not only the latest record
I want to Display Customer.CustID and CustomerType.TypeName and CustomerAddress.Mobile
You need to make sub-queries for most recent customer type and latest phone number like this:
SELECT *
FROM (
SELECT latestUpdate, CustID, AddID, TypeID,
ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY latestUpdate DESC) AS RowNumber
FROM Customer
) AS c
INNER JOIN (
SELECT latestUpdate, AddID, Mobile,
ROW_NUMBER() OVER (PARTITION BY AddId ORDER BU ltestUpdate DESC) AS RowNumber
FROM CustomerAddress
) AS t
ON c.AddId = t.AddId
INNER JOIN CustomerType ct
ON ct.TypeId = c.TypeId
WHERE c.RowNumber = 1
AND t.RowNumber = 1
A simpler way than using row_number would be using cross apply together with top 1 in an ordered subquery:
select c.CustId, p.Mobile
from Customer c
cross apply (
select top 1 Mobile
from CustomerAddress a
where c.CustId = a.AddId
order by a.latestUpdate
) p
You need to use some subqueries :
SELECT *
FROM Customer AS C
LETF OUTER JOIN (SELECT *, ROW_NUMBER() OVER(PARTITION BY CustID ORDER BY LastestUpdate DESC) AS N
FROM CustomerAddress) AS A
ON C.CustID = A.CustID AND N = 1
LETF OUTER JOIN (SELECT *, ROW_NUMBER() OVER(PARTITION BY CustID ORDER BY LastestUpdate DESC) AS N
FROM CustomerType) AS T
ON C.CustID = T.CustID AND N = 1
If you have had used Temporal table which is an ISO SQL Standard feature for data history of table, you will always have the lastest rows inside the main table, old rows stays into history table and can be queried with a time point or date interval restriction.
This is it:
select * from (select *,RANK() OVER (
PARTITION BY b.AddID
ORDER BY b.latestUpdate DESC,
) as rank1
from
Customer a
left join
CustomerAddress b
on
a.AddID=b.AddID
left join
CustomerType c
on
v.TypeId =c.TypeId
) where rank1=1;
You should join the tables using the "APPLY" operator.
See: Link

How to use ORDER BY with 2 columns created by different aggregate functions

I'm trying to use SALESPERSON_CUSTOMER_REVENUE (salesperson's revenue for each customer) and total revenue by each salesperson in ORDER BY. Currently, I can only use only SALESPERSONPERSONID and SALESPERSON_CUSTOMER_REVENUE in ORDER BY statement.
SALES_ORDERS
-------------------------------------------------------------------------
| SALESPERSONPERSONID | CUSTOMERID | ORDERID |
-------------------------------------------------------------------------
| 3 | 10 | 324371 |
-------------------------------------------------------------------------
SALES_ORDERLINES
--------------------------------------------------------------------
| ORDERID | ORDERLINEID | QUANTITY | UNITPRICE |
--------------------------------------------------------------------
| 324371 | 10 | 32 | 100 |
--------------------------------------------------------------------
My current query
SELECT
ORD.SALESPERSONPERSONID,
ORD.CUSTOMERID,
SUM(LINE.QUANTITY * LINE.UNITPRICE) AS SALESPERSON_CUSTOMER_REVENUE
FROM SALES_ORDERS ORD
INNER JOIN SALES_ORDERLINES LINE
ON ORD.ORDERID = LINE.ORDERID
GROUP BY ORD.SALESPERSONPERSONID, ORD.CUSTOMERID
ORDER BY ORD.SALESPERSONPERSONID, SALESPERSON_CUSTOMER_REVENUE DESC
expected result
--------------------------------------------------------------------
| SALESPERSONPERSONID | CUSTOMERID | SALESPERSON_CUSTOMER_REVENUE |
--------------------------------------------------------------------
| 3 | 10 | 3200 |
--------------------------------------------------------------------
| 3 | 12 | 2200 |
--------------------------------------------------------------------
| 1 | 2 | 2000 |
--------------------------------------------------------------------
| 1 | 1 | 1200 |
--------------------------------------------------------------------
| 2 | 3 | 3000 |
TLDR:
I want to sort salespeople by their total revenue and for each salesperson, I want to sort by revenue for each customer.
Please let me know your idea. Thank you!
Well, you are getting the same results because you are performing the exact same operation in each aggregate column. I think this would be a great use case for a window function.
Since you want to get the results grouping bY different columns, PARTITION BY solves that:
SELECT DISTINCT
ORD.SALESPERSONID,
ORD.CUSTOMERID,
SUM(LINE.QUANTITY * LINE.UNITPRICE) OVER (PARTITION BY SALESPERSONID) AS SALESPERSON_REVENUE,
SUM(LINE.QUANTITY * LINE.UNITPRICE) OVER (PARTITION BY SALESPERSONID, CUSTOMERID) AS SALESPERSON_CUSTOMER_REVENUE
FROM
SALES_ORDERS AS ORD
INNER JOIN
SALES_ORDERLINES AS LINE ON ORD.ORDERID = LINE.ORDERID
ORDER BY
SALESPERSON_REVENUE DESC,
SALESPERSON_CUSTOMER_REVENUE DESC
You will see that column SALESPERSON_REVENUE will aggregate the operation SUM(LINE.QUANTITY * LINE.UNITPRICE) per salesperson
And column SALESPERSON_CUSTOMER_REVENUE will aggregate the operation SUM(LINE.QUANTITY * LINE.UNITPRICE) per salesperson/customer combination.
You can put window functions in the ORDER BY clause:
SELECT ORD.SALESPERSONPERSONID, ORD.CUSTOMERID,
SUM(LINE.QUANTITY * LINE.UNITPRICE) AS SALESPERSON_CUSTOMER_REVENUE
FROM SALES_ORDERS ORD JOIN
SALES_ORDERLINES LINE
ON ORD.ORDERID = LINE.ORDERID
GROUP BY ORD.SALESPERSONPERSONID, ORD.CUSTOMERID
ORDER BY SUM(SUM(LINE.QUANTITY * LINE.UNITPRICE)) OVER (PARTITION BY ORD.SALESPERSONPERSONID),
ORD.SALESPERSONPERSONID,
SALESPERSON_CUSTOMER_REVENUE DESC
Note that there are three ORDER BY keys. The middle one is important. It handles the case when two sales persons have the same total revenue and ensures that the rows for each sales person remain together.

Greatest count for each customer in PostgreSQL

customer | category | count
------------+---------------+-------
4846 | Vegetables | 1
1687 | Fast-Food | 7
2654 | Drink | 2
2654 | Vegetables | 3
1597 | Vegetables | 1
4846 | Drink | 2
2654 | Fast-Food | 1
1597 | Drink | 6
1597 | Snack | 3
how can i select the category which has greatest count for each customer for this table?
This is called the mode. You can use distinct on:
select distinct on (customer) t.*
from t
order by customer, count desc;
You can use window function row_number().
select
customer,
category,
count
from
(
select
*,
row_number() over (partition by customer order by count desc) as rnk
from yourTable
) val
where rnk = 1
Simple code for you try:
SELECT c.*
FROM (SELECT customer, max(count) as max_count
FROM customers
GROUP BY customer) as max_count_table
JOIN customers as c on max_count_table.customer = c.customer and max_count_table.max_count = c.count
Result:

Getting date, and count of unique customers when first order was placed

I have a table called orders that looks like this:
+--------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| memberid | int(11) | YES | | NULL | |
| deliverydate | date | YES | | NULL | |
+--------------+---------+------+-----+---------+-------+
And that contains the following data:
+------+----------+--------------+
| id | memberid | deliverydate |
+------+----------+--------------+
| 1 | 991 | 2019-10-25 |
| 2 | 991 | 2019-10-26 |
| 3 | 992 | 2019-10-25 |
| 4 | 992 | 2019-10-25 |
| 5 | 993 | 2019-10-24 |
| 7 | 994 | 2019-10-21 |
| 6 | 994 | 2019-10-26 |
| 8 | 995 | 2019-10-26 |
+------+----------+--------------+
I would like a result set returning each unique date, and a separate column showing how many customers that placed their first order that day.
I'm having problems with querying this the right way, especially when the data consists of multiple orders the same day from the same customer.
My approach has been to
Get all unique memberids that placed an order during the time period I want to look at
Filter out the ones that placed their first order during the period by comparing the memberids that has placed an order before the timeperiod
Grouping by delivery date, and counting all unique memberids (but this obviously counts unique memberids each day individually!)
Here's the corresponding SQL:
SELECT deliverydate,COUNT(DISTINCT memberid) FROM orders
WHERE
MemberId IN (SELECT DISTINCT memberid FROM orders WHERE deliverydate BETWEEN '2019-10-25' AND '2019-10-26')
AND NOT
MemberId In (SELECT DISTINCT memberid FROM orders WHERE deliverydate < '2019-10-25')
GROUP BY deliverydate
ORDER BY deliverydate ASC;
But this results in the following with the above data:
+--------------+--------------------------+
| deliverydate | COUNT(DISTINCT memberid) |
+--------------+--------------------------+
| 2019-10-25 | 2 |
| 2019-10-26 | 2 |
+--------------+--------------------------+
The count for 2019-10-26 should be 1.
Appreciate any help :)
You can aggregate twice:
select first_deliverydate, count(*) cnt
from (
select min(deliverydate) first_deliverydate
from orders
group by memberid
) t
group by first_deliverydate
order by first_deliverydate
The subquery gives you the first order data of each member, then the outer query aggregates and counts by first order date.
This demo on DB Fiddle with your sample data returns:
first_deliverydate | cnt
:----------------- | --:
2019-10-21 | 1
2019-10-24 | 1
2019-10-25 | 2
2019-10-26 | 1
In MySQL 8.0, This can also be achieved with window functions:
select deliverydate first_deliverydate, count(*) cnt
from (
select deliverydate, row_number() over(partition by memberid order by deliverydate) rn
from orders
) t
where rn = 1
group by deliverydate
order by deliverydate
Demo on DB Fiddle
you have first to figure out when was the first delivery date:
SELECT firstdeliverydate,COUNT(DISTINCT memberid) FROM (
select memberid, min(deliverydate) as firstdeliverydate
from orders
WHERE
MemberId IN (SELECT DISTINCT memberid FROM orders WHERE deliverydate BETWEEN '2019-10-25' AND '2019-10-26')
AND NOT
MemberId In (SELECT DISTINCT memberid FROM orders WHERE deliverydate < '2019-10-25')
group by memberid)
t1
group by firstdeliverydate
Get the first order of each customer with NOT EXISTS and then GROUP BY deliverydate to count the distinct customers who placed their order:
select o.deliverydate, count(distinct o.memberid) counter
from orders o
where not exists (
select 1 from orders
where memberid = o.memberid and deliverydate < o.deliverydate
)
group by o.deliverydate
See the demo.
Results:
| deliverydate | counter |
| ------------------- | ------- |
| 2019-10-21 00:00:00 | 1 |
| 2019-10-24 00:00:00 | 1 |
| 2019-10-25 00:00:00 | 2 |
| 2019-10-26 00:00:00 | 1 |
But if you want results for all the dates in the table including those dates where there where no orders from new customers (so the counter will be 0):
select d.deliverydate, count(distinct o.memberid) counter
from (
select distinct deliverydate
from orders
) d left join orders o
on o.deliverydate = d.deliverydate and not exists (
select 1 from orders
where memberid = o.memberid and deliverydate < o.deliverydate
)
group by d.deliverydate

Query item with closest date based on current date

I am trying to get the closest date for item no and price based on the current date. The query is giving me output, but not the way I want.
There is a different price for the same item and it's not filtering.
Here's my query:
SELECT distinct [ITEM_NO]
,min(REQUIRED_DATE) as Date
,[PRICE]
FROM [DATA_WAREHOUSE].[app].[OHCMS_HOPS_ORDERS]
where (REQUIRED_DATE) >= GETDATE() and PRICE is not null
group by ITEM_NO,PRICE
order by ITEM_NO
Any Ideas?
You can try to use ROW_NUMBER window function to make it.
SELECT ITEM_NO,
REQUIRED_DATE,
PRICE
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ITEM_NO ORDER BY REQUIRED_DATE) rn
FROM DATA_WAREHOUSE].[app].[OHCMS_HOPS_ORDERS]
where REQUIRED_DATE >= GETDATE() and PRICE is not null
)t1
WHERE rn = 1
Could you order by the the absolute value of DATEDIFF?
ORDER BY ABS(DATEDIFF(day, REQUIRED_DATE, GETDATE()))
This seems like an iteration of the greatest-n-per-group problem
I'm not quite certain what constraints you're looking to impose
Largest Date
Most Recent Date (but not in future)
Closest Date to today (past or present)
Here's an example table and which row we'd want if queried on 6/3/2019:
| Item | RequiredDate | Price |
|------|--------------|-------|
| A | 2019-05-29 | 10 |
| A | 2019-06-01 | 20 | <-- #2
| A | 2019-06-04 | 30 | <-- #3
| A | 2019-06-05 | 40 | <-- #1
| B | 2019-06-01 | 80 |
But I'm going to guess you're looking for #2
We can identify we the row / largest date by grouping by item and using an aggregate operation like MAX on each group
SELECT o.Item, MAX(o.RequiredDate) AS MostRecentDt
FROM Orders o
WHERE o.RequiredDate <= GETDATE()
GROUP BY o.Item
Which returns this:
| Item | MostRecentDt |
|------|--------------|
| A | 2019-05-29 |
| A | 2019-06-01 |
| B | 2019-06-01 |
However, once we've grouped by that record, the trouble is then in joining back to the original table to get the full row/record in order to select any other information not part of the original GROUP BY statement
Using ROW_NUMBER we can sort elements in a set, and indicate their order (highest...lowest)
SELECT *, ROW_NUMBER() OVER(PARTITION BY Item ORDER BY RequiredDate DESC) rn
FROM Orders o
WHERE o.RequiredDate <= GETDATE()
| Item | RequiredDate | Price | rn |
|------|--------------|-------|----|
| A | 2019-05-29 | 10 | 1 |
| A | 2019-06-01 | 20 | 2 |
| B | 2019-06-01 | 80 | 1 |
Since we've sorted DESC, now we just want to query this group to get the most recent values per group (rn=1)
WITH OrderedPastItems AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY Item ORDER BY RequiredDate DESC) rn
FROM Orders o
WHERE o.RequiredDate <= GETDATE()
)
SELECT *
FROM OrderedPastItems
WHERE rn = 1
Here's a MCVE in SQL Fiddle
Further Reading:
SQL selecting rows by most recent date
Select row with most recent date per user