SQL left join with latest record - sql

I want to left join a table with the latest record only.
I have Customer1 table:
+--------+----------+
| CustID | CustName |
+--------+----------+
| 1 | ABC123 |
| 2 | 456XYZ |
| 3 | 5PQR3 |
| 4 | 789XYZ |
| 5 | 789A |
+--------+----------+
SalesInvoice table:
+------------+--------+-----------+
| InvDate | CustID | InvNumber |
+------------+--------+-----------+
| 2020-03-01 | 1 | IV236 |
| 2020-04-07 | 1 | IV644 |
| 2020-06-13 | 2 | IV869 |
| 2020-03-29 | 3 | IV436 |
| 2020-02-06 | 3 | IV126 |
+------------+--------+-----------+
And I want this required output:
+--------+------------+-----------+
| CustID | InvDate | InvNumber |
+--------+------------+-----------+
| 1 | 2020-04-07 | IV644 |
| 2 | 2020-06-13 | IV869 |
| 3 | 2020-03-29 | IV436 |
| 4 | | |
| 5 | | |
+--------+------------+-----------+
For quick and easy, below is the sample code.
drop table if exists #Customer1
create table #Customer1(CustID int, CustName varchar (100))
insert into #Customer1 values
(1,'ABC123'),
(2,'456XYZ'),
(3,'5PQR3'),
(4,'789XYZ'),
(5,'789A')
drop table if exists #SalesInvoice
create table #SalesInvoice(InvDate DATE, CustID INT, InvNumber varchar (100))
insert into #SalesInvoice values
('2020-03-01',1,'IV236'),
('2020-04-07',1,'IV644'),
('2020-06-13',2,'IV869'),
('2020-03-29',3,'IV436'),
('2020-02-06',3,'IV126')

I like using TOP 1 WITH TIES in this case:
SELECT TOP 1 WITH TIES c.CustID, i.InvDate, i.InvNumber
FROM #Customer1 c
LEFT JOIN #Invoices i ON c.CustID = i.CustID
ORDER BY ROW_NUMBER() OVER (PARTITION BY c.CustID ORDER BY i.InvDate DESC);
Demo
The top 1 trick here is to order by row number, assigning a sequence to each customer, with the sequence descending by invoice date. Then, this approach retains just the most recent invoice record for each customer.

I recommend outer apply:
select c.*, i.*
from #c c outer apply
(select top (1) i.*
from #invoices i
where i.custId = c.custId
order by i.invDate desc
) i;
outer apply implements a special type of join called a "lateral join". This is a very powerful construct. But when learning about them, you can think of a lateral join as a correlated subquery that can return more than one column and more than one row.

You can try ROW_NUMBER window function instead of lateral joins with this simple self-explaining T-SQL
SELECT c.CustID
, d.InvDate
, d.InvNumber
FROM #C c
LEFT JOIN (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY InvDate DESC) AS RowNo
FROM #D
) d
ON c.CustID = d.CustID
AND d.RowNo = 1
Basically ROW_NUMBER is used to filter the "last" invoice in one table scan, instead of performing SELECT TOP 1 ... ORDER BY in the correlated query which has to be executed multiple times -- as much as the number of customers.

Related

How to join tables only with the latest record in SQL SERVER [duplicate]

This question already has answers here:
Join to only the "latest" record with t-sql
(7 answers)
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 4 months ago.
I want to list all customer with the latest phone number and most recent customer type
the phone number and type of customers are changing periodically so I want the latest record only without getting old values based on the lastestupdate column
Customer:
+------------+--------------------+------------+
|latestUpdate| CustID | AddID | TypeID |
+------------+--------+-----------+-------------
| 2020-03-01 | 1 | 1 | 1 |
| 2020-04-07 | 2 | 2 | 2 |
| 2020-06-13 | 3 | 3 | 3 |
| 2020-03-29 | 4 | 4 | 4 |
| 2020-02-06 | 5 | 5 | 5 |
+------------+--------+------------+----------+
CustomerAddress:
+------------+--------+-----------+
|latestUpdate| AddID | Mobile |
+------------+--------+-----------+
| 2020-03-01 | 1 | 66666 |
| 2020-04-07 | 1 | 55555 |
| 2020-06-13 | 2 | 99999 |
| 2020-03-29 | 3 | 11111 |
| 2020-02-06 | 3 | 22222 |
+------------+--------+-----------+
CustomerType:
+------------+--------+-----------+
|latestUpdate| TypeId | TypeName |
+------------+--------+-----------+
| 2020-03-01 | 1 | First |
| 2020-04-07 | 1 | Second |
| 2020-06-13 | 3 | Third |
| 2020-03-29 | 4 | Fourth |
| 2020-02-06 | 5 | Fifth |
+------------+--------+-----------+
When I tried to join I am always getting duplicated customerID not only the latest record
I want to Display Customer.CustID and CustomerType.TypeName and CustomerAddress.Mobile
You need to make sub-queries for most recent customer type and latest phone number like this:
SELECT *
FROM (
SELECT latestUpdate, CustID, AddID, TypeID,
ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY latestUpdate DESC) AS RowNumber
FROM Customer
) AS c
INNER JOIN (
SELECT latestUpdate, AddID, Mobile,
ROW_NUMBER() OVER (PARTITION BY AddId ORDER BU ltestUpdate DESC) AS RowNumber
FROM CustomerAddress
) AS t
ON c.AddId = t.AddId
INNER JOIN CustomerType ct
ON ct.TypeId = c.TypeId
WHERE c.RowNumber = 1
AND t.RowNumber = 1
A simpler way than using row_number would be using cross apply together with top 1 in an ordered subquery:
select c.CustId, p.Mobile
from Customer c
cross apply (
select top 1 Mobile
from CustomerAddress a
where c.CustId = a.AddId
order by a.latestUpdate
) p
You need to use some subqueries :
SELECT *
FROM Customer AS C
LETF OUTER JOIN (SELECT *, ROW_NUMBER() OVER(PARTITION BY CustID ORDER BY LastestUpdate DESC) AS N
FROM CustomerAddress) AS A
ON C.CustID = A.CustID AND N = 1
LETF OUTER JOIN (SELECT *, ROW_NUMBER() OVER(PARTITION BY CustID ORDER BY LastestUpdate DESC) AS N
FROM CustomerType) AS T
ON C.CustID = T.CustID AND N = 1
If you have had used Temporal table which is an ISO SQL Standard feature for data history of table, you will always have the lastest rows inside the main table, old rows stays into history table and can be queried with a time point or date interval restriction.
This is it:
select * from (select *,RANK() OVER (
PARTITION BY b.AddID
ORDER BY b.latestUpdate DESC,
) as rank1
from
Customer a
left join
CustomerAddress b
on
a.AddID=b.AddID
left join
CustomerType c
on
v.TypeId =c.TypeId
) where rank1=1;
You should join the tables using the "APPLY" operator.
See: Link

Postgresql left join

I have two tables cars and usage. I create a record in usage once a month for some of cars.
Now I want to get distinct list of cars with their latest usage that I saved.
first of all look at the tables please
cars:
| id | model | reseller_id |
|----|-------------|-------------|
| 1 | Samand Sall | 324228 |
| 2 | Saba 141 | 92933 |
usages:
| id | car_id | year | month | gas |
|----|--------|------|-------|-----|
| 1 | 2 | 2020 | 2 | 68 |
| 2 | 2 | 2020 | 3 | 94 |
| 3 | 2 | 2020 | 4 | 33 |
| 4 | 2 | 2020 | 5 | 12 |
The problem is here
I need only the latest usage of year and month
I tried a lot of ways but none of them is good enough. because sometimes this query gets me one ofnot latest records of usages.
SELECT * FROM cars AS c
LEFT JOIN
(select *
from usages
) u on (c.id = u.car_id)
order by u.gas desc
You can do this with a DISTINCT ON in the derived table:
SELECT *
FROM cars AS c
LEFT JOIN (
select distinct on (u.car_id) *
from usages u
order by u.car_id, u.year desc, u.month desc
) lu on c.id = lu.car_id
order by u.gas desc;
I think you need window function row_number. Here is the demo.
select
id,
model,
reseller_id
from
(
select
c.id,
model,
reseller_id,
row_number() over (partition by u.car_id order by u.id desc) as rn
from cars c
left join usages u
on c.id = u.car_id
) subq
where rn = 1

Getting date, and count of unique customers when first order was placed

I have a table called orders that looks like this:
+--------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| memberid | int(11) | YES | | NULL | |
| deliverydate | date | YES | | NULL | |
+--------------+---------+------+-----+---------+-------+
And that contains the following data:
+------+----------+--------------+
| id | memberid | deliverydate |
+------+----------+--------------+
| 1 | 991 | 2019-10-25 |
| 2 | 991 | 2019-10-26 |
| 3 | 992 | 2019-10-25 |
| 4 | 992 | 2019-10-25 |
| 5 | 993 | 2019-10-24 |
| 7 | 994 | 2019-10-21 |
| 6 | 994 | 2019-10-26 |
| 8 | 995 | 2019-10-26 |
+------+----------+--------------+
I would like a result set returning each unique date, and a separate column showing how many customers that placed their first order that day.
I'm having problems with querying this the right way, especially when the data consists of multiple orders the same day from the same customer.
My approach has been to
Get all unique memberids that placed an order during the time period I want to look at
Filter out the ones that placed their first order during the period by comparing the memberids that has placed an order before the timeperiod
Grouping by delivery date, and counting all unique memberids (but this obviously counts unique memberids each day individually!)
Here's the corresponding SQL:
SELECT deliverydate,COUNT(DISTINCT memberid) FROM orders
WHERE
MemberId IN (SELECT DISTINCT memberid FROM orders WHERE deliverydate BETWEEN '2019-10-25' AND '2019-10-26')
AND NOT
MemberId In (SELECT DISTINCT memberid FROM orders WHERE deliverydate < '2019-10-25')
GROUP BY deliverydate
ORDER BY deliverydate ASC;
But this results in the following with the above data:
+--------------+--------------------------+
| deliverydate | COUNT(DISTINCT memberid) |
+--------------+--------------------------+
| 2019-10-25 | 2 |
| 2019-10-26 | 2 |
+--------------+--------------------------+
The count for 2019-10-26 should be 1.
Appreciate any help :)
You can aggregate twice:
select first_deliverydate, count(*) cnt
from (
select min(deliverydate) first_deliverydate
from orders
group by memberid
) t
group by first_deliverydate
order by first_deliverydate
The subquery gives you the first order data of each member, then the outer query aggregates and counts by first order date.
This demo on DB Fiddle with your sample data returns:
first_deliverydate | cnt
:----------------- | --:
2019-10-21 | 1
2019-10-24 | 1
2019-10-25 | 2
2019-10-26 | 1
In MySQL 8.0, This can also be achieved with window functions:
select deliverydate first_deliverydate, count(*) cnt
from (
select deliverydate, row_number() over(partition by memberid order by deliverydate) rn
from orders
) t
where rn = 1
group by deliverydate
order by deliverydate
Demo on DB Fiddle
you have first to figure out when was the first delivery date:
SELECT firstdeliverydate,COUNT(DISTINCT memberid) FROM (
select memberid, min(deliverydate) as firstdeliverydate
from orders
WHERE
MemberId IN (SELECT DISTINCT memberid FROM orders WHERE deliverydate BETWEEN '2019-10-25' AND '2019-10-26')
AND NOT
MemberId In (SELECT DISTINCT memberid FROM orders WHERE deliverydate < '2019-10-25')
group by memberid)
t1
group by firstdeliverydate
Get the first order of each customer with NOT EXISTS and then GROUP BY deliverydate to count the distinct customers who placed their order:
select o.deliverydate, count(distinct o.memberid) counter
from orders o
where not exists (
select 1 from orders
where memberid = o.memberid and deliverydate < o.deliverydate
)
group by o.deliverydate
See the demo.
Results:
| deliverydate | counter |
| ------------------- | ------- |
| 2019-10-21 00:00:00 | 1 |
| 2019-10-24 00:00:00 | 1 |
| 2019-10-25 00:00:00 | 2 |
| 2019-10-26 00:00:00 | 1 |
But if you want results for all the dates in the table including those dates where there where no orders from new customers (so the counter will be 0):
select d.deliverydate, count(distinct o.memberid) counter
from (
select distinct deliverydate
from orders
) d left join orders o
on o.deliverydate = d.deliverydate and not exists (
select 1 from orders
where memberid = o.memberid and deliverydate < o.deliverydate
)
group by d.deliverydate

Retrieve the minimal create date with multiple rows

I have an issue with an SQL query that I am trying to write. I am trying to retrieve the row that has the minimal create_dt for each inst (see table) and amount (which isn't unique).
Unfortunately I can't use group by as the amount column isn't unique.
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company A | 400 | 4545 | 01/11/2018 |
| Company A | 200 | 4545 | 31/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
| Company B | 212 | 4893 | 04/10/2016 |
| Company B | 100 | 4893 | 10/10/2017 |
| Company B | 20 | 4893 | 04/10/2018 |
+--------------+--------+------+-------------+
In the above example I expect to see:
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
+--------------+--------+------+-------------+
Code:
SELECT
bill_company, bill_name, account_no
FROM
dbo.customer_information;
SELECT
balance_id, balance_id2, minus_balance,new_balance,
create_date, account_no
FROM
dbo.btr
SELECT
balance_id, balance_id2, expired_Date, amount, balance_type, account_no
FROM
dbo.btr_balance
SELECT
balance_ist, expired_date, account_no, balance_type
FROM
dbo.BALANCE_inst
Retrieve the minimal create data for a balance instance with the lowest balance for a balance inst.
(SELECT
bill_company,
bill_name,
account_no,
balance_ist,
amount,
MIN(create_date)
FROM
dbo.mtr btr
LEFT JOIN
btr_balance btrb ON btr.balance_id = btrb.balance_id
AND btr.balance_id2 = btrb.balance_id2
LEFT JOIN
balance_inst bali ON btr.account_no = bali.account_no
AND btrb.expired_date = bali.expired_date
GROUP BY
bill_company, bill_name, account_no,amount, balance_ist)
I have seen some solutions about using correlated query but can't see to get my head around it.
Common Table Expression (CTE) will help you.
;with cte as (
select *, row_number() over(partition by company_name order by create_date) rn
from dbo.myTable
)
select * from cte
where rn = 1;
use row_number() i assumed bill_company is your company name
select * from
( SELECT bill_company,
bill_name,
account_no,
balance_ist,
amount,
create_date,
row_number() over(partition by bill_company order by create_date) rn
FROM dbo.mtr btr left join btr_balance btrb
on btr.balance_id = btrb.balance_id and btr.balance_id2 = btrb.balance_id2
left join balance_inst bali
on btr.account_no = bali.account_no and btrb.expired_date = bali.expired_date
) t where t.rn=1

Real life example, when to use OUTER / CROSS APPLY in SQL

I have been looking at CROSS / OUTER APPLY with a colleague and we're struggling to find real life examples of where to use them.
I've spent quite a lot of time looking at When should I use CROSS APPLY over INNER JOIN? and googling but the main (only) example seems pretty bizarre (using the rowcount from a table to determine how many rows to select from another table).
I thought this scenario may benefit from OUTER APPLY:
Contacts Table (contains 1 record for each contact)
Communication Entries Table (can contain a phone, fax, email for each contact)
But using subqueries, common table expressions, OUTER JOIN with RANK() and OUTER APPLY all seem to perform equally. I'm guessing this means the scenario isn't applicable to APPLY.
Please share some real life examples and help explain the feature!
Some uses for APPLY are...
1) Top N per group queries (can be more efficient for some cardinalities)
SELECT pr.name,
pa.name
FROM sys.procedures pr
OUTER APPLY (SELECT TOP 2 *
FROM sys.parameters pa
WHERE pa.object_id = pr.object_id
ORDER BY pr.name) pa
ORDER BY pr.name,
pa.name
2) Calling a Table Valued Function for each row in the outer query
SELECT *
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle)
3) Reusing a column alias
SELECT number,
doubled_number,
doubled_number_plus_one
FROM master..spt_values
CROSS APPLY (SELECT 2 * CAST(number AS BIGINT)) CA1(doubled_number)
CROSS APPLY (SELECT doubled_number + 1) CA2(doubled_number_plus_one)
4) Unpivoting more than one group of columns
Assumes 1NF violating table structure....
CREATE TABLE T
(
Id INT PRIMARY KEY,
Foo1 INT, Foo2 INT, Foo3 INT,
Bar1 INT, Bar2 INT, Bar3 INT
);
Example using 2008+ VALUES syntax.
SELECT Id,
Foo,
Bar
FROM T
CROSS APPLY (VALUES(Foo1, Bar1),
(Foo2, Bar2),
(Foo3, Bar3)) V(Foo, Bar);
In 2005 UNION ALL can be used instead.
SELECT Id,
Foo,
Bar
FROM T
CROSS APPLY (SELECT Foo1, Bar1
UNION ALL
SELECT Foo2, Bar2
UNION ALL
SELECT Foo3, Bar3) V(Foo, Bar);
There are various situations where you cannot avoid CROSS APPLY or OUTER APPLY.
Consider you have two tables.
MASTER TABLE
x------x--------------------x
| Id | Name |
x------x--------------------x
| 1 | A |
| 2 | B |
| 3 | C |
x------x--------------------x
DETAILS TABLE
x------x--------------------x-------x
| Id | PERIOD | QTY |
x------x--------------------x-------x
| 1 | 2014-01-13 | 10 |
| 1 | 2014-01-11 | 15 |
| 1 | 2014-01-12 | 20 |
| 2 | 2014-01-06 | 30 |
| 2 | 2014-01-08 | 40 |
x------x--------------------x-------x
CROSS APPLY
There are many situation where we need to replace INNER JOIN with CROSS APPLY.
1. If we want to join 2 tables on TOP n results with INNER JOIN functionality
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
INNER JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
The above query generates the following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
x------x---------x--------------x-------x
See, it generated results for last two dates with last two date's Id and then joined these records only in outer query on Id, which is wrong. To accomplish this, we need to use CROSS APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
CROSS APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
and forms he following result.
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
x------x---------x--------------x-------x
Here is the working. The query inside CROSS APPLY can reference the outer table, where INNER JOIN cannot do this(throws compile error). When finding the last two dates, joining is done inside CROSS APPLY ie, WHERE M.ID=D.ID.
2. When we need INNER JOIN functionality using functions.
CROSS APPLY can be used as a replacement with INNER JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
CROSS APPLY dbo.FnGetQty(M.ID) C
And here is the function
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
x------x---------x--------------x-------x
OUTER APPLY
1. If we want to join 2 tables on TOP n results with LEFT JOIN functionality
Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
LEFT JOIN
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
ORDER BY CAST(PERIOD AS DATE)DESC
)D
ON M.ID=D.ID
which forms the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | NULL | NULL |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
This will bring wrong results ie, it will bring only latest two dates data from Details table irrespective of Id even though we join with Id. So the proper solution is using OUTER APPLY.
SELECT M.ID,M.NAME,D.PERIOD,D.QTY
FROM MASTER M
OUTER APPLY
(
SELECT TOP 2 ID, PERIOD,QTY
FROM DETAILS D
WHERE M.ID=D.ID
ORDER BY CAST(PERIOD AS DATE)DESC
)D
which forms the following desired result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-08 | 40 |
| 2 | B | 2014-01-06 | 30 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
2. When we need LEFT JOIN functionality using functions.
OUTER APPLY can be used as a replacement with LEFT JOIN when we need to get result from Master table and a function.
SELECT M.ID,M.NAME,C.PERIOD,C.QTY
FROM MASTER M
OUTER APPLY dbo.FnGetQty(M.ID) C
And the function goes here.
CREATE FUNCTION FnGetQty
(
#Id INT
)
RETURNS TABLE
AS
RETURN
(
SELECT ID,PERIOD,QTY
FROM DETAILS
WHERE ID=#Id
)
which generated the following result
x------x---------x--------------x-------x
| Id | Name | PERIOD | QTY |
x------x---------x--------------x-------x
| 1 | A | 2014-01-13 | 10 |
| 1 | A | 2014-01-11 | 15 |
| 1 | A | 2014-01-12 | 20 |
| 2 | B | 2014-01-06 | 30 |
| 2 | B | 2014-01-08 | 40 |
| 3 | C | NULL | NULL |
x------x---------x--------------x-------x
Common feature of CROSS APPLY and OUTER APPLY
CROSS APPLY or OUTER APPLY can be used to retain NULL values when unpivoting, which are interchangeable.
Consider you have the below table
x------x-------------x--------------x
| Id | FROMDATE | TODATE |
x------x-------------x--------------x
| 1 | 2014-01-11 | 2014-01-13 |
| 1 | 2014-02-23 | 2014-02-27 |
| 2 | 2014-05-06 | 2014-05-30 |
| 3 | NULL | NULL |
x------x-------------x--------------x
When you use UNPIVOT to bring FROMDATE AND TODATE to one column, it will eliminate NULL values by default.
SELECT ID,DATES
FROM MYTABLE
UNPIVOT (DATES FOR COLS IN (FROMDATE,TODATE)) P
which generates the below result. Note that we have missed the record of Id number 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
x------x-------------x
In such cases a CROSS APPLY or OUTER APPLY will be useful
SELECT DISTINCT ID,DATES
FROM MYTABLE
OUTER APPLY(VALUES (FROMDATE),(TODATE))
COLUMNNAMES(DATES)
which forms the following result and retains Id where its value is 3
x------x-------------x
| Id | DATES |
x------x-------------x
| 1 | 2014-01-11 |
| 1 | 2014-01-13 |
| 1 | 2014-02-23 |
| 1 | 2014-02-27 |
| 2 | 2014-05-06 |
| 2 | 2014-05-30 |
| 3 | NULL |
x------x-------------x
One real life example would be if you had a scheduler and wanted to see what the most recent log entry was for each scheduled task.
select t.taskName, lg.logResult, lg.lastUpdateDate
from task t
cross apply (select top 1 taskID, logResult, lastUpdateDate
from taskLog l
where l.taskID = t.taskID
order by lastUpdateDate desc) lg
To answer the point above knock up an example:
create table #task (taskID int identity primary key not null, taskName varchar(50) not null)
create table #log (taskID int not null, reportDate datetime not null, result varchar(50) not null, primary key(reportDate, taskId))
insert #task select 'Task 1'
insert #task select 'Task 2'
insert #task select 'Task 3'
insert #task select 'Task 4'
insert #task select 'Task 5'
insert #task select 'Task 6'
insert #log
select taskID, 39951 + number, 'Result text...'
from #task
cross join (
select top 1000 row_number() over (order by a.id) as number from syscolumns a cross join syscolumns b cross join syscolumns c) n
And now run the two queries with a execution plan.
select t.taskID, t.taskName, lg.reportDate, lg.result
from #task t
left join (select taskID, reportDate, result, rank() over (partition by taskID order by reportDate desc) rnk from #log) lg
on lg.taskID = t.taskID and lg.rnk = 1
select t.taskID, t.taskName, lg.reportDate, lg.result
from #task t
outer apply ( select top 1 l.*
from #log l
where l.taskID = t.taskID
order by reportDate desc) lg
You can see that the outer apply query is more efficient. (Couldn't attach the plan as I'm a new user... Doh.)