Find Segment with Longest Stay Per Booking

Find Segment with Longest Stay Per Booking - sql

We have a number of bookings and one of the requirements is that we display the Final Destination for a booking based on its segments. Our business has defined the Final Destination as that in which we have the longest stay. And Origin being the first departure point.
Please note this is not the segments with the Longest Travel time i.e. Datediff(minute, DepartDate, ArrivalDate) This is requesting the one with the Longest gap between segments.
This is a simplified version of the tables:
Create Table Segments
(
BookingID int,
SegNum int,
DepartureCity varchar(100),
DepartDate datetime,
ArrivalCity varchar(100),
ArrivalDate datetime
);
Create Table Bookings
(
BookingID int identity(1,1),
Locator varchar(10)
);
Insert into Segments values (1,2,'BRU','2010-03-06 10:40','FIH','2010-03-06 20:20:00')
Insert into Segments values (1,4,'FIH','2010-03-13 21:50:00','BRU', '2010-03-14 07:25:00')
Insert into Segments values (2,2,'BOD','2010-02-10 06:50:00','AMS','2010-02-10 08:50:00')
Insert into Segments values (2,3,'AMS','2010-02-10 10:40:00','EBB','2010-02-10 20:40:00')
Insert into Segments values (2,4,'EBB','2010-02-28 22:55:00','AMS','2010-03-01 05:35:00')
Insert into Segments values (2,5,'AMS','2010-03-01 10:25:00','BOD','2010-03-01 12:15:00')
insert into Segments values (3,2,'BRU','2010-03-09 12:10:00','IAD','2010-03-09 14:46:00')
Insert into Segments Values (3,3,'IAD','2010-03-13 17:57:00','BRU','2010-03-14 07:15:00')
insert into segments values (4,2,'BRU','2010-07-27','ADD','2010-07-28')
insert into segments values (4,4,'ADD','2010-07-28','LUN','2010-07-28')
insert into segments values (4,5,'LUN','2010-08-23','ADD','2010-08-23')
insert into segments values (4,6,'ADD','2010-08-23','BRU','2010-08-24')
Insert into Bookings values('5MVL7J')
Insert into Bookings values ('Y2IMXQ')
insert into bookings values ('YCBL5C')
Insert into bookings values ('X7THJ6')
I have created a SQL Fiddle with real data here:
SQL Fiddle Example
I have tried to do the following, however this doesn't appear to be correct.
SELECT Locator, fd.*
FROM Bookings ob
OUTER APPLY
(
SELECT Top 1 DepartureCity, ArrivalCity
from
(
SELECT DISTINCT
seg.segnum ,
seg.DepartureCity ,
seg.DepartDate ,
seg.ArrivalCity ,
seg.ArrivalDate,
(SELECT
DISTINCT
DATEDIFF(MINUTE , seg.ArrivalDate , s2.DepartDate)
FROM Segments s2
WHERE s2.BookingID = seg.BookingID AND s2.segnum = seg.segnum + 1) 'LengthOfStay'
FROM Bookings b(NOLOCK)
INNER JOIN Segments seg (NOLOCK) ON seg.bookingid = b.bookingid
WHERE b.Locator = ob.locator
) a
Order by a.lengthofstay desc
)
FD
The results I expect are:
Locator Origin Destination
5MVL7J BRU FIH
Y2IMXQ BOD EBB
YCBL5C BRU IAD
X7THJ6 BRU LUN
I get the feeling that a CTE would be the best approach, however my attempts do this so far failed miserably. Any help would be greatly appreciated.
I have managed to get the following query working but it only works for one at a time due to the top one, but I'm not sure how to tweak it:
WITH CTE AS
(
SELECT distinct s.DepartureCity, s.DepartDate, s.ArrivalCity, s.ArrivalDate, b.Locator , ROW_NUMBER() OVER (PARTITION BY b.Locator ORDER BY SegNum ASC) RN
FROM Segments s
JOIN bookings b ON s.bookingid = b.BookingID
)
SELECT C.Locator, c.DepartureCity, a.ArrivalCity
FROM
(
SELECT TOP 1 C.Locator, c.ArrivalCity, c1.DepartureCity, DATEDIFF(MINUTE,c.ArrivalDate, c1.DepartDate) 'ddiff'
FROM CTE c
JOIN cte c1 ON c1.Locator = C.Locator AND c1.rn = c.rn + 1
ORDER BY ddiff DESC
) a
JOIN CTE c ON C.Locator = a.Locator
WHERE c.rn = 1

You can try something like this:
;WITH CTE_Start AS
(
--Ordering of segments to eliminate gaps
SELECT *, ROW_NUMBER() OVER (PARTITION BY BookingID ORDER BY SegNum) RN
FROM dbo.Segments
)
, RCTE_Stay AS
(
--recursive CTE to calculate stay between segments
SELECT *, 0 AS Stay FROM CTE_Start s WHERE RN = 1
UNION ALL
SELECT sNext.*, DATEDIFF(Mi, s.ArrivalDate, sNext.DepartDate)
FROM CTE_Start sNext
INNER JOIN RCTE_Stay s ON s.RN + 1 = sNext.RN AND s.BookingID = sNext.BookingID
)
, CTE_Final AS
(
--Search for max(stay) for each bookingID
SELECT *, ROW_NUMBER() OVER (PARTITION BY BookingID ORDER BY Stay DESC) AS RN_Stay
FROM RCTE_Stay
)
--join Start and Final on RN=1 to find origin and departure
SELECT b.Locator, s.DepartureCity AS Origin, f.DepartureCity AS Destination
FROM CTE_Final f
INNER JOIN CTE_Start s ON f.BookingID = s.BookingID
INNER JOIN dbo.Bookings b ON b.BookingID = f.BookingID
WHERE s.RN = 1 AND f.RN_Stay = 1
SQLFiddle DEMO

You can use the OUTER APPLY + TOP operators to find the next values SegNum. After finding the gap between segments are used MIN/MAX aggregate functions with OVER clause as conditions in the CASE expression
;WITH cte AS
(
SELECT seg.BookingID,
CASE WHEN MIN(seg.segNum) OVER(PARTITION BY seg.BookingID) = seg.segNum
THEN seg.DepartureCity END AS Origin,
CASE WHEN MAX(DATEDIFF(MINUTE, seg.ArrivalDate, o.DepartDate)) OVER(PARTITION BY seg.BookingID)
= DATEDIFF(MINUTE, seg.ArrivalDate, o.DepartDate)
THEN o.DepartureCity END AS Destination
FROM Segments seg (NOLOCK)
OUTER APPLY (
SELECT TOP 1 seg2.DepartDate, seg2.DepartureCity
FROM Segments seg2
WHERE seg.BookingID = seg2.BookingID
AND seg.SegNum < seg2.SegNum
ORDER BY seg2.SegNum ASC
) o
)
SELECT b.Locator, MAX(c.Origin) AS Origin, MAX(c.Destination) AS Destination
FROM cte c JOIN Bookings b ON c.BookingID = b.BookingID
GROUP BY b.Locator
See demo on SQLFiddle

The statement below:
;WITH DataSource AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY BookingID ORDER BY DATEDIFF(SS,DepartDate,ArrivalDate) DESC) AS Row
,Segments.BookingID
,Segments.SegNum
,Segments.DepartureCity
,Segments.DepartDate
,Segments.ArrivalCity
,Segments.ArrivalDate
,DATEDIFF(SS,DepartDate,ArrivalDate) AS DiffInSeconds
FROM Segments
)
SELECT *
FROM DataSource DS
INNER JOIN Bookings B
ON DS.[BookingID] = B.[BookingID]
Will give the following output:
So, adding the following clause to the above statement:
WHERE Row = 1
will give you what you need.
Few important things:
As you can see from the screenshot below, there are two records with same difference in second. If you want to show both of them (or all of them if there are), instead ROW_NUMBER function use RANK function.
The return type of DATEDIFF is INT. So, there is limitation for seconds max deference value. It is as follows:
If the return value is out of range for int (-2,147,483,648 to
+2,147,483,647), an error is returned. For millisecond, the maximum difference between startdate and enddate is 24 days, 20 hours, 31
minutes and 23.647 seconds. For second, the maximum difference is 68
years.

Related

SQL - Get the sum of several groups of records

DESIRED RESULT
Get the hours SUM of all [Hours] including only a single result from each [DevelopmentID] where [Revision] is highest value
e.g SUM 1, 2, 3, 5, 6 (Result should be 22.00)
I'm stuck trying to get the appropriate grouping.
DECLARE #CompanyID INT = 1
SELECT
SUM([s].[Hours]) AS [Hours]
FROM
[dbo].[tblDev] [d] WITH (NOLOCK)
JOIN
[dbo].[tblSpec] [s] WITH (NOLOCK) ON [d].[DevID] = [s].[DevID]
WHERE
[s].[Revision] = (
SELECT MAX([s2].[Revision]) FROM [tblSpec] [s2]
)
GROUP BY
[s].[Hours]

use row_number() to identify the latest revision
SELECT SUM([Hours])
FROM (
SELECT *, R = ROW_NUMBER() OVER (PARTITION BY d.DevID
ORDER BY s.Revision)
FROM [dbo].[tblDev] d
JOIN [dbo].[tblSpec] s
ON d.[DevID] = s.[DevID]
) d
WHERE R = 1

If you want one row per DevId, then that should be in the GROUP BY (and presumably in the SELECT as well):
SELECT s.DevId, SUM(s.Hours) as hours
FROM [dbo].[tblDev] d JOIN
[dbo].[tblSpec] s
ON [d].[DevID] = [s].[DevID]
WHERE s.Revision = (SELECT MAX(s2.Revision) FROM tblSpec s2)
GROUP BY s.DevId;
Also, don't use WITH NOLOCK unless you really know what you are doing -- and I'm guessing you do not. It is basically a license that says: "You can get me data even if it is not 100% accurate."
I would also dispense with all the square braces. They just make the query harder to write and to read.

SQL - select last and previous different to last

The problem: a simplified membership table containing membership id, starting date for each membership and membership level description:
CREATE TABLE cover
(
[membership_id] int,
[cover_from_date] date,
[description] varchar(57)
);
INSERT INTO cover ([membership_id], [cover_from_date], [description])
VALUES (1, '1/1/2011', 'AA'),
(1, '1/2/2011', 'BB'),
(1, '1/3/2011', 'CC'),
(1, '1/4/2011', 'CC');
The task: to list the current membership and the immediate previous membership different to the current one. So from the above table I would like to see something like:
1, 1/4/2011, CC, 1/2/2011, BB
The attempted solution: I have managed to come up with a solution but it takes an enormous time to run on a large database and I'm sure there are better ways of resolving this problem. My no-doubt over complicated query is as follows:
with cte as
(
select
cover.membership_id, cover.cover_from_date,
cover.description,
row_number() over (partition by cover.membership_id order by cover.cover_from_date desc) AS version_no
from
cover
)
select
cte.membership_id,
cover_now.cover_from_date, cover_now.description,
cover_prev.cover_from_date, cover_prev.description
from
cte
left outer join
cte cover_now on cte.membership_id = cover_now.membership_id
and cover_now.version_no = 1
left outer join
cte cover_prev on cte.membership_id = cover_prev.membership_id
and cover_prev.version_no = (select min(x.version_no)
from cte x
where x.version_no >= 2
and x.membership_id = cover_now.membership_id
and x.description <> cover_now.description)
group by
cte.membership_id, cover_now.cover_from_date, cover_now.description,
cover_prev.cover_from_date, cover_prev.description
The entire fiddle is located here. Any tips on how to optimise the query would be appreciated.

First create an index on membership_id and cover_from_date in descending order. It will be heavily used by this query.
create index cover_by_date on cover (membership_id asc, cover_from_date desc)
Then:
select
membership.membership_id,
membership.cover_from_date,
membership.description,
previous_membership.cover_from_date,
previous_membership.description
from
(
select membership_id, description, cover_from_date, row_number() over (partition by membership_id order by cover_from_date desc) as rank
from cover
) as membership
left join (
select previous.membership_id, previous.description, previous.cover_from_date, row_number() over (partition by previous.membership_id order by previous.cover_from_date desc) as rank
from cover
join cover as previous on
cover.membership_id = previous.membership_id and
cover.description <> previous.description and
cover.cover_from_date > previous.cover_from_date
) as previous_membership on
previous_membership.membership_id = membership.membership_id and
previous_membership.rank = 1
where
membership.rank = 1

Find duplicates in MS SQL table

I know that this question has been asked several times but I still cannot figure out why my query is returning values which are not duplicates. I want my query to return only the records which have identical value in the column Credit. The query executes without any errors but values which are not duplicated are also being returned. This is my query:
Select
_bvGLTransactionsFull.AccountDesc,
_bvGLAccountsFinancial.Description,
_bvGLTransactionsFull.TxDate,
_bvGLTransactionsFull.Description,
_bvGLTransactionsFull.Credit,
_bvGLTransactionsFull.Reference,
_bvGLTransactionsFull.UserName
From
_bvGLAccountsFinancial Inner Join
_bvGLTransactionsFull On _bvGLAccountsFinancial.AccountLink =
_bvGLTransactionsFull.AccountLink
Where
_bvGLTransactionsFull.Credit
IN
(SELECT Credit AS NumOccurrences
FROM _bvGLTransactionsFull
GROUP BY Credit
HAVING (COUNT(Credit) > 1 ) )
Group By
_bvGLTransactionsFull.AccountDesc, _bvGLAccountsFinancial.Description,
_bvGLTransactionsFull.TxDate, _bvGLTransactionsFull.Description,
_bvGLTransactionsFull.Credit, _bvGLTransactionsFull.Reference,
_bvGLTransactionsFull.UserName, _bvGLAccountsFinancial.Master_Sub_Account,
IsNumeric(_bvGLTransactionsFull.Reference), _bvGLTransactionsFull.TrCode
Having
_bvGLTransactionsFull.TxDate > 01 / 11 / 2014 And
_bvGLTransactionsFull.Reference Like '5_____' And
_bvGLTransactionsFull.Credit > 0.01 And
_bvGLAccountsFinancial.Master_Sub_Account = '90210'

That's because you're matching on the credit field back to your table, which contains duplicates. You need to isolate the rows that are duplicated with ROW_NUMBER:
;WITH CTE AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY CREDIT ORDER BY (SELECT NULL)) AS RN
FROM _bvGLTransactionsFull)
Select
CTE.AccountDesc,
_bvGLAccountsFinancial.Description,
CTE.TxDate,
CTE.Description,
CTE.Credit,
CTE.Reference,
CTE.UserName
From
_bvGLAccountsFinancial Inner Join
CTE On _bvGLAccountsFinancial.AccountLink = CTE.AccountLink
WHERE CTE.RN > 1
Group By
CTE.AccountDesc, _bvGLAccountsFinancial.Description,
CTE.TxDate, CTE.Description,
CTE.Credit, CTE.Reference,
CTE.UserName, _bvGLAccountsFinancial.Master_Sub_Account,
IsNumeric(CTE.Reference), CTE.TrCode
Having
CTE.TxDate > 01 / 11 / 2014 And
CTE.Reference Like '5_____' And
CTE.Credit > 0.01 And
_bvGLAccountsFinancial.Master_Sub_Account = '90210'
Just as a side note, I would consider using aliases to shorten your queries and make them more readable. Prefixing the table name before each column in a join is very difficult to read.

I trust your code in terms of extracting all data per your criteria. With this, let me have a different approach and see your script "as-is". So then, lets keep first all the records in a temp.
Select
_bvGLTransactionsFull.AccountDesc,
_bvGLAccountsFinancial.Description,
_bvGLTransactionsFull.TxDate,
_bvGLTransactionsFull.Description,
_bvGLTransactionsFull.Credit,
_bvGLTransactionsFull.Reference,
_bvGLTransactionsFull.UserName
-- temp table
INTO #tmpTable
From
_bvGLAccountsFinancial Inner Join
_bvGLTransactionsFull On _bvGLAccountsFinancial.AccountLink =
_bvGLTransactionsFull.AccountLink
Where
_bvGLTransactionsFull.Credit
IN
(SELECT Credit AS NumOccurrences
FROM _bvGLTransactionsFull
GROUP BY Credit
HAVING (COUNT(Credit) > 1 ) )
Group By
_bvGLTransactionsFull.AccountDesc, _bvGLAccountsFinancial.Description,
_bvGLTransactionsFull.TxDate, _bvGLTransactionsFull.Description,
_bvGLTransactionsFull.Credit, _bvGLTransactionsFull.Reference,
_bvGLTransactionsFull.UserName, _bvGLAccountsFinancial.Master_Sub_Account,
IsNumeric(_bvGLTransactionsFull.Reference), _bvGLTransactionsFull.TrCode
Having
_bvGLTransactionsFull.TxDate > 01 / 11 / 2014 And
_bvGLTransactionsFull.Reference Like '5_____' And
_bvGLTransactionsFull.Credit > 0.01 And
_bvGLAccountsFinancial.Master_Sub_Account = '90210'
Then remove the "single occurrence" data by creating a row index and remove all those 1 time indexes.
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY Credit ORDER BY Credit) AS rowIdx
, *
FROM #tmpTable) AS innerTmp
WHERE
rowIdx != 1
You can change your preference through PARTITION BY <column name>.
Should you have any concerns, please raise it first as these are so far how I understood your case.
EDIT : To include those credits that has duplicates.
SELECT
tmp1.*
FROM #tmpTable tmp1
RIGHT JOIN (
SELECT
Credit
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY Credit ORDER BY Credit) AS rowIdx
, *
FROM #tmpTable) AS innerTmp
WHERE
rowIdx != 1
) AS tmp2
ON tmp1.Credit = tmp2.Credit

Better way to detect the number of consecutive records for a lot of different unique user IDs

So I'm currently waiting for my query to run (which takes about 5 minutes)
I have developed a query that will look at a student's attendance record, identify a run of Attendance Codes and group them based on date and class time.
This query works great if i'm trying to find the highest number of absences for a particular student but becomes trickier when I try to create a table which shows the highest run of absences for all active students.
To achieve this I am using a cursor to assign the StudentNo (unique id) to a parameter and then run my original query, and place the results into a Temporary table called #results.
Here is my code:
DECLARE #StudentId INT
DECLARE #getStudentId CURSOR
DECLARE #Results TABLE(StudentNo INT,AttendanceCode VarChar(2),StartDate DateTime,EndDate DateTime,"# of Classes" INT)
SET #getStudentId = CURSOR FOR
SELECT StudentNo
FROM [dbo].[Students]
OPEN #getStudentId
FETCH NEXT
FROM #getStudentId INTO #StudentId
WHILE ##FETCH_STATUS = 0
BEGIN
WITH AttendanceCodeMaster AS
(SELECT
[dbo].[Students].StudentNo,
CAST(CONVERT(date,[dbo].[CourseOfferingSchedule].ClassDate,101) as DATETIME) + CAST(CONVERT(time,dbo.CourseOfferingSchedule.ClassStartTime,101) AS DATETIME) as ClassDate,
[dbo].[CourseOfferingAttendanceScheduled].AttendanceCode
FROM [dbo].[CourseOfferingAttendanceScheduled]
INNER JOIN [dbo].[Students] on [dbo].[CourseOfferingAttendanceScheduled].StudentNo = [dbo].[Students].StudentNo
INNER JOIN dbo.[CourseOfferingSchedule] on [dbo].[CourseOfferingAttendanceScheduled].ScheduleID = [dbo].[CourseOfferingSchedule].ScheduleID
INNER JOIN [dbo].[StudentStatus] on [dbo].[Students].StudentStatusID = [dbo].[StudentStatus].StudentStatusID
where
[dbo].[Students].StudentNo = #StudentId and StudentStatus = 'Active' and Complete = 'Y'
),
RunGroup AS
(SELECT StudentNo,ClassDate, AttendanceCode, (SELECT COUNT(*) From AttendanceCodeMaster as G WHERE G.AttendanceCode <> GR.AttendanceCode AND G.ClassDate <= GR.ClassDate) as RunGroup
FROM AttendanceCodeMaster as GR ),
AbsenceStreaks AS
(SELECT
StudentNo,
AttendanceCode,
MIN(ClassDate) as StartDate,
MAX(ClassDate)as EndDate,
COUNT(*) as '# of Classes'
FROM RunGroup
where AttendanceCode = 'A'
GROUP BY StudentNo,AttendanceCode, RunGroup),
LongestStreak AS
(SELECT TOP 1 * FROM AbsenceStreaks
Order BY '# of Classes' Desc)
INSERT INTO #Results SELECT * FROM LongestStreak
FETCH NEXT
FROM #getStudentId INTO #StudentId
END
CLOSE #getStudentId
DEALLOCATE #getStudentId
SELECT * from #Results
where "# of Classes" >= 30
order by StudentNo

You do not need a cursor to solve this problem. The following may be making some assumptions on field contents and names (because it is hard to follow your query logic), but it should give you the right approach.
The key is that a string of absences can be recognized by enumerating all class days for a student (in a class) and then enumerating all class days for a student by whether or not they attended. The difference between these values is constant, for a sequence of absences or presences.
select studentNo, grp, AttendanceCode, count(*) as numInRow,
min(ClassDate) as DateStart, max(ClassDate) as DateEnd
from (select acm.*
(row_number() over (partition by StudentNo order by ClassDate) -
row_number() over (partition by StudentNo, AttendanceCode order by ClassDate)
) as grp
from AttendanceCodeMaster acm
) acm
group by studentNo, grp, AttendanceCode;
(I'm not sure if there is a class code in there somewhere as well.)
This should give you the information you want to find sequences of absences or presences.
StudentNo,ClassDate, AttendanceCode, (SELECT COUNT(*) From AttendanceCodeMaster as G WHERE G.AttendanceCode <> GR.AttendanceCode AND G.ClassDate <= GR.ClassDate) as RunGroup
FROM AttendanceCodeMaster

sql max/min query and data transformation

update: changed one time to show that the times per shipment may not be in sequential order always.
here is my input
create table test
(
shipment_id int,
stop_seq tinyint,
time datetime
)
insert into test values (1,1,'2009-8-10 8:00:00')
insert into test values (1,2,'2009-8-10 9:00:00')
insert into test values (1,3,'2009-8-10 10:00:00')
insert into test values (2,1,'2009-8-10 13:00:00')
insert into test values (2,2,'2009-8-10 14:00:00')
insert into test values (2,3,'2009-8-10 20:00:00')
insert into test values (2,4,'2009-8-10 18:00:00')
the output that i want is below
shipment_id start end
----------- ----- ---
1 8:00 10:00
2 13:00 18:00
i need to take the time from the min(stop) row for each shipment and the time from the max(stop) row and place in start/end respectively. i know this can be done with multiple queries rather easily but i am looking to see if a single select query can do this.
thanks!

I think the only way you'll be able to do it is with sub-queries.
SELECT shipment_id
, (SELECT TOP 1 time
FROM test AS [b]
WHERE b.shipment_id = a.shipment_id
AND b.stop_seq = MIN(a.stop_seq)) AS [start]
, (SELECT TOP 1 time
FROM test AS [b]
WHERE b.shipment_id = a.shipment_id
AND b.stop_seq = MAX(a.stop_seq)) AS [end]
FROM test AS [a]
GROUP BY shipment_id
You'll need to use the DATEPART function to chop up the time column to get your exact output.

Use a Common Table Expression (CTE) - this works (at least on my SQL Server 2008 test system):
WITH SeqMinMax(SeqID, MinID, MaxID) AS
(
SELECT Shipment_ID, MIN(stop_seq), MAX(stop_seq)
FROM test
GROUP BY Shipment_ID
)
SELECT
SeqID 'Shipment_ID',
(SELECT TIME FROM test
WHERE shipment_id = smm.seqid AND stop_seq = smm.minid) 'Start',
(SELECT TIME FROM test
WHERE shipment_id = smm.seqid AND stop_seq = smm.maxid) 'End'
FROM seqminmax smm
The SeqMinMax CTE selects the min and max "stop_seq" values for each "shipment_id", and the rest of the query then builds on those values to retrieve the associated times from the table "test".
CTE's are supported on SQL Server 2005 (and are a SQL:2003 standard feature - no Microsoft "invention", really).
Marc

Am I correct in thinking that you want the first time rather than the 'min' time, and the last time in the sequence rather than the 'max' time?

SELECT C.shipment_id, C.start, B2.time AS stop FROM
(
SELECT A.shipment_id, B1.time AS start, A.max_stop_seq FROM
(
SELECT shipment_id, MIN(stop_seq) as min_stop_seq, MAX(stop_seq) as max_stop_seq
FROM test
GROUP BY shipment_id
) AS A
INNER JOIN
(
SELECT shipment_id, stop_seq, time FROM test
) AS B1
ON A.shipment_id = B1.shipment_id AND A.min_stop_seq = B1.stop_seq
) AS C
INNER JOIN
(
SELECT shipment_id, stop_seq, time FROM test
) AS B2
ON C.shipment_id = B2.shipment_id AND C.max_stop_seq = B2.stop_seq

select t1.shipment_id, t1.time start, t2.time [end]
from (
select shipment_id, min(stop_seq) min, max(stop_seq) max
from test
group by shipment_id
) a
inner join test t1 on a.shipment_id = t1.shipment_id and a.min = t1.stop_seq
inner join test t2 on a.shipment_id = t2.shipment_id and a.max = t2.stop_seq

I suggest you take advantage of row_number and PIVOT. This may look messy, but I think it will perform well, and it's more adaptable to various assumptions. For example, it doesn't assume that the latest datetime value corresponds to the largest stop_seq value for a given shipment.
with test_ranked(shipment_id,stop_seq,time,rankup,rankdown) as (
select
shipment_id, stop_seq, time,
row_number() over (
partition by shipment_id
order by stop_seq
),
row_number() over (
partition by shipment_id
order by stop_seq desc
)
from test
), test_extreme_times(shipment_id,tag,time) as (
select
shipment_id, 'start', time
from test_ranked where rankup = 1
union all
select
shipment_id, 'end', time
from test_ranked where rankdown = 1
)
select
shipment_id, [start], [end]
from test_extreme_times
pivot (max(time) for tag in ([start],[end])) P
order by shipment_id;
go
The PIVOT isn't really needed, but it's handy. However, do note that the MAX inside the PIVOT expression doesn't do anything useful. There's only one [time] value for each tag, so MIN would work just as well. The syntax requires an aggregate function in this position.
Addendum: Here's an adaptation of CptSkippy's solution that may be more efficient than using MIN and MAX if you have a shipments table:
SELECT shipment_id
, (SELECT TOP 1 time
FROM test AS [b]
WHERE b.shipment_id = a.shipment_id
ORDER BY stop_seq ASC) AS [start]
, (SELECT TOP 1 time
FROM test AS [b]
WHERE b.shipment_id = a.shipment_id
ORDER BY stop_seq DESC) AS [end]
FROM shipments_table AS [a];

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find Segment with Longest Stay Per Booking - sql

Related

SQL - Get the sum of several groups of records

SQL - select last and previous different to last

Find duplicates in MS SQL table

Better way to detect the number of consecutive records for a lot of different unique user IDs

sql max/min query and data transformation

Categories

Resources