Compare two rows in SQL Server and return only one row - sql

I have a table (trips) that has response data with columns:
TripDate
Job
Address
DispatchDateTime
OnSceneDateTime
Vehicle
Often two vehicles will respond to the same address on the same date, and I need to find the one that was there first.
I've tried this:
SELECT
TripDate,
Job,
Vehicle,
DispatchDateTime
(SELECT min(OnSceneDateTime)
FROM Trips AS FirstOnScene
WHERE AllTrips.TripDate = FirstOnScene.TripDate
AND AllTrips.Address = FirstOnScene.Address) AS FirstOnScene
FROM
Trips AS AllTrips
But I still get both records returned, and both have the same FirstOnScene time.
How do I only get THE record, with it's DispatchDateTime and OnSceneDateTime, and not the row of the trip that was on scene second?
Here are a few example rows from the table:
2016-01-01 0169-a 150 Main St 2016-01-01 16:52 2016-01-01 16:59 Truck 1
2016-01-01 0171-a 150 Main St 2016-01-01 16:53 2016-01-01 17:05 Truck 2
2016-01-01 0190-a 29 Spring St 2016-01-01 17:19 2016-01-01 17:30 Truck 5
2016-01-02 0111-a 8 Fist St 2016-01-02 09:30 2016-01-02 09:40 Truck 1
2016-01-02 0112-a 8 Fist St 2016-01-02 09:32 2016-01-02 09:38 Truck 2
In the above examples I need to return the first, third, and last row of that data set.

Here is a total shot in the dark based on the sparse information provided. I don't really know what defines a given incident so you can adjust the partition accordingly.
with sortedValues as
(
select TripDate
, Job
, Vehicle
, OnSceneDateTime
, ROW_NUMBER() over(partition by Address, DispatchDateTime order by OnSceneDateTime desc) as RowNum
from Trips
)
select TripDate
, Job
, Vehicle
, OnSceneDateTime
from sortedValues
where RowNum = 1

You can just filter the rows down by selecting only the MIN OnSceneDateTime like below:
SELECT TripDate, Job, Vehicle, DispatchDateTime,OnSceneDateTime FirstOnScene
FROM Trips as AllTrips
WHERE AllTrips.OnSceneDateTime = (SELECT MIN(OnSceneDateTime)
FROM Trips as FirstOnScene
WHERE AllTrips.TripDate = FirstOnScene.TripDate
and AllTrips.Address = FirstOnScene.Address
)

How about use an ORDER BY on the OnSceneDateTime and then Limit 1. A simplified version like this:
SELECT TripDate, Job, Vehicle, DispatchDateTime, OnSceneDateTime FROM trips ORDER BY OnSceneDateTime LIMIT 1

Related

Expanding/changing my query to find more entries using (potentially) IFELSE

My question will use this dataset as an example. I have a query setup (I have changed variables to more generic variables for the sake of posting this on the internet so the query may not make perfect sense) that picks the most recent date for a given account. So the query returns values with a reason_type of 1 with the most recent date. This query has effective_date set to is not null.
account date effective_date value reason_type
123456 4/20/2017 5/1/2017 5 1
123456 1/20/2017 2/1/2017 10 1
987654 2/5/2018 3/1/2018 15 1
987654 12/31/2017 2/1/2018 20 1
456789 4/27/2018 5/1/2018 50 1
456789 1/24/2018 2/1/2018 60 1
456123 4/25/2017 null 15 2
789123 5/1/2017 null 16 2
666888 2/1/2018 null 31 2
333222 1/1/2018 null 20 2
What I am looking to do now is to basically use that logic to only apply to reason_type
if there is an entry for it, otherwise have it default to reason_type
I think I should be using an IFELSE, but I'm admittedly not knowledgeable about how I would go about that.
Here is the code that I currently have to return the reason_type 1s most recent entry.
I hope my question is clear.
SELECT account, date, effective_date, value, reason_type
from
(
SELECT account, date, effective_date, value, reason_type
ROW_NUMBER() over (partition by account order by date desc) rn
from mytable
WHERE value is not null
AND effective_date is not null
)
WHERE rn =1
I think you might want something like this (do you really have a column named date by the way? That seems like a bad idea):
SELECT account, date, effective_date, value, reason_type
FROM (
SELECT account, date, effective_date, value, reason_type
, ROW_NUMBER() OVER ( PARTITION BY account ORDER BY date DESC ) AS rn
FROM mytable
WHERE value IS NOT NULL
) WHERE rn = 1
-- effective_date IS NULL or is on or before today's date
AND ( effective_date IS NULL OR effective_date < TRUNC(SYSDATE+1) );
Hope this helps.

How to group consecutive rows together in SQL by multiple columns

I have rows in a query that return something like:
Date User Time Location Service Count
1/1/2018 Nick 12:00 Location A X 1
1/1/2018 Nick 12:01 Location A Y 1
1/1/2018 John 12:02 Location B Z 1
1/1/2018 Harry 12:03 Location A X 1
1/1/2018 Harry 12:04 Location A X 1
1/1/2018 Harry 12:05 Location B Y 1
1/1/2018 Harry 12:06 Location B X 1
1/1/2018 Nick 12:07 Location A X 1
1/1/2018 Nick 12:08 Location A Y 1
where the query returns locations visited by a user and a count of picks done from the location. results are sorted by user and time ascending. I need to group it to where CONSECUTIVE rows with same User and Location are grouped with a SUM of Count column and comma separated list of unique values in Service Column, final result returns something like this:
Date User Start Time End Time Location Service Count
1/1/2018 Nick 12:00 12:01 Location A X,Y 2
1/1/2018 John 12:02 12:02 Location B Z 1
1/1/2018 Harry 12:03 12:04 Location A X 2
1/1/2018 Harry 12:05 12:06 Location B X,Y 2
1/1/2018 Nick 12:07 12:08 Location A X,Y 2
I'm not sure where to start. Maybe lag or partition clauses? hoping an SQL guru can help here...
This is a gaps and islands problem. One method for solving it uses row_number():
select Date, User, min(Time) as start_time, max(time) as end_time,
Location,
listagg(Service, ',') within group (order by service),
count(*) as cnt
from (select t.*,
row_number() over (date order by time) as seqnum,
row_number() over (partition by user, date, location order by time) as seqnum_2
from t
) t
group by Date, User, Location, (seqnum - seqnum_2);
It is a bit tricky to explain how this works. My suggestion is to run the subquery and you will see how the difference of row numbers defines the groups that you are looking for.
Use lag to get user and location values of previous row. Then use a running sum to generate a new group whenever the user and location change. Finally aggregate on the classified groups,user,location and date.
select Date, User, min(Time) as start_time,max(time) as end_time, Location,
listagg(Service, ',') within group (order by Service),
count(*) as cnt
from (select Date, User, Time, Location,
sum(case when prev_location=location and prev_user=user then 0 else 1 end) over(order by date,time) as grp
from (select Date, User, Time, Location,
lag(Location) over(order by date,time) as prev_location,
lag(User) over(order by date,time) as prev_user,
from t
) t
) t
group by Date, User, Location, grp;

Advanced Sql query solution required

player team start_date end_date points
John Jacob SportsBallers 2015-01-01 2015-03-31 100
John Jacob SportsKings 2015-04-01 2015-12-01 115
Joe Smith PointScorers 2014-01-01 2016-12-31 125
Bill Johnson SportsKings 2015-01-01 2015-06-31 175
Bill Johnson AllStarTeam 2015-07-01 2016-12-31 200
The above table has many more rows. I was asked the below questions in an interview.
1.)For each player, which team were they play for on 2015-01-01?
I could not answer this one.
2.)For each player, how can we get the team for whom they scored the most points?
select team from Players
where points in (select max(points) from players group by player).
Please, solutions for both.
1
select *
from PlayerTeams
where startdate <='2015-01-01' and enddate >= '2015-01-01'
2
Select player, team, points
from(
Select *, row_number() over (partition by player order by points desc) as rank
From PlayerTeams) as player
where rank = 1
For #1:
Select Player
,Team
From table
Where '2015-01-01' between start_date and end_date
For #2:
select t.Player
,t.Team
from table t
inner join (select Player
,Max(points)
from table
group by Player) m
on t.Player = m.Player
and t.points = m.points

SQL JOIN - retrieve MAX DateTime from second table and the first DateTime after previous MAX for other value

I have issue with creating a proper SQL expression.
I have table TICKET with column TICKETID
TICKETID
1000
1001
I then have table STATUSHISTORY from where I need to retrieve what was the last time (maximum time) when that ticket entered VENDOR status (last VENDOR status) and when it exited VENDOR status (by exiting VENDOR status I mean the first next INPROG status, but only first INPROG after the VENDOR status, it's always INPROG the next status after VENDOR status). Also it is also possible that VENDOR status for ID does not exist at all in STATUSHISOTRY (then nulls should be returned), but INPROG exists always - it can be before but also and after VENDOR status, if ID is not anymore in VENDOR status.
Here is the example of STATUSHISTORY.
ID TICKETID STATUS DATETIME
1 1000 INPROG 01.01.2017 10:00
2 1000 VENDOR 02.01.2017 10:00
3 1000 INPROG 03.01.2017 10:00
4 1000 VENDOR 04.01.2017 10:00
5 1000 INPROG 05.01.2017 10:00
6 1000 HOLD 06.01.2017 10:00
7 1000 INPROG 07.01.2017 10:00
8 1001 INPROG 02.02.2017 10:00
9 1001 VENDOR 03.02.2017 10:00
10 1001 INPROG 04.02.2017 10:00
11 1001 VENDOR 05.02.2017 10:00
So the result when doing the query from TICKET table and doing the JOIN with table STATUSHISTORY should be:
ID VENDOR_ENTERED VENDOR_EXITED
1000 04.01.2017 10:00 05.01.2017 10:00
1001 05.02.2017 10:00 null
Because for ID 1000 last VENDOR status was at 04.01.2017 and the first INPROG status after the VENDOR status for that ID was at 05.01.2017 while for ID 1001 the last VENDOR status was at 05.02.2017 and after that INPROG status did not happen yet.
If VENDOR did not exist then both columns should be null in result.
I am really stuck with this, trying different JOINs but without any progress.
Thank you in advance if you can help me.
You can do this with window functions. First, assign a "vendor" group to the tickets. You can do this using a cumulative sum counting the number of "vendor" records on or before each record.
Then, aggregate the records to get one record per "vendor" group. And use row numbers to get the most recent records. So:
with vg as (
select ticket,
min(datetime) as vendor_entered,
min(case when status = 'INPROG' then datetime end) as vendor_exitied
from (select sh.*,
sum(case when status = 'VENDOR' then 1 else 0 end) over (partition by ticketid order by datetime) as grp
from statushistory sh
) sh
group by ticket, grp
)
select vg.tiketid, vg.vendor_entered, vg.vendor_exited
from (select vg.*,
row_number() over (partition by ticket order by vendor_entered desc) as seqnum
from vg
) vg
where seqnum = 1;
You can aggregate to get max time, then join onto all of the date values higher than that time, and then re-aggregate:
select a.TicketID,
a.VENDOR_ENTERED,
min( EXIT_TIME ) as VENDOR_EXITED
from (
select TicketID,
max( DATETIME ) as VENDOR_ENTERED
from StatusHistory
where Status = 'VENDOR'
group by TicketID
) as a
left join
(
select TicketID,
DATETIME as EXIT_TIME
from StatusHistory
where Status = 'INPROG'
) as b
on a.TicketID = b.TicketID
and EXIT_TIME >= a.VENDOR_ENTERED
group by a.TicketID,
a.VENDOR_ENTERED
DB2 is not supported in SQLfiddle, but a standard SQL example can be found here.

Get MAX count but keep the repeated calculated value if highest

I have the following table, I am using SQL Server 2008
BayNo FixDateTime FixType
1 04/05/2015 16:15:00 tyre change
1 12/05/2015 00:15:00 oil change
1 12/05/2015 08:15:00 engine tuning
1 04/05/2016 08:11:00 car tuning
2 13/05/2015 19:30:00 puncture
2 14/05/2015 08:00:00 light repair
2 15/05/2015 10:30:00 super op
2 20/05/2015 12:30:00 wiper change
2 12/05/2016 09:30:00 denting
2 12/05/2016 10:30:00 wiper repair
2 12/06/2016 10:30:00 exhaust repair
4 12/05/2016 05:30:00 stereo unlock
4 17/05/2016 15:05:00 door handle repair
on any given day need do find the highest number of fixes made on a given bay number, and if that calculated number is repeated then it should also appear in the resultset
so would like to see the result set as follows
BayNo FixDateTime noOfFixes
1 12/05/2015 00:15:00 2
2 12/05/2016 09:30:00 2
4 12/05/2016 05:30:00 1
4 17/05/2016 15:05:00 1
I manage to get the counts of each but struggling to get the max and keep the highest calculated repeated value. can someone help please
Use window functions.
Get the count for each day by bayno and also find the min fixdatetime for each day per bayno.
Then use dense_rank to compute the highest ranked row for each bayno based on the number of fixes.
Finally get the highest ranked rows.
select distinct bayno,minfixdatetime,no_of_fixes
from (
select bayno,minfixdatetime,no_of_fixes
,dense_rank() over(partition by bayno order by no_of_fixes desc) rnk
from (
select t.*,
count(*) over(partition by bayno,cast(fixdatetime as date)) no_of_fixes,
min(fixdatetime) over(partition by bayno,cast(fixdatetime as date)) minfixdatetime
from tablename t
) x
) y
where rnk = 1
Sample Demo
You are looking for rank() or dense_rank(). I would right the query like this:
select bayno, thedate, numFixes
from (select bayno, cast(fixdatetime) as date) as thedate,
count(*) as numFixes,
rank() over (partition by cast(fixdatetime as date) order by count(*) desc) as seqnum
from t
group by bayno, cast(fixdatetime as date)
) b
where seqnum = 1;
Note that this returns the date in question. The date does not have a time component.