SQL: Group By on Consecutive Records Part 2 - sql

Problem 1
I have the following table which shows the location of a person at 1 hour intervals
Id EntityID EntityName LocationID Timex delta
1 1 Mickey Club house 0300 1
2 1 Mickey Club house 0400 1
3 1 Mickey Park 0500 2
4 1 Mickey Minnies Boutique 0600 3
5 1 Mickey Minnies Boutique 0700 3
6 1 Mickey Club house 0800 4
7 1 Mickey Club house 0900 4
8 1 Mickey Park 1000 5
9 1 Mickey Club house 1100 6
The delta increments by +1 every time the location changes.
I would like to return an aggregate grouped by delta as per example below.
EntityName LocationID StartTime EndTime
Mickey Club house 0300 0500
Mickey Park 0500 0600
Mickey Minnies Boutique 0600 0800
Mickey Club house 0800 1000
Mickey Park 1000 1100
Mickey Club house 1100 1200
I am using the following query which was taken and adapted from here
SQL: Group By on Consecutive Records
(which works fine):
select
min(timex) as start_date
,end_date
,entityid
,entityname
,locationid
,delta
from
(
select
s1.timex
,(
select
max(timex)
from
[locationreport2] s2
where
s2.entityid = s1.entityid
and s2.delta = s1.delta
and not exists
(
select
null
from
[dbo].[locationreport2] s3
where
s3.timex < s2.timex
and s3.timex > s1.timex
and s3.entityid <> s1.entityid
and s3.entityname <> s1.entityname
and s3.delta <> s1.delta
)
) as end_date
, s1.entityid
, s1.entityname
, s1.locationid
, s1.delta
from
[dbo].[locationreport2] s1
) Result
group by
end_date
, entityid
, entityname
, locationid
, delta
order by
1 asc
However I would like to not use the delta (it takes effort to calculate and populate it); instead I am wondering if there is any way to calculate it as part / whilst running the query.
I wouldn’t mind using a view either.
 
Problem 2
I have the following table which shows the location of different people at 1 hour intervals
Id EntityID EntityName LocationID Timex Delta
1 1 Mickey Club house 0900 1
2 1 Mickey Club house 1000 1
3 1 Mickey Park 1100 2
4 2 Donald Club house 0900 1
5 2 Donald Park 1000 2
6 2 Donald Park 1100 2
7 3 Goofy Park 0900 1
8 3 Goofy Club house 1000 2
9 3 Goofy Park 1100 3
I would like to return an aggregate grouped by person and location.
For example
EntityID EntityName LocationID StartTime EndTime
1 Mickey Club house 0900 1100
1 Mickey Park 1100 1200
2 Donald Club house 0900 1000
2 Donald Park 1000 1200
3 Goofy Park 0900 1000
3 Goofy Club house 1000 1100
3 Goofy Park 1100 1200
What modifications do I need to the above query (Problem 1)?

Sounds like a case for analytic functions. You would need to add 1 hour to the EndTime, but that will be dependent on the data type.
SELECT EntityName, LocationID, StartTime, EndTime
FROM ( SELECT EntityName, LocationID
,MIN(Timex) OVER (PARTITION BY EntityID, delta ORDER BY delta) AS StartTime
,MAX(Timex) OVER (PARTITION BY EntityID, delta ORDER BY delta) AS EndTime
FROM locationreport2
) x
GROUP BY EntityName, LocationID, StartTime, EndTime
ORDER BY EntityName, StartTime

Related

JOIN Tables based on Service Date

I have 2 Tables (History and Responsible). They need to be JOINED based on Service Date.
History Table:
Id
ServiceDate
Hours
ClientId
ClientName
1
2021-10-15
3
123
Tom Holland
2
2021-10-25
5
123
Tom Holland
3
2022-01-14
2
123
Tom Holland
Responsible Table:
2999-12-31 means Responsible has no end date (current)
ClientId
ClientName
ResponsibleId
ResponsibleName
ResponsibleStartDate
ResponsibleEndtDate
123
Tom Holland
77
Thomas Anderson
2020-09-17
2021-10-17
123
Tom Holland
88
Tom Cruise
2021-10-18
2999-12-31
123
Tom Holland
99
Sten Lee
2022-01-07
2999-12-31
My code produces multiple rows, because 2022-01-14 Service date falls under multiple date ranges from Responsible Table:
SELECT h.Id,
h.ServiceDate,
h.Hours,
h.ClientId,
h.ClientName,
r.ResponsibleName
FROM History AS h
LEFT JOIN Responsible AS r
ON (h.ClientId = r.ClientId AND h.ServiceDate BETWEEN r.ResponsibleStartDate AND r.ResponsibleEndtDate)
The output of the query above is:
Id
ServiceDate
Hours
ClientId
ClientName
ResponsibleName
1
2021-10-15
3
123
Tom Holland
Thomas Anderson
2
2021-10-25
5
123
Tom Holland
Tom Cruise
3
2022-01-14
2
123
Tom Holland
Tom Cruise
3
2022-01-14
2
123
Tom Holland
Sten Lee
Technically, output is correct (because 2022-01-14 is between 2021-10-18 - 2999-12-31 as well between 2022-01-07 - 2999-12-31), but not what I need.
I would like to know if possible to achieve 2 outputs:
1) If Service Date falls in multiple date ranges from Responsible Table, Responsible Should be the person who's ResponsibleStartDate is closer to the ServiceDate:
Id
ServiceDate
Hours
ClientId
ClientName
ResponsibleName
1
2021-10-15
3
123
Tom Holland
Thomas Anderson
2
2021-10-25
5
123
Tom Holland
Tom Cruise
3
2022-01-14
2
123
Tom Holland
Sten Lee
2) Keep all rows, if Service Date falls in multiple date ranges from Responsible Table, but split Hours evenly between Responsible:
Id
ServiceDate
Hours
ClientId
ClientName
ResponsibleName
1
2021-10-15
3
123
Tom Holland
Thomas Anderson
2
2021-10-25
5
123
Tom Holland
Tom Cruise
3
2022-01-14
1
123
Tom Holland
Tom Cruise
3
2022-01-14
1
123
Tom Holland
Sten Lee
First one, we can use a window function to apply a row number, based on how close to ServiceDate the ResponsibleStartDate is, then we can just pick the first row per h.Id. If there is a tie we can break it by picking something that will give us deterministic order, e.g. ORDER BY {DATEDIFF expression}, ResponsibleName.
;WITH cte AS
(
SELECT h.Id,
h.ServiceDate,
h.Hours,
h.ClientId,
h.ClientName,
r.ResponsibleName,
RankOrderedByProximityToServiceDate = ROW_NUMBER() OVER
(PARTITION BY h.Id
ORDER BY ABS(DATEDIFF(DAY, ResponsibleStartDate, ServiceDate)))
FROM dbo.History AS h
LEFT JOIN dbo.Responsible AS r
ON (h.ClientId = r.ClientId
AND h.ServiceDate BETWEEN r.ResponsibleStartDate AND r.ResponsibleEndtDate)
)
SELECT Id, ServiceDate, Hours, ClientId, ClientName, ResponsibleName
FROM cte WHERE RankOrderedByProximityToServiceDate = 1;
Output:
Id
ServiceDate
Hours
ClientId
ClientName
ResponsibleName
1
2021-10-15
3
123
Tom Holland
Thomas Anderson
2
2021-10-25
5
123
Tom Holland
Tom Cruise
3
2022-01-14
2
123
Tom Holland
Sten Lee
Second one doesn't require a CTE, we can simply divide the Hours in h by the number of rows that exist for that h.Id, then limit it to 2 decimal places:
SELECT h.Id,
h.ServiceDate,
Hours = CONVERT(decimal(11,2),
h.Hours * 1.0
/ COUNT(h.Id) OVER (PARTITION BY h.Id)),
h.ClientId,
h.ClientName,
r.ResponsibleName
FROM dbo.History AS h
LEFT JOIN dbo.Responsible AS r
ON (h.ClientId = r.ClientId
AND h.ServiceDate BETWEEN r.ResponsibleStartDate AND r.ResponsibleEndtDate);
Output:
Id
ServiceDate
Hours
ClientId
ClientName
ResponsibleName
1
2021-10-15
3.00
123
Tom Holland
Thomas Anderson
2
2021-10-25
5.00
123
Tom Holland
Tom Cruise
3
2022-01-14
1.00
123
Tom Holland
Tom Cruise
3
2022-01-14
1.00
123
Tom Holland
Sten Lee
Both demonstrated in this db<>fiddle.
My attempt at part 1 - it doesn't work if there's more than one Responsible as of the same start date.
WITH
"all_services" AS (
SELECT
h.Id,
h.ServiceDate,
h.Hours,
h.ClientId,
h.ClientName,
r.ResponsibleName,
r.ResponsibleStartDate
FROM History AS h
LEFT JOIN Responsible AS r
ON h.ClientId = r.ClientId
AND h.ServiceDate BETWEEN r.ResponsibleStartDate AND r.ResponsibleEndtDate
),
"most_recent_key" AS (
SELECT
ServiceDate,
ClientId,
MAX(ResponsibleStartDate) AS "ResponsibleStartDate"
FROM all_services
GROUP BY ServiceDate, ClientId
)
SELECT Id, ServiceDate, Hours, ClientId, ClientName, ResponsibleName
FROM all_services
INNER JOIN most_recent_key
USING (ServiceDate, ClientId, ResponsibleStartDate)
Posting it anyway as a contrast to Aaron's better solution as a learning point for myself.

Creating a timetable with SQL (calculated start times for slots) and filtering by a person to show them their slots

I'm working in iMIS CMS (iMIS 200) and trying to create an IQA (an iMIS query, using SQL) that will give me a timetable of slots assigned to people per day (I've got this working); but then I want to be able to filter that timetable on a person's profile so they just see the slots they are assigned to.
(This is for auditions for an orchestra. So people make an application per instrument, then those applications are assigned to audition slots, of which there are several slots per day)
As the start/end times for slots are calculated using SUM OVER, when I filter this query by the person ID, I lose the correct start/end times for slots (as the other slots aren't in the data for it to SUM, I guess!)
Table structure:
tblContacts
===========
ContactID ContactName
---------------------------
1 Steve Jones
2 Clare Philips
3 Bob Smith
4 Helen Winters
5 Graham North
6 Sarah Stuart
tblApplications
===============
AppID FKContactID Instrument
-----------------------------------
1 1 Violin
2 1 Viola
3 2 Cello
4 3 Cello
5 4 Trumpet
6 5 Clarinet
7 5 Horn
8 6 Trumpet
tblAuditionDays
===============
AudDayID AudDayDate AudDayVenue AudDayStart
-------------------------------------------------
1 16-Sep-19 London 10:00
2 17-Sep-19 Manchester 10:00
3 18-Sep-19 Birmingham 13:30
4 19-Sep-19 Leeds 10:00
5 19-Sep-19 Glasgow 11:30
tblAuditionSlots
================
SlotID FKAudDayID SlotOrder SlotType SlotDuration FKAppID
-----------------------------------------------------------------
1 1 1 Audition 20 3
2 1 2 Audition 20 4
3 1 3 Chat 10 3
4 1 5 Chat 10 4
5 1 4 Audition 20
6 2 1 Audition 20 1
7 2 2 Audition 20 6
8 2 4 Chat 10 6
9 2 3 Chat 10 1
10 2 5 Audition 20
11 3 2 Chat 10 8
12 3 1 Audition 20 2
13 3 4 Chat 5 2
14 3 3 Audition 20 8
15 5 1 Audition 30 5
16 5 2 Audition 30 7
17 5 3 Chat 15 7
18 5 4 Chat 15 5
Current SQL for listing all the slots each day (in date/slot order, with the slot timings calculcated correctly) is:
SELECT
[tblAuditionSlots].[SlotOrder] as [Order],
CASE
WHEN
SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) is null
THEN
CONVERT(VARCHAR(5), [tblAuditionDays].[AudDayStart], 108)
ELSE
CONVERT(VARCHAR(5), Dateadd(minute, SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), [tblAuditionDays].[AudDayStart]), 108)
END
+ ' - ' +
CASE
WHEN
SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) is null
THEN
CONVERT(VARCHAR(5), [tblAuditionDays].[AudDayStart], 108)
ELSE
CONVERT(VARCHAR(5), Dateadd(minute, SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), [tblAuditionDays].[AudDayStart]), 108)
END AS [Slot],
[tblAuditionSlots].[SlotType] AS [Type],
[tblContacts].[ContactName] as [Name],
FROM
tblAuditionSlots
LEFT JOIN tblAuditionDays ON tblAuditionSlots.FKAudDayID = tblAuditionDays.AudDayID
LEFT JOIN tblApplications ON tblAuditionSlots.FKAppID = tblApplications.AppID
LEFT JOIN tblContacts ON tblApplications.FKContactID = tblContacts.ContactID
GROUP BY
[tblAuditionSlots].[SlotOrder],
[tblAuditionSlots].[SlotType],
[tblAuditionSlots].[SlotDuration],
[tblAuditionDays].[AudDayStart],
[tblContacts].[ContactName],
[tblContacts].[ContactID],
[tblAuditionDays].[AudDayID],
[tblAuditionDays].[AudDayDate]
ORDER BY
[tblAuditionDays].[DayDate],
[tblAuditionSlots].[Order]
iMIS, the CMS we're using, is limited by what you can create in an IQA (query).
You can basically insert (some) SQL as a column and give it an alias; you can add (non-calculated) fields to the order by; you can't really control the Group By (whatever fields are added are included in the Group By).
Ultimately, I'd like to be able to filter this by a Contact ID so I can see all their audition slots, but with the times correctly calculated.
From the sample data, for example:
STEVE JONES AUDITIONS
=====================
Date Slot Venue Type Instrument
----------------------------------------------------------------
17-Sep-19 10:00 - 10:20 Manchester Audition Violin
17-Sep-19 10:40 - 10:50 Manchester Chat Violin
18-Sep-19 13:30 - 13:50 Birmingham Audition Viola
18-Sep-19 14:30 - 14:35 Birmingham Chat Viola
HELEN WINTERS AUDITIONS
=======================
Date Slot Venue Type Instrument
----------------------------------------------------------------
19-Sep-19 11:30 - 12:00 Glasgow Audition Trumpet
19-Sep-19 12:45 - 13:00 Glasgow Chat Trumpet
Hopefully that all makes sense and I've provided enough information.
(In this version of iMIS [200], you can't do subqueries, in case that comes up...)
Thanks so much in advance for whatever help/tips/advice you can offer!
Chris

SQL - Grouping COUNT Results

The query I'm trying to answer is 'How many sales above or equal to 60 has each person made?'
My table (sales$):
SaleID name salevalue
1 Steve 100
2 John 50
3 Ellen 25
4 Steve 100
5 Mary 60
6 Mary 80
7 John 70
8 Mary 55
9 Steve 65
10 Ellen 120
11 Ellen 30
12 Ellen 40
13 John 40
14 Mary 60
15 Steve 50
My code is:
select name,
COUNT(*) as 'sales above 60'
from Sales$
group by salevalue, name
having salevalue >= 60;
Which gives:
Ellen 1
John 1
Mary 2
Mary 1
Steve 1
Steve 2
The information is correct in that Mary & Steve both have 3 sales, however I'm forced by the HAVING command to group them out.
Any ideas? I'm sure I've just taken a wrong turning.
You can use conditional aggregation for this:
select name,
COUNT(case when salevalue >= 60 then 1 end) as 'sales above 60'
from Sales$
group by name
This way COUNT will take into consideration only records having salevalue >= 60.
I've swapped the HAVING statement for a WHERE and achieved the desired result:
select name, count(*) 'sales above 50'
from sales$
where salevalue >=60
group by name
(Lightbulb moment after posting)

3 Tables, JOIN query and alphabetic order

I am currently working with three tables where I am trying to figure out how to use a join to display once the title_id of any book with Dennis McCann as an editor. The tables have in common title_id and editor_id. Cant find a way to piece it all together. How to display once the title_id of any book with Dennis McCann as an editor?
SELECT * FROM title_editors;
EDITOR_ID TITLE_ EDITOR_ORDER
----------- ------ ------------
826-11-9034 Bu2075 2
826-11-9034 PS2091 2
826-11-9034 Ps2106 2
826-11-9034 PS3333 2
826-11-9034 PS7777 2
826-11-9034 pS1372 2
885-23-9140 MC2222 2
885-23-9140 MC3021 2
885-23-9140 Tc3281 2
885-23-9140 TC4203 2
885-23-9140 TC7777 2
321-55-8906 bU1032 2
321-55-8906 BU1111 2
321-55-8906 BU7832 2
321-55-8906 PC1035 2
321-55-8906 PC8888 2
321-55-8906 BU2075 3
777-02-9831 pc1035 3
777-02-9831 PC8888 3
943-88-7920 BU1032 1
943-88-7920 bu1111 1
943-88-7920 BU2075 1
943-88-7920 BU7832 1
943-88-7920 PC1035 1
943-88-7920 pc8888 1
993-86-0420 PS1372 1
993-86-0420 PS2091 1
993-86-0420 PS2106 1
993-86-0420 PS3333 1
993-86-0420 pS7777 1
993-86-0420 MC2222 1
993-86-0420 MC3021 1
993-86-0420 Tc3218 1
993-86-0420 TC4203 1
993-86-0420 TC7777 1
35 rows selected.
SQL> SELECT * FROM title_authors;
AUTHOR_ID TITLE_ AUTHOR_ORDER ROYALTY_SHARE
----------- ------ ------------ -------------
409-56-7008 Bu1032 1 .6
486-29-1786 PS7777 1 1
486-29-1786 pC9999 1 1
712-45-1867 MC2222 1 1
172-32-1176 Ps3333 1 1
213-46-8915 BU1032 2 .4
238-95-7766 PC1035 1 1
213-46-8915 Bu2075 1 1
998-72-3567 pS2091 1 .5
899-46-2035 PS2091 2 .5
998-72-3567 PS2106 1 1
722-51-5454 mc3021 1 .75
899-46-2035 MC3021 2 .25
807-91-6654 tC3218 1 1
274-80-9391 BU7832 1 1
427-17-2319 pC8888 1 .5
846-92-7186 PC8888 2 .5
756-30-7391 PS1372 1 .75
724-80-9391 PS1372 2 .25
724-80-9391 bu1111 1 .6
267-41-2394 bU1111 2 .4
672-71-3249 TC7777 1 .4
267-41-2394 TC7777 2 .3
472-27-2349 Tc7777 3 .3
648-92-1872 TC4203 1 1
25 rows selected.
SQL> SELECT * FROM editors;
EDITOR_ID EDITOR_LNAME EDITOR_FNAME EDITOR_POSITION PHONE ADDRESS CITY ST ZIP
----------- ----------------- ------------- --------------- ------------ -------------------- ------------ -- ------
321-55-8906 DeLongue Martinella Project 415 843-2222 3000 6th St. BERKELEY Ca 94710
723-48-9010 Sparks MANfred cOPY 303 721-3388 15 Sail DENVER Co 80237
777-02-9831 Samuelson Bernard proJect 415 843-6990 27 Yosemite OAKLAND Ca 94609
777-66-9902 Almond Alfred copy 312 699-4177 1010 E. DeVON CHICAGO Il 60018
826-11-9034 Himmel Eleanore pRoject 617 423-0552 97 Bleaker BOSTON Ma 02210
885-23-9140 Rutherford-Hayes Hannah PROJECT 301 468-3909 32 Rockbill Pike ROCKBILL MD 20852
993-86-0420 McCann Dennis acQuisition 301 468-3909 32 Rockbill Pike ROCKBill MD 20852
943-88-7920 Kaspchek Christof acquisitiOn 415 549-3909 18 Severe Rd. BERKELEY CA 94710
234-88-9720 Hunter Amanda acquisition 617 432-5586 18 Dowdy Ln. BOSTON MA 02210
You can try join on the table Editors and Ttile_Editors using the Editor_ID that will give you the matching records and you can filter out Only for the ' Dennis McCann ' using either multiple conditions in join or the where clause as,
WITHOUT WHERE
SELECT DISTINCT te.title_id,ed.EDITOR_ID,ed.EDITOR_LNAME,ed.EDITOR_FNAME
FROM
title_editors te JOIN editors ed
ON te.EDITOR_ID = ed.EDITOR_ID
AND ed.EDITOR_LNAME = 'McCann'
AND ed.EDITOR_FNAME = 'Dennis'
ORDER BY te.title_id
USing WHERE
SELECT DISTINCT te.title_id,ed.EDITOR_ID,ed.EDITOR_LNAME,ed.EDITOR_FNAME
FROM
title_editors te JOIN editors ed
ON te.EDITOR_ID = ed.EDITOR_ID
WHERE
ed.EDITOR_LNAME = 'McCann'
AND ed.EDITOR_FNAME = 'Dennis'
ORDER BY te.title_id
It would be easier with the in operator:
SELECT DISTINCT title_id
FROM title_editors
WHERE editor_id IN (SELECT editor_id
FROM editors
WHERE editor_fname = 'Dennis' AND
editor_lname = 'McCann')
ORDER BY title_id ASC

Retrieve top 48 unique records from database based on a sorted Field

I have database table that I am after some SQL for (Which is defeating me so far!)
Imagine there are 192 Athletic Clubs who all take part in 12 Track Meets per season.
So that is 2304 individual performances per season (for example in the 100Metres)
I would like to find the top 48 (unique) individual performances from the table, these 48 athletes are then going to take part in the end of season World Championships.
So imagine the 2 fastest times are both set by "John Smith", but he can only be entered once in the world champs. So i would then look for the next fastest time not set by "John Smith"... so on and so until I have 48 unique athletes..
hope that makes sense.
thanks in advance if anyone can help
PS
I did have a nice screen shot created that would explain it much better. but as a newish user i cannot post images.
I'll try a copy and paste version instead...
ID AthleteName AthleteID Time
1 Josh Lewis 3 11.99
2 Joe Dundee 4 11.31
3 Mark Danes 5 13.44
4 Josh Lewis 3 13.12
5 John Smith 1 11.12
6 John Smith 1 12.18
7 John Smith 1 11.22
8 Adam Bennett 6 11.33
9 Ronny Bower 7 12.88
10 John Smith 1 13.49
11 Adam Bennett 6 12.55
12 Mark Danes 5 12.12
13 Carl Tompkins 2 13.11
14 Joe Dundee 4 11.28
15 Ronny Bower 7 12.14
16 Carl Tompkin 2 11.88
17 Nigel Downs 8 14.14
18 Nigel Downs 8 12.19
Top 4 unique individual performances
1 John Smith 1 11.12
3 Joe Dundee 4 11.28
5 Adam Bennett 6 11.33
6 Carl Tompkins 2 11.88
Basically something like this:
select top 48 *
from (
select athleteId,min(time) as bestTime
from theRaces
where raceId = '123' -- e.g., 123=100 meters
group by athleteId
) x
order by bestTime
try this --
select x.ID, x.AthleteName , x.AthleteID , x.Time
(
select rownum tr_count,v.AthleteID AthleteID, v.AthleteName AthleteName, v.Time Time,v.id id
from
(
select
tr1.AthleteName AthleteName, tr1.Time time,min(tr1.id) id, tr1.AthleteID AthleteID
from theRaces tr1
where time =
(select min(time) from theRaces tr2 where tr2.athleteId = tr1.athleteId)
group by tr1.AthleteName, tr1.AthleteID, tr1.Time
having tr1.Time = ( select min(tr2.time) from theRaces tr2 where tr1.AthleteID =tr2.AthleteID)
order by tr1.time
) v
) x
where x.tr_count < 48