Function to get rolling average with lowest 2 values eliminated? - sql

This is my sample data with the current_Rating column my desired output.
Date Name Subject Importance Location Time Rating Current_rating
12/08/2020 David Work 1 London - - 4
1/08/2020 David Work 3 London 23.50 4 3.66
2/10/2019 David Emails 3 New York 18.20 3 4.33
2/08/2019 David Emails 3 Paris 18.58 4 4
11/07/2019 David Work 1 London - 3 4
1/06/2019 David Work 3 London 23.50 4 4
2/04/2019 David Emails 3 New York 18.20 3 5
2/03/2019 David Emails 3 Paris 18.58 5 -
12/08/2020 George Updates 2 New York - - 2
1/08/2019 George New Appointments5 London 55.10 2 -
I need to use a function to get values in the current_Rating column.The current_Rating gets the previous 5 results from the rating column for each name, then eliminates the lowest 2 results, then gets the average for the remaining 3. Also some names may not have 5 results, so I will just need to get the average of the results if 3 or below, if 4 results I will need to eliminate the lowest value and average the remaining 3. Also to get the right 5 previous results it will need to be sorted by date. Is this possible? Thanks for your time in advance.

What a pain! I think the simplest method might be to use arrays and then unnest() and aggregate:
select t.*, r.current_rating
from (select t.*,
array_agg(rating) over (partition by name order by date rows between 4 preceding and current row) as rating_5
from t
) t cross join lateral
(select avg(r) as current_rating
from (select u.*
from unnest(t.rating_5) with ordinality u(r, n)
where r is not null
order by r desc desc
limit 3
) r
) r

Related

Select a row based on where have maximum by one column

I have a table called match_score which have following data
id
participant
round
score
1
gabe
1
100
2
john
1
90
3
duff
1
80
4
vlad
1
85
5
gabe
2
75
6
john
2
70
Let's just say that round 1 is the preliminary round and 2 is the final round
I want to rank the result based on the score and grouped by the participant , if I'm using some normal sql group by participant and order by score desc.
vlad are the 1st, duff 2nd and gabe 3rd, which one is wrong.
what i want is
1st gabe with 75 point in the final round
2nd john with 70 point in the final round
3rd vlad with 85 point in the preliminary round
4th duff with 80 point in the preliminary round
maybe something like this:
select
h.participant,
h.round,
h.score
from
match_score h
where
not exists (
select 1
from
match_score t
where
t.participant = h.participantand t.round > h.round
)
order by
h.round desc,
h.score desc;

Group overlapping differing date-ranges with different start- and enddate in SQL

I need help with grouping overlapping date-ranges in Microsoft SQL Server 18.1. A sample of the data looks like this. I need to be able to group based on ID, Name, StartDate and EndDate. This should be done so, that if a ID 1's date-range overlaps or is less than 7 days apart from the next row of ID 1's date range, they should be assigned the same grouping ID. If the timegap between two lines is bigger than 7 days, they should be seperated as two groupings.
The data is characterized by having differing Start- and EndDate for almost all of the rows, so it cannot be grouped by Start- and EndDate. Instead the goal is to group all rows for one person where the rows date-ranges overlap with less than 7 days, and show the Start- and EndDate for the full period by using MIN and MAX as shown in the "desired output window". Each Person in the dataset can have up to 200 lines of differing date-ranges that need to be grouped if it overlaps with other date-ranges for the person in the dataset.
I suppose the solution requires running through all rows until all rows are grouped based on ID, Name and overlapping date ranges. The case is that match-grouping for period 1-4, firstly requires period 1 to be matched with period 2, then period 1-2 needs to be matched to period 3, and period 1-3 to be matched with period 4. It can be the case that Period 1's date-range (e.g. 01-01-2019 - 30-05-2019) ends later than period 2's (e.g. 05-02-2019 - 24-04-2019), where the period 1-2 comparison/match to period 3 should be the MAX EndDate for period 1-2 meaning 30-05-2019 in this case.
Period 1 Period 2 Period 3 Period 4
X
X
X
X
I need help with making a code that gets me from step 0 - Raw Data to step 1 - Grouping by overlapping date-ranges (less than 7 days apart). I have tried CASE, LAG, LEAD, PARTITION BY and some different kind of loops but haven't found a solution on how to solve the problem.
Step 0 - Raw Data:
ID Name StartDate EndDate
1 Peter Hanson 01-01-2018 15-02-2019
1 Peter Hanson 05-01-2019 23-02-2019
1 Peter Hanson 30-02-2019 18-04-2019
2 Eric Schmidt 05-01-2019 18-03-2019
2 Eric Schmidt 07-01-2019 25-05-2019
3 Martin Boyle 08-03-2018 12-01-2019
3 Martin Boyle 15-01-2019 17-04-2019
3 Martin Boyle 18-04-2019 12-05-2019
3 Martin Boyle 29-04-2019 31-09-2019
Step 1- Grouping by overlapping date-ranges (less than 7 days apart):
ID Name StartDate EndDate Grouping
1 Peter Hanson 01-01-2018 15-02-2019 1
1 Peter Hanson 05-01-2019 23-02-2019 1
1 Peter Hanson 30-02-2019 18-04-2019 2
2 Eric Schmidt 05-01-2019 18-03-2019 3
2 Eric Schmidt 07-01-2019 25-05-2019 3
3 Martin Boyle 08-03-2018 12-01-2019 4
3 Martin Boyle 23-01-2019 17-04-2019 5
3 Martin Boyle 18-04-2019 12-05-2019 5
3 Martin Boyle 29-04-2019 31-09-2019 5
Step 2 - Desired Output window:
ID Name StartDate EndDate Grouping
1 Peter Hanson 01-01-2019 23-02-2019 1
1 Peter Hanson 30-02-2019 18-04-2019 2
2 Eric Schmidt 05-01-2019 25-05-2019 3
3 Martin Boyle 08-03-2018 12-01-2019 4
3 Martin Boyle 23-01-2019 31-09-2019 5
I hope that somebody can help with this task.
You want to identify where a group starts. Based on your description, you can use lag() -- although it total overlaps are allowed then a cumulative max() is more appropriate.
Then, the groups are the cumulative sum of the starts . . . and the rest is aggregation:
select id, name, min(startdate), max(enddate),
dense_rank() over (order by id, min(startdate)) as grouping
from (select t.*,
sum(case when prev_enddate >= dateadd(day, -7, startdate) then 0 else 1 end /*end*/
) over (partition by id order by startdate) as grp
from (select t.*,
lag(enddate) over (partition by id order by startdate) as prev_enddate
from t
) t
) t
group by id, name;

Creating a timetable with SQL (calculated start times for slots) and filtering by a person to show them their slots

I'm working in iMIS CMS (iMIS 200) and trying to create an IQA (an iMIS query, using SQL) that will give me a timetable of slots assigned to people per day (I've got this working); but then I want to be able to filter that timetable on a person's profile so they just see the slots they are assigned to.
(This is for auditions for an orchestra. So people make an application per instrument, then those applications are assigned to audition slots, of which there are several slots per day)
As the start/end times for slots are calculated using SUM OVER, when I filter this query by the person ID, I lose the correct start/end times for slots (as the other slots aren't in the data for it to SUM, I guess!)
Table structure:
tblContacts
===========
ContactID ContactName
---------------------------
1 Steve Jones
2 Clare Philips
3 Bob Smith
4 Helen Winters
5 Graham North
6 Sarah Stuart
tblApplications
===============
AppID FKContactID Instrument
-----------------------------------
1 1 Violin
2 1 Viola
3 2 Cello
4 3 Cello
5 4 Trumpet
6 5 Clarinet
7 5 Horn
8 6 Trumpet
tblAuditionDays
===============
AudDayID AudDayDate AudDayVenue AudDayStart
-------------------------------------------------
1 16-Sep-19 London 10:00
2 17-Sep-19 Manchester 10:00
3 18-Sep-19 Birmingham 13:30
4 19-Sep-19 Leeds 10:00
5 19-Sep-19 Glasgow 11:30
tblAuditionSlots
================
SlotID FKAudDayID SlotOrder SlotType SlotDuration FKAppID
-----------------------------------------------------------------
1 1 1 Audition 20 3
2 1 2 Audition 20 4
3 1 3 Chat 10 3
4 1 5 Chat 10 4
5 1 4 Audition 20
6 2 1 Audition 20 1
7 2 2 Audition 20 6
8 2 4 Chat 10 6
9 2 3 Chat 10 1
10 2 5 Audition 20
11 3 2 Chat 10 8
12 3 1 Audition 20 2
13 3 4 Chat 5 2
14 3 3 Audition 20 8
15 5 1 Audition 30 5
16 5 2 Audition 30 7
17 5 3 Chat 15 7
18 5 4 Chat 15 5
Current SQL for listing all the slots each day (in date/slot order, with the slot timings calculcated correctly) is:
SELECT
[tblAuditionSlots].[SlotOrder] as [Order],
CASE
WHEN
SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) is null
THEN
CONVERT(VARCHAR(5), [tblAuditionDays].[AudDayStart], 108)
ELSE
CONVERT(VARCHAR(5), Dateadd(minute, SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), [tblAuditionDays].[AudDayStart]), 108)
END
+ ' - ' +
CASE
WHEN
SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) is null
THEN
CONVERT(VARCHAR(5), [tblAuditionDays].[AudDayStart], 108)
ELSE
CONVERT(VARCHAR(5), Dateadd(minute, SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), [tblAuditionDays].[AudDayStart]), 108)
END AS [Slot],
[tblAuditionSlots].[SlotType] AS [Type],
[tblContacts].[ContactName] as [Name],
FROM
tblAuditionSlots
LEFT JOIN tblAuditionDays ON tblAuditionSlots.FKAudDayID = tblAuditionDays.AudDayID
LEFT JOIN tblApplications ON tblAuditionSlots.FKAppID = tblApplications.AppID
LEFT JOIN tblContacts ON tblApplications.FKContactID = tblContacts.ContactID
GROUP BY
[tblAuditionSlots].[SlotOrder],
[tblAuditionSlots].[SlotType],
[tblAuditionSlots].[SlotDuration],
[tblAuditionDays].[AudDayStart],
[tblContacts].[ContactName],
[tblContacts].[ContactID],
[tblAuditionDays].[AudDayID],
[tblAuditionDays].[AudDayDate]
ORDER BY
[tblAuditionDays].[DayDate],
[tblAuditionSlots].[Order]
iMIS, the CMS we're using, is limited by what you can create in an IQA (query).
You can basically insert (some) SQL as a column and give it an alias; you can add (non-calculated) fields to the order by; you can't really control the Group By (whatever fields are added are included in the Group By).
Ultimately, I'd like to be able to filter this by a Contact ID so I can see all their audition slots, but with the times correctly calculated.
From the sample data, for example:
STEVE JONES AUDITIONS
=====================
Date Slot Venue Type Instrument
----------------------------------------------------------------
17-Sep-19 10:00 - 10:20 Manchester Audition Violin
17-Sep-19 10:40 - 10:50 Manchester Chat Violin
18-Sep-19 13:30 - 13:50 Birmingham Audition Viola
18-Sep-19 14:30 - 14:35 Birmingham Chat Viola
HELEN WINTERS AUDITIONS
=======================
Date Slot Venue Type Instrument
----------------------------------------------------------------
19-Sep-19 11:30 - 12:00 Glasgow Audition Trumpet
19-Sep-19 12:45 - 13:00 Glasgow Chat Trumpet
Hopefully that all makes sense and I've provided enough information.
(In this version of iMIS [200], you can't do subqueries, in case that comes up...)
Thanks so much in advance for whatever help/tips/advice you can offer!
Chris

Access SQL - Select only the last sequence

I have a table with an ID and multiple informative columns. Sometimes however, I can have multiple data for an ID, so I added a column called "Sequence". Here is a shortened example:
ID Sequence Name Tel Date Amount
124 1 Bob 873-4356 2001-02-03 10
124 2 Bob 873-4356 2002-03-12 7
124 3 Bob 873-4351 2006-07-08 24
125 1 John 983-4568 2007-02-01 3
125 2 John 983-4568 2008-02-08 13
126 1 Eric 345-9845 2010-01-01 18
So, I would like to obtain only these lines:
124 3 Bob 873-4351 2006-07-08 24
125 2 John 983-4568 2008-02-08 13
126 1 Eric 345-9845 2010-01-01 18
Anyone could give me a hand on how I could build a SQL query to do this ?
Thanks !
You can calculate the maximum sequence using group by. Then you can use join to get only the maximum in the original data.
Assuming your table is called t:
select t.*
from t join
(select id, MAX(sequence) as maxs
from t
group by id
) tmax
on t.id = tmax.id and
t.sequence = tmax.maxs

Retrieve top 48 unique records from database based on a sorted Field

I have database table that I am after some SQL for (Which is defeating me so far!)
Imagine there are 192 Athletic Clubs who all take part in 12 Track Meets per season.
So that is 2304 individual performances per season (for example in the 100Metres)
I would like to find the top 48 (unique) individual performances from the table, these 48 athletes are then going to take part in the end of season World Championships.
So imagine the 2 fastest times are both set by "John Smith", but he can only be entered once in the world champs. So i would then look for the next fastest time not set by "John Smith"... so on and so until I have 48 unique athletes..
hope that makes sense.
thanks in advance if anyone can help
PS
I did have a nice screen shot created that would explain it much better. but as a newish user i cannot post images.
I'll try a copy and paste version instead...
ID AthleteName AthleteID Time
1 Josh Lewis 3 11.99
2 Joe Dundee 4 11.31
3 Mark Danes 5 13.44
4 Josh Lewis 3 13.12
5 John Smith 1 11.12
6 John Smith 1 12.18
7 John Smith 1 11.22
8 Adam Bennett 6 11.33
9 Ronny Bower 7 12.88
10 John Smith 1 13.49
11 Adam Bennett 6 12.55
12 Mark Danes 5 12.12
13 Carl Tompkins 2 13.11
14 Joe Dundee 4 11.28
15 Ronny Bower 7 12.14
16 Carl Tompkin 2 11.88
17 Nigel Downs 8 14.14
18 Nigel Downs 8 12.19
Top 4 unique individual performances
1 John Smith 1 11.12
3 Joe Dundee 4 11.28
5 Adam Bennett 6 11.33
6 Carl Tompkins 2 11.88
Basically something like this:
select top 48 *
from (
select athleteId,min(time) as bestTime
from theRaces
where raceId = '123' -- e.g., 123=100 meters
group by athleteId
) x
order by bestTime
try this --
select x.ID, x.AthleteName , x.AthleteID , x.Time
(
select rownum tr_count,v.AthleteID AthleteID, v.AthleteName AthleteName, v.Time Time,v.id id
from
(
select
tr1.AthleteName AthleteName, tr1.Time time,min(tr1.id) id, tr1.AthleteID AthleteID
from theRaces tr1
where time =
(select min(time) from theRaces tr2 where tr2.athleteId = tr1.athleteId)
group by tr1.AthleteName, tr1.AthleteID, tr1.Time
having tr1.Time = ( select min(tr2.time) from theRaces tr2 where tr1.AthleteID =tr2.AthleteID)
order by tr1.time
) v
) x
where x.tr_count < 48