Checking for maximum value in the same column repeatedly and replacing with new found maximum - sql

I'm trying to find the maximum value in a column and keep the same value until the next max value is found, if found then replace it with the new one, in SQL as data lies in MS SQL Server
Algorithm would be basically keep a global variable for MaxDelay and keep rewriting that with new max or run two 'for loops' one for current row and other for row-1 until next project is found.
Should I be using a mix of recursive query and funtion in SQL?
I tried Lag() but I'm unable to use Max() of Lag() due to windowed functions limitation.
SELECT *,
(select max(v) from (VALUES([Delay],Lag([Delay], 1) OVER(ORDER BY [Milestones], [Project Name])))as value(v)) as [MaxDate]
FROM dbo.[Scenario Testing with Previous Row]
error:
Msg 4108, Level 15, State 1, Line 2
Windowed functions can only appear in the SELECT or ORDER BY clauses.
Would appreciate if someone could give me some pointers.
Project Name Milestones Baseline Date Actual Date Delay Max Delay Worst Case Date
Project 1 MS_1 12/12/2016 15/12/2016 3 3 15/12/2016
Project 1 MS_2 14/12/2016 16/12/2016 2 3 17/12/2016
Project 1 MS_3 31/12/2016 09/01/2017 9 9 09/01/2017
Project 1 MS_4 11/01/2017 12/01/2017 1 9 20/01/2017
Project 1 MS_5 21/01/2017 24/01/2017 3 9 30/01/2017
Project 1 MS_6 01/02/2017 15/02/2017 14 14 15/02/2017
Project 1 MS_7 15/02/2017 16/02/2017 1 14 01/03/2017
Project 1 MS_8 26/02/2017 26/02/2017 0 14 12/03/2017
Project 1 MS_9 31/03/2017 31/03/2017 0 14 14/04/2017

If you want a cumulative max, then use the appropriate function. Something like this:
select t.*,
max(delay) over (order by [Milestones], [Project Name]) as running_max
from dbo.[Scenario Testing with Previous Row] t;

Related

Calculate overlap time in seconds for groups in SQL

I have a bunch of timestamps grouped by ID and type in the sample data shown below.
I would like to find overlapped time between start_time and end_time columns in seconds for each group of ID and between each lead and follower combinations. I would like to show the overlap time only for the first record of each group which will always be the "lead" type.
For example, for the ID 1, the follower's start and end times in row 3 overlap with the lead's in row 1 for 193 seconds (from 09:00:00 to 09:03:13). the follower's times in row 3 also overlap with the lead's in row 2 for 133 seconds (09:01:00 to 2020-05-07 09:03:13). That's a total of 326 seconds (193+133)
I used the partition clause to rank rows by ID and type and order them by start_time as a start.
How do I get the overlap column?
row# ID type start_time end_time rank. overlap
1 1 lead 2020-05-07 09:00:00 2020-05-07 09:03:34 1 326
2 1 lead 2020-05-07 09:01:00 2020-05-07 09:03:13 2
3 1 follower 2020-05-07 08:59:00 2020-05-07 09:03:13 1
4 2 lead 2020-05-07 11:23:00 2020-05-07 11:33:00 1 540
4 2 follower 2020-05-07 11:27:00 2020-05-07 11:32:00 1
5 3 lead 2020-05-07 14:45:00 2020-05-07 15:00:00 1 305
6 3 follower 2020-05-07 14:44:00 2020-05-07 14:44:45 1
7 3 follower 2020-05-07 14:50:00 2020-05-07 14:55:05 2
In your example, the times completely cover the total duration. If this is always true, you can use the following logic:
select id,
(sum(datediff(second, start_time, end_time) -
datediff(second, min(start_time), max(end_time)
) as overlap
from t
group by id;
To add this as an additional column, then either use window functions or join in the result from the above query.
If the overall time has gaps, then the problem is quite a bit more complicated. I would suggest that you ask a new question and set up a db fiddle for the problem.
Tried this a couple of way and got it to work.
I first joined 2 tables with individual records for each type, 'lead' and 'follower' and created a case statement to calculate max start time for each lead and follower start time combination and min end time for each lead and follower end time combination. Stored this in a temp table.
CASE
WHEN lead_table.start_time > follower_table.start_time THEN lead_table.start_time
WHEN lead_table.start_time < follower_table.start_time THEN patient_table.start_time_local
ELSE 0
END as overlap_start_time,
CASE
WHEN follower_table.end_time < lead_table.end_time THEN follower_table.end_time
WHEN follower_table.end_time > lead_table.end_time THEN lead_table.end_time
ELSE 0
END as overlap_end_time
Then created an outer query to lookup the temp table just created to find the difference between start time and end time for each lead and follower combination in seconds
select temp_table.id,
temp_table.overlap_start_time,
temp_table.overlap_end_time,
DATEDIFF_BIG(second,
temp_table.overlap_start_time,
temp_table.overlap_end_time) as overlap_time FROM temp_table

How do i join the last record from one table where the date is older than other table?

This is my first post here, and the first problem i havent been able to find a solution to on my own. I have a MainTable that contains the fields: Date, MinutesActiveWork (And other not relevant fields). I have a second table that contains the fields: ID, id_Workarea, GoalOfActiveMinutes, GoalActiveFrom.
I want to make a query that returns all records from MainTable, and the active goal for the date.
Exampel:
Maintable (Date = dd/mm/yyyy)
ID Date ActvWrkMin WrkAreaID
1 01-01-2019 45 1
2 02-01-2019 50 1
3 03-01-2019 48 1
GoalTable:
ID id_Workarea Goal GlActvFrm
1 1 45 01-01-2019
2 2 90 01-01-2019
3 1 50 03-01-2019
What i want from my query:
IDMain Date ActvWrkMin Goal WrkAreaID
1 01-01-2019 45 45 1
2 02-01-2019 50 45 1
3 03-01-2019 48 50 1
The query that i have now is really close to what i want. But the problem is that the query outputs all goals that is less than the date from MainTable (It makes sense why, but i dont know what criteria to type to fix it). Like so:
IDMain Date ActvWrkMin Goal WrkAreaID
1 01-01-2019 45 45 1
2 02-01-2019 50 45 1
3 03-01-2019 48 45 1 <-- Dont want this one
3 03-01-2019 48 50 1
My query
SELECT tblMain.Date, tblMain.ActiveWorkMins, tblGoal.Goal
FROM VtblSumpMain AS tblMain LEFT JOIN (
SELECT VtblGoalsForWorkareas.idWorkArea, VtblGoalsForWorkareas.Goal, VtblGoalsForWorkareas.GoalActiveFrom (THIS IS THE DATE FIELD)
FROM VtblGoalsForWorkareas
WHERE VtblGoalsForWorkareas.idWorkArea= 1) AS tblGoal ON tblMain.Date > tblGoal.GoalActiveFrom
ORDER BY tblMain.Date
(I know i could do this pretty simple with Dlookup, but that is just not fast enough)
Thanks for any advice!
For this, I think you have to use the nested query as I mention below.
select tblMain.id,tblMain.Date,tblMain.ActvWrkMin, tblMain.WrkAreaID,
(select top 1 Goal
from GoalTable as gtbl
where gtbl.id_workarea = 1
and tblmain.[Date] >= gtbl.glActvFrm order by gtbl.glActvFrm desc) as Goal
from Maintable as tblMain
Check the below image for the result which is generated from this query.
I hope this will solve your issue.

How to calculate a running total that is a distinct sum of values

Consider this dataset:
id site_id type_id value date
------- ------- ------- ------- -------------------
1 1 1 50 2017-08-09 06:49:47
2 1 2 48 2017-08-10 08:19:49
3 1 1 52 2017-08-11 06:15:00
4 1 1 45 2017-08-12 10:39:47
5 1 2 40 2017-08-14 10:33:00
6 2 1 30 2017-08-09 07:25:32
7 2 2 32 2017-08-12 04:11:05
8 3 1 80 2017-08-09 19:55:12
9 3 2 75 2017-08-13 02:54:47
10 2 1 25 2017-08-15 10:00:05
I would like to construct a query that returns a running total for each date by type. I can get close with a window function, but I only want the latest value for each site to be summed for the running total (a simple window function will not work because it sums all values up to a date--not just the last values for each site). So I guess it could be better described as a running distinct total?
The result I'm looking for would be like this:
type_id date sum
------- ------------------- -------
1 2017-08-09 06:49:47 50
1 2017-08-09 07:25:32 80
1 2017-08-09 19:55:12 160
1 2017-08-11 06:15:00 162
1 2017-08-12 10:39:47 155
1 2017-08-15 10:00:05 150
2 2017-08-10 08:19:49 48
2 2017-08-12 04:11:05 80
2 2017-08-13 02:54:47 155
2 2017-08-14 10:33:00 147
The key here is that the sum is not a running sum. It should only be the sum of the most recent values for each site, by type, at each date. I think I can help explain it by walking through the result set I've provided above. For my explanation, I'll walk through the original data chronologically and try to explain the expected result.
The first row of the result starts us off, at 2017-08-09 06:49:47, where chronologically, there is only one record of type 1 and it is 50, so that is our sum for 2017-08-09 06:49:47.
The second row of the result is at 2017-08-09 07:25:32, at this point in time we have 2 unique sites with values for type_id = 1. They have values of 50 and 30, so the sum is 80.
The third row of the result occurs at 2017-08-09 19:55:12, where now we have 3 sites with values for type_id = 1. 50 + 30 + 80 = 160.
The fourth row is where it gets interesting. At 2017-08-11 06:15:00 there are 4 records with a type_id = 1, but 2 of them are for the same site. I'm only interested in the most recent value for each site so the values I'd like to sum are: 30 + 80 + 52 resulting in 162.
The 5th row is similar to the 4th since the value for site_id:1, type_id:1 has changed again and is now 45. This results in the latest values for type_id:1 at 2017-08-12 10:39:47 are now: 30 + 80 + 45 = 155.
Reviewing the 6th row is also interesting when we consider that at 2017-08-15 10:00:05, site 2 has a new value for type_id 1, which gives us: 80 + 45 + 25 = 150 for 2017-08-15 10:00:05.
You can get a cumulative total (running total) by including an ORDER BY clause in your window frame.
select
type_id,
date,
sum(value) over (partition by type_id order by date) as sum
from your_table;
The ORDER BY works because
The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
SELECT type_id,
date,
SUM(value) OVER (PARTITION BY type_id ORDER BY type_id, date) - (SUM(value) OVER (PARTITION BY type_id, site_id ORDER BY type_id, date) - value) AS sum
FROM your_table
ORDER BY type_id,
date

Getting date difference between consecutive rows in the same group

I have a database with the following data:
Group ID Time
1 1 16:00:00
1 2 16:02:00
1 3 16:03:00
2 4 16:09:00
2 5 16:10:00
2 6 16:14:00
I am trying to find the difference in times between the consecutive rows within each group. Using LAG() and DATEDIFF() (ie. https://stackoverflow.com/a/43055820), right now I have the following result set:
Group ID Difference
1 1 NULL
1 2 00:02:00
1 3 00:01:00
2 4 00:06:00
2 5 00:01:00
2 6 00:04:00
However I need the difference to reset when a new group is reached, as in below. Can anyone advise?
Group ID Difference
1 1 NULL
1 2 00:02:00
1 3 00:01:00
2 4 NULL
2 5 00:01:00
2 6 00:04:00
The code would look something like:
select t.*,
datediff(second, lag(time) over (partition by group order by id), time)
from t;
This returns the difference as a number of seconds, but you seem to know how to convert that to a time representation. You also seem to know that group is not acceptable as a column name, because it is a SQL keyword.
Based on the question, you have put group in the order by clause of the lag(), not the partition by.

Max date among records and across tables - SQL Server

I tried max to provide in table format but it seem not good in StackOver, so attaching snapshot of the 2 tables. Apologize about the formatting.
SQL Server 2012
**MS Table**
**mId tdId name dueDate**
1 1 **forecastedDate** 1/1/2015
2 1 **hypercareDate** 11/30/2016
3 1 LOE 1 7/4/2016
4 1 LOE 2 7/4/2016
5 1 demo for yy test 10/15/2016
6 1 Implementation – testing 7/4/2016
7 1 Phased Rollout – final 7/4/2016
8 2 forecastedDate 1/7/2016
9 2 hypercareDate 11/12/2016
10 2 domain - Forte NULL
11 2 Fortis completion 1/1/2016
12 2 Certification NULL
13 2 Implementation 7/4/2016
-----------------------------------------------
**MSRevised**
**mId revisedDate**
1 1/5/2015
1 1/8/2015
3 3/25/2017
2 2/1/2016
2 12/30/2016
3 4/28/2016
4 4/28/2016
5 10/1/2016
6 7/28/2016
7 7/28/2016
8 4/28/2016
9 8/4/2016
9 5/28/2016
11 10/4/2016
11 10/5/2016
13 11/1/2016
----------------------------------------
The required output is
1. Will be passing the 'tId' number, for instance 1, lets call it tid (1)
2. Want to compare tId (1)'s all milestones (except hypercareDate) with tid(1)'s forecastedDate milestone
3. return if any of the milestone date (other than hypercareDate) is greater than the forecastedDate
The above 3 steps are simple, but I have to first compare the milestones date with its corresponding revised dates, if any, from the revised table, and pick the max date among all that needs to be compared with the forecastedDate
I managed to solve this. Posting the answer, hope it helps aomebody.
//Insert the result into temp table
INSERT INTO #mstab
SELECT [mId]
, [tId]
, [msDate]
FROM [dbo].[MS]
WHERE ([msName] NOT LIKE 'forecastedDate' AND [msName] NOT LIKE 'hypercareDate'))
// this scalar function will get max date between forecasted duedate and forecasted revised date
SELECT #maxForecastedDate = [dbo].[fnGetMaxDate] ( 'forecastedDate');
// this will get the max date from temp table and compare it with forecasatedDate/
SET #maxmilestoneDate = (SELECT MAX(maxDate)
FROM ( SELECT ms.msDueDate AS dueDate
, mr.msRevisedDate AS revDate
FROM #mstab as ms
LEFT JOIN [MSRev] as mr on ms.msId = mr.msId
) maxDate
UNPIVOT (maxDate FOR DateCols IN (dueDate, revDate))up );