Label rows by most time spent in a time interval - pandas

I have a data set with actions that have a start and end time. I'd like to label each row part of day (morning, noon, evening, night). Since some actions might start at one part and end in the other, I'd like to consider where most of the time was spent.
Say that morning is 6am-11am and noon is 11am-2am and I have an action between 10:30am to 1pm it should be labeled as noon.
One approach I though was to create a column for each part of day and calculate the number of seconds spent in each part (per row) then use idxmax to find the part of day. But then how do I calculate the time overlap between (start, stop) to the part of day?

df = pd.DataFrame([[0, 4],
[2, 5.2],
[0.2, 6],
[3, 4.1]], columns=['start', 'end'])
periods = {'morning': (0, 3),
'afternoon': (3, 6)}
for name, (start, stop) in periods.items():
df['i_start'] = start
df['i_end'] = stop
overlap = df[['end', 'i_end']].min(axis=1) - df[['start', 'i_start']].max(axis=1)
df.loc[overlap >= 0, name] = overlap[overlap >= 0]
result = df[list(periods)].idxmax(axis=1)
... should do the job (as long as you don't have actions spanning from one day to the next).

Related

Plotly: highlight regular trading market hours

Goal: highlight regular trading market hours in a plotly chart.
-Using a df with standard datetime and 1 minute intervals.
-Regular trading hours = 9:30am EST to 4pm EST
—-Incase interested:
——-pre market = 4am to 9:30am
——-post market = 4pm to 8pm
Stack overflow has great links for highlighting weekend data like this:
Nevermind that link was just removed by the author as I tried to post it, but it’s too difficult for me to translate that to specific times of day anyway.
This is relatively easy to do using fig.add_vrect()
I built a similar highlighting system for night and day:
time = df.index.get_level_values("time")
# Getting info for plotting the day/night indicator
# time[0].date() picks out 00:00 (midnight) and then we add 6 hours to get 6 am.
start_morning = pd.to_datetime(time[0].date()) + pd.Timedelta(
hours=6
)
end_morning = pd.to_datetime(time[-1].date()) + pd.Timedelta(
hours=30
)
num_mornings = (end_morning - start_morning).days
# Now we build up the morning times, every day at 6 am
mornings = [
start_morning + timedelta(days=x) for x in range(num_mornings)
]
for morning in mornings:
fig.add_vrect(
# Highlighted region starts at 6 am and ends at 6 pm, every day.
x0=morning,
x1=morning + timedelta(hours=12),
fillcolor="white",
opacity=0.1,
line_width=0,
)
For you, it would just be a simple matter of adjusting the times. So for instance, for 9:30 am you can use
morning = pd.to_datetime(time[0].date()) + pd.Timedelta(hours=9.5)
to get the first day of your data, at 9:30 am. Now in fig.add_vrect() use
x0= morning
x1= morning + timedelta(hours=6.5)
to highlight between 9:30 am and 4 pm.

Adding Period to startDate doesn't produce endDate

I have two LocalDates declared as following:
val startDate = LocalDate.of(2019, 10, 31) // 2019-10-31
val endDate = LocalDate.of(2019, 9, 30) // 2019-09-30
Then I calculate the period between them using Period.between function:
val period = Period.between(startDate, endDate) // P-1M-1D
Here the period has the negative amount of months and days, which is expected given that endDate is earlier than startDate.
However when I add that period back to the startDate, the result I'm getting is not the endDate, but the date one day earlier:
val endDate1 = startDate.plus(period) // 2019-09-29
So the question is, why doesn't the invariant
startDate.plus(Period.between(startDate, endDate)) == endDate
hold for these two dates?
Is it Period.between who returns an incorrect period, or LocalDate.plus who adds it incorrectly?
If you look how plus is implemented for LocalDate
#Override
public LocalDate plus(TemporalAmount amountToAdd) {
if (amountToAdd instanceof Period) {
Period periodToAdd = (Period) amountToAdd;
return plusMonths(periodToAdd.toTotalMonths()).plusDays(periodToAdd.getDays());
}
...
}
you'll see plusMonths(...) and plusDays(...) there.
plusMonths handles cases when one month has 31 days, and the other has 30. So the following code will print 2019-09-30 instead of non-existent 2019-09-31
println(startDate.plusMonths(period.months.toLong()))
After that, subtracting one day results in 2019-09-29. This is the correct result, since 2019-09-29 and 2019-10-31 are 1 month 1 day apart
The Period.between calculation is weird and in this case boils down to
LocalDate end = LocalDate.from(endDateExclusive);
long totalMonths = end.getProlepticMonth() - this.getProlepticMonth();
int days = end.day - this.day;
long years = totalMonths / 12;
int months = (int) (totalMonths % 12); // safe
return Period.of(Math.toIntExact(years), months, days);
where getProlepticMonth is total number of months from 00-00-00. In this case, it's 1 month and 1 day.
From my understanding, it's a bug in a Period.between and LocalDate#plus for negative periods interaction, since the following code has the same meaning
val startDate = LocalDate.of(2019, 10, 31)
val endDate = LocalDate.of(2019, 9, 30)
val period = Period.between(endDate, startDate)
println(endDate.plus(period))
but it prints the correct 2019-10-31.
The problem is that LocalDate#plusMonths normalises date to be always "correct". In the following code, you can see that after subtracting 1 month from 2019-10-31 the result is 2019-09-31 that is then normalised to 2019-10-30
public LocalDate plusMonths(long monthsToAdd) {
...
return resolvePreviousValid(newYear, newMonth, day);
}
private static LocalDate resolvePreviousValid(int year, int month, int day) {
switch (month) {
...
case 9:
case 11:
day = Math.min(day, 30);
break;
}
return new LocalDate(year, month, day);
}
I believe that you are simply out of luck. The invariant that you have invented sounds reasonable, but doesn’t hold in java.time.
It seems that the between method just subtracts the month numbers and the days of month and since the results have the same sign, is content with this result. I think I agree that probably a better decision could have been taken here, but as #Meno Hochschild has correctly stated, math involving the 29, 30 or 31 of months can hardly be clearcut, and I dare not suggest what the better rule would have been.
I bet they are not going to change it now. Not even if you file a bug report (which you can always try). Too much code is already relying on how it’s been working for more than five and a half years.
Adding P-1M-1D back into the start date works the way I would have expected. Subtracting 1 month from (really adding –1 month to) October 31 yeilds September 30, and subtracting 1 day yields September 29. Again, it’s not clear-cut, you could argue in favour of September 30 instead.
Analyzing your expectation (in pseudo code)
startDate.plus(Period.between(startDate, endDate)) == endDate
we have to discuss several topics:
how to handle separate units like months or days?
how is the addition of a duration (or "period") defined?
how to determine the temporal distance (duration) between two dates?
how is the subtraction of a duration (or "period") defined?
Let's first look at the units. Days are no problem because they are the smallest possible calendar unit and every calendar date differs by any other date in full integers of days. So we always have in pseudo code equal if positive or negative:
startDate.plus(ChronoUnit.Days.between(startDate, endDate)) == endDate
Months however are tricky because the gregorian calendar defines calendar months with different lengths. So the situation can arise that the addition of any integer of months to a date can cause an invalid date:
[2019-08-31] + P1M = [2019-09-31]
The decision of java.time to reduce the end date to a valid one - here [2019-09-30] - is reasonable and corresponds to the expectations of most users because the final date still preserves the calculated month. However, this addition including an end-of-month-correction is NOT reversible, see the reverted operation called subtraction:
[2019-09-30] - P1M = [2019-08-30]
The result is also reasonable because a) the basic rule of month addition is to keep the day-of-month as much as possible and b) [2019-08-30] + P1M = [2019-09-30].
What is the addition of a duration (period) exactly?
In java.time, a Period is a composition of items consisting of years, months and days with any integer partial amounts. So the addition of a Period can be resolved to the addition of the partial amounts to the starting date. Since years are always convertible to 12-multiples of months, we can first combine years and months and then add the total in one step in order to avoid strange side effects in leap years. The days can be added in the last step. A reasonable design as done in java.time.
How to determine the right Period between two dates?
Let's first discuss the case when the duration is positive, meaning the starting date is before the ending date. Then we can always define the duration by first determining the difference in months and then in days. This order is important to achieve a month component because otherwise every duration between two dates would only consist of days. Using your example dates:
[2019-09-30] + P1M1D = [2019-10-31]
Technically, the starting date is first moved forward by the calculated difference in months between start and end. Then the day delta as difference between the moved start date and the end date is added to the moved start date. This way we can calculate the duration as P1M1D in the example. So far so reasonable.
How to subtract a duration?
Most interesting point in the previous addition example is, there is by accident NO end-of-month-correction. Nevertheless java.time fails to do the reverse subtraction.
It first subtracts the months and then the days:
[2019-10-31] - P1M1D = [2019-09-29]
If java.time had instead tried to reverse the steps in the addition before then the natural choice would have been to first subtract the days and then the months. With this changed order, we would get [2019-09-30]. The changed order in the subtraction would help as long as there was no end-of-month-correction in the corresponding addition step. This is especially true if the day-of-month of any starting or ending date is not bigger than 28 (the minimum possible month length). Unfortunately java.time has defined another design for the subtraction of Period which leads to less consistent results.
Is the addition of a duration reversible in the subtraction?
First we have to understand that the suggested changed order in the subtraction of a duration from a given calendar date does not guarantee the reversibility of the addition. Counter example which has an end-of-month-correction in the addition:
[2011-03-31] + P3M1D = [2011-06-30] + P1D = [2011-07-01] (ok)
[2011-07-01] - P3M1D = [2011-06-30] - P3M = [2011-03-30] :-(
Changing the order is not bad because it yields more consistent results. But
how to cure the remaining deficiencies? The only way left is to change the calculation of the duration, too. Instead of using P3M1D, we can see that the duration P2M31D will work in both directions:
[2011-03-31] + P2M31D = [2011-05-31] + P31D = [2011-07-01] (ok)
[2011-07-01] - P2M31D = [2011-05-31] - P2M = [2011-03-31] (ok)
So the idea is to change the normalization of the computed duration. This can be done by looking if the addition of the computed month delta is reversible in a subtraction step - i.e. avoids the need for an end-of-month-correction. java.time does unfortunately not offer such a solution. It is not a bug, but can be considered as a design limitation.
Alternatives?
I have enhanced my time library Time4J by reversible metrics which deploy the ideas given above. See following example:
PlainDate d1 = PlainDate.of(2011, 3, 31);
PlainDate d2 = PlainDate.of(2011, 7, 1);
TimeMetric<CalendarUnit, Duration<CalendarUnit>> metric =
Duration.inYearsMonthsDays().reversible();
Duration<CalendarUnit> duration =
metric.between(d1, d2); // P2M31D
Duration<CalendarUnit> invDur =
metric.between(d2, d1); // -P2M31D
assertThat(d1.plus(duration), is(d2)); // first invariance
assertThat(invDur, is(duration.inverse())); // second invariance
assertThat(d2.minus(duration), is(d1)); // third invariance

Excel Macro VBA: Add minutes and hours together where total can exceed 24 hours

hopefully this is a silly question with an easy answer.
I have no choice really what language I use, which is why I'm doing this in Excel with VBA.
I'm basically calculating total downtime hours over a month. I need to add small amounts of minutes together to find out a total that will be over 24 hours of course.
Here is the scenario:
Server A was down for 3 hours and 52 minutes this month.
Server B was down for 15 hours and 25 minutes this month.
Server B had 7 hours and 23 minutes downtime during a critical period, so this is multplied by 3 to equate it to non-critical downtime.
Server A has: 3 hours 52 minutes at x1
Server B has: 8 hours 2 minutes at x1
Server B has: 7 hours 21 minutes at x3
All downtimes and restoration times are manually listed in a sheet in time formats recognised by excel, eg:
event 1 : 19/11/2017 5:00 : 19/11/2017 14:12
event 2 : 13/11/2017 6:00 : 13/11/2017 6:40
event 3 : 13/11/2017 7:57 : 13/11/2017 9:01
event 4 : 17/11/2017 6:15 : 18/11/2017 8:10
Weekends are not counted
Only minutes between 6am and 6pm are counted
Minutes increase in priority during certain time periods:
06:00-07:00, 07:00-09:00, 09:00-10:00, 10:00-14:00
High priority minutes are multiplied to equate peak time usage with lower standard time usage
I'm struggling to find a way to add times together to count hours, excel trys to give answers relative to 01/01/1900 or some "real" date.
I'm going the opposite way, I have the real dates, I need to work with the hours between them. Is there a data format that is in plain hours:minutes?
I thought it was obvious but I'll state clearly in case, start time and end times are not necessarily on the same day. They can be any time, any relationship, sometimes start time will be after the end time due to how faults are reported. Obviously that counts as 0 minutes in that case.
My current methodology for attacking this problem is:
increase the start time until it becomes valid charge time
calculate the minutes until there is a change such as end of day or higher priority time slot, or start time = end time
add the calculated minutes to a total
increase the start time by the calculated minutes
start cycle again from the the new 'start time' and loop until there are no minutes remaining between start time and end time
startof:
'move to start of next chargeable day, if not on a chargeable day
'eg weekends, public holidays, easy function to write
Do While testForChargeable() = False
opnDate = DateAdd("d", 1, opnDate)
opnTime = "06:00"
Loop
'check if open time is past the end of chargeable time, 18:00
If (opnTime >= endofdayTime) Then
'move to start of next chargeable day
opnDate = DateAdd("d", 1, opnDate)
opnTime = "06:00"
End If
'check if open time is after close time and fault is excluded
If (opnDate >= bisDate) And (opnTime >= bisTime) Then
GoTo last
End If
'check if close time is on same day as start time
If DateDiff("d", opnDate, bisDate) = 0 Then
'if it is, add minutes between opntime and bistime
chargeTime = chargeTime + calculateChargeTime(opnTime, bisTime)
'calculation ends, loop naturally terminates
Else
'if not, add remaining mintes of day to chargeable time
chargeTime = chargeTime + calculateChargeTime(opnTime, endofdayTime)
'move to start of next day
opnDate = DateAdd("d", 1, opnDate)
opnTime = "06:00"
GoTo startof
End If
last:
Cheers
Edit: Now that we're on the same page and I have what I think is a workable solution for you, I'll replace my previous answer [re: How Excel dates are related to value (ie., 1 day = 1)] with this one. The previous answer (and my computer messing up while trying to post it) is viewable in the Edit History.
So, you need a way to count minutes duration, between two DateTimes, and include or exclude sub-time-ranges based on criteria that might require ongoing adjustment, and you want this in a VBA function for use in automation of downtime data analysis.
Try this:
Option Explicit
Function MinsBetween(startDateTime As Date, stopDateTime As Date, count_StartTime As Date, count_StopTime As Date) As Long
Dim startTime As Date, stopTime As Date
'ignore dates, use only the times
startTime = startDateTime - Int(startDateTime)
stopTime = stopDateTime - Int(stopDateTime)
If startTime >= count_StopTime Or stopTime <= count_StartTime Then
'entire period falls outside of times to count
MinsBetween = 0
Exit Function
End If
'make 'adj' times start/end at counted times if necessary
startDateTime = IIf(startTime < count_StartTime, count_StartTime, startTime)
stopDateTime = IIf(stopTime > count_StopTime, count_StopTime, stopTime)
'calculate & return minutes between (never return negative number)
MinsBetween = Abs(DateDiff("n", startDateTime, stopDateTime))
End Function
This function counts only the minutes between startDateTime and stopDateTime that also fall between count_StartTime and count_StopTime.
Expects:
- count_StartTime & count_StopTime to be an Excel Time (or number between 0 and 1)
- startDateTime & stopDateTime to be an Excel Time or DateTime.
Returns a long integer. Could be referenced in VBA or as a worksheet function.
Example usage:
The outage 'event' occurred from 05:00 to 07:03 on 2017/11/19, but only the times between 6am and 6pm should be counted:
Debug.Print MinsBetween("2017/11/19 05:00", "2017/11/19 07:03", "06:00", "18:00")
The outage 'event' occurred from 05:00 to 14:12 on 2017/11/19. The duration that occurred between [peak period] 1pm to 2pm are have higher priority and should be counted as "double-time":
Debug.Print (2 * MinsBetween("2017/11/19 05:00", "2017/11/19 14:12", "13:00", "15:00") )
As weekends are ignored entirely, those reports could be excluded with a simple check like this:
Function isWeekend(wDateTime As Date) As Boolean
isWeekend = Weekday(DateValue(wDateTime)) = vbSaturday Or Weekday(DateValue(wDateTime)) = vbSunday
End Function
...returns TRUE if the supplied date (or datetime) falls on a weekend, otherwise returns FALSE.
You could use a combination of these functions to build sub or worksheet function around your custom criteria and adjust as needed.
For example:
Function DownTimeMinutes(startDateTime As Date, stopDateTime As Date) As Long
'you could process your custom criteria for each start/stop period here
Dim dtMinutes As Long
'for example:
'IGNORE DOWNTIME ON WEEKENDS
If isWeekend(startDateTime) Then
'ignore weekeends
DownTimeMinutes = 0
Exit Function
End If
'COUNT MINUTES BETWEEN 6AM-6PM with "x1" multiplier
dtMinutes = MinsBetween(startDateTime, stopDateTime, "06:00", "18:00")
'DON'T COUNT LUNCH BREAK (or something like that)
'(subtract these minutes from total)
dtMinutes = dtMinutes - MinsBetween(startDateTime, stopDateTime, "12:00", "12:30")
'COUNT MINUTES BETWEEN 14:00-15:00 as "x3"
'(already counted as "x1" so add "2x these minutes"
dtMinutes = dtMinutes + (2 * MinsBetween(startDateTime, stopDateTime, "14:00", "15:00"))
'return adjusted minutes for this downtime event
DownTimeMinutes = dtMinutes
End Function
Side note: This is the short-story of the long-example I was getting at when I thought part of your issue was trouble converting varying M/D/Y , MM/DD/YY , M-DD-YYYY , etc, manual entries to DateTimes that Excel would recognize regardless of the user's Regional date settings.
As I understand it, you don't need it now but I figured I might as well add it to my answer anyway :
=DATE(MID(RIGHT(LEFT(A1,FIND(" ",A1)-1),LEN(LEFT(A1,FIND(" ",A1)-1))-FIND("/",LEFT(A1,FIND(" ",A1)))),FIND("/",RIGHT(LEFT(A1,FIND(" ",A1)-1),LEN(LEFT(A1,FIND(" ",A1)-1))-FIND("/",LEFT(A1,FIND(" ",A1)))))+1,4),LEFT(RIGHT(LEFT(A1,FIND(" ",A1)-1),LEN(LEFT(A1,FIND(" ",A1)-1))-FIND("/",LEFT(A1,FIND(" ",A1)))),FIND("/",RIGHT(LEFT(A1,FIND(" ",A1)-1),LEN(LEFT(A1,FIND(" ",A1)-1))-FIND("/",LEFT(A1,FIND(" ",A1)))))-1),FIND("/",RIGHT(LEFT(A1,FIND(" ",A1)-1),LEN(LEFT(A1,FIND(" ",A1)-1))-FIND("/",LEFT(A1,FIND(" ",A1))))))+TIMEVALUE(RIGHT(A1,LEN(A1)-FIND(" ",A1)))
Generally I don't condone the use of gigantic formulas (I was more concerned about getting it into a single function that about readability), and there are other ways to deal with date issues caused by Regional differences in shared workbooks (including Windows API) but in most cases I find text manipulation will do the job too.

Sum values on date table where one column equals a selected value from another

I have a DimDate table that has a Billable Day Portion field that can be between 0 and 1. For each day that's in the current Bonus Period I want to multiple that Day Portion by 10, and then return the total sum.
To find out what Bonus Period we're in, I return ContinuousBonusPeriod where the date equals today:
Current Continuous Bonus Period:= CALCULATE(MAX(DimDate[ContinuousBonusPeriod]), FILTER(DimDate, DimDate[DateKey] = TODAY()))
I can see in the measure display this is correctly coming back as Bonus Period 1. However, when I then use ContinuousBonusPeriod in the measure to determine the number of days in the current period, it only returns 10, 1 day multiplied by the static 10.
Billable Hours This Period:= CALCULATE(SUMX(DimDate, DimDate[Billable Day Portion] * 10), FILTER(DimDate, DimDate[ContinuousBonusPeriod] = [Current Continuous Bonus Period]))
It appears to be only counting today's DimDate record instead of all the records whereContinuousBonusPeriod = 'Bonus Period 1' as I'd expect.
I needed to make sure no existing filter was applied to the DimDate table when calculating the Current Continuous Bonus Period:
Current Continuous Bonus Period:= CALCULATE(MAX(DimDate[ContinuousBonusPeriod]), FILTER(ALL(DimDate), DimDate[DateKey] = TODAY()))
(Notice the ALL() statement)

calculate difference in time in days, hours and minutes

UPDATE: I am updating the question to reflect the full solution. Using the time_diff gem Brett mentioned below, the following code worked.
code:
cur_time = Time.now.strftime('%Y-%m-%d %H:%M')
Time.diff(Time.parse('2011-08-12 09:00'), Time.parse(cur_time))
Thanks, Brett.
Without using a external gem, you can easily get differences between dates using a method like this:
def release(time)
delta = time - Time.now
%w[days hours minutes].collect do |step|
seconds = 1.send(step)
(delta / seconds).to_i.tap do
delta %= seconds
end
end
end
release(("2011-08-12 09:00:00").to_time)
# => [7, 17, 37]
which will return an array of days, hours and minutes and can be easily extended to include years, month and seconds as well:
def release(time)
delta = time - Time.now
%w[years months days hours minutes seconds].collect do |step|
seconds = 1.send(step)
(delta / seconds).to_i.tap do
delta %= seconds
end
end
end
release(("2011-08-12 09:00:00").to_time)
# => [0, 0, 7, 17, 38, 13]
I've used time_diff to achieve this sort of thing easily before, you may want to check it out.