Convert HH:MM:SS and MM:SS to seconds in same column in Pandas - pandas

I am trying to convert a timestamp string into an integer and having trouble.
The column in my dataframe looks like this:
time
30:03
1:15:02
I have tried df['time'].str.split(':').apply(lambda x: int(x[0]) * 60 + int(x[1])) but that only works if every value in the column is HH:MM:SS but my column is mixed. Some people took 30 minutes to finish the task and some took an hour, etc.

You can make a custom convert function where you check for time format:
def convert(x):
x = x.split(":")
if len(x) == 2:
return int(x[0]) * 60 + int(x[1])
return int(x[0]) * 3600 + int(x[1]) * 60 + int(x[2])
df["time"] = df["time"].apply(convert)
print(df)
Prints:
time
0 1803
1 4502

Related

How to convert an integer value to time (HH:MM:SS) in SQL Server?

I have a set of integer value which can be either a single digit to 6 digit number. Now I want to convert this set of values to time representing the HH:MM:SS format. I thought of converting first to varchar then to time but it didn't work out. Can anyone help me out with the problem?
You can use TIMEFROMPARTS
SELECT
TIMEFROMPARTS(
YourColumn / 10000,
YourColumn / 100 % 100,
YourColumn % 100
)
FROM YourTable;
This happens to be what run times look like in msdb..sysjobschedules, which I've addressed here. Assuming "val" is your integer, try:
select dateadd(s, val - ((val / 100) * 40) - ((val / 10000) * 2400), 0/*or some date*/)
(subtracting out 40 seconds per minute and 40*100 + 2400 seconds per hour to get the actual number of seconds, then adding that many seconds to a date.)
Try:
date (dateadd(second,value,'19700101'))

Convert float value from a db2 database to a date format?

I have a database from a custom application.
The values in the database for a date is 44416.254005, and the application that uses this database resolves that number to the following date 8/8/2021 6:05:46 AM.
Another example is 44416.268113 to 8/8/2021 6:26:05 AM.
ex. 44328.93713 = 5/12/2021 10:29:28 PM
I have an old backup of this database and need to figure out how the conversions are made?
This looks like the an Excel format for a date. If so, you can convert to seconds and add to 1899-12-31:
select timestamp '1899-12-31 00:00:00' + (44416.268113 * 60 * 60 * 24) second
import datetime
def xldate_to_datetime(xldate):
start = datetime.datetime(1899, 12, 30)
delta = datetime.timedelta(days=xldate)
return start + delta

Pandas: Read dates in column which has different formats

I would like to read a csv, with dates in a column, but the dates are in different formats within the column.
Specifically, some dates are in "dd/mm/yyyy" format, and some are in "4####" format (excel 1900 date system, serial number represents days elapsed since 1900/1/1).
Is there any way to use read_csv or pandas.to_datetime to convert the column to datetime?
Have tried using pandas.to_datetime with no parameters to no avail.
df["Date"] = pd.to_datetime(df["Date"])
Returns
ValueError: year 42613 is out of range
Presumably it can read the "dd/mm/yyyy" format fine but produces an error for the "4####" format.
Note: the column is mixed type as well
Appreciate any help
Example
dates = ['25/07/2016', '42315']
df = DataFrame (dates, columns=['Date'])
#desired output ['25/07/2016', '07/11/2015']
Let's try:
dates = pd.to_datetime(df['Date'], dayfirst=True, errors='coerce')
m = dates.isna()
dates.loc[m] = (
pd.TimedeltaIndex(df.loc[m, 'Date'].astype(int), unit='d')
+ pd.Timestamp(year=1899, month=12, day=30)
)
df['Date'] = dates
Or alternatively with seconds conversion:
dates = pd.to_datetime(df['Date'], dayfirst=True, errors='coerce')
m = dates.isna()
dates.loc[m] = pd.to_datetime(
(df.loc[m, 'Date'].astype(int) - 25569) * 86400.0,
unit='s'
)
df['Date'] = dates
df:
Date
0 2016-07-25
1 2015-11-07
Explanation:
First convert to datetime all that can be done with pd.to_datetime:
dates = pd.to_datetime(df['Date'], dayfirst=True, errors='coerce')
Check which values couldn't be converted:
m = dates.isna()
Convert NaTs
a. Offset as days since 1899-12-30 using TimedeltaIndex + pd.Timestamp:
dates.loc[m] = (
pd.TimedeltaIndex(df.loc[m, 'Date'].astype(int), unit='d')
+ pd.Timestamp(year=1899, month=12, day=30)
)
b. Or convert serial days to seconds mathematically:
dates.loc[m] = pd.to_datetime(
(df.loc[m, 'Date'].astype(int) - 25569) * 86400.0,
unit='s'
)
Update the Date column:
df['Date'] = dates

Converting Numeric to Time

I have a column which contains NUMERIC(5,2) data. The data is representative of shift start times. A few examples are 6.30, 10.30, 13.30 and 15.30. What's the best way to convert this into a time so I can do time calculations against another field? The other field is a datetime field.
Edit: The values represent times. For example 6.30 = 06:30, 15.30 = 03:30 PM.
You can do something like this:
select dateadd(minute, floor(col) * 60 + (col % 1) * 100, 0)
Replace the dot(.) with colon(:) and compare against your time column.
select *
from yourtable
Where timecol = replace(numericcol,'.',':')
considering timecol is of datatype time, rhs will be implicitly converted to time.

Converting hours into minutes in SQL

i'm trying to write a regular expression that will convert the hours into minutes.
Currently I have a value which give me the number of hours before the dot and the number of minutes after the dot, for example 6.10 is 6 hours and 10 minute.
Can i use a regular expression to look for the value before the dot and multiply by 60 (to get minutes) and then add the value after the dot (which is already in minutes)?
So as per the example used before, it would do:
(6 * 60) + 10
Thank you!
There's a STRTOK function to tokenize a string and then it's just:
select '12.34' as x,
cast(strtok(x,'.',1) as int) * 60 + cast(strtok(x,'.',2) as int)
If your value "6.10" is stored as a string, convert it to a number first:
SET #Input = CAST(#InputString AS FLOAT)
Then, use Modulus and Floor.
Total Minutes = The fractional part * 100 + the whole number part * 60.
So if your value 6.10 is contained in a variable named #Input, then you can get the value you wish with
SET #Output = (#Input % 1) * 100 + FLOOR(#Input, 0) * 60
If it's in a table then something more like this would be appropriate:
SELECT (Input % 1) * 100 + FLOOR(Input, 0) * 60 Output
FROM Table
The above only works if minutes are zero padded, e.g. 6 hours five minutes = 6.05. If you instead represent minutes without padding (6 hours five minutes = 6.5) then use the method that under suggests.
Regular expressions is probably an overkill for such a simple string manipulation. All you need is INDEX function to get the location of the dot and SUBSTR function to get hours and minutes.
Dot position:
select INDEX([yourstring], '.')
Hours (before the dot):
select SUBSTR([yourstring], 1, INDEX([yourstring], '.') - 1)
Minutes (after the dot):
select SUBSTR([yourstring], INDEX([yourstring], '.') + 1)