How do I convert a non zero padded day string to a useful date in pandas - pandas

I'm trying to import a date string with non-zero padded day, zero padded month, and year without century to create a datetime e.g. (11219 to 01/12/19). However, pandas cannot distinguish between the day and the month (e.g. 11219 could be 11th February, 2019 or 1st December, 2019).
I've tried using 'dayfirst' and the '#' in the day e.g. %#d, but nothing works. Code below, any advise?
Code:
df_import['newDate'] = pd.to_datetime(df_import['Date'], format='%d/%m/%Y', dayfirst = True)
Error:
time data '11219' does not match format '%d/%m/%Y' (match)

Since only the day is not zero-padded, the dates are unambiguous. They can simply be parsed by Pandas if we add the pad:
pd.to_datetime(df_import['Date'].str.zfill(6), format='%d%m%y')

use zfill()
A custom function can also be used if you want to handle more cases.
def getDate(str):
return #logic to parse
df_import['newDate'] = df_import['Date'].apply(getDate)

Related

TypeError: dtype datetime64[ns] cannot be converted to timedelta64[ns]

I have a column of years from the sunspots dataset.
I want to convert column 'year' in integer e.g. 1992 to datetime format then find the time delta and eventually compute total seconds (cumulative) to represent the time index column of a time series.
I am trying to use the following code but I get the error
TypeError: dtype datetime64[ns] cannot be converted to timedelta64[ns]
sunspots_df['year'] = pd.to_timedelta(pd.to_datetime(sunspots_df['year'], format='%Y') ).dt.total_seconds()
pandas.Timedelta "[r]epresents a duration, the difference between two dates or times." So you're trying to get Python to tell you the difference between a particular datetime and...nothing. That's why it's failing.
If it's important that you store your index this way (and there may be better ways), then you need to pick a start datetime and compute the difference to get a timedelta.
For example, this code...
import pandas as pd
df = pd.DataFrame({'year': [1990,1991,1992]})
diff = (pd.to_datetime(df['year'], format='%Y') - pd.to_datetime('1990', format='%Y'))\
.dt.total_seconds()
...returns a series whose values are seconds from January 1st, 1990. You'll note that it doesn't invoke pd.to_timedelta(), because it doesn't need to: the result of the subtraction is automatically a pd.timedelta column.

Outputting pandas timestamp to tuple with just month and day

I have a pandas dataframe with a timestamp field which I have successfully to converted to datetime format and now I want to output just the month and day as a tuple for the first date value in the data frame. It is for a test and the output must not have leading zeros. I ahve tried a number of things but I cannot find an answer without converting the timestamp to a string which does not work.
This is the format
2021-05-04 14:20:00.426577
df_cleaned['trans_timestamp']=pd.to_datetime(df_cleaned['trans_timestamp']) is as far as I have got with the code.
I have been working on this for days and cannot get output the checker will accept.
Update
If you want to extract month and day from the first record (solution proposed by #FObersteiner)
>>> df['trans_timestamp'].iloc[0].timetuple()[1:3]
(5, 4)
If you want extract all month and day from your dataframe, use:
# Setup
df = pd.DataFrame({'trans_timestamp': ['2021-05-04 14:20:00.426577']})
df['trans_timestamp'] = pd.to_datetime(df['trans_timestamp'])
# Extract tuple
df['month_day'] = df['trans_timestamp'].apply(lambda x: (x.month, x.day))
print(df)
# Output
trans_timestamp month_day
0 2021-05-04 14:20:00.426577 (5, 4)

Is there an R function for changing a strange character date to POSIXct?

My date type looks like 7-Feb-20 (character) data, and I need to convert this into a data/time, so eventually I can use "As.Date() or as.POSIXct()" to move it into the SQL server as datetime. SQL server uses datetime as default when moving from R to SQL.
The as.Date() function will convert character data into a date format.
You can specify the input format by using the format = argument.
In this case, it looks like your format is %d-%b-%y
%d will give you day
%b will give you abbreviated month name (compare with %B, which gives full month name)
%y will give you two-digit year.
You will also need to include the hyphen in your format.

Format date where the position of the parts is variable

We have a file that needs to be imported that has dates in it. The dates are in a format that I have not seen before and the day part can vary in length (but not the month or year seemingly) and position based on wether the number is double digit or not, i.e.
Dates:
13082014 is 13th February 2014
9092013 is 9th September 2013
The current script tries to substring the parts out, but fails on the second one as there is not enough data. I could write an if or case to check the length, but is there a SQL format that can be used to reliably import this data?
To clarify this is MSSQL and the date format is ddmmyyyy or dmmyyyy
One of the simple way is using STUFF.
example:
select STUFF(STUFF('13082014 ',3,0,'/'),6,0,'/');
//result: 13/08/2014
Good luck.
LPAD a zero when it is missing so to always get an eight character date string. Here is an example with Oracle, other DBMS may have other string and date functions to achieve the same.
select to_date(datestring, 'ddmmyyyy')
from
(
select lpad('13082014', 8, '0') as datestring from dual
union all
select lpad('9092013', 8, '0') as datestring from dual
);
Result:
13.08.2014
09.09.2013
you can convert the dates to a relevant date format then import data(based on the dateformat change the logic).
something like this :
select Convert(varchar(10),CONVERT(date,YourDateColumn,106),103)

how to check linq comparing date format in vb.net

Say I want record of all employees of date "07/03/2013" where format is "MM/dd/yyyy".
my expression in linq will be :
_dbContext.EmployeeDetails.Where(Function(f) f.EmpId = _empId And f.Date="07/03/2013")
here how linq manage the date format as "MM/dd/yyyy". OR how to compare "f.Date" with MM/dd/yyyy ?
Say,
if i does
_dbContext.EmployeeDetails.Where(Function(f) f.EmpId = _empId And f.Date.ToString("MM/dd/yyyy")="07/03/2013")
It does not allow me. suppose my date will "07/13/2013" then we can consider it matches and sync the format with "f.Date" but what about date is under 12 ? In fact by using first expression, m getting march month records, rather July.
How to figure out this issue?
You are comparing string representations of the date/time, which depend on the current system's formatting options. Use a date literal (must be in 'M/d/yyyy') to create a new DateTime object:
_dbContext.EmployeeDetails.Where(Function(f) f.EmpId = _empId And f.Date=#3/7/2013#)
If it's a variable, say dte:
_dbContext.EmployeeDetails.Where(Function(f) f.EmpId = _empId And f.Date=dte)
It should be this:
_dbContext.EmployeeDetails.Where
(Function(f) f.EmpId = _empId And
f.Date.ToString("dd/MM/yyyy")="07/03/2013")
You where passing Month first, so it parses March. Change toString order so you pass days before and then month. (July)