I'a m reading a csv file with Pandas. In the file there is a column with dates in dd/mm/yyyy format.
def load_csv():
mydateparser = lambda x: dt.datetime.strptime(x, "%d/%m/%Y")
return pd.read_csv('myfile.csv', delimiter=';', parse_dates=['data'], date_parser=mydateparser)
Using this parser the column 'data' type becomes data datetime64[ns], but the format is changed to yyyy-mm-dd.
I need the the column 'data' type to be datetime64[ns] and formated as dd/mm/yyyy.
How can it be done?
Regards,
Elio Fernandes
Date is not stored in yyyy-mm-dd format or dd/mm/yyyy format. It's stored in datetime format. Python by default chooses to shows it in yyyy-mm-dd format. But don't get it wrong, it still is stored in datetime format.
You will get a better idea if you add time to data and then try to display it.
The way to achieve what you wish is by changing date to string right before displaying, so as, it remains datetime in dataframe but you get the specified string format when you display.
The following uses Series.strftime() to change to string. Documentation here.
df['data'].strftime('%d/%m/%Y')
or
The following uses datetime.strftime() to change to string. Documentation here.
df['data'].apply(lambda x: x.strftime('%Y-%m-%d'))
For further reference check out strftime-and-strptime-behavior.
This question will be of great help to understand how datetime is stored in python:
How does Python store datetime internally?
Related
I'm trying to format a date for an API. the desired format is: yyyy-MM-ddTHH:mm:ss.fffffff+HH:mm
(eg. 2022-10-12T09:52:14.1234567+03:00). I'm using Date.ParseExact in the following way:
Date.ParseExact("2022-10-12T09:52:14.1234567+03:00", "yyyy-MM-ddTHH:mm:ss.fffffff+HH:mm", CultureInfo.InvariantCulture)
.
Initially I used 'Now' instead of this string, but then I saw that the string and the desired format have to match. The error I'm getting is 'DateTime pattern 'H' appears more than once with different values.'. Is there a way to avoid that? Also is it possible to use 'Now' in this line?
Thank you
I suspect that you don't have a parse issue, you don't need ParseExact at all. You have a Date and want to return it as a formatted string. Then use ToString and zzz for the utc-offset:
string result = DateTime.Now.ToString("yyyy-MM-ddTHH:mm:ss.fffffffzzz");
Read also: Custom date and time format strings
I have a csv file where the a timestamp column is coming with values in the following format: 2022-05-12T07:09:33.727-07:00
When I try something like:
df['timestamp'] = pd.to_datetime(df['timestamp'])
It seems to fail silently as the dtype of that column is still object. I am wondering how I can parse such a value.
Also, what is the strategy so that it remains robust to a variety of input time formats?
So I got my dataframe from a JSON file, and the date is labelled as 2000M01 for 2000 january, 2000M02 for 2000 february etc. I need to have it in a different format: 2000Jan, 2000Feb etc.-I have a different data set in this format, I could bring both of these to a third one, if that's easier. Like 2000-01 or some official date format.
My main issue is that as far as I know 2000M01 is not an official data format in any way, so I can't just convert it that way.
Any ideas how I could convert this?
You can easily feed a custom format to pd.to_datetime, in your case it would be '%YM%m', e.g.:
pd.to_datetime('2000M01', format = '%YM%m')
Then you can convert it to any format you want.
You can change the date format with the datetime module
def reformat_date(date_from_json):
date = datetime.datetime.strptime(date_from_json, "%YM%m")
return date.strftime("%Y%b")
As specified in datetime documentation in strftime and strptime formats, you can deal with the unusual date formatting with %YM%m dealing with the input format with the day defaulting to the 1st, and %Y%b giving you the format you want.
Then you map the function to the pandas dataframe
dataframe['DATE_COLUMN'] = dataframe['OLD_DATE_COLUMN'].map(lambda date: reformat_date(date))
I'm doing some ETL from a CSV file in GCS to BQ, everything works fine, except for dates. The field name in my table is TEST_TIME and the type is DATE, so in the TableRow I tried passing a java.util.Date, a com.google.api.client.util.DateTime, a String, a Long value with the number of seconds, but none worked.
I got error messages like these:
Could not convert non-string JSON value to DATE type. Field: TEST_TIME; Value: ...
When using DateTime I got this error:
JSON object specified for non-record field: TEST_TIME.
//tableRow.set("TEST_TIME", date);
//tableRow.set("TEST_TIME", new DateTime(date));
//tableRow.set("TEST_TIME", date.getTime()/1000);
//tableRow.set("TEST_TIME", dateFormatter.format(date)); //e.g. 05/06/2016
I think that you're expected to pass a String in the format YYYY-MM-DD, which is similar to if you were using the REST API directly with JSON. Try this:
tableRow.set("TEST_TIME", "2017-04-06");
If that works, then you can convert the actual date that you have to that format and it should also work.
While working with google cloud dataflow, I used a wrapper from Google for timestamp - com.google.api.client.util.DateTime.
This worked for me while inserting rows into Big Query tables. So, instead of
tableRow.set("TEST_TIME" , "2017-04-07");
I would recommend
tableRow.set("TEST_TIME" , new DateTime(new Date()));
I find this to be a lot cleaner than passing timestamp as a string.
Using the Java class com.google.api.services.bigquery.model.TableRow, to set milliseconds since UTC into a BigQuery TIMESTAMP do this:
tableRow.set("timestamp", millisecondsSinceUTC / 1000.0d);
tableRow.set() expects a floating point number representing seconds since UTC with up to microsecond precision.
Very non-standard and undocumented (set() boxes the value in an object, so it's unclear what data types set() accepts. The other proposed solution of using com.google.api.client.util.DateTime did not work for me.)
Is possible change format date by specific pattern ? I need to made a function which has a two parameters. First is date and second is pattern. I need convert more date variants. Goal this function is change US and European date format.
For example i need convert
EU: dd:MM:yyyy hh:mm:ss
to
US: MM:dd:yyyy hh:mm:ss
On another page i need change
EU: dd/MM/yyyy
to
US: MM/dd/yyyy
And i have a several next variant to convert
And i want to made a similar function
Formater(euDate, pattern)
BEGIN
....
RETURN usDate
My production server is unfortunately SQL server 2005 and doesn't support function FORMAT(). And function CONVERT() doesn't support some variant of date, which i need convert. So in my current solution i parse EU date at individualy parts (#day = day(#euDate), #month, #year, ...) and join them in new string . And i compare it with input parameter in pattern and return CASE which is equal like pattern. I want to this function make general and simplier.
Thank you for Your advice.
You almost certainly can use the convert function. You can read more about all the options here.
If there is some obscure invariant you need, check out this blog by Anubhav Goyal.