pandas TimeStamp subtraction? - pandas

Why does the following code produce the same results?
Is there a way to subtract date from pandas TimeStamp?
print s['ADM.DT'] + pd.DateOffset(month=2)
print s['ADM.DT'] - pd.DateOffset(month=2)
s['ADM.DT'] is pandas.tslib.Timestamp object.

If you use
pd.DateOffset(month=2)
It shifts the Date to the second month of the year. If you want to shift the date for 2 months you have to use:
pd.DateOffset(months=2)

Related

Remove hours and extract only month and year

I try to keep only month and year in this df. I tried several solutions but it is not working. Can you help me ?
YOu need to do this (as you post no data, you'll need to adapt this to your case):
from datetime import datetime
datetime_object = datetime.now()
print(datetime_object)
2021-11-30 15:57:20.812209
And to get the year and month do this:
new_date_month= datetime_object.month
print(new_date_month)
new_date_year = datetime_object.year
print(new_date_year)
11
2021
If you need them as new columns in you df:
df['year']=datetime_object.year
df['Month']=datetime_object.month
Note that if your column is not a datetime, this will not work. Given to format of date you hve you will need to do this first:
st = '2021-11-30 15:57:20.812209'
datetime.strptime(st, '%Y-%m-%d %H:%M:%S.%f')

Filter DataFrame Pandas Datatime

I have a forcast of 24 months in my Dataframe, how can I filter the date to 12 months
I know how to filter by a fixed date.
But my dates are always extended by one month. So I need a variable filter.
My solution should be to filter 12 months from the current month on.
Thanks a lot
Try this:
from datetime import date
from dateutil.relativedelta import relativedelta
df = df[df['Date_column_name'] >= (date.today() + relativedelta(months=+12))]
Hope it helps...

How to find the last month data based on current date?

Is there any way based on a date to take the last one month data? I have searched a lot but I can't find a good and precise solution. If the current date is on index 420 the date is 2012-01-09. I want to have a data frame with data from 2011-12-09 until 2012-01-09.
import pandas as pd
import numpy as np
times = pd.DataFrame(pd.date_range('2012-01-01', '2012-04-01', freq='30min'), columns=['date'])
times['date'] = pd.to_datetime(times['date'])
times['value'] = np.random.randint(1, 6, times.shape[0])
months = times.iloc[0:420].sort_values(by='date', ascending=True).set_index('date').last('1M')
Using the .last command the results end on 2012-01-01 as this is the last month. I understand it but is there any way to find the last one month data without using timedelta or relative delta? In the case of both if a date is missing then an error appears which is also a problem.
Thank you.
I think what you're looking for is pd.Period. You can convert all of your datetimes to a month period and then search using that
# turn your datetimes to month periods
times["month"] = times[times["date"].dt.to_period("m")]
# turn your search date to a period
your_date = pd.Period(your_date, "m")
# search times
times[times["month"] == your_date]

How to get the data for the last 12 months and split and month-wise in HIVE?

Table format for the date column is "yyyyMMdd" and I'm using the following functions to convert into standard format so that HIVE day, months and year can be performed to get the respective values.
(from_unixtime(unix_timestamp(cast(created_day as STRING) ,'yyyyMMdd'), 'yyyy-MM-dd'))
To get the current year data, I would subtract the year obtained from all the records with the year returned by the current date and if it return zero, then it falls in this year.
(year(current_date()) - year(from_unixtime(unix_timestamp(cast(created_day as STRING) ,'yyyyMMdd'), 'yyyy-MM-dd'))) = 0
Problem: If the current date falls in January, I would get only January data month, but i need to get the data from February(last year) to January(current year)?
Also I need to scale this to obtain the last 24 months.
I always set my date range parameters outside of Hive and pass them as arguments as this lends itself to reproducibility and testability.
select <fields> from <table> where created_day between ${hiveconf:start_day} and ${hiveconf:end_day}

How to retrieve the WeekofMonth for a given date in Hive

I have a date field in Hive 2018-06-10, from which i need to get WeekOfMonth
WEEKOFYEAR(order_time)
I need output for 2018-06-10 as 3 (which is 3rd week. assuming week starts from Sunday)
Is there any built in function in Hive to retrieve WeekofMonth. I couldn't find any. I tried below to convert based on minutes and seconds but
from_unixtime(unix_timestamp(CURRENT_DATE())+7200)
But the above is not giving correct value
For the week of the month, you can get the day part of the month and divide by 7.
select case
when DAYOFMONTH(order_time)%7 = 0
then DAYOFMONTH(order_time)/7
else DAYOFMONTH(order_time)/7 + 1
end
Also you can use date_format function:
select date_format('2018-06-10','W');
See more format patterns here: SimpleDateFormat