Convert a Particular Dataframe Column Into Customized Date Or Time - pandas
As you can see a Date & Time Column are being saved in this CSV File. Now what problem is that the date & time are in format of something like - 30-1-2022 & 20:08:00
But i want it to look something like 30th Jan 22 and 8:08 PM
Any code for that ?
import requests
import pandas as pd
from datetime import datetime
from datetime import date
currentd = date.today()
s = requests.Session()
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url = 'https://www.nseindia.com/'
step = s.get(url,headers=headers)
today = datetime.now().strftime('%d-%m-%Y')
api_url = f'https://www.nseindia.com/api/corporate-announcements?index=equities&from_date={today}&to_date={today}'
resp = s.get(api_url,headers=headers).json()
result = pd.DataFrame(resp)
result.drop(['difference', 'dt','exchdisstime','csvName','old_new','orgid','seq_id','sm_isin','bflag','symbol','sort_date'], axis = 1, inplace = True)
result.rename(columns = {'an_dt':'DateandTime', 'attchmntFile':'Source','attchmntText':'Topic','desc':'Type','smIndustry':'Sector','sm_name':'Company Name'}, inplace = True)
result[['Date','Time']] = result.DateandTime.str.split(expand=True)
result.drop(['DateandTime'], axis = 1, inplace = True)
result.to_csv( ( str(currentd.day) +'-'+str(currentd.month) +'-'+'CA.csv'), index=True)
print('Saved the CSV File')
Try creating a temporary column:
result['Full_date']=pd.to_datetime(result['Date']+' '+result['Time'])
Then format 'Date' and 'Time'
result['Date']=result['Full_date'].dt.strftime('%b %d, %Y')
result['Time']=result['Full_date'].dt.strftime('%R' '%p')
Try this:
# Remove comment if needed
# import locale
# locale.setlocale(locale.LC_TIME, 'C')
# https://stackoverflow.com/a/16671271
def ord(n):
return str(n)+("th" if 4<=n%100<=20 else {1:"st",2:"nd",3:"rd"}.get(n%10, "th"))
result['Date'] = pd.to_datetime(result['Date'], format='%d-%b-%Y')
result['Date'] = result['Date'].dt.day.map(ord) + result['Date'].dt.strftime(' %b %Y')
result['Time'] = pd.to_datetime(result['Time']).dt.strftime('%-H:%M %p')
# Now you can export
Output:
>>> result[['Date', 'Time']]
Date Time
0 30th Jan 2022 21:07 PM
1 30th Jan 2022 20:57 PM
2 30th Jan 2022 19:40 PM
3 30th Jan 2022 18:55 PM
4 30th Jan 2022 18:53 PM
5 30th Jan 2022 18:09 PM
6 30th Jan 2022 17:44 PM
7 30th Jan 2022 16:01 PM
8 30th Jan 2022 15:21 PM
9 30th Jan 2022 15:16 PM
10 30th Jan 2022 15:10 PM
11 30th Jan 2022 15:06 PM
12 30th Jan 2022 14:29 PM
13 30th Jan 2022 14:15 PM
14 30th Jan 2022 13:41 PM
15 30th Jan 2022 12:20 PM
16 30th Jan 2022 12:09 PM
17 30th Jan 2022 12:07 PM
18 30th Jan 2022 10:58 AM
19 30th Jan 2022 10:42 AM
20 30th Jan 2022 10:40 AM
21 30th Jan 2022 10:39 AM
22 30th Jan 2022 10:06 AM
23 30th Jan 2022 9:39 AM
24 30th Jan 2022 9:36 AM
25 30th Jan 2022 9:25 AM
26 30th Jan 2022 8:43 AM
27 30th Jan 2022 1:00 AM
28 30th Jan 2022 0:59 AM
29 30th Jan 2022 0:13 AM
Related
how do you extract a variable that appears multiple times in a table only once
I'm trying to extract the name of space organisations from a table but the closest i can get is the amount of times it appears next to the name of the organisation but i just want the name of the organisation not the amount of times it is named in the table. if you can help me please leave a comment on my google colab. https://colab.research.google.com/drive/1m4zI4YGguQ5aWdDVyc7Bdpr-78KHdxhR?usp=sharing What I get: variable number organisation time of launch 0 SpaceX Fri Aug 07, 2020 05:12 UTC 1 CASC Thu Aug 06, 2020 04:01 UTC 2 SpaceX Tue Aug 04, 2020 23:57 UTC 3 Roscosmos Thu Jul 30, 2020 21:25 UTC 4 ULA Thu Jul 30, 2020 11:50 UTC ... ... ... 4319 US Navy Wed Feb 05, 1958 07:33 UTC 4320 AMBA Sat Feb 01, 1958 03:48 UTC 4321 US Navy Fri Dec 06, 1957 16:44 UTC 4322 RVSN USSR Sun Nov 03, 1957 02:30 UTC 4323 RVSN USSR Fri Oct 04, 1957 19:28 UTC etc etc etc What I want: organisation RVSN USSR Arianespace CASC General Dynamics NASA VKS RF US Air Force ULA Boeing Martin Marietta etc
Merge Time Series-Data with different time delta
I am trying to merge two dataframes with different time delta. One represents the returns of an asset (df2) on a daily basis and the other one is the inflation rate (df1) which is published once a month but not in a regular inverval. I am trying to merge those two. df1 = First Release Original Release Date 30 Jun 2010 10:01 1.4% 30 Jul 2010 10:00 1.7% 31 Aug 2010 10:00 1.6% 30 Sep 2010 10:00 1.8% 29 Oct 2010 10:02 1.9% ... ... 17 Mar 2022 11:00 5.9% 21 Apr 2022 10:00 7.4% 18 May 2022 10:00 7.4% 17 Jun 2022 10:00 8.1% 19 Jul 2022 10:00 8.6% [145 rows x 1 columns] df2 = Date 2010-08-11 -0.001654 2010-08-12 -0.028538 2010-08-13 0.001072 2010-08-16 -0.007665 2010-08-17 0.002667 ... 2022-01-25 0.029663 2022-01-26 0.026082 2022-01-27 -0.000115 2022-01-28 0.002425 2022-01-31 0.007184 Obviously inflation rate should be placed in the new column from the day after it is released until there is a new release. For example 30. June is the first anouncement and 30 Jul the second. So from 1. July to the 30. July should be 1.4 %. The result is published on the 30. but to avoid look-ahead-bias it is more appropriate to have it . Does someone have an idea or maybe encountered some similar problem ?
SQL group by 7am to 7am
How do I simply group by a 24 hour interval from 7am to 7am in a manner similar to: select format(t_stamp,'yyyy-MMM') from mytable group by format(t_stamp,'yyyy-MMM') if input is like 3,Wed Mar 23 20:40:40 EDT 2022 3,Wed Mar 23 20:40:39 EDT 2022 4,Wed Mar 23 03:36:10 EDT 2022 3,Wed Mar 22 15:46:44 EST 2022 3,Tue Mar 22 04:16:52 EST 2022 4,Sat Mar 22 03:13:08 EDT 2022 3,Sat Mar 22 03:13:05 EDT 2022 4,Sat Mar 21 04:10:36 EDT 2022 output should be like 6, Mar 23 7, Mar 22 10, Mar 21 4, Mar 20
SQL Custom unique Ordering with repeated sequence
I have a datetime column (data type of timestamp without time zone) named time. I can best explain my issue with a example: Example I've the following data in this column (pretifying timestamp for this example) ID TIME 1 1 Mar 2022 - 1PM 2 1 Mar 2022 - 2PM 3 1 Mar 2022 - 1PM 4 1 Mar 2022 - 3PM 5 1 Mar 2022 - 2PM 6 2 Mar 2022 - 2PM 7 2 Mar 2022 - 1PM 8 2 Mar 2022 - 3PM 9 2 Mar 2022 - 1PM 10 1 Mar 2022 - 3PM 11 2 Mar 2022 - 2PM 12 2 Mar 2022 - 3PM 13 3 Mar 2022 - 4PM 14 3 Mar 2022 - 3PM 15 3 Mar 2022 - 3PM 16 3 Mar 2022 - 4PM If i do ORDER BY time, i get the following result: ID TIME 1 1 Mar 2022 - 1PM 3 1 Mar 2022 - 1PM 2 1 Mar 2022 - 2PM 5 1 Mar 2022 - 2PM 4 1 Mar 2022 - 3PM 10 1 Mar 2022 - 3PM 7 2 Mar 2022 - 1PM 9 2 Mar 2022 - 1PM 6 2 Mar 2022 - 2PM 11 2 Mar 2022 - 2PM 8 2 Mar 2022 - 3PM 12 2 Mar 2022 - 3PM 14 3 Mar 2022 - 3PM 15 3 Mar 2022 - 3PM 13 3 Mar 2022 - 4PM 16 3 Mar 2022 - 4PM But i want the result in this way: ID TIME 1 1 Mar 2022 - 1PM 2 1 Mar 2022 - 2PM 4 1 Mar 2022 - 3PM 13 3 Mar 2022 - 4PM 3 1 Mar 2022 - 1PM 5 1 Mar 2022 - 2PM 10 1 Mar 2022 - 3PM 16 3 Mar 2022 - 4PM 7 2 Mar 2022 - 1PM 6 2 Mar 2022 - 2PM 8 2 Mar 2022 - 3PM 9 2 Mar 2022 - 1PM 11 2 Mar 2022 - 2PM 12 2 Mar 2022 - 3PM 14 3 Mar 2022 - 3PM 13 3 Mar 2022 - 4PM As you can see first 4 rows have unique timestamp and the sequence should repeat based on Time (1PM, 2PM, 3PM). How can we do this in SQL? I'm using postresql as my DB. I'm using Rails for my Backend. EDIT: Have added more context to example to explain my scenario.
One way you can try to use ROW_NUMBER window function with REPLACE function SELECT time FROM ( SELECT *,REPLACE(time,'PM','') val, ROW_NUMBER() OVER(PARTITION BY REPLACE(time,'PM','')) rn FROM T ) t1 ORDER BY rn,val
For example, sequence of the col a with tbl(a, othercol) as ( SELECT 1,1 UNION ALL SELECT 1,2 UNION ALL SELECT 1,3 UNION ALL SELECT 2,4 UNION ALL SELECT 2,5 UNION ALL SELECT 2,6 UNION ALL SELECT 3,7 UNION ALL SELECT 3,8 UNION ALL SELECT 3,9 ), cte as ( SELECT *, row_number() over(partition by a order by a) rn from tbl ) select a, othercol from cte order by rn, a
The problem you have at hand is a direct result of not choosing the correct data type for the values you store. To get the sorting correct, you need to convert the string to a proper time value. There is no to_time() function in Postgres, but you can convert it to a timestamp then cast it to a time: order by to_timestamp("time", 'hham')::time You should fix your database design and convert that column to a proper time type. Which will also prevent storing invalid values ('3 in the afternoon' or '128foo') in that column
Databricks: replicate columns
Suppose I am having the following Dataframe : YEAR MONTH Value 2019 JAN 100 2019 JAN 200 2019 MAR 400 2019 MAR 100 And I do the pivot group by YEAR. ( df.groupBy().pivot()....) YEAR JAN MAR 2019 300 500 But I also wanted to replicate the column of the Months through out the year even there are no data in that month ... which means I would like to have YEAR JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2019 300 0 500 0 0 0 0 0 0 0 0 0 Thanks