Convert a Particular Dataframe Column Into Customized Date Or Time - pandas

As you can see a Date & Time Column are being saved in this CSV File. Now what problem is that the date & time are in format of something like - 30-1-2022 & 20:08:00
But i want it to look something like 30th Jan 22 and 8:08 PM
Any code for that ?
import requests
import pandas as pd
from datetime import datetime
from datetime import date
currentd = date.today()
s = requests.Session()
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url = 'https://www.nseindia.com/'
step = s.get(url,headers=headers)
today = datetime.now().strftime('%d-%m-%Y')
api_url = f'https://www.nseindia.com/api/corporate-announcements?index=equities&from_date={today}&to_date={today}'
resp = s.get(api_url,headers=headers).json()
result = pd.DataFrame(resp)
result.drop(['difference', 'dt','exchdisstime','csvName','old_new','orgid','seq_id','sm_isin','bflag','symbol','sort_date'], axis = 1, inplace = True)
result.rename(columns = {'an_dt':'DateandTime', 'attchmntFile':'Source','attchmntText':'Topic','desc':'Type','smIndustry':'Sector','sm_name':'Company Name'}, inplace = True)
result[['Date','Time']] = result.DateandTime.str.split(expand=True)
result.drop(['DateandTime'], axis = 1, inplace = True)
result.to_csv( ( str(currentd.day) +'-'+str(currentd.month) +'-'+'CA.csv'), index=True)
print('Saved the CSV File')

Try creating a temporary column:
result['Full_date']=pd.to_datetime(result['Date']+' '+result['Time'])
Then format 'Date' and 'Time'
result['Date']=result['Full_date'].dt.strftime('%b %d, %Y')
result['Time']=result['Full_date'].dt.strftime('%R' '%p')

Try this:
# Remove comment if needed
# import locale
# locale.setlocale(locale.LC_TIME, 'C')
# https://stackoverflow.com/a/16671271
def ord(n):
return str(n)+("th" if 4<=n%100<=20 else {1:"st",2:"nd",3:"rd"}.get(n%10, "th"))
result['Date'] = pd.to_datetime(result['Date'], format='%d-%b-%Y')
result['Date'] = result['Date'].dt.day.map(ord) + result['Date'].dt.strftime(' %b %Y')
result['Time'] = pd.to_datetime(result['Time']).dt.strftime('%-H:%M %p')
# Now you can export
Output:
>>> result[['Date', 'Time']]
Date Time
0 30th Jan 2022 21:07 PM
1 30th Jan 2022 20:57 PM
2 30th Jan 2022 19:40 PM
3 30th Jan 2022 18:55 PM
4 30th Jan 2022 18:53 PM
5 30th Jan 2022 18:09 PM
6 30th Jan 2022 17:44 PM
7 30th Jan 2022 16:01 PM
8 30th Jan 2022 15:21 PM
9 30th Jan 2022 15:16 PM
10 30th Jan 2022 15:10 PM
11 30th Jan 2022 15:06 PM
12 30th Jan 2022 14:29 PM
13 30th Jan 2022 14:15 PM
14 30th Jan 2022 13:41 PM
15 30th Jan 2022 12:20 PM
16 30th Jan 2022 12:09 PM
17 30th Jan 2022 12:07 PM
18 30th Jan 2022 10:58 AM
19 30th Jan 2022 10:42 AM
20 30th Jan 2022 10:40 AM
21 30th Jan 2022 10:39 AM
22 30th Jan 2022 10:06 AM
23 30th Jan 2022 9:39 AM
24 30th Jan 2022 9:36 AM
25 30th Jan 2022 9:25 AM
26 30th Jan 2022 8:43 AM
27 30th Jan 2022 1:00 AM
28 30th Jan 2022 0:59 AM
29 30th Jan 2022 0:13 AM

Related

how do you extract a variable that appears multiple times in a table only once

I'm trying to extract the name of space organisations from a table but the closest i can get is the amount of times it appears next to the name of the organisation but i just want the name of the organisation not the amount of times it is named in the table.
if you can help me please leave a comment on my google colab.
https://colab.research.google.com/drive/1m4zI4YGguQ5aWdDVyc7Bdpr-78KHdxhR?usp=sharing
What I get:
variable number
organisation
time of launch
0
SpaceX
Fri Aug 07, 2020 05:12 UTC
1
CASC
Thu Aug 06, 2020 04:01 UTC
2
SpaceX
Tue Aug 04, 2020 23:57 UTC
3
Roscosmos
Thu Jul 30, 2020 21:25 UTC
4
ULA
Thu Jul 30, 2020 11:50 UTC
...
...
...
4319
US Navy
Wed Feb 05, 1958 07:33 UTC
4320
AMBA
Sat Feb 01, 1958 03:48 UTC
4321
US Navy
Fri Dec 06, 1957 16:44 UTC
4322
RVSN USSR
Sun Nov 03, 1957 02:30 UTC
4323
RVSN USSR
Fri Oct 04, 1957 19:28 UTC
etc
etc
etc
What I want:
organisation
RVSN USSR
Arianespace
CASC
General Dynamics
NASA
VKS RF
US Air Force
ULA
Boeing
Martin Marietta
etc

Merge Time Series-Data with different time delta

I am trying to merge two dataframes with different time delta. One represents the returns of an asset (df2) on a daily basis and the other one is the inflation rate (df1) which is published once a month but not in a regular inverval. I am trying to merge those two.
df1 =
First Release
Original Release Date
30 Jun 2010 10:01 1.4%
30 Jul 2010 10:00 1.7%
31 Aug 2010 10:00 1.6%
30 Sep 2010 10:00 1.8%
29 Oct 2010 10:02 1.9%
... ...
17 Mar 2022 11:00 5.9%
21 Apr 2022 10:00 7.4%
18 May 2022 10:00 7.4%
17 Jun 2022 10:00 8.1%
19 Jul 2022 10:00 8.6%
[145 rows x 1 columns]
df2 =
Date
2010-08-11 -0.001654
2010-08-12 -0.028538
2010-08-13 0.001072
2010-08-16 -0.007665
2010-08-17 0.002667
...
2022-01-25 0.029663
2022-01-26 0.026082
2022-01-27 -0.000115
2022-01-28 0.002425
2022-01-31 0.007184
Obviously inflation rate should be placed in the new column from the day after it is released until there is a new release. For example 30. June is the first anouncement and 30 Jul the second. So from 1. July to the 30. July should be 1.4 %. The result is published on the 30. but to avoid look-ahead-bias it is more appropriate to have it . Does someone have an idea or maybe encountered some similar problem ?

SQL group by 7am to 7am

How do I simply group by a 24 hour interval from 7am to 7am in a manner similar to:
select format(t_stamp,'yyyy-MMM')
from mytable
group by format(t_stamp,'yyyy-MMM')
if input is like
3,Wed Mar 23 20:40:40 EDT 2022
3,Wed Mar 23 20:40:39 EDT 2022
4,Wed Mar 23 03:36:10 EDT 2022
3,Wed Mar 22 15:46:44 EST 2022
3,Tue Mar 22 04:16:52 EST 2022
4,Sat Mar 22 03:13:08 EDT 2022
3,Sat Mar 22 03:13:05 EDT 2022
4,Sat Mar 21 04:10:36 EDT 2022
output should be like
6, Mar 23
7, Mar 22
10, Mar 21
4, Mar 20

SQL Custom unique Ordering with repeated sequence

I have a datetime column (data type of timestamp without time zone) named time. I can best explain my issue with a example:
Example I've the following data in this column (pretifying timestamp for this example)
ID TIME
1 1 Mar 2022 - 1PM
2 1 Mar 2022 - 2PM
3 1 Mar 2022 - 1PM
4 1 Mar 2022 - 3PM
5 1 Mar 2022 - 2PM
6 2 Mar 2022 - 2PM
7 2 Mar 2022 - 1PM
8 2 Mar 2022 - 3PM
9 2 Mar 2022 - 1PM
10 1 Mar 2022 - 3PM
11 2 Mar 2022 - 2PM
12 2 Mar 2022 - 3PM
13 3 Mar 2022 - 4PM
14 3 Mar 2022 - 3PM
15 3 Mar 2022 - 3PM
16 3 Mar 2022 - 4PM
If i do ORDER BY time, i get the following result:
ID TIME
1 1 Mar 2022 - 1PM
3 1 Mar 2022 - 1PM
2 1 Mar 2022 - 2PM
5 1 Mar 2022 - 2PM
4 1 Mar 2022 - 3PM
10 1 Mar 2022 - 3PM
7 2 Mar 2022 - 1PM
9 2 Mar 2022 - 1PM
6 2 Mar 2022 - 2PM
11 2 Mar 2022 - 2PM
8 2 Mar 2022 - 3PM
12 2 Mar 2022 - 3PM
14 3 Mar 2022 - 3PM
15 3 Mar 2022 - 3PM
13 3 Mar 2022 - 4PM
16 3 Mar 2022 - 4PM
But i want the result in this way:
ID TIME
1 1 Mar 2022 - 1PM
2 1 Mar 2022 - 2PM
4 1 Mar 2022 - 3PM
13 3 Mar 2022 - 4PM
3 1 Mar 2022 - 1PM
5 1 Mar 2022 - 2PM
10 1 Mar 2022 - 3PM
16 3 Mar 2022 - 4PM
7 2 Mar 2022 - 1PM
6 2 Mar 2022 - 2PM
8 2 Mar 2022 - 3PM
9 2 Mar 2022 - 1PM
11 2 Mar 2022 - 2PM
12 2 Mar 2022 - 3PM
14 3 Mar 2022 - 3PM
13 3 Mar 2022 - 4PM
As you can see first 4 rows have unique timestamp and the sequence should repeat based on Time (1PM, 2PM, 3PM).
How can we do this in SQL? I'm using postresql as my DB. I'm using Rails for my Backend.
EDIT:
Have added more context to example to explain my scenario.
One way you can try to use ROW_NUMBER window function with REPLACE function
SELECT time
FROM (
SELECT *,REPLACE(time,'PM','') val,
ROW_NUMBER() OVER(PARTITION BY REPLACE(time,'PM','')) rn
FROM T
) t1
ORDER BY rn,val
For example, sequence of the col a
with tbl(a, othercol) as
(
SELECT 1,1 UNION ALL
SELECT 1,2 UNION ALL
SELECT 1,3 UNION ALL
SELECT 2,4 UNION ALL
SELECT 2,5 UNION ALL
SELECT 2,6 UNION ALL
SELECT 3,7 UNION ALL
SELECT 3,8 UNION ALL
SELECT 3,9
),
cte as (
SELECT *, row_number() over(partition by a order by a) rn
from tbl
)
select a, othercol
from cte
order by rn, a
The problem you have at hand is a direct result of not choosing the correct data type for the values you store.
To get the sorting correct, you need to convert the string to a proper time value. There is no to_time() function in Postgres, but you can convert it to a timestamp then cast it to a time:
order by to_timestamp("time", 'hham')::time
You should fix your database design and convert that column to a proper time type. Which will also prevent storing invalid values ('3 in the afternoon' or '128foo') in that column

Databricks: replicate columns

Suppose I am having the following Dataframe :
YEAR MONTH Value
2019 JAN 100
2019 JAN 200
2019 MAR 400
2019 MAR 100
And I do the pivot group by YEAR. ( df.groupBy().pivot()....)
YEAR JAN MAR
2019 300 500
But I also wanted to replicate the column of the Months through out the year even there are no data in that month ...
which means I would like to have
YEAR JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
2019 300 0 500 0 0 0 0 0 0 0 0 0
Thanks