Merging DataFrames on a specific column together - pandas

I tried to perform my self-created function on a for loop.
Some remarks in advance:
ma_strategy is my function and requires three inputs
ticker_list is a list with strings result is a pandas Dataframe with 7 columns and I can call the column 'return_cum' with result['return_cum']. - The rows of this column are containing floating point numbers.
My intention is the following:
The for loop should iterate over the items in my ticker_list and should save the 'return_cum' columns in a DataFrame. Then the different 'return_cum' columns should be stored together so that at the end I get a DataFrame with all the 'return_cum' columns of my ticker list.
How can I achieve that goal?
My approach is:
for i in ticker_list:
result = ma_strategy(i, 20, 5)
x = result['return_cum'].to_frame()
But at this stage I need some help.

If i inderstood you correctly this should work:
result_df =pd.DataFrame()
for i in ticker_list:
result= ma_strategy(i, 20,5)
resault_df[i + '_return_cum'] = result['return_cum']

Related

Compile a count of similar rows in a Pandas Dataframe based on multiple column values

I have two Dataframes, one containing my data read in from a CSV file and another that has the data grouped by all of the columns but the last and reindexed to contain a column for the count of the size of the groups.
df_k1 = pd.read_csv(filename, sep=';')
columns_for_groups = list(df_k1.columns)[:-1]
k1_grouped = df_k1.groupby(columns_for_groups).size().reset_index(name="Count")
I need to create a series such that every row(i) in the series corresponds to row(i) in my original Dataframe but the contents of the series need to be the size of the group that the row belongs to in the grouped Dataframe. I currently have this, and it works for my purposes, but I was wondering if anyone knew of a faster or more elegant solution.
size_by_row = []
for row in df_k1.itertuples():
for group in k1_grouped.itertuples():
if row[1:-1] == group[1:-1]:
size_by_row.append(group[-1])
break
group_size = pd.Series(size_by_row)

pandas: split pandas columns of unequal length list into multiple columns

I have a dataframe with one column of unequal list which I want to spilt into multiple columns (the item value will be the column names). An example is given below
I have done through iterrows, iterating thruough the rows and examine the list from each rows. It seem workable as my dataframe has few rows. However, I wonder if there is any clean methods
I have done through additional_df = pd.DataFrame(venue_df.location.values.tolist())
However the list break down into as below
thanks fro your help
Can you try this code: built assuming venue_df.location contains the list you have shown in the cells.
venue_df['school'] = venue_df.location.apply(lambda x: ('school' in x)+0)
venue_df['office'] = venue_df.location.apply(lambda x: ('office' in x)+0)
venue_df['home'] = venue_df.location.apply(lambda x: ('home' in x)+0)
venue_df['public_area'] = venue_df.location.apply(lambda x: ('public_area' in x)+0)
Hope this helps!
First lets explode your location column, so we can get your wanted end result.
s=df['Location'].explode()
Then lets use crosstab in that series so we can get your end result
import pandas as pd
pd.crosstab(s).unstack()
I didnt test it out cause i dont know you base_df

Pandas split list inside a column into separate columns

I have a dataset with 71 columns and 113 rows. Each column is a array of values. I want to split these arrays into separate columns. Then rename the columns with the prefix
!wget https://raw.githubusercontent.com/pranavn91/sample/master/audioonly.csv
audio = pd.read_csv("audioonly.csv")
zcr = pd.DataFrame(audio['zcr'].str.split().values.tolist())
zcr.columns = ['zcr_' + str(col) for col in zcr.columns]
I can do it for each column individually and combine as single dataframe.
Please propose a faster method.
you can use concat and a list comprehension:
audio_exploded = pd.concat([pd.DataFrame(audio[col].str.split().values.tolist())\
.add_prefix(f'{col}_')
for col in audio.columns],
axis=1)

How to concat 3 dataframes with each into sequential columns

I'm trying to understand how to concat three individual dataframes (i.e df1, df2, df3) into a new dataframe say df4 whereby each individual dataframe has its own column left to right order.
I've tried using concat with axis = 1 to do this, but it appears not possible to automate this with a single action.
Table1_updated = pd.DataFrame(columns=['3P','2PG-3Io','3Io'])
Table1_updated=pd.concat([get_table1_3P,get_table1_2P_max_3Io,get_table1_3Io])
Note that with the exception of get_table1_2P_max_3Io, which has two columns, all other dataframes have one column
For example,
get_table1_3P =
get_table1_2P_max_3Io =
get_table1_3Io =
Ultimately, i would like to see the following:
I believe you need first concat and tthen change order by list of columns names:
Table1_updated=pd.concat([get_table1_3P,get_table1_2P_max_3Io,get_table1_3Io], axis=1)
Table1_updated = Table1_updated[['3P','2PG-3Io','3Io']]

Merging Dataframe within a for loop

I tried to perform my self-created function on a for loop, but it does not work as expected.
Some remarks in advance:
ma_strategy is my function and requires three inputs
ticker_list is a list with strings
result is a pandas Dataframe with 7 columns and I can call the column 'return_cum' with result['return_cum']. The rows of this column are containing floating point numbers.
These for loops doesn't work:
for i in ticker_list:
result = ma_strategy(i, 20, 5)
x = result['return_cum']
sample_returns = pd.DataFrame
y = pd.merge(x.to_frame(),sample_returns, left_index=True)
for i in ticker_list:
result = ma_strategy(i, 20, 5)
x = result[['return_cum']]
sample_returns = pd.DataFrame
y = pd.concat([sample_returns, x], axis=1)
My intention is the following:
The for loop should iterate over the items in my ticker_list and should save the 'return_cum' columns in x. Then the 'return_cum' columns should be stored in y together so that at the end I get a DataFrame with all the 'return_cum' columns of my ticker list.
How can I achieve that goal? I tried pd.concoat and merge, but nothing works.
Thanks for your help!