I have a dataframe with one column of unequal list which I want to spilt into multiple columns (the item value will be the column names). An example is given below
I have done through iterrows, iterating thruough the rows and examine the list from each rows. It seem workable as my dataframe has few rows. However, I wonder if there is any clean methods
I have done through additional_df = pd.DataFrame(venue_df.location.values.tolist())
However the list break down into as below
thanks fro your help
Can you try this code: built assuming venue_df.location contains the list you have shown in the cells.
venue_df['school'] = venue_df.location.apply(lambda x: ('school' in x)+0)
venue_df['office'] = venue_df.location.apply(lambda x: ('office' in x)+0)
venue_df['home'] = venue_df.location.apply(lambda x: ('home' in x)+0)
venue_df['public_area'] = venue_df.location.apply(lambda x: ('public_area' in x)+0)
Hope this helps!
First lets explode your location column, so we can get your wanted end result.
s=df['Location'].explode()
Then lets use crosstab in that series so we can get your end result
import pandas as pd
pd.crosstab(s).unstack()
I didnt test it out cause i dont know you base_df
I have a data frame with mutliple columns and some of these have missing values.
I would to filter so that I can return a dataframe that has missing values on one or two specific columns.
Can anyone help me figure out how to do that?
Having a dataframe "df" with a columns "A"
df_missing = df[df['A'].isnull()]
I have a dataframe with columns. The columns have mostly blank rows but a few of the rows have strings and those are the only rows i want to see. I have tried the below code but dont know how to only select strings in the columns and append to get a new dataframe with only columns with strings in the rows.
columns = list(df)
for i in columns:
df1 = df[df[i]== ]
can someone please help?
df[df['column_name'].isna()]
should do the trick
I'm trying to understand how to concat three individual dataframes (i.e df1, df2, df3) into a new dataframe say df4 whereby each individual dataframe has its own column left to right order.
I've tried using concat with axis = 1 to do this, but it appears not possible to automate this with a single action.
Table1_updated = pd.DataFrame(columns=['3P','2PG-3Io','3Io'])
Table1_updated=pd.concat([get_table1_3P,get_table1_2P_max_3Io,get_table1_3Io])
Note that with the exception of get_table1_2P_max_3Io, which has two columns, all other dataframes have one column
For example,
get_table1_3P =
get_table1_2P_max_3Io =
get_table1_3Io =
Ultimately, i would like to see the following:
I believe you need first concat and tthen change order by list of columns names:
Table1_updated=pd.concat([get_table1_3P,get_table1_2P_max_3Io,get_table1_3Io], axis=1)
Table1_updated = Table1_updated[['3P','2PG-3Io','3Io']]
I have two csv files that I want to merge, by adding the column information from one csv to another. However they have no common index between them, but they do have the same amount of rows(they are in order). I have seen many examples of joining csv files based on index and on same numbers, however my csv files have no similar index, but they are in order. I've tried a few different examples with no luck.
mycsvfile1
"a","1","mike"
"b","2","sally"
"c","3","derek"
mycsvfile2
"boy","63","retired"
"girl","55","employed"
"boy","22","student"
Desired outcome for outcsvfile3
"a","1","mike","boy","63","retired"
"b","2","sally","girl","55","employed"
"c","3","derek","boy","22","student"
Code:
import csv
import panada
df2 = pd.read_csv("mycsvfile1.csv",header=None)
df1 = pd.read_csv("mycsvfile2.csv", header=None)
df3 = pd.merge(df1,df2)
Using
df3 = pd.merge([df1,df2])
Adds the data into a new row which doesn't help me. Any assistance is greatly appreciated.
If both dataframes have numbered indexes (i.e. starting at 0 and increasing by 1 - which is the default behaviour of pd.read_csv), and assuming that both DataFrames are already sorted in the correct order so that the rows match up, then this should do it:
df3 = pd.merge(df1,df2, left_index=True, right_index=True)
You do not have any common columns between df1 and df2 , besides the index . So we can using concat
pd.concat([df1,df2],axis=1)