pandas how to explode from two cells element-wise [duplicate] - pandas

This question already has answers here:
Efficient way to unnest (explode) multiple list columns in a pandas DataFrame
(7 answers)
Closed 9 months ago.
I have a dataframe:
df =
A B C
1 [2,3] [4,5]
And I want to explode it element-wise based on [B,C] to get:
df =
A B C
1 2 4
1 3 5
What is the best way to do so?
B and C are always at the same length.
Thanks

Try, in pandas 1.3.2:
df.explode(['B', 'C'])
Output:
A B C
0 1 2 4
0 1 3 5

Related

Merge and inverleave rows of two dataframes [duplicate]

This question already has answers here:
Pandas - Interleave / Zip two DataFrames by row
(5 answers)
Closed 20 days ago.
This post was edited and submitted for review 20 days ago.
Suppose we have:
>>> df1
A B
0 1 a
1 2 a
2 3 a
3 4 a
>>> df2
A B
0 1 b
1 2 b
2 3 b
3 5 b
I would like to merge them on "A" and then list them by interleaving rows like:
A B
0 1 a
0 1 b
1 2 a
1 2 b
2 3 a
2 3 b
I tried merge but it list them column by column. For example if I have 3 or more data frames, merge can merge them on some columns, but my problem would be then to interleave them
If need match by A filter rows by Series.isin in boolean indexing, pass to concat with DataFrame.sort_index:
df = pd.concat([df1[df1.A.isin(df2.A)],
df2[df2.A.isin(df1.A)]]).sort_index(kind='stable')
print (df)
A B
0 1 a
0 1 b
1 2 a
1 2 b
2 3 a
2 3 b
EDIT:
For general data is possible sorting by A and create default index for correct interleaving:
df = (pd.concat([df1[df1.A.isin(df2.A)].sort_values('A', kind='stable').reset_index(drop=True),
df2[df2.A.isin(df1.A)].sort_values('A', kind='stable').reset_index(drop=True)])
.sort_index(kind='stable'))

Transform a dataframe in this specific way [duplicate]

This question already has answers here:
Reshape Pandas DataFrame to a Series with columns prefixed with indices
(1 answer)
efficiently flatten multiple columns into a single row in pandas
(1 answer)
Closed 8 months ago.
(Please help me to rephrase the title. I looked at questions with similar titles but they are not asking the same thing.)
I have a dataframe like this:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
(the first column is indexes and not important)
I need to transform it so it ends up like this:
A A-1 A-2 B B-1 B-2 C C-1 C-2
1 2 3 4 5 6 7 8 9
I know about DataFrame.T which seems one step in the right direction, but how to programatically change the column headers, and move the rows "besides each other" to make it a single row?
First use DataFrame.unstack with convert values to one columns DataFrame by Series.to_frame and transpose, last flatten MultiIndex in list comprehension with if-else for expected ouput:
df1 = df.unstack().to_frame().T
df1.columns = [a if b == 0 else f'{a}-{b}' for a, b in df1.columns]
print (df1)
A A-1 A-2 B B-1 B-2 C C-1 C-2
0 1 2 3 4 5 6 7 8 9

python - List of Lists into pandas dataframe including name of columns

I would like to transfer a list of lists into a dataframe with columns based on the lists in the list.
This is still easy.
list = [[....],[....],[...]]
df = pd.DataFrame(list)
df = df.transpose()
The problem is: I would like to give the columns a column-name based on entries I have in another list:
list_two = [A,B,C,...]
This is my issue Im still struggling with.
Is there any approach to solve this problem?
Thanks a lot in advance for your help.
Best regards
Sascha
Use zip with dict for dictionary of lists and pass to DataFrame:
L= [[1,2,3,5],[4,8,9,8],[1,2,5,3]]
list_two = list('ABC')
df = pd.DataFrame(dict(zip(list_two, L)))
print (df)
A B C
0 1 4 1
1 2 8 2
2 3 9 5
3 5 8 3
Or if pass index parameter after transpose get columns names by this list:
df = pd.DataFrame(L, index=list_two).T
print (df)
A B C
0 1 4 1
1 2 8 2
2 3 9 5
3 5 8 3

Dataframe group by numerical column and then combine with the original dataframe [duplicate]

This question already has answers here:
Pandas new column from groupby averages
(2 answers)
Closed 2 years ago.
I have a pandas data frame and I would like to first group by one of the columns and calculate mean of count of each group of that column. Then, I would like to combine this grouped entity with the original data frame.
An example:
df =
a b orders
1 3 5
5 8 10
2 3 6
Group by along column b and taking mean of orders
groupby_df =
b mean(orders)
3 5.5
8 10
End result:
df =
a b orders. mean(orders)
1 3 5 5.5
5 8 10 10
2 3 6 5.5
I know I can group by on b and then, do a inner join on b, but, I feel like it can be done in much cleaner/one-liner way. Is it possible to do better than that?
This is transform
df['mean']=df.groupby('b').orders.transform('mean')

flattern pandas dataframe column levels [duplicate]

This question already has answers here:
Pandas: combining header rows of a multiIndex DataFrame
(1 answer)
How to flatten a hierarchical index in columns
(19 answers)
Closed 4 years ago.
I'm surprised, i haven't found anything relevant.
I simply need to flattern this DataFrame with some unifying symbol, e.g. "_".
So, i need this
A B
a1 a2 b1 b2
id
264 0 0 1 1
321 1 1 2 2
to look like this:
A_a1 A_a2 B_b1 B_b2
id
264 0 0 1 1
321 1 1 2 2
Try this:
df.columns = df.columns.map('_'.join)