Filteration on dataframe column value with combination of values

Filteration on dataframe column value with combination of values - pandas

I have a dataframe which has 2 columns named TABLEID and STATID
There are different values in the both the columns.
when I filter the dataframe on values say '101PC' and 'ST101', it gives me 14K records and when I filter the dataframe on values say '102HT' and 'ST102', it gives me 14K records also. The issue is when I try to combine both the filters like below it gives me blank dataframe. I was expecting 28K records in my resultant dataframe. Any help is much appreciated
df[df[['TABLEID','STATID']].apply(tuple, axis = 1).isin([('101PC', 'ST101'), ('102HT','ST102')])]

Related

Merge two dataframe based on column which has splitted value

I have two data frames. One of the data frames appears to be as follows:
.
Products columns contain data like 1;3;5.
The other data frame looks like:
I am merging both of the frames:
Merge_Store_Transaction['products'] = Merge_Store_Transaction['products'].str.split(';')
Merge_Store_Transaction = Merge_Store_Transaction.explode('products')
Which give me result like: It duplicated all other values that I don't want. Is there a way where it divide the profit column with respective number of products and replicate the number or just fill other rows with zero.

I think that once you have this result, you can do something like the following:
Merge_Store_Transaction["profit"] = Merge_Store_Transaction.groupby(["group_id", "date"])["profit"].mean().reset_index(0, drop=True)
Same thing for the revenue_in_usd column.

How to filter on panda dataframe for missing values?

I have a data frame with mutliple columns and some of these have missing values.
I would to filter so that I can return a dataframe that has missing values on one or two specific columns.
Can anyone help me figure out how to do that?

Having a dataframe "df" with a columns "A"
df_missing = df[df['A'].isnull()]

iterate of df colomns to give rows with text

I have a dataframe with columns. The columns have mostly blank rows but a few of the rows have strings and those are the only rows i want to see. I have tried the below code but dont know how to only select strings in the columns and append to get a new dataframe with only columns with strings in the rows.
columns = list(df)
for i in columns:
df1 = df[df[i]== ]
can someone please help?

df[df['column_name'].isna()]
should do the trick

Convert Series to Dataframe where series index is Dataframe column names

I am selecting row by row as follows:
for i in range(num_rows):
row = df.iloc[i]
as a result I am getting a Series object where row.index.values contains names of df columns.
But I wanted instead dataframe with only one row having dataframe columns in place.
When I do row.to_frame() instead of 1x85 dataframe (1 row, 85 cols) I get 85x1 dataframe where index contains names of columns and row.columns
outputs
Int64Index([0], dtype='int64').
But all I want is just original data-frame columns with only one row. How do I do it?
Or how do I convert row.index values to row.column values and change 85x1 dimension to 1x85

You just need to adding T
row.to_frame().T
Also change your for loop with adding []
for i in range(num_rows):
row = df.iloc[[i]]

How to group by and sum several columns?

I have a big dataframe with several columns which contains strings, numbers, etc. I am trying to group by SCENARIO and then sum only the columns between 2020 and 2050. The only thing I have got so far is sum one column as displayed as follows, but I need to change this '2050' by the columns between 2020 and 2050, for instance.
df1 = df.groupby(["SCENARIO"])['2050'].sum().sum(axis=0)

You are creating a subset of the df with only that single column. I can't tell how your dataset looks like from the information provided, but try:
df.groupby(["SCENARIO"]).sum()
This should some up all the rows which are in the column.
Alternatively select the columns which you want to perform the summation on.
df.groupby(["SCENARIO"])[["column1","column2"]].sum()

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Filteration on dataframe column value with combination of values - pandas

Related

Merge two dataframe based on column which has splitted value

How to filter on panda dataframe for missing values?

iterate of df colomns to give rows with text

Convert Series to Dataframe where series index is Dataframe column names

How to group by and sum several columns?

Categories

Resources