NaN output when multiplying row and column of dataframe in pandas - pandas

I have two data frames the first one looks like this:
and the second one like so:
I am trying to multiply the values in number of donors column of the second data frame(96 values) with the values in the first row of the first data frame and columns 0-95 (also 96 values).
Below is the code I have for multiplying the two right now, but as you can see the values are all NaN:
Does anyone know how to fix this?

Your second dataframe has dtype object, you must convert it to float
df_sls.iloc[0,3:-1].astype(float)

Related

Merge two dataframe based on column which has splitted value

I have two data frames. One of the data frames appears to be as follows:
.
Products columns contain data like 1;3;5.
The other data frame looks like:
I am merging both of the frames:
Merge_Store_Transaction['products'] = Merge_Store_Transaction['products'].str.split(';')
Merge_Store_Transaction = Merge_Store_Transaction.explode('products')
Which give me result like: It duplicated all other values that I don't want. Is there a way where it divide the profit column with respective number of products and replicate the number or just fill other rows with zero.
I think that once you have this result, you can do something like the following:
Merge_Store_Transaction["profit"] = Merge_Store_Transaction.groupby(["group_id", "date"])["profit"].mean().reset_index(0, drop=True)
Same thing for the revenue_in_usd column.

Is there a way to combine two columns in a dataset, keeping the larger float64 using Pandas?

Ill try to keep it simple, but these are very large datasets I am working with.
Theoretically I am trying to combine Columns A and B of my data frame.
But, if A has a value in a row then B doesn't, and vice versa. That hole is filled with 'NaN'
A {1,2,NaN,4,5}
B {NaN,NaN,3,NaN,NaN}
I need A to equal {1,2,3,4,5}
EDIT:
Using
df.rename(columns{"a":"b"})
before you concatenate your data allows them to be combined easily is the only layering values layer over NaN.
df['A'] = df['A'].fillna(df['B'])
What this code does is fill all missing values of column A with the values found in column B.
For more options see: https://datascience.stackexchange.com/questions/17769/how-to-fill-missing-value-based-on-other-columns-in-pandas-dataframe

Filteration on dataframe column value with combination of values

I have a dataframe which has 2 columns named TABLEID and STATID
There are different values in the both the columns.
when I filter the dataframe on values say '101PC' and 'ST101', it gives me 14K records and when I filter the dataframe on values say '102HT' and 'ST102', it gives me 14K records also. The issue is when I try to combine both the filters like below it gives me blank dataframe. I was expecting 28K records in my resultant dataframe. Any help is much appreciated
df[df[['TABLEID','STATID']].apply(tuple, axis = 1).isin([('101PC', 'ST101'), ('102HT','ST102')])]

counting each value in dataframe

So I want to create a plot or graph. I have a time series data.
My dataframe looks like that:
df.head()
I need to count values in df['status'] (there are 4 different values) and df['group_name'] (2 different values) for each day.
So i want to have date index and count of how many times each value from df['status'] appear as well as df['group_name']. It should return Series.
I used spam.groupby('date')['column'].value_counts().unstack().fillna(0).astype(int) and it working as it should. Thank you all for help

Convert floats to ints in pandas dataframe

I have a pandas dataframe with a column ‘distance’ and it is of datatype ‘float64’.
Distance
14.827379
0.754254
0.2284546
1.833768
I want to convert these numbers to whole numbers (14,0,0,1). I tried with this but I get the error “ValueError: Cannot convert NA to integer”.
df['distance(kmint)'] = result['Distance'].astype('int')
Any help would be appreciated!!
I filtered out the NaN's from the dataframe using this:
result = result[np.isfinite(result['distance(km)'])]
Then, I was able to convert from float to int.
An alternative approach would be to convert the NaN values as part of your data import and cleaning processes. The more generalized solution could involve specifying the values that are NaN in the read_table command by setting the na_values flag. What you want to make sure of is that there isn't some malfored data like 1.5km in one of your fields that getting picked up as a NaN value.
pandas.read_table(..., na_values=None, keep_default_na=True, na_filter=True, ....)
Subsequently, once the dataframe is populated and the NaN values are identified properly, you can use the fillna method to substitute in zeros or the values that you identified as your distances.
Finally, it would be best to probably use notnull versus isfinite to convert the over to integers.