Find maximum numbers from n-Numpy array - numpy

Is there a way to find out the highest individual numbers give a list of Numpy arrays?
e.g.
import numpy as np
a = np.array([10, 2])
b = np.array([3, 4])
c = np.array([5, 6])
The output will be a Numpy array with the following form:
np.array([10, 6])

You can stack all arrays together and get the max afterwards:
np.stack((a, b, c)).max(0)
# array([10, 6])

Related

How I print specific values from a multidimensional array with numpy?

I have a multidimensional np.array like: [[2, 55, 62], [3, 56,63], [4, 57, 64], ...].
I'm pretending to print only the values greater than 2 at the firt column, returnig a print like: [[3, 56,63], [4, 57, 64], ...]
How can I get it?
All you need to do is to select just the values you want to print.
Short answer:
import numpy as np
a = np.array([[1,2,3],[3,2,1]])
print(a[a>2])
What's going on?
Well, first, a>2 return a boolean mask telling if condition is met for each position of the array. This is a numpy array with exactly the same shape than a, but with dtype=bool.
Then, this mask is used to select only values where the mask's value is True, which are also those hat meet your condition.
Finally, you just print them.
Step by step, you can write as follows:
import numpy as np
a = np.array([[1,2,3],[3,2,1]])
print(a.shape) # output is (2, 3)
mask = a > 2
print(mask.shape) # output is (2, 3)
print(mask.dtype) # output is book
print(mask) # here you can see True only for those positions where condition is met
print(a[mask])

Extract all odd numbers from numpy array

how can i extract all the odd number from a numpy array?
try:
import numpy as np
a = np.array([1,2,3,4,5,6,6,7,7,8,9])
a[a % 2 == 1]
Out[13]: array([1, 3, 5, 7, 7, 9])
b = np.where(a%2)
print(f'Array with Odd Numbers: {a[b]}')
Here a is the array containing all the numbers.
And the Output of Odd Numbers from that array is given as a[b]

Elementwise multiplication of NumPy arrays of different shapes

When I use numpy.multiply(a,b) to multiply numpy arrays with shapes (2, 1),(2,) I get a 2 by 2 matrix. But what I want is element-wise multiplication.
I'm not familiar with numpy's rules. Can anyone explain what's happening here?
When doing an element-wise operation between two arrays, which are not of the same dimensionality, NumPy will perform broadcasting. In your case Numpy will broadcast b along the rows of a:
import numpy as np
a = np.array([[1],
[2]])
b = [3, 4]
print(a * b)
Gives:
[[3 4]
[6 8]]
To prevent this, you need to make a and b of the same dimensionality. You can add dimensions to an array by using np.newaxis or None in your indexing, like this:
print(a * b[:, np.newaxis])
Gives:
[[3]
[8]]
Let's say you have two arrays, a and b, with shape (2,3) and (2,) respectively:
a = np.random.randint(10, size=(2,3))
b = np.random.randint(10, size=(2,))
The two arrays, for example, contain:
a = np.array([[8, 0, 3],
[2, 6, 7]])
b = np.array([7, 5])
Now for handling a product element to element a*b you have to specify what numpy has to do when reaching for the absent axis=1 of array b. You can do so by adding None:
result = a*b[:,None]
With result being:
array([[56, 0, 21],
[10, 30, 35]])
Here are the input arrays a and b of the same shape as you mentioned:
In [136]: a
Out[136]:
array([[0],
[1]])
In [137]: b
Out[137]: array([0, 1])
Now, when we do multiplication using either * or numpy.multiply(a, b), we get:
In [138]: a * b
Out[138]:
array([[0, 0],
[0, 1]])
The result is a (2,2) array because numpy uses broadcasting.
# b
#a | 0 1
------------
0 | 0*0 0*1
1 | 1*0 1*1
I just explained the broadcasting rules in broadcasting arrays in numpy
In your case
(2,1) + (2,) => (2,1) + (1,2) => (2,2)
It has to add a dimension to the 2nd argument, and can only add it at the beginning (to avoid ambiguity).
So you want a (2,1) result, you have to expand the 2nd argument yourself, with reshape or [:, np.newaxis].

Add single element to array as first entry in numpy

How to achieve this?
I have a numpy array containing:
[1, 2, 3]
I want to create an array containing:
[8, 1, 2, 3]
That is, I want to add an element on as the first element of the array.
Ref:Add single element to array in numpy
The most basic operation is concatenate:
x=np.array([1,2,3])
np.concatenate([[8],x])
# array([8, 1, 2, 3])
np.r_ and np.insert make use of this. Even if they are more convenient to remember, or use in more complex cases, you should be familiar with concatenate.
Use numpy.insert(). The docs are here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.insert.html#numpy.insert
You can also use numpy's np.r_, a short-cut for concatenation along the first axis:
>>> import numpy as np
>>> a = np.array([1, 2, 3])
>>> b = np.r_[8, a]
>>> b
array([8, 1, 2, 3])

Seaborn groupby pandas Series

I want to visualize my data into box plots that are grouped by another variable shown here in my terrible drawing:
So what I do is to use a pandas series variable to tell pandas that I have grouped variables so this is what I do:
import pandas as pd
import seaborn as sns
#example data for reproduciblity
a = pd.DataFrame(
[
[2, 1],
[4, 2],
[5, 1],
[10, 2],
[9, 2],
[3, 1]
])
#converting second column to Series
a.ix[:,1] = pd.Series(a.ix[:,1])
#Plotting by seaborn
sns.boxplot(a, groupby=a.ix[:,1])
And this is what I get:
However, what I would have expected to get was to have two boxplots each describing only the first column, grouped by their corresponding column in the second column (the column converted to Series), while the above plot shows each column separately which is not what I want.
A column in a Dataframe is already a Series, so your conversion is not necessary. Furthermore, if you only want to use the first column for both boxplots, you should only pass that to Seaborn.
So:
#example data for reproduciblity
df = pd.DataFrame(
[
[2, 1],
[4, 2],
[5, 1],
[10, 2],
[9, 2],
[3, 1]
], columns=['a', 'b'])
#Plotting by seaborn
sns.boxplot(df.a, groupby=df.b)
I changed your example a little bit, giving columns a label makes it a bit more clear in my opinion.
edit:
If you want to plot all columns separately you (i think) basically want all combinations of the values in your groupby column and any other column. So if you Dataframe looks like this:
a b grouper
0 2 5 1
1 4 9 2
2 5 3 1
3 10 6 2
4 9 7 2
5 3 11 1
And you want boxplots for columns a and b while grouped by the column grouper. You should flatten the columns and change the groupby column to contain values like a1, a2, b1 etc.
Here is a crude way which i think should work, given the Dataframe shown above:
dfpiv = df.pivot(index=df.index, columns='grouper')
cols_flat = [dfpiv.columns.levels[0][i] + str(dfpiv.columns.levels[1][j]) for i, j in zip(dfpiv.columns.labels[0], dfpiv.columns.labels[1])]
dfpiv.columns = cols_flat
dfpiv = dfpiv.stack(0)
sns.boxplot(dfpiv, groupby=dfpiv.index.get_level_values(1))
Perhaps there are more fancy ways of restructuring the Dataframe. Especially the flattening of the hierarchy after pivoting is hard to read, i dont like it.
This is a new answer for an old question because in seaborn and pandas are some changes through version updates. Because of this changes the answer of Rutger is not working anymore.
The most important changes are from seaborn==v0.5.x to seaborn==v0.6.0. I quote the log:
Changes to boxplot() and violinplot() will probably be the most disruptive. Both functions maintain backwards-compatibility in terms of the kind of data they can accept, but the syntax has changed to be more similar to other seaborn functions. These functions are now invoked with x and/or y parameters that are either vectors of data or names of variables in a long-form DataFrame passed to the new data parameter.
Let's now go through the examples:
# preamble
import pandas as pd # version 1.1.4
import seaborn as sns # version 0.11.0
sns.set_theme()
Example 1: Simple Boxplot
df = pd.DataFrame([[2, 1] ,[4, 2],[5, 1],
[10, 2],[9, 2],[3, 1]
], columns=['a', 'b'])
#Plotting by seaborn with x and y as parameter
sns.boxplot(x='b', y='a', data=df)
Example 2: Boxplot with grouper
df = pd.DataFrame([[2, 5, 1], [4, 9, 2],[5, 3, 1],
[10, 6, 2],[9, 7, 2],[3, 11, 1]
], columns=['a', 'b', 'grouper'])
# usinge pandas melt
df_long = pd.melt(df, "grouper", var_name='a', value_name='b')
# join two columns together
df_long['a'] = df_long['a'].astype(str) + df_long['grouper'].astype(str)
sns.boxplot(x='a', y='b', data=df_long)
Example 3: rearanging the DataFrame to pass is directly to seaborn
def df_rename_by_group(data:pd.DataFrame, col:str)->pd.DataFrame:
'''This function takes a DataFrame, groups by one column and returns
a new DataFrame where the old columnnames are extended by the group item.
'''
grouper = df.groupby(col)
max_length_of_group = max([len(values) for item, values in grouper.indices.items()])
_df = pd.DataFrame(index=range(max_length_of_group))
for i in grouper.groups.keys():
helper = grouper.get_group(i).drop(col, axis=1).add_suffix(str(i))
helper.reset_index(drop=True, inplace=True)
_df = _df.join(helper)
return _df
df = pd.DataFrame([[2, 5, 1], [4, 9, 2],[5, 3, 1],
[10, 6, 2],[9, 7, 2],[3, 11, 1]
], columns=['a', 'b', 'grouper'])
df_new = df_rename_by_group(data=df, col='grouper')
sns.boxplot(data=df_new)
I really hope this answer helps to avoid some confusion.
sns.boxplot() doesnot take groupby.
Probably you are gonna see
TypeError: boxplot() got an unexpected keyword argument 'groupby'.
The best idea to group data and use in boxplot passing the data as groupby dataframe value.
import seaborn as sns
grouDataFrame = nameDataFrame(['A'])['B'].agg(sum).reset_index()
sns.boxplot(y='B', x='A', data=grouDataFrame)
Here B column data contains numeric value and grouped is done on the basis of A. All the grouped value with their respective column are added and boxplot diagram is plotted. Hope this helps.