How does assignment to a boolean-indexed numpy array work?

How does assignment to a boolean-indexed numpy array work? - numpy

If I have a numpy array a, then a[a == 8] returns an array that consists of all 8's. If I assign such an array to be 0, as in a[a==8] = 0, then I would expect that the object returned by a[a == 8] is being assigned to zero. Instead, assigns each element of a that was equal to 8 to be zero, which is obviously more intuitive. The problem is that I cannot understand how this comes to be by the rules of the language.

Related

Why 'float object not iterable error' if it's an array of integers?

So I have a column dataframe made of arrays that I already divided by using indexes of other columns. Therefore I just get a part of the array that depends on these indexes and position, the position is a list of tuples with the positions (they can have the same start and ending point) and the index is a value. These are displayed in columns as well. The code is the following:
df['relative_cells_array'] = df.apply(lambda x: x['cells_array'][:, x['position'][x['relative_track']][0]:x['position'][x['relative_track']][1]+1] if x['relative_track']<=len(x['position']) else np.nan, axis=1)
This works. But the problem comes when I use other arrays that are modified, in this case the array uses spatial binomial weights to interpolate values. Due to the standardization, this transformation of the original array gives you float when dividing by the neighboring cells. I convert it to integer and PRINT the array, but still gives me error, I tried other things and it also gave me error of the tuple (position is a list of tuple). But why it worked before?
The code for this is the following:
df['relative_cells_array_weighted1'] = df.apply(lambda x: [[int(y) for y in sublist] for sublist in x['cells_weighted1'][:, x['position'][x['relative_track']][0]:x['position'][x['relative_track']][1]+1]] if x['relative_track']<=len(x['position']) else np.nan, axis=1)
df['relative_average_weighted1_cell_reading'] = df['relative_cells_array_weighted1'].apply(lambda x: [num for sublist in x for num in sublist])
This is the error: TypeError: 'float' object is not iterable
And after making some changes I have the tuple error (don't remember the changes, I used chatgpt)

Boolean indexing with 2D arrays

I have two arrays, a and b, one 2D and one 1D, containing values of two related quantities that are filled in the same order, such that a[0] is related to b[0] and so on.
I would like to access the element of b where a is equal to a given value, where the value is a 1D array itself.
For example
a=np.array([[0,0],[0,1],[1,0],[1,1]])
b=np.array([0, 7, 9, 4])
value = np.array([0,1])
In 1D cases I could use boolean indexing easily and do
b[a==value]
The result I want is 7.
But in this case, it does not work because it checks each element of b in the comparison, instead of checking subarrays...
Is there a quick way to do this?

The question doesn't seem to match the example, but this returns [7]:
b[(a == value).all(axis=-1)]

What does the [1] do when using .where()?

I m practicing on a Data Cleaning Kaggle excercise.
In parsing dates example I can´t figure out what the [1] does at the end of the indices object.
Thanks..
# Finding indices corresponding to rows in different date format
indices = np.where([date_lengths == 24])[1]
print('Indices with corrupted data:', indices)
earthquakes.loc[indices]

As described in the documentation, numpy.where called with a single argument is equivalent to calling np.asarray([date_lengths == 24]).nonzero().
numpy.nonzero return a tuple with as many items as the dimensions of the input array with the indexes of the non-zero values.
>>> np.nonzero([1,0,2,0])
(array([0, 2]),)
Slicing [1] enables to get the second element (i.e. second dimension) but as the input was wrapped into […], this is equivalent to doing:
np.where(date_lengths == 24)[0]
>>> np.nonzero([1,0,2,0])[0]
array([0, 2])

It is an artefact of the extra [] around the condition. For example:
a = np.arange(10)
To find, for example, indices where a>3 can be done like this:
np.where(a > 3)
gives as output a tuple with one array
(array([4, 5, 6, 7, 8, 9]),)
So the indices can be obtained as
indices = np.where(a > 3)[0]
In your case, the condition is between [], which is unnecessary, but still works.
np.where([a > 3])
returns a tuple of which the first is an array of zeros, and the second array is the array of indices you want
(array([0, 0, 0, 0, 0, 0]), array([4, 5, 6, 7, 8, 9]))
so the indices are obtained as
indices = np.where([a > 3])[1]

Remove values if they repeat in array [duplicate]

This question already has answers here:
How to remove the adjacent duplicate value in a numpy array?
(4 answers)
Closed 5 years ago.
I have an array like
[0,0,0,0,1,1,1,1,0,0,0,0,1,1,0,0]
and I want to determine the number of non-zeros intervals. I know how I can do that of course in a for loop but I wonder if there is a nice numpy solution to it.
The method I am looking for is suppose to "collapse" the array whenever a value repeats itself. So the above array would become for example
[0,1,0,1,0]
for the sake of counting it would of course be sufficient to return just
[1,1]
but I'd like to know a general approach that might also be able to handle more than two different elements such as
[1,1,1,2,2,2,3,3,0,0,1,1,2,2]
or so.

One option is to pick up the values when there is a change with boolean indexing:
import numpy as np
a = np.array([1,1,1,2,2,2,3,3,0,0,1,1,2,2])
a[np.concatenate(([True], np.diff(a) != 0))]
# array([1, 2, 3, 0, 1, 2])
np.count_nonzero(a[np.concatenate(([True], np.diff(a) != 0))])
# 5
First case:
b = np.array([0,0,0,0,1,1,1,1,0,0,0,0,1,1,0,0])

b[np.concatenate(([True], np.diff(b) != 0))]
# array([0, 1, 0, 1, 0])
np.count_nonzero(b[np.concatenate(([True], np.diff(b) != 0))])
# 2

Divide one numpy array by another only where both arrays are non-zero

What's the easiest, most Pythonic way to divide one numpy array by another (of the same shape, element-wise) only where both arrays are non-zero?
Where either the divisor or dividend is zero, the corresponding element in the output array should be zero. (This is the default output when the divisor is zero, but np.nan is the default output when the dividend is zero.)

This still tries to divide by 0, but it gives the correct result:
np.where(b==0, 0, a/b)
To avoid doing the divide-by-zero, you can do:
m = b!=0
c = np.zeros_like(a)
np.place(c, m, a[m]/b[m])

I would do it in two lines:
z = x/y
z[y == 0] = 0
As you said, if only the element in x is 0, z will already be 0 at that position. So let NumPy handle that, and then fix up the places where y is 0 by using NumPy's boolean indexing.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How does assignment to a boolean-indexed numpy array work? - numpy

Related

Why 'float object not iterable error' if it's an array of integers?

Boolean indexing with 2D arrays

What does the [1] do when using .where()?

Remove values if they repeat in array [duplicate]

Divide one numpy array by another only where both arrays are non-zero

Categories

Resources