Deleting multiple columns from a numpy array - numpy

I have array of random columns, that has to be deleted from numpy array. When i am trying below code, that many number of columns are not getting deleted. Any suggestion?
np.array([np.delete(image[row], columns[row].astype(int), axis=0) for row in range(height)])

I'm not too sure what a few things in the example are, like image[row], and columns[row] But the below example words to delete multiple columns. With the example np.delete(n,[0,2],1) That's saying for array n, delete the first(0) and the third(2) line where the axis=1
n = np.array([
[2,3,4,6],
[3,3,0,8],
[8,4,1,0],
[9,4,2,0]])
print(np.delete(n,[0,2],1))
output
[[3 6]
[3 8]
[4 0]
[4 0]]

Related

pandas dataframe function mean() not working correctly to ignore nan values

By default, the mean() method should ignore the nan value, but for my case, it didn't work. It still takes the nan value.
a = np.array([1,9])
b = np.array([3,nan])
c = np.array([7,8])
d = {'value': [a,b,a,c], 'group': [3,3,4,4], 'garbage':['asd','acas','asdasdc','ghfas']}
df = pd.DataFrame(data=d)
df
OUTPUT:
value group garbage
0 [1, 9] 3 asd
1 [3.0, nan] 3 acas
2 [1, 9] 4 asdasdc
3 [7, 8] 4 ghfas
for i,j in df.groupby('group')['value']:
print(j.mean())
print("=========")
OUTPUT:
[ 2. nan]
=========
[4. 8.5]
=========
I am not sure what you are trying to do here, but Ill take a stab at it.
Firstly, the values column is a column of numpy arrays, so it is two dimensional. Then when you run groupby, j becomes a pd.Series of numpy arrays. Thus, when you call mean you are taking the mean by aligning the axes of the numpy arrays. This is pretty unadvisable because these objects can change shape which will cause an error.
I think what you are trying to do is take the mean across all the arrays in each group. You can do that with.
for i,j in df.groupby('group')['value']:
print(np.nanmean(np.concatenate(j.values)))
Whatever you are trying to do, it is going to be way easier to interact with once you combine the values in your loop.

Possible to use np.where to check a condition in vector, but output rows in a 2D array

I have a series and a dataframe. I want to check if the values in a series pass a condition, and modify the row of the dataframe if they do, otherwise leave as is.
NumPy has a broadcasting issue with this - is there another way to do this?
ser = pd.Series([74, 80, 24], pd.date_range(start='2020-01-01', periods=3, freq='D'))
test = pd.DataFrame([pd.Series([1, 2], index=['a', 'b'])] * len(ser), index=ser.index)
np.where(ser<50, (test*2), test)
ValueError: operands could not be broadcast together with shapes (3,)
(3,2) (3,2)
I think a workaround would be to modify ser to be a dataframe with all equivalent columns, but it seems a little bit clunky.
Use broadcasting in NumPy, so they are not aligned by indices, only necessary same length of Series and DataFrame:
a = np.where(ser.to_numpy()[:, None]<50, (test*2), test)
print (a)
[[1 2]
[1 2]
[2 4]]

Numpy, how to retrieve sub-array of array (specific indices)?

I have an array:
>>> arr1 = np.array([[1,2,3], [4,5,6], [7,8,9]])
array([[1 2 3]
[4 5 6]
[7 8 9]])
I want to retrieve a list (or 1d-array) of elements of this array by giving a list of their indices, like so:
indices = [[0,0], [0,2], [2,0]]
print(arr1[indices])
# result
[1,6,7]
But it does not work, I have been looking for a solution about it for a while, but I only found ways to select per row and/or per column (not per specific indices)
Someone has any idea ?
Cheers
Aymeric
First make indices an array instead of a nested list:
indices = np.array([[0,0], [0,2], [2,0]])
Then, index the first dimension of arr1 using the first values of indices, likewise the second:
arr1[indices[:,0], indices[:,1]]
It gives array([1, 3, 7]) (which is correct, your [1, 6, 7] example output is probably a typo).

Combining Matrices in Numpy to preserve outer dimension

Let's say I have this numpy array - [[1,2],[3,4]]. How could I stack it together K times such to preserve the dimensions of the inner matrices?
[[[1,2],[3,4]],
[[1,2],[3,4]],
[[1,2],[3,4]]]....
You can use the built in function for it.
np.repeat([a],K,axis=0)
output:
[[[1 2]
[3 4]]
[[1 2]
[3 4]]
...
[[1 2]
[3 4]]]

How to get those rows having the equal value and their subscript if there is a [10,1] tensor?

I am new in TensorFlow. If there is a [10,1] tensor, I want to find out all rows with the same value and their subscript.
For example, there is a tensor like [[1],[2],[3],[4],[5],[1],[2],[3],[4],[6]].
By comparing each element in the matrix, it is easy to get a dictionary structure like
{‘1’: [0,5], ‘2’: [1,6], ‘3’: [2, 7], ‘4’: [3, 8], ‘5’: [4], ‘6’: [9]} in python, which can record how many times each element occurs in the matrix.
I expect to achieve this result in TensorFlow. Could someone please give me a hand? Thanks a lot.
I think this is a longer method. Still the elements and indices are not associated in a data structure.
Other shorter methods must be there.
t = tf.constant([[1],[2],[3],[4],[5],[1],[2],[3],[4],[6]])
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
y, idx, count =tf.unique_with_counts(tf.squeeze(t))
y1, idx1, count1 = sess.run([y,idx,count])
for i in range(len(y1)) :
print( sess.run( tf.where(tf.equal(t,y1[i]))[:2,-2]))
Output is
[0 5]
[1 6]
[2 7]
[3 8]
[4]
[9]