A question about numpy ndarray transformation - numpy

any simple way to change this array
[[ 3 4 0 1 2]
[ 8 9 5 6 7]
[13 14 10 11 12]]
into:
[[ 0 0 0 1 2]
[ 0 0 5 6 7]
[ 0 0 10 11 12]]
?
Edit: maximum supported dimension for an ndarray is 32, found 306 for transpose

Use Slicing:
>>> a[:,:2] = 0
>>> a
array([[ 0, 0, 0, 1, 2],
[ 0, 0, 5, 6, 7],
[ 0, 0, 10, 11, 12]])

Related

How to get the specific out put for Numpy array slicing?

x is an array of shape(n_dim,n_row,n_col) of 1st n natural numbers
b is boolean array of shape(2,) having elements True,false
def array_slice(n,n_dim,n_row,n_col):
x = np.arange(0,n).reshape(n_dim,n_row,n_col)
b = np.full((2,),True)
print(x[b])
print(x[b,:,1:3])
expected output
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]]
[[[ 1 2]
[ 6 7]
[11 12]]]
my output:-
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
[[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]]]
[[[ 1 2]
[ 6 7]
[11 12]]
[[16 17]
[21 22]
[26 27]]]
An example:
In [83]: x= np.arange(24).reshape(2,3,4)
In [84]: b = np.full((2,),True)
In [85]: x
Out[85]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [86]: b
Out[86]: array([ True, True])
With two True, b selects both plains of the 1st dimension:
In [87]: x[b]
Out[87]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
A b with a mix of true and false:
In [88]: b = np.array([True, False])
In [89]: b
Out[89]: array([ True, False])
In [90]: x[b]
Out[90]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]])

Getting the values of only one channel of a numpy array

I would very much be grateful if you don't close this question without even giving a hint of how to solve this problem,please.
I have the following
import numpy as np
layer1 = np.zeros((5,3,4),dtype=np.uint8)
layer1[0,0,0]=20
layer1[1,1,0]=20
layer1[2,2,0]=20
layer1[3,1,0]=20
layer1[4,0,0]=20
layer1[0,0,1] =50
layer1[1,0,1]=50
layer1[2,0,1]=50
print(layer1)
print("---------------")
which gives me
[[[20 50 0 0]
[ 0 0 0 0]
[ 0 0 0 0]]
[[ 0 50 0 0]
[20 0 0 0]
[ 0 0 0 0]]
[[ 0 50 0 0]
[ 0 0 0 0]
[20 0 0 0]]
[[ 0 0 0 0]
[20 0 0 0]
[ 0 0 0 0]]
[[20 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]]]
How can I reduce a get the values only of one channel ?
For example for channel=0
I want to get
[[20 0 0]
[ 0 20 0]
[ 0 0 20]
[ 0 20 0]
[20 0 0]]
where channel can be 0,1,2 or 3
EDIT: Just in case, the layer1[0,0,0]=20 is just a convenient way to fill up the matrix. My question is how to tranform layer1 once filled to the matrix of (5,3)
EDIT: if the "channel" is 1 then I would get
[[50 0 0]
[ 50 0 0]
[ 50 0 0]
[ 0 0 0]
[0 0 0]]
numpy array indexing is well documented. Don't skip it!
In [1]: layer1 = np.zeros((5,3,4),dtype=np.uint8)
...: layer1[0,0,0]=20
...: layer1[1,1,0]=20
...: layer1[2,2,0]=20
...: layer1[3,1,0]=20
...: layer1[4,0,0]=20
...:
...: layer1[0,0,1] =50
...: layer1[1,0,1]=50
...: layer1[2,0,1]=50
In [2]: layer1.shape
Out[2]: (5, 3, 4)
In [3]: layer1[:,:,0]
Out[3]:
array([[20, 0, 0],
[ 0, 20, 0],
[ 0, 0, 20],
[ 0, 20, 0],
[20, 0, 0]], dtype=uint8)
In [4]: layer1[:,:,2]
Out[4]:
array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]], dtype=uint8)

How does tensorflow get indices of unique value in Tensorflow Tensor?

Suppose I have one input 1D tensor, I want to get indices for unique elements in 1D tensor.
input 1D tensor
[ 1 3 0 0 0 3 5 6 8 9 12 2 5 7 0 11 6 7 0 0]
expected output
Values: [1, 3, 0, 5, 6, 8, 9, 12, 2, 7, 11]
indices: [0, 1, 2, 6, 7, 8, 9, 10, 11, 13, 15]
Here is my strategy now.
input = [ 1, 3, 0, 0, 0, 3, 5, 6, 8, 9, 12, 2, 5, 7, 0, 11, 6, 7, 0, 0,]
unique_value_in_input, _ = tf.unique(input) # [1 3 0 5 6 8 9 12 2 7 11]
number_of_unique_value = tf.shape(unique_value_in_input)[0] #11
y = tf.reshape(y, (number_of_unique_value, 1)) #[[1], [3], [0], [5], [6], [8], [9], ..]
input_matrix = tf.tile(input, [number_of_unique_value]) # repeat the tensor for tf.equal()
input_matrix = tf.reshape(input, [number_of_unique_value,-1])
cols = tf.where(tf.equal(input_matrix, y))[:,-1] #[[ 0 0] [ 1 1] [ 1 5] [ 2 6] [ 2 12] ...]
Since I will have repeat value in tf.where() step, which means I have duplicated True in result.
Is there any function I can use in this issue?
You should be able to do the following and get the desired output. We do the following. For each value in unique values, you get a boolean tensor and get the maximum index (i.e only the first maximum index) through tf.argmax.
import tensorflow as tf
input = tf.constant([ 1, 3, 0, 0, 0, 3, 5, 6, 8, 9, 12, 2, 5, 7, 0, 11, 6, 7, 0, 0,], tf.int64)
unique_vals, _ = tf.unique(input)
res = tf.map_fn(
lambda x: tf.argmax(tf.cast(tf.equal(input, x), tf.int64)),
unique_vals)
with tf.Session() as sess:
print(sess.run(res))

Efficiently Creating A Pandas DataFrame From A Numpy 3d array

Suppose we start with
import numpy as np
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
How can this be efficiently be made into a pandas DataFrame equivalent to
import pandas as pd
>>> pd.DataFrame({'a': [0, 0, 1, 1], 'b': [1, 3, 5, 7], 'c': [2, 4, 6, 8]})
a b c
0 0 1 2
1 0 3 4
2 1 5 6
3 1 7 8
The idea is to have the a column have the index in the first dimension in the original array, and the rest of the columns be a vertical concatenation of the 2d arrays in the latter two dimensions in the original array.
(This is easy to do with loops; the question is how to do it without them.)
Longer Example
Using #Divakar's excellent suggestion:
>>> np.random.randint(0,9,(4,3,2))
array([[[0, 6],
[6, 4],
[3, 4]],
[[5, 1],
[1, 3],
[6, 4]],
[[8, 0],
[2, 3],
[3, 1]],
[[2, 2],
[0, 0],
[6, 3]]])
Should be made to something like:
>>> pd.DataFrame({
'a': [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3],
'b': [0, 6, 3, 5, 1, 6, 8, 2, 3, 2, 0, 6],
'c': [6, 4, 4, 1, 3, 4, 0, 3, 1, 2, 0, 3]})
a b c
0 0 0 6
1 0 6 4
2 0 3 4
3 1 5 1
4 1 1 3
5 1 6 4
6 2 8 0
7 2 2 3
8 2 3 1
9 3 2 2
10 3 0 0
11 3 6 3
Here's one approach that does most of the processing on NumPy before finally putting it out as a DataFrame, like so -
m,n,r = a.shape
out_arr = np.column_stack((np.repeat(np.arange(m),n),a.reshape(m*n,-1)))
out_df = pd.DataFrame(out_arr)
If you precisely know that the number of columns would be 2, such that we would have b and c as the last two columns and a as the first one, you can add column names like so -
out_df = pd.DataFrame(out_arr,columns=['a', 'b', 'c'])
Sample run -
>>> a
array([[[2, 0],
[1, 7],
[3, 8]],
[[5, 0],
[0, 7],
[8, 0]],
[[2, 5],
[8, 2],
[1, 2]],
[[5, 3],
[1, 6],
[3, 2]]])
>>> out_df
a b c
0 0 2 0
1 0 1 7
2 0 3 8
3 1 5 0
4 1 0 7
5 1 8 0
6 2 2 5
7 2 8 2
8 2 1 2
9 3 5 3
10 3 1 6
11 3 3 2
Using Panel:
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
b=pd.Panel(rollaxis(a,2)).to_frame()
c=b.set_index(b.index.labels[0]).reset_index()
c.columns=list('abc')
then a is :
[[[1 2]
[3 4]]
[[5 6]
[7 8]]]
b is :
0 1
major minor
0 0 1 2
1 3 4
1 0 5 6
1 7 8
and c is :
a b c
0 0 1 2
1 0 3 4
2 1 5 6
3 1 7 8

Deleting Chained Duplicates

Lets say I have a list:
lits = [1, 1, 1, 2, 0, 0, 0, 0, 3, 3, 1, 4, 5, 2, 2, 2, 0, 0, 0]
and i need this to become [1, 1, 2, 0, 0, 3, 3, 1, 4, 5, 2, 2, 0, 0]
(Delete duplicates, but only in a chain of duplicates. Going to do this on a huge HDF5 file, with pandas, numpy. Would rather not use a for loop iterating through all elements.
table = table.drop_duplicates(cols='[SPEED OVER GROUND [kts]]', take_last=True)
Is there a modification I can do to this code?
In pandas you can do a boolean mask, selecting a row only if it is differs from either the preceding or succeeding value:
>>> df=pd.DataFrame({ 'lits':lits })
>>> df[ (df.lits != df.lits.shift(1)) | (df.lits != df.lits.shift(-1)) ]
lits
0 1
2 1
3 2
4 0
7 0
8 3
9 3
10 1
11 4
12 5
13 2
15 2
16 0
18 0