How to flatten along 3rd dimension in numpy? - numpy

I have a 3d array in numpy that I want to flatten into a 1d array. I want to flatten each 2d "layer" of the array, copying each successive layer into the 1d array.
e.g., for an array with arr[:, :, 0] = [[1, 2], [3, 4]] and arr[:, :, 1] = [[5, 6], [7, 8]], I want the output to be [1, 2, 3, 4, 5, 6, 7, 8].
Currently I have the following code:
out = np.empty(arr.size)
for c in xrange(arr.shape[2]):
layer = arr[:, :, c]
out[c * layer.size:(c + 1) * layer.size] = layer.ravel()
Is there a way to accomplish this efficiently in numpy (without using a for loop)? I have tried messing around with reshape, transpose, and flatten to no avail.

I figured it out:
out = arr.transpose((2, 0, 1)).flatten()

Or (the last axe will be first) : np.rollaxis(a,-1).ravel()

Related

Pandas dataframe to 2D numpy array

I have the following dataframe:
d = {'histogram' : [[1,2],[3,4],[5,6]]}
df = pd.DataFrame(d)
The length of the histograms are always the same (2 in this example case).
and I would like to convert the 'histogram' column into a 2D numpy array to feed into a neural net. The preferred output is:
output_array = np.array(d["histogram"])
i.e.:
array([[1, 2],
[3, 4],
[5, 6]])
however when I try:
df["histogram"].to_numpy()
the results is an array of lists instead of numpy array of arrays:
array([list([1, 2]), list([3, 4]), list([5, 6])], dtype=object)
this is problematic for neural nets as I have to specify the dimensions/shape.
I try to solve the issue by casting as numpy array:
df["histogram_arrays"] = df["histogram"].apply(lambda x: np.array(x))
df["histogram_arrays"].to_numpy()
which returns a 1D array of arrays and not the 2D array.
array([array([1, 2]), array([3, 4]), array([5, 6])], dtype=object)
How can I get the histograms into a 2D array?
Try this:
np.vstack(df['histogram'])
Your question is essentially: how do I convert a NumPy array of (identically-sized) lists to a two-dimensional NumPy array.
That makes it a (near) duplicate of this SO question, but since your actual question is somewhat hidden, I'll put an answer here anyway.
Use numpy.vstack:
>>> data = df['histogram'].to_numpy()
>>> data
array([list([1, 2]), list([3, 4]), list([5, 6])], dtype=object)
>>> data = np.vstack(data)
>>> data.dtype, data.shape
(dtype('int64'), (3, 2))
>>> data
array([[1, 2],
[3, 4],
[5, 6]])

tensorflow - resize dimension

I am new to tensorflow and wondering if it is possible to resize a single dimension within a tensor.
let's I have a given tensor t:
t = [[1, 10], [2, 20]]
shape(t) = [2, 2]
now I want to modify the shape of this tensor, so that:
shape(t) = [2, 3]
So far I just found the functions:
reshape --> this function is able to reshape the tensor in such a way, that the total number of dimensions stays the same (as far as i understood)
shape(t) = [1, 3] | [3, 1] | [4]
expand_dims --> this function is able to add a new 1-dimensional dimension
shape(t) = [1, 2, 2] | [2, 1, 2] | [2, 2, 1]
Is a function for my described purpose in place? If not: Why? (Maybe it doesn't make sense to have such a function?)
Kind regards
use tf.concat can do it. Here is an example.
import tensorflow as tf
t = tf.constant([[1, 10], [2, 20]], dtype=tf.int32)
# the new tensor w/ the shape of [2]
TBA_a = tf.constant([3,30], dtype=tf.int32)
# reshape TBA_a to [2,1], then concat it to t on axis 1 (column)
new_t = tf.concat([t, tf.reshape(TBA_a, [2,1])], axis=1)
sess = tf.InteractiveSession()
print(new_t.eval())
It will give us
[[ 1 10 3]
[ 2 20 30]]

Repeat element from a 2D matrix to a 3D matrix with numpy

I have a 2-D numpy matrix, an example
M = np.matrix([[1,2],[3,4],[5,6]])
I would like, starting from M, to have a matrix like:
M = np.matrix([[[1,2],[1,2],[1,2]],[[3,4],[3,4],[3,4]],[[5,6],[5,6],[5,6]]])
thus, the new matrix has 3 dimensions. How can I do?
NumPy matrix class can't hold 3D data. So, assuming you are okay with NumPy array as output, we can extend the array version of it to 3D with None/np.newaxis and then use np.repeat -
np.repeat(np.asarray(M)[:,None],3,axis=1)
Sample run -
In [233]: M = np.matrix([[1,2],[3,4],[5,6]])
In [234]: np.repeat(np.asarray(M)[:,None],3,axis=1)
Out[234]:
array([[[1, 2],
[1, 2],
[1, 2]],
[[3, 4],
[3, 4],
[3, 4]],
[[5, 6],
[5, 6],
[5, 6]]])
Alternatively, with np.tile -
np.tile(np.asarray(M),3).reshape(-1,3,M.shape[-1])
This should work for you:
np.array([list(np.array(i)) * 3 for i in M])
as another answerer already said, the matrix can't be three-dimensional.
instead of it, you can make 3-dimensional np.array like below.
import numpy as np
M = np.matrix([[1,2],[3,4],[5,6]])
M = np.array(M)
M = np.array([ [x, x, x] for x in M])
M

Normalize numpy ndarray data

My data is numpy ndarray with shape(2,3,4) following this:
I've try to normalize 0-1 scale for each column through sklearn normalization.
from sklearn.preprocessing import normalize
x = np.array([[[1, 2, 3, 4],
[2, 2, 3, 4],
[3, 2, 3, 4]],
[[4, 2, 3, 4],
[5, 2, 3, 4],
[6, 2, 3, 4]]])
x.shape ==> ( 2,3,4)
x = normalize(x, norm='max', axis=0, )
However, I catch the error :
ValueError: Found array with dim 3. the normalize function expected <= 2.
How do I solve this problem?
Thank you.
It seems scikit-learn expects ndarrays with at most two dims. So, to solve it would be to reshape to 2D, feed it to normalize that gives us a 2D array, which could be reshaped back to original shape -
from sklearn.preprocessing import normalize
normalize(x.reshape(x.shape[0],-1), norm='max', axis=0).reshape(x.shape)
Alternatively, it's much simpler with NumPy that works fine with generic ndarrays -
x/np.linalg.norm(x, ord=np.inf, axis=0, keepdims=True)

Tensorflow: index per row

Suppose I have a Tensor of shape (100,20). Now I also have a Tensor of indices of shape (100,). How to obtain now a Tensor of shape (100,) or (100,1) with per row (100 rows) the right value (selected by the corresponding index in indices?
Small example:
So let's say tensor A is
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
and tensor B is
[0,2,1]
then I want as output
[1,6,8]
You can join your B tensor with an appropriate range to create two-dimensional indices (in your example [[0, 0], [1, 2], [2, 1]]) and then extract the elements using tf.gather_nd:
b_2 = tf.expand_dims(b, 1)
range = tf.expand_dims(tf.range(tf.shape(b)[0]), 1)
ind = tf.concat(1, [range, b_2])
res = tf.gather_nd(a, ind)