strange implicit conversion after append an empty list in numpy - numpy

After the thread strange implicit conversion of data type in numpy, I found another strange conversion with numpy
import numpy as np
a = np.array([1,2,3], dtype=int)
c = np.append(a, [])
print the c gives:
array([1., 2., 3.])
However, if:
c = np.append(a, [4])
gives:
array([1, 2, 3, 4])
why is there such strange automatic conversion? It does not make any sense at all

The empty list has to be first turned into an array:
In [149]: np.array([])
Out[149]: array([], dtype=float64)
np.append actually does:
In [151]: np.ravel([])
Out[151]: array([], dtype=float64)
The append code:
def append(arr, values, axis=None):
arr = asanyarray(arr)
if axis is None:
if arr.ndim != 1:
arr = arr.ravel()
values = ravel(values)
axis = arr.ndim-1
return concatenate((arr, values), axis=axis)

Related

Why can I not use linspace on this basic function that contains a for-loop?

I have a larger, more complex function that essentially boils down to the below function: f(X). Why can't this function take in linspace integers as a parameter?
This error is given:
"TypeError: only integer scalar arrays can be converted to a scalar index"
import numpy
from matplotlib import*
from numpy import*
import numpy as np
from math import*
import random
from random import *
import matplotlib.pyplot as plt
def f(x):
for i in range(x):
x+=1
return x
xlist = np.linspace(0, 100, 10)
ylist = f(xlist)
plt.figure(num=0)
linspace produces a numpy array, not a list. Don't confuse the two.
In [3]: x = np.linspace(0,100,10)
In [4]: x
Out[4]:
array([ 0. , 11.11111111, 22.22222222, 33.33333333,
44.44444444, 55.55555556, 66.66666667, 77.77777778,
88.88888889, 100. ])
The numbers will look nicer if we take into account that linspace includes the end points:
In [5]: x = np.linspace(0,100,11)
In [6]: x
Out[6]: array([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90., 100.])
As an array, we can simply add a scalar; no need to write a loop:
In [7]: x + 1
Out[7]: array([ 1., 11., 21., 31., 41., 51., 61., 71., 81., 91., 101.])
range takes one number (or 3), but never an array (of more than 1 element):
In [8]: for i in range(5): print(i)
0
1
2
3
4
We can't modify i in such a loop, or the "x" used to create the range. But we can write a list comprehension:
In [9]: [i+1 for i in range(5)]
Out[9]: [1, 2, 3, 4, 5]
references:
https://www.w3schools.com/python/ref_func_range.asp
https://numpy.org/doc/stable/reference/generated/numpy.linspace.html
xlist = np.linspace(0, 100, 10) + 1
Based on your comment it seems you are just trying to plot a few points. I think your use of range and linspace are redundant and you don't need a loop at all. You do not need a for loop to plot many points, unless you are trying to plot multiple curves or do something fancy with each point. For example, this plots 10 points:
x = np.linspace(0, 100, 10)
y=x*x
plt.plot(x,y,'.')
I think this is a good place to start: https://matplotlib.org/3.5.1/tutorials/introductory/pyplot.html

Wrong result of np.dot

I'm new in python and I'm trying to do the multiplication of a 2d matrix with a 1d one. I use np.dot to do it but it gives me a wrong output. I'm trying to do this:
#X_train.shape = 60000
w = np.zeros([784, 1])
lista = range (0, len(X_train))
for i in lista:
score = np.dot(X_train[i,:], w)
print score.shape
Out-> (1L,)
the output should be (60000,1)
Any idea of how I can resolve the problem?
You should avoid the for loop altogether. Indeed, np.dot is supposed to work on N-dim arrays and does the looping internally. See for example
In [1]: import numpy as np
In [2]: a = np.random.rand(1,2) # a.shape = (1,2)
In [3]: b = np.random.rand(2,3) # b.shape = (2,3)
In [4]: np.dot(a,b)
Out[4]: array([[ 0.33735571, 0.29272468, 0.09361096]])

Eigenvector normalization in numpy

I'm using the linalg in numpy to compute eigenvalues and eigenvectors of matrices of signed reals.
I've read this previous question but still don't grasp the normalization of eigenvectors.
Here is an example straight off Wikipedia:
import numpy as np
from numpy import linalg as la
a = np.matrix([[2, 1], [1, 2]], dtype=np.float)
eigh_vals, eigh_vects = np.linalg.eig(a)
print 'eigen_values='
print eigh_vals
print 'eigen_vectors='
print eigh_vects
The eigenvalues are 1 and 3.
For eigenvectors we expect scalar multiples of [1, -1] and [1, 1], which I get:
eig_vals=
[ 3. 1.]
eig_vets=
[[ 0.70710678 -0.70710678]
[ 0.70710678 0.70710678]]
I understand the 1/sqrt(2) factor is to have the norm=1 but why?
Can normalization be 'switched off'?
Thanks!
The key message for the first eigenvector in the Wikipedia article is
Any non-zero vector with v1 = −v2 solves this equation.
So the actual solution is V1 = [x, -x]. Picking the vector V1 = [1, -1] may be pleasing to the human eye, but it is just as aritrary as picking a vector V1 = [104051, -104051] or any other real value.
Actually, picking V1 = [1, -1] / sqrt(2) is the least arbitrary. Of all the possible vectors for V1, it's the only one that is of unit length.
However if instead of unit length you prefer the first value to be 1, you can do
eigh_vects /= eigh_vects[:, 0]
import numpy as np
import sympy as sp
v = sp.Matrix([[2, 1], [1, 2]])
v_vec = v.eigenvects()
v_vec is a list contains 2 tuples:
[(1, 1, [Matrix([
[-1],
[ 1]])]), (3, 1, [Matrix([
[1],
[1]])])]
1 and 3 is the two eigenvalues. The '1' behind 1 & 3 is the number of the eigenvalues. In each tuple, the third element is the eigenvector of each eigenvalue. It is a Matrix object in sp. You can convert a Matrix object to the np array.
v_vec1 = np.array(v_vec[0][2], dtype=float)
v_vec2 = np.array(v_vec[1][2], dtype=float)
print('v_vec1 =', v_vec1)
print('v_vec2 =', v_vec2)
Here is the normalized eigenvectors you would get:
v_vec1 = [[-1. 1.]]
v_vec2 = [[1. 1.]]
If sympy is an option for you, it appears to normalize less aggressively:
import sympy
a = sympy.Matrix([[2, 1], [1, 2]])
a.eigenvects()
# [(1, 1, [Matrix([
# [-1],
# [ 1]])]), (3, 1, [Matrix([
# [1],
# [1]])])]

How to transpose each element of a Numpy Matrix [duplicate]

I'm starting off with a numpy array of an image.
In[1]:img = cv2.imread('test.jpg')
The shape is what you might expect for a 640x480 RGB image.
In[2]:img.shape
Out[2]: (480, 640, 3)
However, this image that I have is a frame of a video, which is 100 frames long. Ideally, I would like to have a single array that contains all the data from this video such that img.shape returns (480, 640, 3, 100).
What is the best way to add the next frame -- that is, the next set of image data, another 480 x 640 x 3 array -- to my initial array?
A dimension can be added to a numpy array as follows:
image = image[..., np.newaxis]
Alternatively to
image = image[..., np.newaxis]
in #dbliss' answer, you can also use numpy.expand_dims like
image = np.expand_dims(image, <your desired dimension>)
For example (taken from the link above):
x = np.array([1, 2])
print(x.shape) # prints (2,)
Then
y = np.expand_dims(x, axis=0)
yields
array([[1, 2]])
and
y.shape
gives
(1, 2)
You could just create an array of the correct size up-front and fill it:
frames = np.empty((480, 640, 3, 100))
for k in xrange(nframes):
frames[:,:,:,k] = cv2.imread('frame_{}.jpg'.format(k))
if the frames were individual jpg file that were named in some particular way (in the example, frame_0.jpg, frame_1.jpg, etc).
Just a note, you might consider using a (nframes, 480,640,3) shaped array, instead.
Pythonic
X = X[:, :, None]
which is equivalent to
X = X[:, :, numpy.newaxis] and
X = numpy.expand_dims(X, axis=-1)
But as you are explicitly asking about stacking images,
I would recommend going for stacking the list of images np.stack([X1, X2, X3]) that you may have collected in a loop.
If you do not like the order of the dimensions you can rearrange with np.transpose()
You can use np.concatenate() use the axis parameter to specify the dimension that should be concatenated. If the arrays being concatenated do not have this dimension, you can use np.newaxis to indicate where the new dimension should be added:
import numpy as np
movie = np.concatenate((img1[:,np.newaxis], img2[:,np.newaxis]), axis=3)
If you are reading from many files:
import glob
movie = np.concatenate([cv2.imread(p)[:,np.newaxis] for p in glob.glob('*.jpg')], axis=3)
Consider Approach 1 with reshape method and Approach 2 with np.newaxis method that produce the same outcome:
#Lets suppose, we have:
x = [1,2,3,4,5,6,7,8,9]
print('I. x',x)
xNpArr = np.array(x)
print('II. xNpArr',xNpArr)
print('III. xNpArr', xNpArr.shape)
xNpArr_3x3 = xNpArr.reshape((3,3))
print('IV. xNpArr_3x3.shape', xNpArr_3x3.shape)
print('V. xNpArr_3x3', xNpArr_3x3)
#Approach 1 with reshape method
xNpArrRs_1x3x3x1 = xNpArr_3x3.reshape((1,3,3,1))
print('VI. xNpArrRs_1x3x3x1.shape', xNpArrRs_1x3x3x1.shape)
print('VII. xNpArrRs_1x3x3x1', xNpArrRs_1x3x3x1)
#Approach 2 with np.newaxis method
xNpArrNa_1x3x3x1 = xNpArr_3x3[np.newaxis, ..., np.newaxis]
print('VIII. xNpArrNa_1x3x3x1.shape', xNpArrNa_1x3x3x1.shape)
print('IX. xNpArrNa_1x3x3x1', xNpArrNa_1x3x3x1)
We have as outcome:
I. x [1, 2, 3, 4, 5, 6, 7, 8, 9]
II. xNpArr [1 2 3 4 5 6 7 8 9]
III. xNpArr (9,)
IV. xNpArr_3x3.shape (3, 3)
V. xNpArr_3x3 [[1 2 3]
[4 5 6]
[7 8 9]]
VI. xNpArrRs_1x3x3x1.shape (1, 3, 3, 1)
VII. xNpArrRs_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
VIII. xNpArrNa_1x3x3x1.shape (1, 3, 3, 1)
IX. xNpArrNa_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
a = np.expand_dims(a, axis=-1)
or
a = a[:, np.newaxis]
or
a = a.reshape(a.shape + (1,))
There is no structure in numpy that allows you to append more data later.
Instead, numpy puts all of your data into a contiguous chunk of numbers (basically; a C array), and any resize requires allocating a new chunk of memory to hold it. Numpy's speed comes from being able to keep all the data in a numpy array in the same chunk of memory; e.g. mathematical operations can be parallelized for speed and you get less cache misses.
So you will have two kinds of solutions:
Pre-allocate the memory for the numpy array and fill in the values, like in JoshAdel's answer, or
Keep your data in a normal python list until it's actually needed to put them all together (see below)
images = []
for i in range(100):
new_image = # pull image from somewhere
images.append(new_image)
images = np.stack(images, axis=3)
Note that there is no need to expand the dimensions of the individual image arrays first, nor do you need to know how many images you expect ahead of time.
You can use stack with the axis parameter:
img.shape # h,w,3
imgs = np.stack([img1,img2,img3,img4], axis=-1) # -1 = new axis is last
imgs.shape # h,w,3,nimages
For example: to convert grayscale to color:
>>> d = np.zeros((5,4), dtype=int) # 5x4
>>> d[2,3] = 1
>>> d3.shape
Out[30]: (5, 4, 3)
>>> d3 = np.stack([d,d,d], axis=-2) # 5x4x3 -1=as last axis
>>> d3[2,3]
Out[32]: array([1, 1, 1])
I followed this approach:
import numpy as np
import cv2
ls = []
for image in image_paths:
ls.append(cv2.imread('test.jpg'))
img_np = np.array(ls) # shape (100, 480, 640, 3)
img_np = np.rollaxis(img_np, 0, 4) # shape (480, 640, 3, 100).
This worked for me:
image = image[..., None]
This will help you add axis anywhere you want
import numpy as np
signal = np.array([[0.3394572666491664, 0.3089068053925853, 0.3516359279582483], [0.33932706934615525, 0.3094755563319447, 0.3511973743219001], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256]])
print(signal.shape)
#(4,3)
print(signal[...,np.newaxis].shape) or signal[...:none]
#(4, 3, 1)
print(signal[:, np.newaxis, :].shape) or signal[:,none, :]
#(4, 1, 3)
there is three-way for adding new dimensions to ndarray .
first: using "np.newaxis" (something like #dbliss answer)
np.newaxis is just given an alias to None for making it easier to
understand. If you replace np.newaxis with None, it works the same
way. but it's better to use np.newaxis for being more explicit.
import numpy as np
my_arr = np.array([2, 3])
new_arr = my_arr[..., np.newaxis]
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
second: using "np.expand_dims()"
Specify the original ndarray in the first argument and the position
to add the dimension in the second argument axis.
my_arr = np.array([2, 3])
new_arr = np.expand_dims(my_arr, -1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
third: using "reshape()"
my_arr = np.array([2, 3])
new_arr = my_arr.reshape(*my_arr.shape, 1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)

Is there a simple pad in numpy?

Is there a numpy function that pads an array this way?
import numpy as np
def pad(x, length):
tmp = np.zeros((length,))
tmp[:x.shape[0]] = x
return tmp
x = np.array([1,2,3])
print pad(x, 5)
Output:
[ 1. 2. 3. 0. 0.]
I couldn't find a way to do it with numpy.pad()
You can use ndarray.resize():
>>> x = np.array([1,2,3])
>>> x.resize(5)
>>> x
array([1, 2, 3, 0, 0])
Note that this functions behaves differently from numpy.resize(), which pads with repeated copies of the array itself. (Consistency is for people who can't remember everything.)
Sven Marnach's suggestion to use ndarray.resize() is probably the simplest way to do it, but for completeness, here's how it can be done with numpy.pad:
In [13]: x
Out[13]: array([1, 2, 3])
In [14]: np.pad(x, [0, 5-x.size], mode='constant')
Out[14]: array([1, 2, 3, 0, 0])