I want to add two 3D (shape (N, 3)) numpy arrays conditionally where the condition is specified as a 1D array (shape of N).
What's an efficient (vectorized) way to do this? numpy.where() only supports a conditional operation where all three arrays (including the condition) have matching dimensions.
For instance:
a = np.asarray([[1, 1], [2, 2], [3, 3], [4, 4]])
b = np.asarray([[0.2, 0.2], [0.3, 0.3], [0.4, 0.4], [0.5, 0.5]])
c = np.asarray([0, 1, 1, 0])
I would like to be able to do:
np.where(c == 1, a + b, a)
i.e., do an element-wise add of a + b as long as the corresponding element in c is equal to 1 at the same index in the array.
However, I get an error instead:
ValueError: operands could not be broadcast together with shapes (4,) (4,3) (4,3)
Use boolean indexing:
import numpy as np
a = np.random.randint(20, 30, size=(5, 3))
#[[23 29 23]
# [20 27 24]
# [28 26 26]
# [27 20 26]
# [23 24 23]]
b = np.random.randint(20, 30, size=(5, 3))
#[[22 25 20]
# [28 29 20]
# [29 22 29]
# [28 28 21]
# [22 26 27]]
c = np.random.randint(0, 2, size=5).astype(bool)
# [ True True True False False]
r = a + b
r[~c] = a[~c] # keeping the default value if the corresponding value is 0 (or False) in c.
print(r)
[[45 54 43]
[48 56 44]
[57 48 55]
[27 20 26]
[23 24 23]]
Related
I'm unable to understand how NumPy calculates the inner product of two 2D matrices.
For example, this program:
mat = [[1, 2, 3, 4],
[5, 6, 7, 8]]
result = np.inner(mat, mat)
print('\n' + 'result: ')
print(result)
print('')
produces this output:
result:
[[ 30 70]
[ 70 174]]
How are these numbers calculated ??
Before somebody says "read the documentation" I did, https://numpy.org/doc/stable/reference/generated/numpy.inner.html, it's not clear to me from this how this result is calculated.
Before somebody says "check the Wikipedia article" I did, https://en.wikipedia.org/wiki/Frobenius_inner_product shows various math symbols I'm not familiar with and does not explain how a calculation such as the one above is performed.
Before somebody says "Google it", I did, most examples are for 1-d arrays (which is an easy calculation), and others like this video https://www.youtube.com/watch?v=_YtHyjcQ1gw produce a different result than NumPy does.
Any clarification would be greatly appreciated.
In [55]: mat = [[1, 2, 3, 4],
...: [5, 6, 7, 8]]
...:
In [56]: arr = np.array(mat)
In [58]: arr.dot(arr.T)
Out[58]:
array([[ 30, 70],
[ 70, 174]])
That's a matrix product of a (2,4) with a (4,2), resulting in a (2,2). This is the usual 'scan across the columns, down the rows' method.
A couple of other expressions that do this:
I like the expressiveness of einsum, where the sum-of-products is on the j dimension:
In [60]: np.einsum('ij,kj->ik',arr,arr)
Out[60]:
array([[ 30, 70],
[ 70, 174]])
With broadcasted elementwise multiplication and summation:
In [61]: (arr[:,None,:]*arr[None,:,:]).sum(axis=-1)
Out[61]:
array([[ 30, 70],
[ 70, 174]])
Without the sum, the products are:
In [62]: (arr[:,None,:]*arr[None,:,:])
Out[62]:
array([[[ 1, 4, 9, 16],
[ 5, 12, 21, 32]],
[[ 5, 12, 21, 32],
[25, 36, 49, 64]]])
Which are the values you discovered.
I finally found this site https://www.tutorialspoint.com/numpy/numpy_inner.htm which explains things a little better. The above is computed as follows:
(1*1)+(2*2)+(3*3)+(4*4) (1*5)+(2*6)+(3*7)+(4*8)
1 + 4 + 9 + 16 5 + 12 + 21 + 32
= 30 = 70
(5*1)+(6*2)+(7*3)+(8*4) (5*5)+(6*6)+(7*7)+(8*8)
5 + 12 + 21 + 32 25 + 36 + 49 + 64
= 70 = 174
I'd like to select elements from an array along a specific axis given an index array. For example, given the arrays
a = np.arange(30).reshape(5,2,3)
idx = np.array([0,1,1,0,0])
I'd like to select from the second dimension of a according to idx, such that the resulting array is of shape (5,3). Can anyone help me with that?
You could use fancy indexing
a[np.arange(5),idx]
Output:
array([[ 0, 1, 2],
[ 9, 10, 11],
[15, 16, 17],
[18, 19, 20],
[24, 25, 26]])
To make this more verbose this is the same as:
x,y,z = np.arange(a.shape[0]), idx, slice(None)
a[x,y,z]
x and y are being broadcasted to the shape (5,5). z could be used to select any columns in the output.
I think this gives the results you are after - it uses np.take_along_axis, but first you need to reshape your idx array so that it is also a 3d array:
a = np.arange(30).reshape(5, 2, 3)
idx = np.array([0, 1, 1, 0, 0]).reshape(5, 1, 1)
results = np.take_along_axis(a, idx, 1).reshape(5, 3)
Giving:
[[ 0 1 2]
[ 9 10 11]
[15 16 17]
[18 19 20]
[24 25 26]]
Question
Why are the numpy tuple indexing behaviors inconsistent? Please explain the rational or design decision behind these behaviors. In my understanding, Z[(0,2)] and Z[(0, 2), (0)] are both tuple indexing and expected the consistent behavior for copy/view. If this is incorrect, please explain,
import numpy as np
Z = np.arange(36).reshape(3, 3, 4)
print("Z is \n{}\n".format(Z))
b = Z[
(0,2) # Select Z[0][2]
]
print("Tuple indexing Z[(0,2)] is \n{}\nIs view? {}\n".format(
b,
b.base is not None
))
c = Z[ # Select Z[0][0][1] & Z[0][2][1]
(0,2),
(0)
]
print("Tuple indexing Z[(0, 2), (0)] is \n{}\nIs view? {}\n".format(
c,
c.base is not None
))
Z is
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
[[24 25 26 27]
[28 29 30 31]
[32 33 34 35]]]
Tuple indexing Z[(0,2)] is
[ 8 9 10 11]
Is view? True
Tuple indexing Z[(0, 2), (0)] is
[[ 0 1 2 3]
[24 25 26 27]]
Is view? False
Numpy indexing is confusing and wonder how people built the understanding. If there is a good way to understand or cheat-sheets, please advise.
It's the comma that creates a tuple. The () just set boundaries where needed.
Thus
Z[(0,2)]
Z[0,2]
are the same, select on the first 2 dimension. Whether that returns an element, or an array depends on how many dimensions Z has.
The same interpretation applies to the other case.
Z[(0, 2), (0)]
Z[( np.array([0,2]), 0)]
Z[ np.array([0,2]), 0]
are the same - the first dimensions is indexed with a list/array, and thus is advanced indexing. It's a copy.
[ 8 9 10 11]
is a row of the 3d array; its a contiguous block of Z
[[ 0 1 2 3]
[24 25 26 27]]
is 2 rows from Z. They aren't contiguous, so there's no way of identifying them with just shape and strides (and offset in the databuffer).
details
__array_interface__ gives details about the underlying data of an array
In [146]: Z = np.arange(36).reshape(3,3,4)
In [147]: Z.__array_interface__
Out[147]:
{'data': (38255712, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3, 3, 4),
'version': 3}
In [148]: Z.strides
Out[148]: (96, 32, 8)
For the view:
In [149]: Z1 = Z[0,2]
In [150]: Z1
Out[150]: array([ 8, 9, 10, 11])
In [151]: Z1.__array_interface__
Out[151]:
{'data': (38255776, False), # 38255712+8*8
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (4,),
'version': 3}
The data buffer pointer is 8 elements further along in Z buffer. Shape is much reduced.
In [152]: Z2 = Z[[0,2],0]
In [153]: Z2
Out[153]:
array([[ 0, 1, 2, 3],
[24, 25, 26, 27]])
In [154]: Z2.__array_interface__
Out[154]:
{'data': (31443104, False), # an entirely different location
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (2, 4),
'version': 3}
Z2 is the same as two selections:
In [158]: Z[0,0]
Out[158]: array([0, 1, 2, 3])
In [159]: Z[2,0]
Out[159]: array([24, 25, 26, 27])
It is not
Z[0][0][1] & Z[0][2][1]
Z[0,0,1] & Z[0,2,1]
Compare that with a 2 row slice:
In [156]: Z3 = Z[0:2,0]
In [157]: Z3.__array_interface__
Out[157]:
{'data': (38255712, False), # same as Z's
'strides': (96, 8),
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (2, 4),
'version': 3}
A view is returned if the new array can be described with shape, strides and all or part of the original data buffer.
My question is how to make a vandermonde matrix. This is the definition:
In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row, i.e., an m × n matrix
I would like to make a 4*4 version of this.
So farI have defined values but only for one row as follows
a=2
n=4
for a in range(n):
for i in range(n):
v.append(a**i)
v = np.array(v)
print(v)
I dont know how to scale this. Please help!
Given a starting column a of length m you can create a Vandermonde matrix v with n columns a**0 to a**(n-1)like so:
import numpy as np
m = 4
n = 4
a = range(1, m+1)
v = np.array([a]*n).T**range(n)
print(v)
#[[ 1 1 1 1]
# [ 1 2 4 8]
# [ 1 3 9 27]
# [ 1 4 16 64]]
As proposed by michael szczesny you could use numpy.vander.
But this will not be according to the definition on Wikipedia.
x = np.array([1, 2, 3, 5])
N = 4
np.vander(x, N)
#array([[ 1, 1, 1, 1],
# [ 8, 4, 2, 1],
# [ 27, 9, 3, 1],
# [125, 25, 5, 1]])
So, you'd have to use numpy.fliplr aswell:
x = np.array([1, 2, 3, 5])
N = 4
np.fliplr(np.vander(x, N))
#array([[ 1, 1, 1, 1],
# [ 1, 2, 4, 8],
# [ 1, 3, 9, 27],
# [ 1, 5, 25, 125]])
This could also be achieved without numpy using nested list comprehensions:
x = [1, 2, 3, 5]
N = 4
[[xi**i for i in range(N)] for xi in x]
# [[1, 1, 1, 1],
# [1, 2, 4, 8],
# [1, 3, 9, 27],
# [1, 5, 25, 125]]
# Vandermonde Matrix
def Vandermonde_Matrix(D, k):
'''
D = {(x_i,y_i): 0<=i<=n}
----------------
k degree
'''
n = len(D)
V = np.zeros(shape=(n, k))
for i in range(n):
V[i] = np.power(np.array(D[i][0]), np.arange(k))
return V
I wonder whether there are some ways to apply constraints on the batches to generate in Tensorflow. For example, let's say we are training a CNN on a huge dataset to do image classification. Is it possible to force Tensorflow to generate batches where all samples are with the same class? Like, one batch of images all tagged with "Apple", the other one where samples all tagged with "Orange".
The reason I ask this question is I want to do some experiments to see how different levels of shuffling influence the final trained models. It's common practice to do sample-level shuffling for CNN training, and everybody is doing it. I just want to check it myself, thus obtaining a more vivid and first-hand knowledge about it.
Thanks!
Dataset.filter() can be used:
labels = np.random.randint(0, 10, (10000))
data = np.random.uniform(size=(10000, 5))
ds = tf.data.Dataset.from_tensor_slices((data, labels))
ds = ds.filter(lambda data, labels: tf.equal(labels, 1)) #comment this line out for unfiltered case
ds = ds.batch(5)
iterator = ds.make_one_shot_iterator()
vals = iterator.get_next()
with tf.Session() as sess:
for _ in range(5):
py_data, py_labels = sess.run(vals)
print(py_labels)
with ds.filter():
> [1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
without ds.filter():
> [8 0 7 6 3]
[2 4 7 6 1]
[1 8 5 5 5]
[7 1 7 4 0]
[7 1 8 0 0]
Edit. The following code shows how to use a feedable iterator to perform batch label selection on the fly. See "Creating an iterator"
labels = ['Apple'] * 100 + ['Orange'] * 100
data = list(range(200))
random.shuffle(labels)
batch_size = 4
ds_apple = tf.data.Dataset.from_tensor_slices((data, labels)).filter(
lambda data, label: tf.equal(label, 'Apple')).batch(batch_size)
ds_orange = tf.data.Dataset.from_tensor_slices((data, labels)).filter(
lambda data, label: tf.equal(label, 'Orange')).batch(batch_size)
handle = tf.placeholder(tf.string, [])
iterator = tf.data.Iterator.from_string_handle(
handle, ds_apple.output_types, ds_apple.output_shapes)
batch = iterator.get_next()
apple_iterator = ds_apple.make_one_shot_iterator()
orange_iterator = ds_orange.make_one_shot_iterator()
with tf.Session() as sess:
apple_handle = sess.run(apple_iterator.string_handle())
orange_handle = sess.run(orange_iterator.string_handle())
# loop and switch back and forth between apples and oranges
for _ in range(3):
feed_dict = {handle: apple_handle}
print(sess.run(batch, feed_dict=feed_dict))
feed_dict = {handle: orange_handle}
print(sess.run(batch, feed_dict=feed_dict))
Typical output for this is as follows. Note that the data values increase monotonically across Apple and Orange batches showing that the iterators are not resetting.
> (array([2, 3, 6, 7], dtype=int32), array([b'Apple', b'Apple', b'Apple', b'Apple'], dtype=object))
(array([0, 1, 4, 5], dtype=int32), array([b'Orange', b'Orange', b'Orange', b'Orange'], dtype=object))
(array([ 9, 13, 15, 19], dtype=int32), array([b'Apple', b'Apple', b'Apple', b'Apple'], dtype=object))
(array([ 8, 10, 11, 12], dtype=int32), array([b'Orange', b'Orange', b'Orange', b'Orange'], dtype=object))
(array([21, 22, 23, 25], dtype=int32), array([b'Apple', b'Apple', b'Apple', b'Apple'], dtype=object))
(array([14, 16, 17, 18], dtype=int32), array([b'Orange', b'Orange', b'Orange', b'Orange'], dtype=object))