Would it be possible to use numpy/scipy to multiply matrices composed of polynomials?
Specifically I wish to multiply a 120 by 120 sparse matrix who's entries can look like a+7*b+c by itself.
Honestly, I haven't tried very hard to do this. I see that there is a polynomial module in numpy but I have no experience with it. I am just hoping that someone sees this and says "obviously it's possible, do this".
There is one relevant question asked before from what I've seen: Matrices whose entries are polynomials
I don't know about sparse, but numpy object arrays work fine.
In [1]: from numpy.polynomial import Polynomial as P
In [2]: a = np.array([[P([1,2]), P([3,4])]]*2)
In [3]: a
Out[3]:
array([[Polynomial([ 1., 2.], [-1, 1], [-1, 1]),
Polynomial([ 3., 4.], [-1, 1], [-1, 1])],
[Polynomial([ 1., 2.], [-1, 1], [-1, 1]),
Polynomial([ 3., 4.], [-1, 1], [-1, 1])]], dtype=object)
In [4]: np.dot(a, a)
Out[4]:
array([[Polynomial([ 4., 14., 12.], [-1., 1.], [-1., 1.]),
Polynomial([ 12., 34., 24.], [-1., 1.], [-1., 1.])],
[Polynomial([ 4., 14., 12.], [-1., 1.], [-1., 1.]),
Polynomial([ 12., 34., 24.], [-1., 1.], [-1., 1.])]], dtype=object)
Related
Given tensor data
[[[ 0., 0.],
[ 1., 1.],
[-1., -1.]],
[[-1., -1.],
[ 4., 4.],
[ 5., 5.]]]
I want to remove [-1,-1] and get
[[[ 0., 0.],
[ 1., 1.]],
[[ 4., 4.],
[ 5., 5.]]]
How to get the above without using ragged feature in tensorflow?
You can try this:
x = tf.constant(
[[[ 0., 0.],
[ 1., 1.],
[-1., -2.]],
[[-1., -2.],
[ 4., 4.],
[ 5., 5.]]])
mask = tf.math.not_equal(x, np.array([-1, -1]))
result = tf.boolean_mask(x, mask)
shape = tf.shape(x)
result = tf.reshape(result, (shape[0], -1, shape[2]))
You could do it like this:
import tensorflow as tf
import numpy as np
data = [[[ 0., 0.],
[ 1., 1.],
[-1., -1.]],
[[-1., -1.],
[ 4., 4.],
[ 5., 5.]]]
data = tf.constant(data)
indices = tf.math.not_equal(data, tf.constant([-1., -1.]))
res = data[indices]
shape = tf.shape(data)
total = tf.reduce_sum(
tf.cast(tf.math.logical_and(indices[:, :, 0], indices[:, :, 1])[0], tf.int32))
res = tf.reshape(res, (shape[0], total, shape[-1]))
with tf.Session() as sess:
print(sess.run(res))
# [[[0. 0.]
# [1. 1.]]
# [[4. 4.]
# [5. 5.]]]
I have a 2D tensor. I would like to take each vector in that 2D tensor and tf.tensordot(vector, matrix, axes=1) to the matrix in a 3D tensor that has the same index in the 3D tensor as the vector does in the 2D tensor.
Essentially, I'd like the same result as I'd get with this for loop, but by doing tensorflow matrix operations rather than numpy and looping:
tensor2d = np.array([[1.,1.,1.,0.,0.],
[1.,1.,0.,0.,0.]],
np.float32)
tensor3d = np.array([
[
[1., 2., 3.],
[2., 2., 3.],
[3., 2., 3.],
[4., 2., 3.],
[5., 2., 3.],
],
[
[1., 2., 3.],
[2., 2., 3.],
[3., 2., 3.],
[4., 2., 3.],
[5., 2., 3.],
]
], np.float32)
results = []
for i in range(len(tensor2d)):
results.append(np.tensordot(tensor2d[i], tensor3d[i], axes=1))
Output of this should be a matrix that looks like this (though types would be different):
[array([6., 6., 9.], dtype=float32), array([3., 4., 6.], dtype=float32)]
Ok, the self-found answer boils down to use tf.math.multiply and mess around with transposes until the result is the desired shape. Would be great if someone could come up with a more principled answer at some point, but for now, this worked:
result = tf.transpose(tf.math.multiply(tensor2d, tensor3d.transpose([2,0,1])), [1,2,0])
Im trying to understand numpy where condition.
>>> import numpy as np
>>> x = np.arange(9.).reshape(3, 3)
>>> x
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
>>> np.where( x > 5 )
(array([2, 2, 2]), array([0, 1, 2]))
IN the above case, what does the output actually mean, array([0,1,2]) I actually see in the input what is array([2,2,2])
Th first array indicates the row number and the second array indicates the corresponding column number.
If the array is following:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
Then the following
(array([2, 2, 2]), array([0, 1, 2]))
Can be interpreted as
array(2,0) => 6
array(2,1) => 7
array (2,2) => 8
You might also want to know where those values appear visually in your array. In such cases, you can return the array's value where the condition is True and a null value where they are false. In the example below, the value of x is returned at the position where x>5, otherwise assign -1.
x = np.arange(9.).reshape(3, 3)
np.where(x>5, x, -1)
array([[-1., -1., -1.],
[-1., -1., -1.],
[ 6., 7., 8.]])
Three elements found, located at (2,0),(2,1),(2,2)..
By the way, tryhelp(np.where()) will help you a lot.
I'm trying to feed 1D numpy arrays (flattend images) via a generator into a H5py data file in order to create training and validation matrices.
The following code was adapted from a solution (can't find it now) in which the data attribute of H5py's File objects's create_dataset function is provided data in the form of a call to np.fromiter which has a generator function as one of its arguments.
from scipy.misc import imread
import h5py
import numpy as np
import os
# Creating h5 data file
f = h5py.File('../data.h5', 'w')
# Source directory for image data
src = '/datasets/aic540/train/images/'
# Showing quantity and dimensionality of data
images = os.listdir(src)
ex_img = imread(src + images[0])
flat_img = ex_img.flatten()
print "# of images is {}".format(len(images))
print "image shape is {}".format(ex_img.shape)
print "flattened image shape is {}".format(flat_img.shape)
# Creating generator to feed in data to h5py's `create_dataset` function
gen = (imread(src + i).flatten().astype(np.int8) for i in os.listdir(src))
# Creating h5 dataset
f.create_dataset(name='training',
#shape=(59482, 1555200),
data=np.fromiter(gen, dtype=np.int8))
Output:
# of images is 59482
image shape is (540, 960, 3)
flattened image shape is (1555200,)
Traceback (most recent call last):
File "process_images.py", line 30, in <module>
data=np.fromiter(gen, dtype=np.int8))
ValueError: setting an array element with a sequence.
I've read when searching for this error in this context that the problem is that np.fromiter() needs a list and not a generator function (which seems opposed to the function that the name "fromiter" implies) -- wrapping the generator in a list call list(gen) allows the code to run but it, of course, uses up all the memory in the expansion of this list before the call to create_dataset is made.
How do I use a generator to feed data into an H5py data file?
If my approach is entirely wrong, what is the correct way to build a very large numpy matrix that doesn't fit in memory -- using H5py or otherwise?
The with a sequence error comes from what you are trying to feed fromiter, not the generator part.
In py3, range is generator like:
In [15]: np.fromiter(range(3),dtype=int)
Out[15]: array([0, 1, 2])
In [16]: np.fromiter((2*x for x in range(3)),dtype=int)
Out[16]: array([0, 2, 4])
But if I start with a 2d array (which imread produces, right?), and create a generator expression as you do:
In [17]: gen = (np.ones((2,3)).flatten().astype(np.int8) for i in range(3))
In [18]: list(gen)
Out[18]:
[array([1, 1, 1, 1, 1, 1], dtype=int8),
array([1, 1, 1, 1, 1, 1], dtype=int8),
array([1, 1, 1, 1, 1, 1], dtype=int8)]
I generate a list of arrays.
In [19]: gen = (np.ones((2,3)).flatten().astype(np.int8) for i in range(3))
In [21]: np.fromiter(gen, np.int8)
...
ValueError: setting an array element with a sequence.
np.fromiter creates a 1d array from an iterator that provides 'numbers' one at a time, not something that dishes out lists or arrays.
In any case, npfromiter creates a full array; not some sort of generator. There's nothing like an array 'generator'.
Even without chunking you can write data to the file by 'row' or other slice.
In [28]: f = h5py.File('test.h5', 'w')
In [29]: data = f.create_dataset(name='test',shape=(100,10))
In [30]: for i in range(100):
...: data[i,:] = np.arange(i,i+10)
...:
In [31]: data
Out[31]: <HDF5 dataset "test": shape (100, 10), type "<f4">
The equivalent in your case is to load an image, reshape it, and write it immediately to the h5py dataset. No need to collect all the images in an array or list.
read 10 rows:
In [33]: data[:10,:]
Out[33]:
array([[ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[ 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.],
[ 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.],
[ 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.],
[ 5., 6., 7., 8., 9., 10., 11., 12., 13., 14.],
[ 6., 7., 8., 9., 10., 11., 12., 13., 14., 15.],
[ 7., 8., 9., 10., 11., 12., 13., 14., 15., 16.],
[ 8., 9., 10., 11., 12., 13., 14., 15., 16., 17.],
[ 9., 10., 11., 12., 13., 14., 15., 16., 17., 18.]], dtype=float32)
Enabling chunking might help with really large datasets, but I don't experience in that area.
Namely, rearranging rows, adding multiples of rows, and multiplying by scalars.
I don't see these methods defined in http://docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.html or elsewhere.
And if they aren't defined, then why not?
Yes, you can manipulate array rows, adding and multiplying them. For example:
In [1]: import numpy as np
In [2]: m = np.ones((3, 4))
In [3]: m
Out[3]:
array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]])
In [4]: m[1, :] = 2*m[1, :] # Multiply
In [5]: m
Out[5]:
array([[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 1., 1., 1., 1.]])
In [6]: m[0, :] = m[0, :] + 2*m[1, :] # Multiply and add
In [7]: m
Out[7]:
array([[ 5., 5., 5., 5.],
[ 2., 2., 2., 2.],
[ 1., 1., 1., 1.]])
In [8]: m[ (0, 2), :] = m[ (2, 0), :] # Swap rows
In [9]: m
Out[9]:
array([[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 5., 5., 5., 5.]])