Multiply each channel by different matrix? - tensorflow

Is there a way, in tensorflow, to multiply each channel by a different matrix?
Imagine you have a 2D array A of dimensions (N, D1).
You can multiply it by an array B of size (D1, D2) to get output size (N, D2).
Now imagine you have a 3D array of dimensions (N, D1, 3).
Suppose you had B1, B2, B3 all of size (D1, D2). Combining the outputs A * B1, A * B2, A * B3, you could form an array of size (N, D2, 3).
But is there a way to get an output size of (N, D2, 3) by just doing multiplication once?
I looked into transpose and matmul but it doesn't seem to work for this purpose.
Thank you!

tf.einsum() could be applied here.
To make the code below easier to understand, I renamed D1 = O and D2 = P.
import tensorflow as tf
A = tf.random_normal([N, O, 3])
B = tf.random_normal([O, P, 3]) # B = tf.stack([B1, B2, B3], axis=2)
res = tf.einsum("noi,opi->npi", A, B)

You could use tf.matmul here. Its just that you will have to transpose the dimensions.
Consider, N = 2, D1 = 4, D2 = 5. First create two matrices having shapes N x D1 x 3 and D1 x D2 x 3.
a = tf.constant(np.arange(1, 25, dtype=np.int32), shape=[2,4,3])
b = tf.constant(np.arange(1, 61, dtype=np.int32), shape=[4,5,3])
Transpose the matrices so that the first dimension is the same.
a = tf.transpose(a, (2, 0, 1)) # a.shape = (3, 2, 4)
b = tf.transpose(b, (2, 0, 1)) # b.shape = (3, 4, 5)
Perform the multiplication as usual.
r = tf.matmul(a,b) # r.shape = (3, 2, 5)
r = tf.transpose(r, (1, 2, 0)) # r.shape = (2, 5, 3)
Hope this helps.

Related

how to read the shape of the numpy array after numpy stack

I am doing stack operation on two 2D array using numpy.
a = np.random.randint(1, 5, size=(4, 4))
b = np.random.randint(6, 10, size=(4, 4))
f = np.stack((a, b), axis=2)
I checked the shape of the f array.
f.shape
(4, 4, 2)
in the obtained shape (4, 4, 2), I would like to know what is first 4 represnts, second 4 represents and third element 2 reprents?

Algorithm to define a 2d grid

Suppose a grid is defined by a set of grid parameters: its origin (x0, y0), an angel between one side and x axis, and increments and - please see the figure below.
There are scattered points with known coordinates on the grid but they don’t exactly fall on grid intersections. Is there an algorithm to find a set of grid parameters to define the grid so that the points are best fit to grid intersections?
Suppose the known coordinates are:
(2 , 5.464), (3.732, 6.464), (5.464, 7.464)
(3 , 3.732), (4.732, 4.732), (6.464, 5.732)
(4 , 2 ), (5.732, 3 ), (7.464, 4 ).
I expect the algorithm to find the origin (4, 2), the angle 30 degree, and the increments both 2.
You can solve the problem by finding a matrix that transforms points from positions (0, 0), (0, 1), ... (2, 2) onto the given points.
Although the grid has only 5 degrees of freedom (position of the origin + angle + scale), it is easier to define the transformation using 2x3 matrix A, because the problem can be made linear in this case.
Let a point with index (x0, y0) to be transformed into point (x0', y0') on the grid, for example (0, 0) -> (2, 5.464) and let a_ij be coefficients of matrix A. Then this pair of points results in 2 equations:
a_00 * x0 + a_01 * y0 + a_02 = x0'
a_10 * x0 + a_11 * y0 + a_12 = y0'
The unknowns are a_ij, so these equations can be written in form
a_00 * x0 + a_01 * y0 + a_02 + a_10 * 0 + a_11 * 0 + a_12 * 0 = x0'
a_00 * 0 + a_01 * 0 + a_02 * 0 + a_10 * x0 + a_11 * y0 + a_12 = y0'
or in matrix form
K0 * (a_00, a_01, a_02, a_10, a_11, a_12)^T = (x0', y0')^T
where
K0 = (
x0, y0, 1, 0, 0, 0
0, 0, 0, x0, y0, 1
)
These equations for each pair of points can be combined in a single equation
K * (a_00, a_01, a_02, a_10, a_11, a_12)^T = (x0', y0', x1', y1', ..., xn', yn')^T
or K * a = b
where
K = (
x0, y0, 1, 0, 0, 0
0, 0, 0, x0, y0, 1
x1, y1, 1, 0, 0, 0
0, 0, 0, x1, y1, 1
...
xn, yn, 1, 0, 0, 0
0, 0, 0, xn, yn, 1
)
and (xi, yi), (xi', yi') are pairs of corresponding points
This can be solved as a non-homogeneous system of linear equations. In this case the solution will minimize sum of squares of distances from each point to nearest grid intersection. This transform can be also considered to maximize overall likelihood given the assumption that points are shifted from grid intersections with normally distributed noise.
a = (K^T * K)^-1 * K^T * b
This algorithm can be easily implemented if there is a linear algebra library is available. Below is an example in Python:
import numpy as np
n_points = 9
aligned_points = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
grid_points = [(2, 5.464), (3.732, 6.464), (5.464, 7.464), (3, 3.732), (4.732, 4.732), (6.464, 5.732), (4, 2), (5.732, 3), (7.464, 4)]
K = np.zeros((n_points * 2, 6))
b = np.zeros(n_points * 2)
for i in range(n_points):
K[i * 2, 0] = aligned_points[i, 0]
K[i * 2, 1] = aligned_points[i, 1]
K[i * 2, 2] = 1
K[i * 2 + 1, 3] = aligned_points[i, 0]
K[i * 2 + 1, 4] = aligned_points[i, 1]
K[i * 2 + 1, 5] = 1
b[i * 2] = grid_points[i, 0]
b[i * 2 + 1] = grid_points[i, 1]
# operator '#' is matrix multiplication
a = np.linalg.inv(np.transpose(K) # K) # np.transpose(K) # b
A = a.reshape(2, 3)
print(A)
[[ 1. 1.732 2. ]
[-1.732 1. 5.464]]
Then the parameters can be extracted from this matrix:
theta = math.degrees(math.atan2(A[1, 0], A[0, 0]))
scale_x = math.sqrt(A[1, 0] ** 2 + A[0, 0] ** 2)
scale_y = math.sqrt(A[1, 1] ** 2 + A[0, 1] ** 2)
origin_x = A[0, 2]
origin_y = A[1, 2]
theta = -59.99927221917264
scale_x = 1.99995599951599
scale_y = 1.9999559995159895
origin_x = 1.9999999999999993
origin_y = 5.464
However there remains a minor issue: matrix A corresponds to an affine transform. This means that grid axes are not guaranteed to be perpendicular. If this is a problem, then the first two columns of the matrix can be modified in a such way that the transform preserves angles.
Update: I fixed the mistakes and resolved sign ambiguities, so now this algorithm produces the expected result. However it should be tested to see if all cases are handled correctly.
Here is another attempt to solve this problem. The idea is to decompose transformation into non-uniform scaling matrix and rotation matrix A = R * S and then solve for coefficients sx, sy, r1, r2 of these matrices given restriction that r1^2 + r2^2 = 1. The minimization problem is described here: How to find a transformation (non-uniform scaling and similarity) that maps one set of points to another?
def shift_points(points):
n_points = len(points)
shift = tuple(sum(coords) / n_points for coords in zip(*points))
shifted_points = [(point[0] - shift[0], point[1] - shift[1]) for point in points]
return shifted_points, shift
n_points = 9
aligned_points = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
grid_points = [(2, 5.464), (3.732, 6.464), (5.464, 7.464), (3, 3.732), (4.732, 4.732), (6.464, 5.732), (4, 2), (5.732, 3), (7.464, 4)]
aligned_points, aligned_shift = shift_points(aligned_points)
grid_points, grid_shift = shift_points(grid_points)
c1, c2 = 0, 0
b11, b12, b21, b22 = 0, 0, 0, 0
for i in range(n_points):
c1 += aligned_points[i][0] ** 2
c2 += aligned_points[i][0] ** 2
b11 -= 2 * aligned_points[i][0] * grid_points[i][0]
b12 -= 2 * aligned_points[i][1] * grid_points[i][0]
b21 -= 2 * aligned_points[i][0] * grid_points[i][1]
b22 -= 2 * aligned_points[i][1] * grid_points[i][1]
k = (b11 ** 2 * c2 + b22 ** 2 * c1 - b21 ** 2 * c2 - b12 ** 2 * c1) / \
(b21 * b11 * c2 - b12 * b22 * c1)
# r1_sqr and r2_sqr might need to be swapped
r1_sqr = 2 / (k ** 2 + 4 + k * math.sqrt(k ** 2 + 4))
r2_sqr = 2 / (k ** 2 + 4 - k * math.sqrt(k ** 2 + 4))
for sign1, sign2 in [(1, 1), (-1, 1), (1, -1), (-1, -1)]:
r1 = sign1 * math.sqrt(r1_sqr)
r2 = sign2 * math.sqrt(r2_sqr)
scale_x = -b11 / (2 * c1) * r1 - b21 / (2 * c1) * r2
scale_y = b12 / (2 * c2) * r2 - b22 / (2 * c2) * r1
if scale_x >= 0 and scale_y >= 0:
break
theta = math.degrees(math.atan2(r2, r1))
There might be ambiguities in choosing r1_sqr and r2_sqr. Origin point can be estimated from aligned_shift and grid_shift, but I didn't implement it yet.
theta = -59.99927221917264
scale_x = 1.9999559995159895
scale_y = 1.9999559995159895

6 order polynomial regression no result

I just learnt about tensorflow. To be more familiar with the syntax, I build a toy model to perform polynomial regression.
The toy dataset that I created is
x_data = np.linspace(-1, 1, 300) + np.random.uniform(-0.05, 0.05, 300)
y_data = np.linspace(-1, 1, 300) ** 2 + np.random.uniform(-0.05, 0.05, 300)
The model that I built is
batch_size = 20
x = tf.placeholder(tf.float64, [1, batch_size])
y = tf.placeholder(tf.float64, [1, batch_size])
a0 = tf.Variable(np.random.rand(1))
a1 = tf.Variable(np.random.rand(1))
a2 = tf.Variable(np.random.rand(1))
a3 = tf.Variable(np.random.rand(1))
a4 = tf.Variable(np.random.rand(1))
a5 = tf.Variable(np.random.rand(1))
a6 = tf.Variable(np.random.rand(1))
op = a6 * x ** 6 + a5 * x ** 5 + a4 * x ** 4 + a3 * x ** 3 + a2 * x ** 2 + a1 * x ** 1 + a0
error = tf.reduce_sum(tf.square(op - y))
init = tf.global_variables_initializer()
optimizer = tf.train.GradientDescentOptimizer(0.0001)
train = optimizer.minimize(error)
sess = tf.Session()
steps = 100000
sess.run(init)
for i in range(steps):
rand_int = np.random.randint(0, 300, batch_size)
x_temp = x_data[rand_int].reshape(1, batch_size)
y_temp = y_data[rand_int].reshape(1, batch_size)
feed = {x: x_temp, y: y_temp}
sess.run(train, feed)
a0, a1, a2, a3, a4, a5, a6= sess.run([a0, a1, a2, a3, a4, a5, a6])
However, after I run the model, the result that I got is:
[a0, a1, a2, a3, a4, a5, a6] = [array([ nan]), array([ nan]), array([ nan]), array([ nan]), array([ nan]), array([ nan]), array([ nan])]
Why did the model learn nothing? I've changed the learning rate to a magnitude smaller, yet the outcome is still the same.

Pick random tensors from another one in Tensorflow

I have a Tensor X whith shape [B, L, E] (let's say, B batches of L vectors of length E). From this Tensor X, I want to randomly pick N vectors in each batch, and so create Y with shape [B, N, E].
I tried to combine tf.random_uniform and tf.gather but I really struggle with the dimension and can't get Y.
You can use something like this:
import tensorflow as tf
import numpy as np
B = 3
L = 5
E = 2
N = 3
input = np.array(range(B * L * E)).reshape([B, L, E])
print(input)
print("#################################")
X = tf.constant(input)
batch_range = tf.tile(tf.reshape(tf.range(B, dtype=tf.int32), shape=[B, 1, 1]), [1, N, 1])
random = tf.random_uniform([B, N, 1], minval = 0, maxval = L - 1, dtype = tf.int32)
indices = tf.concat([batch_range, random], axis = 2)
output = tf.gather_nd(X, indices)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(indices))
print("#################################")
print(sess.run(output))

Tensorflow - tf.matmul of conv features and a vector as a batch matmul

I tried the following code
batch_size= 128
c1 = tf.zeros([128,32,32,16])
c2 = tf.zeros([128,32,32,16])
c3 = tf.zeros([128,32,32,16])
c = tf.stack([c1, c2, c3], 4) (size: [128, 32, 32, 16, 3])
alpha = tf.zeros([128,3,1])
M = tf.matmul(c,alpha)
And it makes error at tf.matmul.
What I want is just a linear combination alpha[0]*c1 + alpha[1]*c2 + alpha[2]*c3 at each sample. When batch size is 1, this code will be fine, but when it is not how can I do it?
Should I reshape c1,c2,c3?
I think this code works; verified it.
import tensorflow as tf
import numpy as np
batch_size= 128
c1 = tf.ones([128,32,32,16])
c2 = tf.ones([128,32,32,16])
c3 = tf.ones([128,32,32,16])
c = tf.stack([c1, c2, c3], 4)
alpha = tf.zeros([1,3])
for j in range(127):
z = alpha[j] + 1
z = tf.expand_dims(z,0)
alpha = tf.concat([alpha,z],0)
M = tf.einsum('aijkl,al->aijk',c,alpha)
print('')
with tf.Session() as sess:
_alpha = sess.run(alpha)
_M = sess.run(M)
print('')