Logical AND/OR in Keras Backend - tensorflow

Tensorflow has tf.logical_and() and tf.logical_or() for comparison of two boolean tensors, i.e. tf.logical_and(x,y)==TRUE if x==TRUE and y==TRUE (doc). I can't find anything like this in the Keras backend though. They have keras.backend.any() and .all(), but this is for aggregation within a tensor, not between. I've been having to use workarounds with nested K.switch() functions, but it is painfully inelegant.

Let x and y be boolean keras tensors of the same shape.
To take elementwise or, do the following:
keras.backend.any(keras.backend.stack([x, y], axis=0), axis=0)
To take elementwise and, do the following:
keras.backend.all(keras.backend.stack([x, y], axis=0), axis=0)
Here keras.backend.stack([x, y], axis=0) stacks x and y into a new tensor with an additional dimension at number 0. After that keras.backend.any takes a logical or along the new dimension, and keras.backend.any takes the logical and.

My solution (perhaps not the best, because I haven't found others either), is:
A = K.cast(someBooleanTensor, K.floatx())
B = K.cast(anotherBooleanTensor, K.floatx())
A_and_B = A * B #this is also something I use a lot for gathering elements
A_or_B = 1 -((1-A)*(1-B))
But thinking about it now... I never tested python operators... perhaps they work?

Related

using gather on argmax is different than taking max

I'm trying to learn to train a double-DQN algorithm on tensorflow and it doesn't work. to make sure everything is fine I wanted to test something. I wanted to make sure that using tf.gather on the argmax is exactly the same as taking the max: let's say I have a network called target_network:
first let's take the max:
next_qvalues_target1 = target_network.get_symbolic_qvalues(next_obs_ph) #returns tensor of qvalues
next_state_values_target1 = tf.reduce_max(next_qvalues_target1, axis=1)
let's try it in a different way- using argmax and gather:
next_qvalues_target2 = target_network.get_symbolic_qvalues(next_obs_ph) #returns same tensor of qvalues
chosen_action = tf.argmax(next_qvalues_target2, axis=1)
next_state_values_target2 = tf.gather(next_qvalues_target2, chosen_action)
diff = tf.reduce_sum(next_state_values_target1) - tf.reduce_sum(next_state_values_target2)
next_state_values_target2 and next_state_values_target1 are supposed to be completely identical. so running the session should output diff = . but it does not.
What am I missing?
Thanks.
Found out what went wrong. chosen action is of shape (n, 1) so I thought that using gather on a variable that's (n, 4) I'll get a result of shape (n, 1). turns out this isn't true. I needed to turn chosen_action to be a variable of shape (n, 2)- instead of [action1, action2, action3...] I needed it to be [[1, action1], [2, action2], [3, action3]....] and use gather_nd to be able to take specific elements from next_qvalues_target2 and not gather, because gather takes complete rows.

no broadcasting for tf.matmul in tensorflow for 4D 3D tensors

First I find another question here No broadcasting for tf.matmul in TensorFlow
But that question does not solve my problem.
My problem is a batch of matrices multiply another batch of vectors.
x=tf.placeholder(tf.float32,shape=[10,1000,3,4])
y=tf.placeholder(tf.float32,shape=[1000,4])
x is a batch of matrices.There are 10*1000 matrices.Each matrix is of shape [3,4]
y is a batch of vectors.There are 1000 vectors.Each vector is of shape[4]
Dim 1 of x and dim 0 of y are the same.(Here is 1000)
If tf.matmul had supported broadcasting,I could write
y=tf.reshape(y,[1,1000,4,1])
result=tf.matmul(x,y)
result=tf.reshape(result,[10,1000,3])
But tf.matmul does not support broadcasting
If I use the approach of the question I referenced above
x=tf.reshape(x,[10*1000*3,4])
y=tf.transpose(y,perm=[1,0]) #[4,1000]
result=tf.matmul(x,y)
result=tf.reshape(result,[10,1000,3,1000])
The result is of shape [10,1000,3,1000],not [10,1000,3].
I don't know how to remove the redundant 1000
How to get the same result as the tf.matmul which supports broadcasting?
I solve it myself.
x=tf.transpose(x,perm=[1,0,2,3]) #[1000,10,3,4]
x=tf.reshape(x,[1000,30,4])
y=tf.reshape(y,[1000,4,1])
result=tf.matmul(x,y) #[1000,30,1]
result=tf.reshape(result,[1000,10,3])
result=tf.transpose(result,perm=[1,0,2]) #[10,1000,3]
As indicated here, you can use a function to work around:
def broadcast_matmul(A, B):
"Compute A # B, broadcasting over the first `N-2` ranks"
with tf.variable_scope("broadcast_matmul"):
return tf.reduce_sum(A[..., tf.newaxis] * B[..., tf.newaxis, :, :],
axis=-2)

What does tf.gather_nd intuitively do?

Can you intuitively explain or give more examples about tf.gather_nd for indexing and slicing into high-dimensional tensors in Tensorflow?
I read the API, but it is kept quite concise that I find myself hard to follow the function's concept.
Ok, so think about it like this:
You are providing a list of index values to index the provided tensor to get those slices. The first dimension of the indices you provide is for each index you will perform. Let's pretend that tensor is just a list of lists.
[[0]] means you want to get one specific slice(list) at index 0 in the provided tensor. Just like this:
[tensor[0]]
[[0], [1]] means you want get two specific slices at indices 0 and 1 like this:
[tensor[0], tensor[1]]
Now what if tensor is more than one dimensions? We do the same thing:
[[0, 0]] means you want to get one slice at index [0,0] of the 0-th list. Like this:
[tensor[0][0]]
[[0, 1], [2, 3]] means you want return two slices at the indices and dimensions provided. Like this:
[tensor[0][1], tensor[2][3]]
I hope that makes sense. I tried using Python indexing to help explain how it would look in Python to do this to a list of lists.
You provide a tensor and indices representing locations in that tensor. It returns the elements of the tensor corresponding to the indices you provide.
EDIT: An example
import tensorflow as tf
sess = tf.Session()
x = [[1,2,3],[4,5,6]]
y = tf.gather_nd(x, [[1,1],[1,2]])
print(sess.run(y))
[5, 6]

NumPy vectorization with integration

I have a vector and wish to make another vector of the same length whose k-th component is
The question is: how can we vectorize this for speed? NumPy vectorize() is actually a for loop, so it doesn't count.
Veedrac pointed out that "There is no way to apply a pure Python function to every element of a NumPy array without calling it that many times". Since I'm using NumPy functions rather than "pure Python" ones, I suppose it's possible to vectorize, but I don't know how.
import numpy as np
from scipy.integrate import quad
ws = 2 * np.random.random(10) - 1
n = len(ws)
integrals = np.empty(n)
def f(x, w):
if w < 0: return np.abs(x * w)
else: return np.exp(x) * w
def temp(x): return np.array([f(x, w) for w in ws]).sum()
def integrand(x, w): return f(x, w) * np.log(temp(x))
## Python for loop
for k in range(n):
integrals[k] = quad(integrand, -1, 1, args = ws[k])[0]
## NumPy vectorize
integrals = np.vectorize(quad)(integrand, -1, 1, args = ws)[0]
On a side note, is a Cython for loop always faster than NumPy vectorization?
The function quad executes an adaptive algorithm, which means the computations it performs depend on the specific thing being integrated. This cannot be vectorized in principle.
In your case, a for loop of length 10 is a non-issue. If the program takes long, it's because integration takes long, not because you have a for loop.
When you absolutely need to vectorize integration (not in the example above), use a non-adaptive method, with the understanding that precision may suffer. These can be directly applied to a 2D NumPy array obtained by evaluating all of your functions on some regularly spaced 1D array (a linspace). You'll have to choose the linspace yourself since the methods aren't adaptive.
numpy.trapz is the simplest and least precise
scipy.integrate.simps is equally easy to use and more precise (Simpson's rule requires an odd number of samples, but the method works around having an even number, too).
scipy.integrate.romb is in principle of higher accuracy than Simpson (for smooth data) but it requires the number of samples to be 2**n+1 for some integer n.
#zaq's answer focusing on quad is spot on. So I'll look at some other aspects of the problem.
In recent https://stackoverflow.com/a/41205930/901925 I argue that vectorize is of most value when you need to apply the full broadcasting mechanism to a function that only takes scalar values. Your quad qualifies as taking scalar inputs. But you are only iterating on one array, ws. The x that is passed on to your functions is generated by quad itself. quad and integrand are still Python functions, even if they use numpy operations.
cython improves low level iteration, stuff that it can convert to C code. Your primary iteration is at a high level, calling an imported function, quad. Cython can't touch or rewrite that.
You might be able to speed up integrand (and on down) with cython, but first focus on getting the most speed from that with regular numpy code.
def f(x, w):
if w < 0: return np.abs(x * w)
else: return np.exp(x) * w
With if w<0 w must be scalar. Can it be written so it works with an array w? If so, then
np.array([f(x, w) for w in ws]).sum()
could be rewritten as
fn(x, ws).sum()
Alternatively, since both x and w are scalar, you might get a bit of speed improvement by using math.exp etc instead of np.exp. Same for log and abs.
I'd try to write f(x,w) so it takes arrays for both x and w, returning a 2d result. If so, then temp and integrand would also work with arrays. Since quad feeds a scalar x, that may not help here, but with other integrators it could make a big difference.
If f(x,w) can be evaluated on a regular nx10 grid of x=np.linspace(-1,1,n) and ws, then an integral (of sorts) just requires a couple of summations over that space.
You can use quadpy for fully vectorized computation. You'll have to adapt your function to allow for vector inputs first, but that is done rather easily:
import numpy as np
import quadpy
np.random.seed(0)
ws = 2 * np.random.random(10) - 1
def f(x):
out = np.empty((len(ws), *x.shape))
out0 = np.abs(np.multiply.outer(ws, x))
out1 = np.multiply.outer(ws, np.exp(x))
out[ws < 0] = out0[ws < 0]
out[ws >= 0] = out1[ws >= 0]
return out
def integrand(x):
return f(x) * np.log(np.sum(f(x), axis=0))
val, err = quadpy.quad(integrand, -1, +1, epsabs=1.0e-10)
print(val)
[0.3266534 1.44001826 0.68767868 0.30035222 0.18011948 0.97630376
0.14724906 2.62169217 3.10276876 0.27499376]

Creating new vector in tensorflow from argmax performed on another tensor

I have tensor that has shape (?, 3), looks like this [x, y, z] and I need to create function that take argmax of it, creates new vector and assign values with respect to dimension and argmax.
Example:
f(y):
v = tf.variable(tf.zeros(y.get_shape()))
index = tf.argmax(y)
v[index] = 1.0
return v
Unfortunately this doesn't work and I can't figure out how can one do it.
Are you sure that you want to create and assign to a tf.Variable here? It would probably be simpler to use the tf.one_hot() op (available from version 0.8 onwards) to build the result functionally, as you wouldn't have to worry about initialization, etc. For example, you could do the following:
def f(y):
index = tf.argmax(y, 1)
return tf.one_hot(index, tf.shape(y)[1], 1.0, 0.0)