Lets suppose that I have a input layer with shape (h,w,f) = (1 x 1 x 256 )
And let me make two sequence
case 1 :
input = keras.models.Input((1,1,256))
x = keras.layers.Conv2d(f= 32, k=(1,1),s = 1)(input)
x = keras.layers.ReLU()(x)
x = keras.layers.Conv2d(f= 256, k=(1,1),s = 1)(x)
case 2 :
input = keras.models.Input((1,1,256))
x = keras.layers.Flatten()(input)
x = keras.layers.Dense(32)(x)
x = keras.layers.ReLU()(x)
x = keras.layers.Dense(256)(x)
x = keras.layers.reshape((1,1,256))(x)
In these 2 cases are the output x is same?
I am making a SE-Net-like attention module but not the same.
Yes, and you do not need to apply Flatten() and Reshape() in code 2. Dense will be applied on the last channel automatically.
Related
I'm using TensorFlow 2.3.1, and the following code is not shuffling the data.
I have time series data that I want to build features and target for prediction.
Code:
data_len = len(calls['Calls'])
# Load data as TensorFlow dataset
dataset = tf.data.Dataset.from_tensor_slices(calls['Calls'])
# Create data windows and shifting values. Drop incomplete arrays
dataset = dataset.window(5, shift=1, drop_remainder=True)
# Convert into tensors for each window
dataset = dataset.flat_map(lambda window: window.batch(5))
# Split into features and target
dataset = dataset.map(lambda window: (window[:-1], window[-1:]))
# Shuffle data
dataset.shuffle(buffer_size=data_len)
dataset.batch(2).prefetch(1)
# Print output
for x, y in dataset:
print('x = {}'.format(x.numpy()))
print('y = {}'.format(y.numpy()))
Result:
x = [1659 4928 3961 3663]
y = [2452]
x = [4928 3961 3663 2452]
y = [2195]
x = [3961 3663 2452 2195]
y = [3796]
x = [3663 2452 2195 3796]
y = [2997]
x = [2452 2195 3796 2997]
y = [2598]
x = [2195 3796 2997 2598]
y = [2605]
x = [3796 2997 2598 2605]
y = [2603]
x = [2997 2598 2605 2603]
y = [2332]
x = [2598 2605 2603 2332]
y = [2025]
x = [2605 2603 2332 2025]
Found out the issue.
I wasn't assigning the batching and shuffling back to the dataset.
Thanks anyway.
I am trying to train a name generation LSTM network. I am not using pre-defined tensorflow cells (like tf.contrib.rnn.BasicLSTMCell, etc). I have created LSTM cell myself. But the error is not reducing beyond a limit. It only decreases 30% from what it is initially (when random weights were used in forward propagation) and then it starts increasing. Also, the gradients and weights become very small after few thousand training steps.
I think the reason for non-convergence can be one of two:
1. The design of tensorflow graph i have created OR
2. The loss function i used.
I am feeding one hot vectors of each character of the word for each time-step of the network. The code i have used for graph generation and loss function is as follows. Tx is the number of time steps in RNN, n_x,n_a,n_y are length of the input vectors, LSTM cell vector and output vector respectively.
Will be great if someone can help me in identifying what i am doing wrong here.
n_x = vocab_size
n_y = vocab_size
n_a = 100
Tx = 50
Ty = Tx
with open("trainingnames_file.txt") as f:
examples = f.readlines()
examples = [x.lower().strip() for x in examples]
X0 = [[char_to_ix[x1] for x1 in list(x)] for x in examples]
X1 = np.array([np.concatenate([np.array(x), np.zeros([Tx-len(x)])]) for x in X0], dtype=np.int32).T
Y0 = [(x[1:] + [char_to_ix["\n"]]) for x in X0]
Y1 = np.array([np.concatenate([np.array(y), np.zeros([Ty-len(y)])]) for y in Y0], dtype=np.int32).T
m = len(X0)
Wf = tf.get_variable(name="Wf", shape = [n_a,(n_a+n_x)])
Wu = tf.get_variable(name="Wu", shape = [n_a,(n_a+n_x)])
Wc = tf.get_variable(name="Wc", shape = [n_a,(n_a+n_x)])
Wo = tf.get_variable(name="Wo", shape = [n_a,(n_a+n_x)])
Wy = tf.get_variable(name="Wy", shape = [n_y,n_a])
bf = tf.get_variable(name="bf", shape = [n_a,1])
bu = tf.get_variable(name="bu", shape = [n_a,1])
bc = tf.get_variable(name="bc", shape = [n_a,1])
bo = tf.get_variable(name="bo", shape = [n_a,1])
by = tf.get_variable(name="by", shape = [n_y,1])
X_input = tf.placeholder(dtype = tf.int32, shape = [Tx,None])
Y_input = tf.placeholder(dtype = tf.int32, shape = [Ty,None])
X = tf.one_hot(X_input, axis = 0, depth = n_x)
Y = tf.one_hot(Y_input, axis = 0, depth = n_y)
X.shape
a_prev = tf.zeros(shape = [n_a,m])
c_prev = tf.zeros(shape = [n_a,m])
a_all = []
c_all = []
for i in range(Tx):
ac = tf.concat([a_prev,tf.squeeze(tf.slice(input_=X,begin=[0,i,0],size=[n_x,1,m]))], axis=0)
ct = tf.tanh(tf.matmul(Wc,ac) + bc)
tug = tf.sigmoid(tf.matmul(Wu,ac) + bu)
tfg = tf.sigmoid(tf.matmul(Wf,ac) + bf)
tog = tf.sigmoid(tf.matmul(Wo,ac) + bo)
c = tf.multiply(tug,ct) + tf.multiply(tfg,c_prev)
a = tf.multiply(tog,tf.tanh(c))
y = tf.nn.softmax(tf.matmul(Wy,a) + by, axis = 0)
a_all.append(a)
c_all.append(c)
a_prev = a
c_prev = c
y_ex = tf.expand_dims(y,axis=1)
if i == 0:
y_all = y_ex
else:
y_all = tf.concat([y_all,y_ex], axis=1)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y,logits=y_all,dim=0))
opt = tf.train.AdamOptimizer()
train = opt.minimize(loss)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
o = sess.run(loss, feed_dict = {X_input:X1,Y_input:Y1})
print(o.shape)
print(o)
sess.run(train, feed_dict = {X_input:X1,Y_input:Y1})
o = sess.run(loss, feed_dict = {X_input:X1,Y_input:Y1})
print(o)
I am trying to evaluate a condition on each element of a vector y so that I get a vector whose i’th element tells me whether y[i]satisfies the condition. Is there any way to do this without using loops? So far, I have tried the following:
dim = 3
x = tf.placeholder(tf.float32, shape = [dim])
y = tf.log(x)
tf1 = tf.constant(1)
tf0 = tf.constant(0)
x_0 = tf.tile([x[0]], [dim])
delta = tf.cond(tf.equal(y,x_0), tf1, tf0))
sess = tf.Session()
a = np.ones((1,3))
print(sess.run(delta, feed_dict={x:a}))
For a given input x, I want delta[i] to be 1 if y[i] = x[0] and 0 otherwise.
I get error
shape must be of equal rank but are 0 and 1 for 'Select_2' (op: 'select') with input shapes [3], [],[]
I am new to TensorFlow, any help would be appreciated!
Seems like that you have error because you are trying to compare tensors with different shape.
That's working code:
import tensorflow as tf
import numpy as np
dim = 3
x = tf.placeholder(tf.float32, shape=(1, dim), name='ktf')
y = tf.log(x)
delta = tf.cast(tf.equal(y, x[0]), dtype=tf.int32)
sess = tf.Session()
a = np.ones((1, 3))
print(sess.run(delta, feed_dict={x: a}))
For you case, there is no need to use tf.cond, you can use tf.equal that does this without the loops, and because of the broadcasting there is no need to tile it. Just use:
dim = 3
x = tf.placeholder(tf.float32, shape = [dim])
y = tf.log(x)
delta = tf.cast(tf.equal(y,x[0]),tf.float32) # or integer type
sess = tf.Session()
a = np.ones((1,3))
print(sess.run(delta, feed_dict={x:a}))
import tensorflow as tf
x = [[1,2,3],[4,5,6]]
y = [0,1]
z = [1,2]
x = tf.constant(x)
y = tf.constant(y)
z = tf.constant(z)
m = x[y,z]
What I expect is m = [2,6]
I can get the result by theano or numpy. How I get the result using tensorflow?
You would want to use tf.gather_nd
slices = tf.gather_nd(x, [y, z])
Hope this helps.
So I have a vector x and a matrix y where y[i] = [f(x[i]),f(x[i]),f(x[i])...] is a row of experimental values of f at x. I need to flatten out y so that I have two vectors with y[i] = f(x[i]). Here's what I'm using now:
x = np.ravel([[xx]*y.shape[1] for xx in x]); y = np.ravel(y)
Is there a cleaner/faster way?
You could just use np.repeat -
x = x.repeat(y.shape[1])