As the weights are not explicitly defined, how can I pass them to a summary writer?
For exemple:
conv1 = tf.layers.conv2d(
tf.reshape(X,[FLAGS.batch,3,160,320]),
filters = 16,
kernel_size = (8,8),
strides=(4, 4),
padding='same',
kernel_initializer=tf.contrib.layers.xavier_initializer(),
bias_initializer=tf.zeros_initializer(),
kernel_regularizer=None,
name = 'conv1',
activation = tf.nn.elu
)
=>
summarize_tensor(
??????
)
Thanks!
While Da Tong's answer is complete, it took me a while to realize how to use it. To save time for another beginner, you need to add the following to you code to add all trainable variables to the tensorboard summary:
for var in tf.trainable_variables():
tf.summary.histogram(var.name, var)
merged_summary = tf.summary.merge_all()
That depends on what you are going to record in TensorBoard. If you want to put every variables into TensorBoard, call tf.all_variables() or tf.trainable_variables() will give you all the variables. Note that the tf.layers.conv2d is just a wrapper of creating a Conv2D instance and call apply method of it. You can unwrap it like this:
conv1_layer = tf.layers.Conv2D(
filters = 16,
kernel_size = (8,8),
strides=(4, 4),
padding='same',
kernel_initializer=tf.contrib.layers.xavier_initializer(),
bias_initializer=tf.zeros_initializer(),
kernel_regularizer=None,
name = 'conv1',
activation = tf.nn.elu
)
conv1 = conv1_layer.apply(tf.reshape(X,[FLAGS.batch,3,160,320]))
Then you can use conv1_layer.kernel to access the kernel weights.
Related
For my project I need to initialize the CNN 1st Layer kernel with Gammatone filters according to papers ( https://www.mdpi.com/1099-4300/20/12/990/htm ) ,( https://www.groundai.com/project/end-to-end-environmental-sound-classification-using-a-1d-convolutional-neural-network/1 ) and a few others. What does it exactly mean to initialize the cnn kernel with Gammatone filter (Or any filter). How does one implement it? Is it a custom layer? Any tips and guidance would be much appreciated!
for instance
conv_1 = Conv1D(filters = 64, kernel_size = 3, kernel_initializer = *insert Gammatone Filter*, padding = 'same', activation='relu', input_shape = (timesteps, features))(decoder_outputs3)
TIA
You could use TensorFlows constant initializer:
gammatone_filter_kernel = np.array([...])
init_kernel = tf.constant_initializer(gammatone_filter_kernel)
# ...
conv_1 = Conv1D(filters = 64, kernel_size = 3, kernel_initializer = init_kernel, padding = 'same', activation='relu', input_shape = (timesteps, features))(decoder_outputs3)
# ...
If your filter is some kind of preprocessing step to your signal you could set the trainable attribute of the conv laver to False and the weights will be fixed.
I want to classification for pictures with different input sizes. I would like to use the following paper ideas.
'Fully Convolutional Networks for Semantic Segmentation'
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf
I did change the dense layer to conv2D layer like this.
def FullyCNN(input_shape, n_classes):
inputs = Input(shape=(None, None, 1))
first_layer = Conv2D(filters=16, kernel_size=(12,16), strides=1, activation='relu', kernel_initializer='he_normal', name='conv1')(inputs)
first_layer = BatchNormalization()(first_layer)
first_layer = MaxPooling2D(pool_size=2)(first_layer)
second_layer = Conv2D(filters=24, kernel_size=(8,12), strides=1, activation='relu', kernel_initializer='he_normal', name='conv2')(first_layer)
second_layer = BatchNormalization()(second_layer)
second_layer = MaxPooling2D(pool_size=2)(second_layer)
third_layer = Conv2D(filters=32, kernel_size=(5,7), strides=1, activation='relu', kernel_initializer='he_normal', name='conv3')(first_layer)
third_layer = BatchNormalization()(third_layer)
third_layer = MaxPooling2D(pool_size=2)(third_layer)
fully_layer = Conv2D(64, kernel_size=8, activation='relu', kernel_initializer='he_normal')(third_layer)
fully_layer = BatchNormalization()(fully_layer)
fully_layer = Dropout(0.5)(fully_layer)
fully_layer = Conv2D(n_classes, kernel_size=1)(fully_layer)
output = Conv2DTranspose(n_classes, kernel_size=1, activation='softmax')(fully_layer)
model = Model(inputs=inputs, outputs=output)
return model
and I made generator for using fit_generator().
def data_generator(x_train, y_train):
while True:
index = np.asscalar(np.random.choice(len(x_train),1))
feature = np.expand_dims(x_train[index],-1)
feature = np.resize(feature,(-1,feature.shape))
feature = np.expand_dims(feature,0) # make (1,input_height,input_width,1)
label = y_train[index]
yield (feature,label)
and These are images about my data.
However, there is some problems about dimension.
Since the output layer must have 4 dimensions unlike the original CNN model, dimensions do not fit in the label.
Model summary:
Original CNN model summary:
How can I handle this problem? I tried to change dimension about label by expanding the dimension.
label = np.expand_dims(label,0)
label = np.expand_dims(label,0)
label = np.expand_dims(label,0)
I think there is a better way and I wonderd that is it necessary to have conv2DTranspose? And should batch size be 1?
I am currently trying to train a model (hypernetwork) that can predict the weights for another model (main network) such that the main network's cross-entropy loss decreases. However when I use tf.assign to assign the new weights to the network it does not allow backpropagation into the hypernetwork thus rendering the system non-differentiable. I have tested whether my weights are properly updated and they seem to be since when subtracting initial weights from updated ones is a non zero sum.
This is a minimal sample of what I am trying to achieve.
import numpy as np
import tensorflow as tf
from tensorflow.contrib.layers import softmax
def random_addition(variables):
addition_update_ops = []
for variable in variables:
update = tf.assign(variable, variable+tf.random_normal(shape=variable.get_shape()))
addition_update_ops.append(update)
return addition_update_ops
def network_predicted_addition(variables, network_preds):
addition_update_ops = []
for idx, variable in enumerate(variables):
if idx == 0:
print(variable)
update = tf.assign(variable, variable + network_preds[idx])
addition_update_ops.append(update)
return addition_update_ops
def dense_weight_update_net(inputs, reuse):
with tf.variable_scope("weight_net", reuse=reuse):
output = tf.layers.conv2d(inputs=inputs, kernel_size=(3, 3), filters=16, strides=(1, 1),
activation=tf.nn.leaky_relu, name="conv_layer_0", padding="SAME")
output = tf.reduce_mean(output, axis=[0, 1, 2])
output = tf.reshape(output, shape=(1, output.get_shape()[0]))
output = tf.layers.dense(output, units=(16*3*3*3))
output = tf.reshape(output, shape=(3, 3, 3, 16))
return output
def conv_net(inputs, reuse):
with tf.variable_scope("conv_net", reuse=reuse):
output = tf.layers.conv2d(inputs=inputs, kernel_size=(3, 3), filters=16, strides=(1, 1),
activation=tf.nn.leaky_relu, name="conv_layer_0", padding="SAME")
output = tf.reduce_mean(output, axis=[1, 2])
output = tf.layers.dense(output, units=2)
output = softmax(output)
return output
input_x_0 = tf.zeros(shape=(32, 32, 32, 3))
target_y_0 = tf.zeros(shape=(32), dtype=tf.int32)
input_x_1 = tf.ones(shape=(32, 32, 32, 3))
target_y_1 = tf.ones(shape=(32), dtype=tf.int32)
input_x = tf.concat([input_x_0, input_x_1], axis=0)
target_y = tf.concat([target_y_0, target_y_1], axis=0)
output_0 = conv_net(inputs=input_x, reuse=False)
target_y = tf.one_hot(target_y, 2)
crossentropy_loss_0 = tf.losses.softmax_cross_entropy(onehot_labels=target_y, logits=output_0)
conv_net_parameters = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="conv_net")
weight_net_parameters = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="weight_net")
print(conv_net_parameters)
weight_updates = dense_weight_update_net(inputs=input_x, reuse=False)
#updates_0 = random_addition(conv_net_parameters)
updates_1 = network_predicted_addition(conv_net_parameters, network_preds=[weight_updates])
with tf.control_dependencies(updates_1):
output_1 = conv_net(inputs=input_x, reuse=True)
crossentropy_loss_1 = tf.losses.softmax_cross_entropy(onehot_labels=target_y, logits=output_1)
check_sum = tf.reduce_sum(tf.abs(output_0 - output_1))
c_opt = tf.train.AdamOptimizer(beta1=0.9, learning_rate=0.001)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) # Needed for correct batch norm usage
with tf.control_dependencies(update_ops): # Needed for correct batch norm usage
train_variables = weight_net_parameters #+ conv_net_parameters
c_error_opt_op = c_opt.minimize(crossentropy_loss_1,
var_list=train_variables,
colocate_gradients_with_ops=True)
init=tf.global_variables_initializer()
with tf.Session() as sess:
init = sess.run(init)
loss_list_0 = []
loss_list_1 = []
for i in range(1000):
_, checksum, crossentropy_0, crossentropy_1 = sess.run([c_error_opt_op, check_sum, crossentropy_loss_0,
crossentropy_loss_1])
loss_list_0.append(crossentropy_0)
loss_list_1.append(crossentropy_1)
print(checksum, np.mean(loss_list_0), np.mean(loss_list_1))
Does anyone know how I can get tensorflow to compute the gradients for this? Thank you.
In this case your weights aren't variables, they are computed tensors based on the hypernetwork. All you really have is one network during training. If I understand you correctly you are then proposing to discard the hypernetwork and be able to use just the main network to perform predictions.
If this is the case then you can either save the weight values manually and reload them as constants, or you could use tf.cond and tf.assign to assign them as you are doing during training, but use tf.cond to choose to use the variable or the computed tensor depending on whether you're doing training or inference.
During training you will need to use the computed tensor from the hypernetwork in order to enable backprop.
Example from comments, w is the weight you'll use, you can assign a variable during training to keep track of it, but then use tf.cond to either use the variable (during inference) or the computed value from the hypernetwork (during training). In this example you need to pass in a boolean placeholder is_training_placeholder to indicate if you're running training of inference.
tf.assign(w_variable, w_from_hypernetwork)
w = tf.cond(is_training_placeholder, true_fn=lambda: w_from_hypernetwork, false_fn=lambda: w_variable)
I have built an autoencoder using tf.layers.conv2d layers and would like to train it in phases. That is to train the outer layers first then the middle layers and then the inner. I understand this is possible using tf.nn.conv2d because the weights are declared using tf.get_variable but I would think this should also be possible using tf.layers.conv2d.
If I enter a new variable scope different from the original graph to change the inputs to the convolutional layers (i.e. skip the inner layers during phase 1) I am not able to reuse the weights. If I do not enter a new variable scope I am not able to freeze the weights that I dont want to train in this phase.
Basically I am trying to use the training method from Aurélien Géron here https://github.com/ageron/handson-ml/blob/master/15_autoencoders.ipynb
Except I would like to use a cnn instead of dense layers. How to do it?
No need to create the variables by hand. This works just as well:
import tensorflow as tf
inputs_1 = tf.placeholder(tf.float32, (None, 512, 512, 3), name='inputs_1')
inputs_2 = tf.placeholder(tf.float32, (None, 512, 512, 3), name='inputs_2')
with tf.variable_scope('conv'):
out_1 = tf.layers.conv2d(inputs_1, 32, [3, 3], name='conv_1')
with tf.variable_scope('conv', reuse=True):
out_2 = tf.layers.conv2d(inputs_2, 32, [3, 3], name='conv_1')
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(tf.trainable_variables())
If you give tf.layers.conv2d the same name, it will use the same weights (assuming reuse=True, otherwise there will be a ValueError).
In Tesorflow 2.0: tf.layers were replaced by keras layers where the variables are reused by using the same layer object:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation='relu',
input_shape=(512, 512, 3)),
])
#tf.function
def f1(x):
return model(x)
#tf.function
def f2(x):
return model(x)
Both f1 and f2 will use the layer with the same variables
I'd recommend setting it up a little bit differently. Instead of using tf.layers.conv2d, I would explicitly make the weights using calls to tf.get_variable() and then use these weights with calls to tf.nn.conv2d(). This way, you don't blackbox the variable creation, and can reference them easily. It's also a good way to learn exactly what's going on in your network, since you wrote the shapes for every set of weights by hand!
Sample (untested) code:
inputs = tf.placeholder(tf.float32, (batch_size, 512, 512, 3), name='inputs')
weights = tf.get_variable(name='weights', shape=[5, 5, 3, 16], dtype=tf.float32)
with tf.variable_scope("convs"):
hidden_layer_1 = tf.nn.conv2d(input=inputs, filter=weights, stride=[1, 1, 1, 1], padding="SAME")
with tf.variable_scope("convs", reuse=True):
hidden_layer_2 = tf.nn.conv2d(input=hidden_layer_1, filter=weights,stride=[1, 1, 1, 1], padding="SAME"
This creates convolutional weights and applies it twice to your input. I haven't tested this code, so there may be bugs, but it's about how it should look. References here for variable sharing and here for tf.nn.conv2d.
Hopefully that helps! I would be more thorough, but I have no idea what your code looks like.
I read from here that it is recommended to always use tf.get_variable(...) although this seems a bit troublesome when I'm trying to implement a network.
For example:
def create_weights(shape, name = 'weights',\
initializer = tf.random_normal_initializer(0, 0.1)):
weights = tf.get_variable(name, shape, initializer = initializer)
print("weights created named: {}".format(weights.name))
return(weights)
def LeNet(in_units, keep_prob):
# define the network
with tf.variable_scope("conv1"):
conv1 = conv(in_units, create_weights([5, 5, 3, 32]), create_bias([32]))
pool1 = maxpool(conv1)
with tf.variable_scope("conv2"):
conv2 = conv(pool1, create_weights([5, 5, 32, 64]), create_bias([64]))
pool2 = maxpool(conv2)
# reshape the network to feed it into the fully connected layers
with tf.variable_scope("flatten"):
flatten = tf.reshape(pool2, [-1, 1600])
flatten = dropout(flatten, keep_prob)
with tf.variable_scope("fc1"):
fc1 = fc(flatten, create_weights([1600, 120]), biases = create_bias([120]))
fc1 = dropout(fc1, keep_prob)
with tf.variable_scope("fc2"):
fc2 = fc(fc1, create_weights([120, 84]), biases = create_bias([84]))
with tf.variable_scope("logits"):
logits = fc(fc2, create_weights([84, 43]), biases = create_bias([43]))
return(logits)
I have to use with tf_variable_scope(...) every single time I call create_weights, and furthermore, say if I wanted to change the conv1 variable's weights to [7, 7, 3, 32] instead of [5, 5, 3, 32] I would have to restart the kernel as the variable already exists. On the other hand if I use tf.Variable(...) I wouldn't have any of these problems.
Am I using tf.variable_scope(...) incorrectly?
It seems that you cannot change what already exists in a variable scope, thus only when you restart the kernel, you can change a variable that you defined before.(In fact you create a new one because the previous one has been deleted)
...
that is only my guess...I will appreciate it if someone can give a detailed answer.