I usually structure my learnable parameters in the following way in tensorflow:
learnable_weights = {
'w1': tf.get_variable(...),
...
'wn': tf.get_variable(...),
}
learnable_biases = {
'bc1': tf.get_variable(...),
...
'bd3': tf.get_variable(...)
}
The problem I started to recently encounter is the congested tensorboard graph, where I have a lot of weights in the auxilary nodes (this is a part of a big graph, and the number of these nodes is way bigger):
I tried to group them with tf.name_scope. Something like this:
with tf.name_scope('learnable_params'):
learnable_weights = {...}
learnable_biases = {...}
But this has no effect on the graph in tensorboard.
Any reason why or better any suggestions how to group the learnable parameters so that they will not clutter the tensorboard?
You could try using variable_scope instead of name_scope. AFAIK variables created via get_variable ignore name_scope, and I wouldn't be surprised if this applies to the graph organization in Tensorboard as well. I only use variable_scope to wrap anything that creates variables and I've never had these issues with "unorganized" variables.
Related
I successfully managed to implement learning to rank by following the tutorial TF-Ranking for sparse features using the ANTIQUE question answering dataset.
Now my goal is to successfully save the learned model to disk so that I can easily load it without training again. Due to the Tensorflow docs, the estimator.export_saved_model() method seems to be the way to go. But I can't wrap my head around how to tell Tensorflow how my feature structure looks like. Due to the docs here the easiest way seems to be calling tf.estimator.export.build_parsing_serving_input_receiver_fn(), which returns me the required inpur receiver function which I have to pass to the export_saved_model function. But how do I tell Tensorflow how my features from my learning to rank model look like?
From my current understanding the model has context feature specs and example feature specs. So I guess I somehow have to combine those two specs into one feature description, which I then can pass to the build_parsing_serving_input_receiver_fn function?
So I think you are on the right track;
You can get a build_ranking_serving_input_receiver_fn like this: (substitue context_feature_columns(...) and example_feature_columns(...) with defs you probably have for creating your own context and example structures for your training data):
def example_serving_input_fn():
context_feature_spec = tf.feature_column.make_parse_example_spec(
context_feature_columns(_VOCAB_PATHS).values())
example_feature_spec = tf.feature_column.make_parse_example_spec(
list(example_feature_columns(_VOCAB_PATHS).values()))
servingInputReceiver = tfr.data.build_ranking_serving_input_receiver_fn(
data_format=tfr.data.ELWC,
context_feature_spec=context_feature_spec,
example_feature_spec=example_feature_spec,
list_size=_LIST_SIZE,
receiver_name="input_ranking_data",
default_batch_size=None)
return servingInputReceiver
And then pass this to export_saved_model like this:
ranker.export_saved_model('path_to_save_model', example_serving_input_fn())
(ranker here is a tf.estimator.Estimator, maybe you called this 'estimator' in your code)
ranker = tf.estimator.Estimator(
model_fn=model_fn,
model_dir=_MODEL_DIR,
config=run_config)
I'm trying to set up a DNN for classification and at one point I want to take the tensor product of a vector with itself. I'm using the Keras functional API at the moment but it isn't immediately clear that there is a layer that does this already.
I've been attempting to use a Lambda layer and numpy in order to try this, but it's not working.
Doing a bit of googling reveals
tf.linalg.LinearOperatorKronecker, which does not seem to work either.
Here's what I've tried:
I have a layer called part_layer whose output is a single vector (rank one tensor).
keras.layers.Lambda(lambda x_array: np.outer(x_array, x_array),) ( part_layer) )
Ideally I would want this to to take a vector of the form [1,2] and give me [[1,2],[2,4]].
But the error I'm getting suggests that the np.outer function is not recognizing its arguments:
AttributeError: 'numpy.ndarray' object has no attribute '_keras_history
Any ideas on what to try next, or if there is a simple function to use?
You can use two operations:
If you want to consider the batch size you can use the Dot function
Otherwise, you can use the the dot function
In both case the code should look like this:
dot_lambda = lambda x_array: tf.keras.layers.dot(x_array, x_array)
# dot_lambda = lambda x_array: tf.keras.layers.Dot(x_array, x_array)
keras.layers.Lambda(dot_lamda)( part_layer)
Hope this help.
Use tf.tensordot(x_array, x_array, axes=0) to achieve what you want. For example, the expression print(tf.tensordot([1,2], [1,2], axes=0)) gives the desired result: [[1,2],[2,4]].
Keras/Tensorflow needs to keep an history of operations applied to tensors to perform the optimization. Numpy has no notion of history, so using it in the middle of a layer is not allowed. tf.tensordot performs the same operation, but keeps the history.
I think many other people like me might be interested in how they can use GPFlow for their special problems. The key is how GPFlow is customizable, and a good example would be very helpful.
In my case, I read and tried lots of comments in raised issues without any real success. Setting kernel model parameters is not straightforward (creating with default values, and then do it via the delete object method). Transform method is vague.
It would be really helpful if you could add an example showing. how one can initialize and set bounds of an anisotropic kernel model (length-scales values and bounds, variances, ...) and specially adding observations error (as an array-like alpha parameter)
If you just want to set a value, then you can do
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1, lengthscales=0.2))
Alternatively
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1))
model.kern.lengthscales = 0.2
If you want to change the transform, you either need to subclass the kernel, or you can also do
with gpflow.defer_build():
model = gpflow.models.GPR(np.zeros((1, 1)),
np.zeros((1, 1)),
gpflow.kernels.RBF(1))
transform = gpflow.transforms.Logistic(0.1, 1.))
model.kern.lengthscales = gpflow.params.Parameter(0.3, transform=transform)
model.compile()
You need the defer_build to stop the graph being compiled before you've changed the transform. Using the approach above, the compilation of the tensorflow graph is delayed (until the explicit model.compile()) so is built with the intended bounding transform.
Using an array parameter for likelihood variance is outside the scope of gpflow. For what it's worth (and because it has been asked about before), that particular model is especially problematic as it is not clear how test points are defined.
Setting kernel parameters can be done using the .assign() function, or through direct assignment. See the notebook https://github.com/GPflow/GPflow/blob/develop/doc/source/notebooks/understanding/tf_graphs_and_sessions.ipynb. You do not need to delete a parameter to assign a new value to it.
If you want to have per-datapoint noise, you will need to implement your own custom likelihood, which you can do by taking Gaussian likelihood in likelihoods.py as an example.
If by "bounds" you mean limiting the optimisation range for a parameter, you can use the Logistic transform. If you want to pass in a custom transformation for a parameter, you can pass a constructed Parameter object into constructors with a custom transform. Alternatively you can assign a newly created Parameter with a new transform to the model.
Here is more information on how to access and change GPflow parameters: viewing, getting and settings parameters documentation.
Extra bit for #user1018464 answer about replacing transform in existing parameter: changing transformation is a bit tricky, you can't change transformation once a model was compiled in TensorFlow.
E.g.
likelihood = gpflow.likelihoods.Gaussian()
likelihood.variance.transform = gpflow.transforms.Logistic(1., 10.)
----
GPflowError: Parameter "Gaussian/variance" has already been compiled.
Instead you have to reset GPflow object:
likelihood = gpflow.likelihoods.Gaussian() # All tensors compiled
likelihood.clear()
likelihood.variance.transform = gpflow.transforms.Logistic(2, 5)
likelihood.variance = 2.5
likelihood.compile()
I am trying to implement a siamese network , similar to below image
for representation.
In this I have made a class SiameseNet which implements one cnn's output. What I am trying to do is that I create two instance of this class to make two different neural nets, with compulsory condition that they both have same weights.THis is what I have tried so far , but haven't reached a working solution due to some mis conceptions regarding how should I vary the scopes and still manage weight sharing or whatever I am missing here.
class SiameseNet():
def __init__(self,X):
self.input_layer=X
def model(self):
with tf.variable_scope('layer1',reuse=True):
layer1=tf.layers.conv2d(inputs=self.input_layer,filters=8,kernel_size=[1,1],padding='same',activation=tf.nn.relu)
batch_layer1=tf.layers.batch_normalization(inputs=layer1,axis=-1)
dropout_layer1=tf.layers.dropout(inputs=batch_layer1,rate=0.2)#,training=mode == tf.estimator.ModeKeys.TRAIN)
with tf.variable_scope('layer2',reuse=True):
layer2=tf.layers.conv2d(inputs=dropout_layer1,filters=8,kernel_size=[4,4],padding='same',activation=tf.nn.relu)
batch_layer2=tf.layers.batch_normalization(inputs=layer2,axis=-1)
dropout_layer2=tf.layers.dropout(inputs=batch_layer2,rate=0.2)#,training=mode==tf.estimator.ModeKeys.TRAIN)
with tf.variable_scope('layer3',reuse=True):
layer3=tf.layers.conv2d(inputs=dropout_layer2,filters=16,kernel_size=[4,4],padding='same',activation=tf.nn.relu)
batch_layer3=tf.layers.batch_normalization(inputs=layer3,axis=-1)
dropout_layer3=tf.layers.dropout(inputs=batch_layer3,rate=0.2)#,training=mode==tf.estimator.ModeKeys.TRAIN)
with tf.variable_scope('logits',reuse=True):
flatten_layer3= tf.layers.flatten(dropout_layer3)
dense_layer4=tf.layers.dense(inputs=flatten_layer3,units=1000,activation=tf.nn.relu)
logits=tf.layers.dense(inputs=dense_layer4,units=500)
return logits
How i was intending to use it , to make two covnets with shared weights but both receiving different images as input
with tf.Session() as sess:
net1=SiameseNet(x1).model() #x1 = Image1
net2=SiameseNet(x3).model() #x2= Image2
loss=Loss(2)
optimiser=tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
for i in range(4):
l=sess.run(loss.contrastive_loss(1,net1,net2))
print(l)
This gives me this error
ValueError: Variable layer1/conv2d/kernel does not exist, or was not
created with tf.get_variable(). Did you mean to set
reuse=tf.AUTO_REUSE in VarScope?
What I seeking is any clearance on where I am going wrong in terms of correct usage of tensorflow to make two neural nets from a single class SiameseNet. Any clearance on how to use variable scope here so that I can use same weights .Also, what if I don't use variable scope , would that mean that there will be some duplication of inputs or what.
Thanks for your time.
When you set the reuse argument of tf.variable_scope to True, TensorFlow expects the variables (with the name that you provide) to exist within the scope, and that is not the case when you define your first network. Instead you could set reuse=tf.AUTO_REUSE, and the variables will be created in case they don't exist.
If you don't use variable scope, two different networks with non-shared weights will be created. In case you want to avoid using a variable scope directly, there is also the option of setting reuse=tf.AUTO_REUSE in tf.layers.dense and tf.layers.conv2d.
In tensorflow, there's a class GraphKeys. I came across many codes, where it's been used. But it's not explained very well what's the usage of this class both in tensorflow documentation as well as in the codes, where it has been used.
Can someone please explain what's the usage of tf.GraphKey?
Thank you!
As far as I know, tf.GraphKeys is a collection of collections of keys for variables and ops in the graph. The usage (just as common python dictionaries) is to retrieve variables and ops.
Given that said, here are some subsets of tf.GraphKeys I came across:
GLOBAL_VARIABLES and LOCAL_VARIABLES contain all variables of the graph, which need to be initialized before training. tf.global_variables() returns the global variables in a list and can be used with tf.variables_initializer for initialization.
Variables created with option trainable=True will be added to TRAINABLE_VARIABLES and will be fetched and updated by any optimizer under tf.train during training.
SUMMARIES contains keys for all summaries added by tf.summary (scalar, image, histogram, text, etc). tf.summary.merge_all gathers all such keys and returns an op to be run and written to file so that you can visualize them on tensorboard.
Custom functions to update some variables can be added to UPDATE_OPS and separately run at each iteration using sess.run(tf.get_collection(tf.GraphKeys.UPDATE_OPS)). In this case, these variables are set trainable=False to avoid being updated by gradient descent.
You may create your own collections using tf.add_to_collection(some_name, var_or_op) and retrieve the variable or op later. You may retrieve specific variables or ops using tf.get_collection() and tweak the scope.