how to create a tf.layers.Dense object - tensorflow

I want to create a dense layer in tensorflow. I tried tf.layers.dense(input_placeholder, units) which will directly create this layer and get result, but what I want is just a "layer module", i.e. an object of the class tf.layers.Dense(units). I want to first declare these modules/layers in a class, and then to have several member functions apply1(x, y), apply2(x,y) to use these layers.
But when I did in tensorflow tf.layers.Dense(units), it returned:
layer = tf.layers.Dense(100) AttributeError: 'module' object has no
attribute 'Dense'
But if I do tf.layers.dense(x, units), there's no problem.
Any help is appreciated, thanks.

tf.layers.Dense returns a function object that you later apply to your input. It performs variable definitions.
func = tf.layers.Dense(out_dim)
out = func(inputs)
tf.layers.dense performs both variable definitions and application of the dense layer to your input to calculate your output.
out = tf.layers.dense(inputs, out_dim)

Try to avoid the usage of placeholders, you have to feed_dict into the tf.Session so its probably causing this issue.
Try to use the new estimator api to load the data and then use dense layers as is done in the tensorflow's github examoples: [https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/layers/cnn_mnist.py]:

tf.layers.Dense was not exported in TensorFlow before version 1.4. You probably have version 1.3 or earlier installed. (You can check the version with python -c 'import tensorflow as tf; print(tf.__version__)'.)

Related

tensor conversion function numpy() doesn't work within tf.estimator model function

I have tried this with both tensorflow v2.0 and v1.12.0 (with tf.enable_eager_execution() ). So apparently if I call numpy() with the code snippet shown below in my main() function, it works perfectly. However if I use it in my estimator model function i.e., model_fn(features, labels, mode, params) then it complains that 'Tensor' object has no attribute 'numpy'.
ndarray = np.ones([3, 3])
tensor = tf.multiply(ndarray, 42)
print(tensor)
print(tensor.numpy())
Has anyone else experienced similar problem? Seems like a big issue for tf.estimator no?
It won't work. Estimator API is tied to graph construction and it doesn't fully support eager execution. As per official documentation.
Calling methods of Estimator will work while eager execution is
enabled. However, the model_fn and input_fn is not executed eagerly
https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator
TF 2.0 won't even support custom estimators, only premade ones.

how to do finetune using pre-trained model in tf.estimator

i got a model converted from caffe by using MMDNN tool, it converted the caffe model into a saved_model tensorflow style. it's a resnet18 model, and i just strip out several layers in the last, i wish i could load this architecture in the model_fn in a tf.estimator, and manually add some extra layers to do my job.
As the tutorial recommended that I could use loader.load method to load the saved_model. But i just want to use it in a estimator, and i need to define the architecture in the model_fn function. I searched out the SO and github but there isn't a very specific workflow to do that thing, somebody could help me out?
Here is one way of fine tuning using tf.Estimator:
Define your model using the SAME variable names/scopes as in your saved model
Use tf.estimator's warm start functions to initialize your new model with the saved weights. Here is a code snippet :
if fine_tuning:
ws = tf.estimator.WarmStartSettings(ckpt_to_initialize_from=path_saved_model,
vars_to_warm_start='.*')
else:
ws = None
estimator = tf.estimator.Estimator(model_fn=model_function,
warm_start_from=ws,
...
)
This will initialize any variable that share names between your currently defined graph and the saved model.

Keras + Tensorflow model convert to coreml exits NameError: global name ... is not defined

I've adapted the VAE example from the keras site to train on my data, and everything runs fine. But I'm unable to convert to coreml. The error is:
NameError: global name `batch_size' is not defined
Since batch_size clearly is defined in the python source, I'm guessing it has to do with how the conversion tool captures variable names. Does anyone know how I can fix it (or whether it is, indeed, possible to fix)?
Many thanks,
J.
I ran into a similar message when using parameters to construct the neural net. This should work:
from keras import models
batch_size = 50
model = models.load_model(filename, custom_objects={'batch_size': batch_size})
See also documentation: https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model

Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?

This is a simple thing which I just couldn't figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1})
with tf.Session() as sess:
net.load("vgg16.npy", sess)
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64])
b_conv1_b = bias_variable([64])
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would like to kindly remind you the following points:
var.assign(data) where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.
The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.
I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.
You can get variable values using eval method of tf.Variable-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable).

TensorFlow attention_decoder with RNNCell (state_is_tuple=True)

I want to build a seq2seq model with an attention_decoder, and to use MultiRNNCell with LSTMCell as the encoder. Because the TensorFlow code suggests that "This default behaviour (state_is_tuple=False) will soon be deprecated.", I set the state_is_tuple=True for the encoder.
The problem is that, when I pass the state of encoder to attention_decoder, it reports an error:
*** AttributeError: 'LSTMStateTuple' object has no attribute 'get_shape'
This problem seems to be related to the attention() function in seq2seq.py and the _linear() function in rnn_cell.py, in which the code calls the 'get_shape()' function of the 'LSTMStateTuple' object from the initial_state generated by the encoder.
Although the error disappears when I set state_is_tuple=False for the encoder, the program gives the following warning:
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.LSTMCell object at 0x11763dc50>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
I would really appreciate if someone can give any instruction about building seq2seq with RNNCell (state_is_tuple=True).
I ran into this issue also, the lstm states need to be concatenated or else _linear will complain. The shape of LSTMStateTuple depends on the kind of cell you're using. With a LSTM cell, you can concatenate the states like this:
query = tf.concat(1,[state[0], state[1]])
If you're using a MultiRNNCell, concatenate the states for each layer first:
concat_layers = [tf.concat(1,[c,h]) for c,h in state]
query = tf.concat(1, concat_layers)