Tensorflow vgg16 prediction dramatically slow - tensorflow

I use vgg.h5 model + Keras (Tensorflow backend on GPU) for real-time objects classification. It works good.
Then I try to use pure tensorflow graphs with weights from vgg.h5:
I've parsed vgg.h5 with h5py and have received weights for all layers in numpay.array format
I've built a graph (I store kernels and biases in tf.Variable)
But I can't receive output prediction vector. After investigation I've found out that all convolutional layers work, but the first full connection layer output (fc1 with 25088 x 4096 weights matrix) in vgg16 calculated for about 5 minutes. It's not appropriate for real time classification.
So, maybe anyone has experience of building vgg16 from scratch in tensorflow and can help? Why tensorflow as Keras backend works good, but pure tensorflow (with the same weights) can't calculate full connection output? Is there any additional optimisation in Keras for realising full connection (Dense) layers?

Here is a test variant of your code, instrumented with printing shapes of tensors in several places:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
# mock the previous layer's output with a placeholder
pool5_input = tf.placeholder(dtype = tf.float32, shape = (None,7,7,512))
# insert a print operation to print the shape
pool5 = tf.Print(pool5_input, [ tf.shape(pool5_input) ], "pool5 shape is ", summarize = 4)
layer_name = 'fc1'
wd = tf.Variable(np.ones((25088, 4096), dtype='float32'), trainable=False, name=layer_name+'_wd')
bd = tf.Variable(np.ones((4096,), dtype='float32'), trainable=False, name=layer_name+'_bd')
layer_shape = [-1, wd.get_shape().as_list()[0]]
print('layer_shape:', layer_shape)
fc1_flat = tf.reshape(pool5, shape=layer_shape)
fc1_flat = tf.Print(fc1_flat, [ tf.shape(fc1_flat) ], "fc1_flat shape is ")
fc1 = tf.nn.relu( tf.nn.bias_add( tf.matmul(fc1_flat, wd, name=layer_name), bd ) )
fc1 = tf.Print(fc1, [ tf.shape(fc1) ], "fc1 shape is ")
import time
sess.run(tf.global_variables_initializer())
# evaluate network for in input of (minibatch_size, 7, 7, 512)
minibatch_size = 32
start = time.time()
output = sess.run(fc1, feed_dict = { pool5_input: np.ones((minibatch_size, 7, 7, 512), dtype = 'float32')})
elapsed = time.time() - start
print("time to evaluate fully connected layer for minibatch size %d: %.3f seconds" % (minibatch_size, elapsed))
print("output shape is",output.shape)
I get the following output:
layer_shape: [-1, 25088]
...: I tensorflow/core/kernels/logging_ops.cc:79] pool5 shape is [32 7 7 512]
...: I tensorflow/core/kernels/logging_ops.cc:79] fc1_flat shape is [32 25088]
...: I tensorflow/core/kernels/logging_ops.cc:79] fc1 shape is [32 4096]
time to evaluate fully connected layer for minibatch size 32: 0.329 seconds
output shape is (32, 4096)
so for me it takes less than a second (on a GPU) for a minibatch size of 32.
You could insert similar tf.Print() statements into your code and verify that you have the same (or similar) dimensions. By multiplying the sizes of the dimensions you can see how much memory is used at each stage.

Related

Tensorflow input shape incompatible with layer

I'm trying to build a Sequential model with tensorflow.
import tensorflow as tf
import keras
from tensorflow.keras import layers
from keras import optimizers
import numpy as np
model = keras.Sequential (name="model")
model.add(keras.Input(shape=(786,)))
model.add(layers.Dense(2048, activation="relu", name="layer1"))
model.add(layers.Dense(786, activation="relu", name="layer2"))
model.add(layers.Dense(786, activation="relu", name="layer3"))
output = model.add(layers.Dense(786, activation="relu", name="output"))
model.summary()
model.compile(
optimizer=tf.optimizers.Adam(), # Optimizer
loss=keras.losses.CategoricalCrossentropy(),
metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
history = model.fit(
x_train,
y_train,
batch_size=1,
epochs=5,
)
The input shape is a vector with length of 768 (so the input shape is (768,) right?), representing a chess board:
def get_dataset():
container = np.load('/content/drive/MyDrive/test_data_vector.npz')
b, v = container['arr_0'], container['arr_1']
v = np.asarray(v / abs(v).max() / 2 + 0.5, dtype=np.float32) # normalization (0 - 1)
return b, v
xtrain, ytrain = get_dataset()
print(xtrain.shape)
print(ytrain.shape)
>> (37, 786) #there are 37 samples
>> (37, 786)
But I always get the error:
ValueError: Input 0 of layer model is incompatible with the layer: expected axis -1 of input shape to have value 786 but received input with shape (1, 1, 768)
I tried with np.expand_dims(), which ended in the same Error.
The error is just a typo, as the user mentioned the issue is resolved by changing the output shape from 786 to 768 and the issue is resolved.
One suggestion based on the model structure.
The number of units are not related to your input shape, you don't have to match that number.
The number of units like 2048 and 786 in dense layer is too large and this may not help the model to learn better.
Try with smaller numbers like 32,64 etc, you can refer some of the examples in the tensorflow document.

Keras Input Layer Shape On Input Layer Error

I am trying to learn ai algorithms by building. I found a question on Stackoverflow which is here.
I copied this code to try it out, and then modified it to this.
import numpy as np
import tensorflow as tf
from tensorflow import keras as keras
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from tensorflow.python.keras import activations
# Importing the dataset
dataset = np.genfromtxt("data.txt", delimiter='')
X = dataset[:, :-1]
y = dataset[:, -1]
# Splitting the dataset into the Training set and Test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.08, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Initialising the ANN
#model = Sequential()
# Adding the input layer and the first hidden layer
#model.add(Dense(32, activation = 'relu', input_dim = 6))
# Adding the second hidden layer
#model.add(Dense(units = 32, activation = 'relu'))
# Adding the third hidden layer
#model.add(Dense(units = 32, activation = 'relu'))
# Adding the output layer
#model.add(Dense(units = 1))
#model = Sequential([
# keras.Input(shape= (6),name= "digits"),
# Dense(units = 32, activation = "relu"),
# Dense(units = 32, activation = "relu"),
# Dense(units = 1 , name = "predict")##
#])
#
input = keras.Input(shape= (6),name= "digits")
#x0 = Dense(units = 6)(input)
x1 = Dense(units = 32, activation = "relu")(input)
x2 = Dense(units = 32, activation = "relu")(x1)
output = Dense(units = 1 , name = "predict")(x2)
model = keras.Model(inputs = input , outputs= output)
#model.add(Dense(1))
# Compiling the ANN
#model.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fitting the ANN to the Training set
#model.fit(X_train, y_train, batch_size = 10, epochs = 200)
optimizer = keras.optimizers.Adam(learning_rate=1e-3)
loss = keras.losses.MeanSquaredError()
epochs = 200
for epoch in range(epochs):
print("\nStart of epoch %d" % (epoch,))
# Iterate over the batches of the dataset.
for step in range(len(X_train)):
# Open a GradientTape to record the operations run
# during the forward pass, which enables auto-differentiation.
with tf.GradientTape() as tape:
# Run the forward pass of the layer.
# The operations that the layer applies
# to its inputs are going to be recorded
# on the GradientTape.
logits = model( X_train[step] , training=True) # Logits for this minibatch
# Compute the loss value for this minibatch.
loss_value = loss(y_train[step], logits)
# Use the gradient tape to automatically retrieve
# the gradients of the trainable variables with respect to the loss.
grads = tape.gradient(loss_value, model.trainable_weights)
# Run one step of gradient descent by updating
# the value of the variables to minimize the loss.
optimizer.apply_gradients(zip(grads, model.trainable_weights))
# Log every 200 batches.
if step % 200 == 0:
print(
"Training loss (for one batch) at step %d: %.4f"
% (step, float(loss_value))
)
print("Seen so far: %s samples" % ((step + 1) * 64))
y_pred = model.predict(X_test)
plt.plot(y_test, color = 'red', label = 'Real data')
plt.plot(y_pred, color = 'blue', label = 'Predicted data')
plt.title('Prediction')
plt.legend()
plt.show()
I modified the code for creating data when processing. If I use model.fit, it uses data I have given but I wanted to when epochs start to create data from a simulation and then process it.(sorry for bad english. if i couldn't explain very well)
When I start code in line 81:
Exception has occurred: ValueError
Input 0 of layer dense is incompatible with the layer: : expected min_ndim=2, found ndim=1. Full shape received: (6,)
It gives an Exception. I tried to use shape=(6,) shape=(6,1) or similar to this but it doesn't fix anything.
You need to add a batch dimension when calling the keras model:
logits = model( X_train[step][np.newaxis,:] , training=True) # Logits for this minibatch
A batch dimension is used to feed multiple samples to the network. By default, Keras assumes that the input has a batch dimension. To feed one sample, Keras expects a batch of 1 sample. In that case, it means a shape of (1,6). If you want to feed a batch of 2 samples, then the shape will be (2,6), etc.

Implementing Variational Auto Encoder using Tensoroflow_Probability NOT on MNIST data set

I know there are many questions related to Variational Auto Encoders. However, this question in two aspects differs from the existing ones: 1) it is implemented using Tensforflow V2 and Tensorflow_probability; 2) It does not use MNIST or any other image data set.
As about the problem itself:
I am trying to implement VAE using Tensorflow_probability and Keras. and I want to train and evaluate it on some synthetic data sets --as part of my research. I provided the code below.
Although the implementation is done and during the training, the loss value decreases but once I want to evaluate the trained model on my test set I face different errors.
I am somehow confident that the issue is related to input/output shape but unfortunately I did not manage the solve it.
Here is the code:
import numpy as np
import tensorflow as tf
import tensorflow.keras as tfk
import tensorflow_probability as tfp
from tensorflow.keras import layers as tfkl
from sklearn.datasets import make_classification
from tensorflow_probability import layers as tfpl
from sklearn.model_selection import train_test_split
tfd = tfp.distributions
n_epochs = 5
n_features = 2
latent_dim = 1
n_units = 4
learning_rate = 1e-3
n_samples = 400
batch_size = 32
# Generate synthetic data / load data sets
x_in, y_in = make_classification(n_samples=n_samples, n_features=n_features, n_informative=2, n_redundant=0,
n_repeated=0, n_classes=2, n_clusters_per_class=2, weights=[0.5, 0.5],
flip_y=0.01, class_sep=1.0, hypercube=True,
shift=0.0, scale=1.0, shuffle=False, random_state=42)
x_in = x_in.astype('float32')
y_in = y_in.astype('float32') # .reshape(-1, 1)
x_train, x_test, y_train, y_test = train_test_split(x_in, y_in, test_size=0.4, random_state=42, shuffle=True)
x_test, x_val, y_test, y_val = train_test_split(x_test, y_test, test_size=0.5, random_state=42, shuffle=True)
print("shapes:", x_train.shape, y_train.shape, x_test.shape, y_test.shape, x_val.shape, y_val.shape)
prior = tfd.Independent(tfd.Normal(loc=[tf.zeros(latent_dim)], scale=1.), reinterpreted_batch_ndims=1)
train_dataset = tf.data.Dataset.from_tensor_slices(x_train).batch(batch_size)
valid_dataset = tf.data.Dataset.from_tensor_slices(x_val).batch(batch_size)
test_dataset = tf.data.Dataset.from_tensor_slices(x_test).batch(batch_size)
encoder = tf.keras.Sequential([
tfkl.InputLayer(input_shape=[n_features, ], name='enc_input'),
tfkl.Lambda(lambda x: tf.cast(x, tf.float32)), # - 0.5
tfkl.Dense(n_units, activation='relu', name='enc_dense1'),
tfkl.Dense(int(n_units / 2), activation='relu', name='enc_dense2'),
tfkl.Dense(tfpl.MultivariateNormalTriL.params_size(latent_dim),
activation=None, name='mvn_triL1'),
tfpl.MultivariateNormalTriL(
# weight >> num_train_samples or some thing except 1 to convert VAE to beta-VAE
latent_dim, activity_regularizer=tfpl.KLDivergenceRegularizer(prior, weight=1.), name='bottleneck'),
])
decoder = tf.keras.Sequential([
tfkl.InputLayer(input_shape=latent_dim, name='dec_input'),
# tfkl.Dense(n_units, activation='relu', name='dec_dense1'),
# tfkl.Dense(int(n_units * 2), activation='relu', name='dec_dense2'),
tfpl.IndependentBernoulli([n_features], tfd.Bernoulli.logits, name='dec_output'),
])
vae = tfk.Model(inputs=encoder.inputs, outputs=decoder(encoder.outputs), name='VAE')
print("enoder:", encoder)
print(" ")
print("encoder.inputs:", encoder.inputs)
print(" ")
print(" encoder.outputs:", encoder.outputs)
print(" ")
print("decoder:", decoder)
print(" ")
print("decoder:", decoder.inputs)
print(" ")
print("decoder.outputs:", decoder.outputs)
print(" ")
# negative log likelihood i.e the E_{S(eps)} [p(x|z)];
# because the KL term was added in the last layer of the encoder, i.e., via activity_regularizer.
# this loss function takes two arguments, namely the original data points x, and the output of the model,
# which we call it rv_x (because it is a random variable)
negloglik = lambda x, rv_x: -rv_x.log_prob(x)
vae.compile(optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
loss=negloglik,)
vae.summary()
history = vae.fit(train_dataset, epochs=n_epochs, validation_data=valid_dataset,)
print("x.shape:", x_test.shape)
x_hat = vae(x_test)
print("original:")
print(x_test)
print(" ")
print("Decoded Random Samples:")
print(x_hat.sample())
print(" ")
print("Decoded Means:")
print(x_hat.mean())
The Questions:
With the above code I receive the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 80 values, but the requested shape has 160 [Op:Reshape]
As far I know we can add as many layers as I want in the decoder model before its output layer --as it is done a convolutional VAEs, am I right?
If I uncomment the following two lines of code in decoder:
# tfkl.Dense(n_units, activation='relu', name='dec_dense1'),
# tfkl.Dense(int(n_units * 2), activation='relu', name='dec_dense2'),
I see the following warnings and the upcoming error:
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
And the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 640 values, but the requested shape has 160 [Op:Reshape]
Now the question is why the decoder layers are not used during the training as it is mentioned in the warning.
PS, I also tried to pass the x_train, x_valid, x_test directly during the training and evaluation process but it does not help.
Any helps would be indeed appreciated.

Generating Input for LSTM from universal sentence encoder output

I am working on a multi-class classification problem using LSTM and embeddings obtained from Universal sentence encoder.
Previously I was using Glove embeddings, and I get the required input shape for LSTM (batch_size, timesteps, input_dim). I am planning to use the Universal sentence encoder found that the output of Universal Sentence Encoder is 2d [batch, feature]. How can I make the required changes.
LSTM + Universal sentence encoder
EMBED_SIZE = 512
module_url = "https://tfhub.dev/google/universal-sentence-encoder-large/3"
embed = hub.Module(module_url)
def UniversalEmbedding(x):
return embed(tf.squeeze(tf.cast(x, tf.string)),
signature="default", as_dict=True)["default"]
seq_input = Input(shape=(MAX_SEQUENCE_LENGTH,),dtype='int32')
print("seq i",seq_input.shape,seq_input)
embedded_seq = Lambda(UniversalEmbedding,
output_shape=(EMBED_SIZE,))(seq_input)
print("EMD SEQ",embedding.shape,type(embedded_seq))
# (timesteps, n_features) (,MAX_SEQUENCE_LENGTH, EMBED_SIZE) (,150,512)
x_1 = LSTM(units=NUM_LSTM_UNITS,
name='blstm_1',
dropout=DROP_RATE_LSTM)(embedded_seq)
print(x_1)
This produces following error
seq i (?, 150) Tensor("input_8:0", shape=(?, 150), dtype=int32)
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
I0529 07:24:32.504808 140127577749376 saver.py:1483] Saver not created because there are no variables in the graph to restore
EMD SEQ (?, 512) <class 'tensorflow.python.framework.ops.Tensor'>
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-34-ea634319205b> in <module>()
12 x_1 = LSTM(units=NUM_LSTM_UNITS,
13 name='blstm_1',
---> 14 dropout=DROP_RATE_LSTM)(embedded_seq)
15 print(x_1)
16
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py in assert_input_compatibility(self, inputs)
309 self.name + ': expected ndim=' +
310 str(spec.ndim) + ', found ndim=' +
--> 311 str(K.ndim(x)))
312 if spec.max_ndim is not None:
313 ndim = K.ndim(x)
ValueError: Input 0 is incompatible with layer blstm_1: expected ndim=3, found ndim=2
LSTM + Glove embeddings
embedding_layer = Embedding(nb_words,
EMBED_SIZE,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
seq_input = Input(shape=(MAX_SEQUENCE_LENGTH,),dtype='int32')
print("SEQ INP",seq_input,seq_input.shape)
embedded_seq = embedding_layer(seq_input)
print("EMD SEQ",embedded_seq.shape)
# Bi-directional LSTM # (timesteps, n_features)
x_1 = Bidirectional(LSTM(units=NUM_LSTM_UNITS,
name='blstm_1',
dropout=DROP_RATE_LSTM,
recurrent_dropout=DROP_RATE_LSTM),
merge_mode='concat')(embedded_seq)
x_1 = Dropout(DROP_RATE_DENSE)(x_1)
x_1 = Dense(NUM_DENSE_UNITS,activation='relu')(x_1)
x_1 = Dropout(DROP_RATE_DENSE)(x_1)
OUTPUT (This works properly with LSTM)
SEQ INP Tensor("input_2:0", shape=(?, 150), dtype=int32) (?, 150)
EMD SEQ (?, 150, 300)
Sentence Encoder is different from word2vec or Glove, it's not word-level embeddings:
The model is trained and optimized for greater-than-word length text,
such as sentences, phrases or short paragraphs. It is trained on a
variety of data sources and a variety of tasks with the aim of
dynamically accommodating a wide variety of natural language
understanding tasks. The input is variable length English text and the
output is a 512 dimensional vector. We apply this model to the STS
benchmark for semantic similarity, and the results can be seen in the
example notebook made available. The universal-sentence-encoder model
is trained with a deep averaging network (DAN) encoder.
The example above where they used "lambda" function is for FF neural network, and the input to the next layer is 2D, unlike RNN of CNN (3D).
Shortly, what you have to do is to prepare your text before then feed it to your network with Embedding layer:
def process_text(sentences_list):
path = './processed_data'
embeddings_file = "embeddings-{}.pickle".format(len(sentences_list))
if not os.path.isfile(join(path, embeddings_file)):
module_url = "https://tfhub.dev/google/universal-sentence-encoder-large/3"
embed = hub.Module(module_url)
with tf.Session() as sess:
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])
sentences_list = sess.run(embed(sentences_list))
sentences_list = np.array(sentences_list)
sentences_list = np.array([np.reshape(embedding, (len(embedding), 1)) for embedding in sentences_list])
pickle.dump(sentences_list, open(embeddings_file, 'wb'))
else:
sentences_list = pickle.load(open(join(path, embeddings_file), 'rb'))
return sentences_list
I recommend you to save the generated embeddings, as I do in the example, because it will take few time to retrieve the embeddings.
Source: Sentiment Analysis on Twitter Data using Universal Sentence Encoder

How to load MobileNet weights with an input tensor in Keras

I'm trying to apply transfer learning to MNIST using MobileNet weights in Keras. Keras documentation to use MobileNet https://keras.io/applications/#mobilenet
Mobilenet accepts 224x224x3 as input but MNIST is 28x28x1. I'm creating a Lambda layer which can convert 28x28x1 image into 224x224x3 and send it as input to MobileNet. The following code causes
TypeError: Input layers to a Model must be InputLayer objects. Received inputs: Tensor("lambda_2/ResizeNearestNeighbor:0", shape=(?, 224, 224, 3), dtype=float32). Input 0 (0-based) originates from layer type Lambda.
height = 28
width = 28
input_image = Input(shape=(height,width,1))
def resize_image_to_inception(x):
x = K.repeat_elements(x, 3, axis=3)
x = K.resize_images(x, 8, 8, data_format="channels_last")
return x
input_image_ = Lambda(resize_image_to_inception, output_shape=(224, 224, 3))(input_image)
print(type(input_image_))
base_model = MobileNet(input_tensor=input_image_, weights='imagenet', include_top=False)