Check which are the next layers in a tensorflow keras model - tensorflow

I have a keras model which has shortcuts between layers. For each layer, I would like to get the name (or index) of the next connected layers, because simply iterating through all the model.layers will not tell me whether the layer was connected to the previous one or not.
An example model could be:
model = tf.keras.applications.resnet50.ResNet50(
include_top=True, weights='imagenet', input_tensor=None,
input_shape=None, pooling=None, classes=1000)

You can extract the information in dict format in this way...
Firstly, define a utility function and get the relevant nodes as made in the model.summary() method from every Functional model (code reference)
relevant_nodes = []
for v in model._nodes_by_depth.values():
relevant_nodes += v
def get_layer_summary_with_connections(layer):
info = {}
connections = []
for node in layer._inbound_nodes:
if relevant_nodes and node not in relevant_nodes:
# node is not part of the current network
continue
for inbound_layer, node_index, tensor_index, _ in node.iterate_inbound():
connections.append(inbound_layer.name)
name = layer.name
info['type'] = layer.__class__.__name__
info['parents'] = connections
return info
Secondly, extract the information iterating through layers:
results = {}
layers = model.layers
for layer in layers:
info = get_layer_summary_with_connections(layer)
results[layer.name] = info
results is a nested dict with this format:
{
'layer_name': {'type':'the layer type', 'parents':'list of the parent layers'},
...
'layer_name': {'type':'the layer type', 'parents':'list of the parent layers'}
}
For ResNet50 it results in:
{
'input_4': {'type': 'InputLayer', 'parents': []},
'conv1_pad': {'type': 'ZeroPadding2D', 'parents': ['input_4']},
'conv1_conv': {'type': 'Conv2D', 'parents': ['conv1_pad']},
'conv1_bn': {'type': 'BatchNormalization', 'parents': ['conv1_conv']},
...
'conv5_block3_out': {'type': 'Activation', 'parents': ['conv5_block3_add']},
'avg_pool': {'type': 'GlobalAveragePooling2D', 'parents' ['conv5_block3_out']},
'predictions': {'type': 'Dense', 'parents': ['avg_pool']}
}
Also, you can modify get_layer_summary_with_connections to return all the information you are interested in

You can view the whole model and its connections with the keras's Model plotting utilities
tf.keras.utils.plot_model(model, to_file='path/to/image', show_shapes=True)

Related

Weights of pre-trained BERT model not initialized

I am using the Language Interpretability Toolkit (LIT) to load and analyze a BERT model that I pre-trained on an NER task.
However, when I'm starting the LIT script with the path to my pre-trained model passed to it, it fails to initialize the weights and tells me:
modeling_utils.py:648] loading weights file bert_remote/examples/token-classification/Data/Models/results_21_03_04_cleaned_annotations/04.03._8_16_5e-5_cleaned_annotations/04-03-2021 (15.22.23)/pytorch_model.bin
modeling_utils.py:739] Weights of BertForTokenClassification not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
modeling_utils.py:745] Weights from pretrained model not used in BertForTokenClassification: ['bert.embeddings.position_ids']
It then simply uses the bert-base-german-cased version of BERT, which of course doesn't have my custom labels and thus fails to predict anything. I think it might have to do with PyTorch, but I can't find the error.
If relevant, here is how I load my dataset into CoNLL 2003 format (modification of the dataloader scripts found here):
def __init__(self):
# Read ConLL Test Files
self._examples = []
data_path = "lit_remote/lit_nlp/examples/datasets/NER_Data"
with open(os.path.join(data_path, "test.txt"), "r", encoding="utf-8") as f:
lines = f.readlines()
for line in lines[:2000]:
if line != "\n":
token, label = line.split(" ")
self._examples.append({
'token': token,
'label': label,
})
else:
self._examples.append({
'token': "\n",
'label': "O"
})
def spec(self):
return {
'token': lit_types.Tokens(),
'label': lit_types.SequenceTags(align="token"),
}
And this is how I initialize the model and start the LIT server (modification of the simple_pytorch_demo.py script found here):
def __init__(self, model_name_or_path):
self.tokenizer = transformers.AutoTokenizer.from_pretrained(
model_name_or_path)
model_config = transformers.AutoConfig.from_pretrained(
model_name_or_path,
num_labels=15, # FIXME CHANGE
output_hidden_states=True,
output_attentions=True,
)
# This is a just a regular PyTorch model.
self.model = _from_pretrained(
transformers.AutoModelForTokenClassification,
model_name_or_path,
config=model_config)
self.model.eval()
## Some omitted snippets here
def input_spec(self) -> lit_types.Spec:
return {
"token": lit_types.Tokens(),
"label": lit_types.SequenceTags(align="token")
}
def output_spec(self) -> lit_types.Spec:
return {
"tokens": lit_types.Tokens(),
"probas": lit_types.MulticlassPreds(parent="label", vocab=self.LABELS),
"cls_emb": lit_types.Embeddings()
This actually seems to be expected behaviour. In the documentation of the GPT models the HuggingFace team writes:
This will issue a warning about some of the pretrained weights not being used and some weights being randomly initialized. That’s because we are throwing away the pretraining head of the BERT model to replace it with a classification head which is randomly initialized.
So it seems to not be a problem for the fine-tuning. In my use case described above it worked despite the warning as well.

Performing inference with a BERT (TF 1.x) saved model

I'm stuck on one line of code and have been stalled on a project all weekend as a result.
I am working on a project that uses BERT for sentence classification. I have successfully trained the model, and I can test the results using the example code from run_classifier.py.
I can export the model using this example code (which has been reposted repeatedly, so I believe that it's right for this model):
def export(self):
def serving_input_fn():
label_ids = tf.placeholder(tf.int32, [None], name='label_ids')
input_ids = tf.placeholder(tf.int32, [None, self.max_seq_length], name='input_ids')
input_mask = tf.placeholder(tf.int32, [None, self.max_seq_length], name='input_mask')
segment_ids = tf.placeholder(tf.int32, [None, self.max_seq_length], name='segment_ids')
input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn({
'label_ids': label_ids, 'input_ids': input_ids,
'input_mask': input_mask, 'segment_ids': segment_ids})()
return input_fn
self.estimator._export_to_tpu = False
self.estimator.export_savedmodel(self.output_dir, serving_input_fn)
I can also load the exported estimator (where the export function saves the exported model into a subdirectory labeled with a timestamp):
predict_fn = predictor.from_saved_model(self.output_dir + timestamp_number)
However, for the life of me, I cannot figure out what to provide to predict_fn as input for inference. Here is my best code at the moment:
def predict(self):
input = 'Test input'
guid = 'predict-0'
text_a = tokenization.convert_to_unicode(input)
label = self.label_list[0]
examples = [InputExample(guid=guid, text_a=text_a, text_b=None, label=label)]
features = convert_examples_to_features(examples, self.label_list,
self.max_seq_length, self.tokenizer)
predict_input_fn = input_fn_builder(features, self.max_seq_length, False)
predict_fn = predictor.from_saved_model(self.output_dir + timestamp_number)
result = predict_fn(predict_input_fn) # this generates an error
print(result)
It doesn't seem to matter what I provide to predict_fn: the examples array, the features array, the predict_input_fn function. Clearly, predict_fn wants a dictionary of some type - but every single thing that I've tried generates an exception due to a tensor mismatch or other errors that generally mean: bad input.
I presumed that the from_saved_model function wants the same sort of input as the model test function - apparently, that's not the case.
It seems that lots of people have asked this very question - "how do I use an exported BERT TensorFlow model for inference?" - and have gotten no answers:
Thread #1
Thread #2
Thread #3
Thread #4
Any help? Thanks in advance.
Thank you for this post. Your serving_input_fn was the piece I was missing! Your predict function needs to be changed to feed the features dict directly, rather than use the predict_input_fn:
def predict(sentences):
labels = [0, 1]
input_examples = [
run_classifier.InputExample(
guid="",
text_a = x,
text_b = None,
label = 0
) for x in sentences] # here, "" is just a dummy label
input_features = run_classifier.convert_examples_to_features(
input_examples, labels, MAX_SEQ_LEN, tokenizer
)
# this is where pred_input_fn is replaced
all_input_ids = []
all_input_mask = []
all_segment_ids = []
all_label_ids = []
for feature in input_features:
all_input_ids.append(feature.input_ids)
all_input_mask.append(feature.input_mask)
all_segment_ids.append(feature.segment_ids)
all_label_ids.append(feature.label_id)
pred_dict = {
'input_ids': all_input_ids,
'input_mask': all_input_mask,
'segment_ids': all_segment_ids,
'label_ids': all_label_ids
}
predict_fn = predictor.from_saved_model('../testing/1589418540')
result = predict_fn(pred_dict)
print(result)
pred_sentences = [
"That movie was absolutely awful",
"The acting was a bit lacking",
"The film was creative and surprising",
"Absolutely fantastic!",
]
predict(pred_sentences)
{'probabilities': array([[-0.3579178 , -1.2010787 ],
[-0.36648935, -1.1814401 ],
[-0.30407643, -1.3386648 ],
[-0.45970002, -0.9982413 ],
[-0.36113673, -1.1936386 ],
[-0.36672896, -1.1808994 ]], dtype=float32), 'labels': array([0, 0, 0, 0, 0, 0])}
However, the probabilities returned for sentences in pred_sentences do not match the probabilities I get use estimator.predict(predict_input_fn) where estimator is the fine-tuned model being used within the same (python) session. For example, [-0.27276006, -1.4324446 ] using estimator vs [-0.26713806, -1.4505868 ] using predictor.

TF-serving with NMT

I am working on exporting a translation model for serving using TF-Serving.
I have referred the issues in the below link.
https://github.com/tensorflow/serving/issues/712
The model which is being served always seems to give the same result irrespective of the input it receives. I am using the below code.
def export(self):
infer_model = self._create_infer_model()
with tf.Session(graph=infer_model.graph,
config=tf.ConfigProto(allow_soft_placement=True)) as sess:
feature_config = {
'input': tf.FixedLenSequenceFeature(dtype=tf.string, shape=[], allow_missing=True),
}
#serialized_example = tf.placeholder(dtype=tf.string, name="tf_example")
#tf_example = tf.parse_example(serialized_example, feature_config)
tf_example = ['This is created just for export']
inference_input = tf.identity(tf_example, name="inference_input")
#batch_size_placeholder = tf.constant(1, shape=[1,], dtype=tf.int64)
saver = infer_model.model.saver
saver.restore(sess, self._ckpt_path)
# initialize tables
sess.run(tf.tables_initializer())
sess.run(
infer_model.iterator.initializer,
feed_dict={
infer_model.src_placeholder: inference_input.eval()
})
# get outputs of model
inference_outputs, _ = infer_model.model.decode(sess=sess)
#inference_outputs = infer_model.model.sample_words
#get the first of the outputs as the result of inference
inference_output = inference_outputs[0]
# create signature def
# key `seq_input` in `inputs` dict could be changed as your will,
# but the client should consistent with this
# when you make an inference request.
# key `seq_output` in outputs dict is the same as above
inference_signature = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={
'seq_input': infer_model.src_placeholder
},
outputs={
'seq_output': tf.convert_to_tensor(inference_output)
}
)
legacy_ini_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder = tf.saved_model.builder.SavedModelBuilder(self._export_dir)
# key `tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`
# (is `serving_default` actually) in signature_def_map could be changed
# as your will. But the client should consistent with this when you make an inference request.
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: inference_signature,
},
legacy_init_op=legacy_ini_op,
clear_devices=True,
assets_collection=tf.get_collection(tf.GraphKeys.ASSET_FILEPATHS))
builder.save(as_text=True)
print("Done!")
In this case I am always getting the output as
"This is just for export"
Any assistance would be great.
Thanks,
Sujith.

TensorFlow input function for reading sparse data (in libsvm format)

I'm new to TensorFlow and trying to use the Estimator API for some simple classification experiments. I have a sparse dataset in libsvm format. The following input function works for small datasets:
def libsvm_input_function(file):
def input_function():
indexes_raw = []
indicators_raw = []
values_raw = []
labels_raw = []
i=0
for line in open(file, "r"):
data = line.split(" ")
label = int(data[0])
for fea in data[1:]:
id, value = fea.split(":")
indexes_raw.append([i,int(id)])
indicators_raw.append(int(1))
values_raw.append(float(value))
labels_raw.append(label)
i=i+1
indexes = tf.SparseTensor(indices=indexes_raw,
values=indicators_raw,
dense_shape=[i, num_features])
values = tf.SparseTensor(indices=indexes_raw,
values=values_raw,
dense_shape=[i, num_features])
labels = tf.constant(labels_raw, dtype=tf.int32)
return {"indexes": indexes, "values": values}, labels
return input_function
However, for a dataset of a few GB size I get the following error:
ValueError: Cannot create a tensor proto whose content is larger than 2GB.
How can I avoid this error? How should I write an input function to read medium-sized sparse datasets (in libsvm format)?
When use estimator, for libsvm data input, you can create dense index list, dense value list, then use feature_column.categorical_column_with_identity and feature_column.weighted_categorical_column to create feature column, finally, put feature columns to estimator. Maybe your input features length is variable, you can use padded_batch to handle it.
here some codes:
## here is input_fn
def input_fn(data_dir, is_training, batch_size):
def parse_csv(value):
## here some process to create feature_indices list, feature_values list and labels
return {"index": feature_indices, "value": feature_values}, labels
dataset = tf.data.Dataset.from_tensor_slices(your_filenames)
ds = dataset.flat_map(
lambda f: tf.data.TextLineDataset(f).map(parse_csv)
)
ds = ds.padded_batch(batch_size, ds.output_shapes, padding_values=(
{
"index": tf.constant(-1, dtype=tf.int32),
"value": tf.constant(0, dtype=tf.float32),
},
tf.constant(False, dtype=tf.bool)
))
return ds.repeat().prefetch(batch_size)
## create feature column
def build_model_columns():
categorical_column = tf.feature_column.categorical_column_with_identity(
key='index', num_buckets=your_feature_dim)
sparse_columns = tf.feature_column.weighted_categorical_column(
categorical_column=categorical_column, weight_feature_key='value')
dense_columns = tf.feature_column.embedding_column(sparse_columns, your_embedding_dim)
return [sparse_columns], [dense_columns]
## when created feature column, you can put them into estimator, eg. put dense_columns into DNN, and sparse_columns into linear model.
## for export savedmodel
def raw_serving_input_fn():
feature_spec = {"index": tf.placeholder(shape=[None, None], dtype=tf.int32),
"value": tf.placeholder(shape=[None, None], dtype=tf.float32)}
return tf.estimator.export.build_raw_serving_input_receiver_fn(feature_spec)
Another way, you can create your custom feature column, like this: _SparseArrayCategoricalColumn
I have been using tensorflow.contrib.libsvm. Here's an example (i am using eager execution with generators)
import os
import tensorflow as tf
import tensorflow.contrib.libsvm as libsvm
def all_libsvm_files(folder_path):
for file in os.listdir(folder_path):
if file.endswith(".libsvm"):
yield os.path.join(folder_path, file)
def load_libsvm_dataset(path_to_folder):
return tf.data.TextLineDataset(list(all_libsvm_files(path_to_folder)))
def libsvm_iterator(path_to_folder):
dataset = load_libsvm_dataset(path_to_folder)
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
yield libsvm.decode_libsvm(tf.reshape(next_element, (1,)),
num_features=666,
dtype=tf.float32,
label_dtype=tf.float32)
libsvm_iterator gives you a feature-label pair back on each iteration, from multiple files inside a folder that you specify.

Which variables to pass to a tensor flow predictor for tf.feature_coloumns using wide and deep learning model?

I've been trying out the wide and deep learning example from the tensor flow site: https://www.tensorflow.org/tutorials/wide_and_deep
I can train and evaluate the model and even predict within that same process but I can't seem to figure out what input I need to pass into the predictor function when I try to do a prediction from a model that was saved and then reloaded via the predictor.from_saved_model function.
My feature columns and model look like this which runs fine:
term = tf.feature_column.categorical_column_with_vocabulary_list("term", unique_terms['term'].tolist())
name = tf.feature_column.categorical_column_with_vocabulary_list("name", unique_name['name'].tolist())
base_columns = [term, cust_name]
crossed_columns = [
tf.feature_column.crossed_column(["term", "cust_name"], hash_bucket_size=100000),
]
deep_columns = [
tf.feature_column.indicator_column(term),
tf.feature_column.indicator_column(cust_name),
]
model_dir = export_dir
search_model = tf.estimator.DNNLinearCombinedClassifier(
model_dir=model_dir,
linear_feature_columns=crossed_columns,
dnn_feature_columns=deep_columns,
dnn_hidden_units=[100, 50])
I saved the model like this:
feature_columns = crossed_columns + deep_columns
feature_spec = tf.feature_column.make_parse_example_spec(feature_columns)
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
servable_model_dir = export_dir
servable_model_path = search_model.export_savedmodel(servable_model_dir, export_input_fn)
And then I load it back from file like this:
predict_fn = predictor.from_saved_model(export_dir)
predictions = predict_fn({'X':[10]})
The predict_fn is expecting a dictionary with key "inputs" not "x" except I am not sure what the value of "inputs" should be. Can anyone help me out on this please?