I'm having some trouble with a pretty basic model. Am unable to create a pre-processing layer that simply normalizes all features. It is likely that my conceptual understanding of the situation is problematic. My thinking was that the input layer is a list or a dictionary of tf.keras.Input objects, which refer to the input tensors by "name", and indicates their shape and datatypes. Normalizer layers are built by first adapting them over the training dataset, and those layers can be accrued in a list and concatenated. After the input layer is defined, the preprocessing layer takes as input the input layer, and passes its results downstream. Each item in an input layer list is a symbolic representation of the tensors that will flow, and each normalizer will get the right tensors by virtue of having been adapted on that feature.
The error I get is as follows:
TypeError Traceback (most recent call last)
Input In [11], in <cell line: 63>()
59 concatenated_preprocessing_layer = tf.keras.layers.Concatenate(preprocessing_layers)
61 #outputs = concatenated_preprocessing_layer(input_layer.values())
---> 63 outputs = concatenated_preprocessing_layer(all_inputs)
File ~/.pyenv/versions/3.8.5/lib/python3.8/site-packages/keras/utils/traceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
File ~/.pyenv/versions/3.8.5/lib/python3.8/site-packages/keras/layers/merge.py:509, in Concatenate.build(self, input_shape)
507 shape_set = set()
508 for i in range(len(reduced_inputs_shapes)):
--> 509 del reduced_inputs_shapes[i][self.axis]
510 shape_set.add(tuple(reduced_inputs_shapes[i]))
512 if len(shape_set) != 1:
TypeError: list indices must be integers or slices, not ListWrapper
And the code is as follows:
import tensorflow as tf
filepath='./taxi_data.csv'
CSV_COLUMNS = [
'fare_amount',
'pickup_datetime',
'pickup_longitude',
'pickup_latitude',
'dropoff_longitude',
'dropoff_latitude',
'passenger_count',
'key',
]
LABEL_COLUMN = 'fare_amount'
STRING_COLS = ['pickup_datetime']
NUMERIC_COLS = ['pickup_longitude', 'pickup_latitude',
'dropoff_longitude', 'dropoff_latitude',
'passenger_count']
DEFAULTS = [[0.0], ['na'], [0.0], [0.0], [0.0], [0.0], [0.0], ['na']]
DAYS = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']
def map_features_and_labels(row, target_name):
label = row.pop(target_name)
row.pop('key')
row.pop('pickup_datetime')
return row, label
def create_dataset(filepath, target_name, batch_size=1, mode=tf.estimator.ModeKeys.EVAL, CSV_COLUMNS=None, column_defaults=None ):
dataset = tf.data.experimental.make_csv_dataset(file_pattern=filepath, column_names=CSV_COLUMNS, column_defaults=DEFAULTS, num_epochs=1, batch_size=1)
dataset = dataset.map(lambda X: map_features_and_labels(X, target_name))
if mode == tf.estimator.ModeKeys.TRAIN:
dataset = dataset.shuffle(buffer_size=1000).repeat()
return dataset
train_ds = create_dataset(filepath, target_name=LABEL_COLUMN, batch_size=1, CSV_COLUMNS=CSV_COLUMNS, column_defaults=DEFAULTS )
#The input layer is usually a dictionary of feature_name: Input object
input_layer = {
'pickup_longitude': tf.keras.Input(shape=(0,), name='pickup_longitude', dtype=tf.dtypes.float32),
'pickup_latitude': tf.keras.Input(shape=(0,), name='pickup_latitude', dtype=tf.dtypes.float32),
'dropoff_longitude': tf.keras.Input(shape=(0,), name='dropoff_longitude', dtype=tf.dtypes.float32),
'dropoff_latitude': tf.keras.Input(shape=(0,), name='dropoff_latitude', dtype=tf.dtypes.float32),
'passenger_count': tf.keras.Input(shape=(0,), name='passenger_count', dtype=tf.dtypes.float32),
}
preprocessing_layers = []
all_inputs = []
for column in NUMERIC_COLS:
feature_ds = train_ds.map(lambda X, y: X[column])
normalizer = tf.keras.layers.Normalization(axis=None)
normalizer.adapt(feature_ds)
preprocessing_layers.append(normalizer)
all_inputs.append(tf.keras.Input(shape=(0,), name=column, dtype=tf.dtypes.float32, ))
concatenated_preprocessing_layer = tf.keras.layers.Concatenate(preprocessing_layers)
#outputs = concatenated_preprocessing_layer(input_layer.values())
outputs = concatenated_preprocessing_layer(all_inputs)
And here is some of the data in the taxi_data.csv file
17,2014-10-25 21:39:42 UTC,-73.978713,40.78303,-74.008102,40.73881,2,unused
14.9,2012-08-22 12:01:00 UTC,-73.987667,40.728747,-74.003272,40.715202,2,unused
21.5,2013-12-18 23:26:12 UTC,-74.008969,40.716853,-73.97688,40.780289,2,unused
23.5,2014-10-04 21:58:00 UTC,-73.954153,40.806257,-74.00343,40.731867,2,unused
34.3,2012-12-17 15:23:00 UTC,-73.866917,40.770342,-73.968872,40.757482,2,unused
16.1,2009-09-24 17:37:31 UTC,-73.967549,40.762828,-73.97961,40.723133,2,unused
17.3,2010-04-26 20:52:36 UTC,-73.981381,40.749913,-73.966612,40.691132,2,unused
35,2014-08-13 20:16:00 UTC,-73.866107,40.771245,-74.013987,40.676437,2,unused
17.3,2010-12-30 17:55:00 UTC,-73.997803,40.725982,-73.982382,40.772225,2,unused
I was able to get this to work. Like I suspected, it was my conceptual understanding that was the issue. Specifically, I wasn't correctly hooking up the Input (input_placeholder) to the normalizer. The modified code is below:
preprocessing_layers = []
all_inputs = []
for column in NUMERIC_COLS:
normalizer = get_normalization_layer(column, train_ds)
input_placeholder = tf.keras.Input(shape=(1,), name=column, dtype=tf.dtypes.float32, )
encoded_feature = normalizer(input_placeholder)
preprocessing_layers.append(encoded_feature)
all_inputs.append(input_placeholder)
concatenated_preprocessing_layer = tf.keras.layers.concatenate(preprocessing_layers)
#outputs = concatenated_preprocessing_layer(input_layer.values())
preprocessing_new_model = tf.keras.Model(inputs=all_inputs, outputs=concatenated_preprocessing_layer)
preprocessing_new_model(train_features)
You need to concatenate preprocessing_layers and all_inputs by using the code below:
concatenated_preprocessing_layer = tf.keras.layers.Concatenate((preprocessing_layers,all_inputs))
As you have used
concatenated_preprocessing_layer = tf.keras.layers.Concatenate(preprocessing_layers)
You can concatenate all_inputs by using:
outputs =tf.keras.layers.Concatenate((concatenated_preprocessing_layer,all_inputs))
Please refer to this working gist for your reference.
Related
Using the TFBertForQuestionAnswering.from_pretrained() function, we get a predefined head on top of BERT together with a loss function that are suitable for this task.
My question is how to create a custom head without relying on TFAutoModelForQuestionAnswering.from_pretrained().
I want to do this because there is no place where the architecture of the head is explained clearly. By reading the code here we can see the architecture they are using, but I can't be sure I understand their code 100%.
Starting from How to Fine-tune HuggingFace BERT model for Text Classification is good. However, it covers only the classification task, which is much simpler.
'start_positions' and 'end_positions' are created following this tutorial.
So far, I've got the following:
train_dataset
# Dataset({
# features: ['input_ids', 'token_type_ids', 'attention_mask', 'start_positions', 'end_positions'],
# num_rows: 99205
# })
train_dataset.set_format(type='tensorflow', columns=['input_ids', 'token_type_ids', 'attention_mask'])
features = {x: train_dataset[x] for x in ['input_ids', 'token_type_ids', 'attention_mask']}
labels = [train_dataset[x] for x in ['start_positions', 'end_positions']]
labels = np.array(labels).T
tfdataset = tf.data.Dataset.from_tensor_slices((features, labels)).batch(16)
input_ids = tf.keras.layers.Input(shape=(256,), dtype=tf.int32, name='input_ids')
token_type_ids = tf.keras.layers.Input(shape=(256,), dtype=tf.int32, name='token_type_ids')
attention_mask = tf.keras.layers.Input((256,), dtype=tf.int32, name='attention_mask')
bert = TFAutoModel.from_pretrained("bert-base-multilingual-cased")
output = bert([input_ids, token_type_ids, attention_mask]).last_hidden_state
output = tf.keras.layers.Dense(2, name="qa_outputs")(output)
model = tf.keras.models.Model(inputs=[input_ids, token_type_ids, attention_mask], outputs=output)
num_train_epochs = 3
num_train_steps = len(tfdataset) * num_train_epochs
optimizer, schedule = create_optimizer(
init_lr=2e-5,
num_warmup_steps=0,
num_train_steps=num_train_steps,
weight_decay_rate=0.01
)
def qa_loss(labels, logits):
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(
from_logits=True, reduction=tf.keras.losses.Reduction.NONE
)
start_loss = loss_fn(labels[0], logits[0])
end_loss = loss_fn(labels[1], logits[1])
return (start_loss + end_loss) / 2.0
model.compile(
loss=loss_fn,
optimizer=optimizer
)
model.fit(tfdataset, epochs=num_train_epochs)
And I am getting the following error:
ValueError: `labels.shape` must equal `logits.shape` except for the last dimension. Received: labels.shape=(2,) and logits.shape=(256, 2)
It is complaining about the shape of the labels. This should not happen since I am using SparseCategoricalCrossentropy loss.
For future reference, I actually found a solution, which is just editing the TFBertForQuestionAnswering class itself. For example, I added an additional layer in the following code and trained the model as usual and it worked.
from transformers import TFBertPreTrainedModel
from transformers import TFBertMainLayer
from transformers.modeling_tf_utils import TFQuestionAnsweringLoss, get_initializer, input_processing
from transformers.modeling_tf_outputs import TFQuestionAnsweringModelOutput
from transformers import BertConfig
class MY_TFBertForQuestionAnswering(TFBertPreTrainedModel, TFQuestionAnsweringLoss):
# names with a '.' represents the authorized unexpected/missing layers when a TF model is loaded from a PT model
_keys_to_ignore_on_load_unexpected = [
r"pooler",
r"mlm___cls",
r"nsp___cls",
r"cls.predictions",
r"cls.seq_relationship",
]
def __init__(self, config: BertConfig, *inputs, **kwargs):
super().__init__(config, *inputs, **kwargs)
self.num_labels = config.num_labels
self.bert = TFBertMainLayer(config, add_pooling_layer=False, name="bert")
# This is the dense layer I added
self.my_dense = tf.keras.layers.Dense(
units=config.hidden_size,
kernel_initializer=get_initializer(config.initializer_range),
name="my_dense",
)
self.qa_outputs = tf.keras.layers.Dense(
units=config.num_labels,
kernel_initializer=get_initializer(config.initializer_range),
name="qa_outputs",
)
def call(
self,
input_ids = None,
attention_mask = None,
token_type_ids = None,
position_ids = None,
head_mask = None,
inputs_embeds = None,
output_attentions = None,
output_hidden_states = None,
return_dict = None,
start_positions = None,
end_positions= None,
training = False,
**kwargs,
):
r"""
start_positions (`tf.Tensor` or `np.ndarray` of shape `(batch_size,)`, *optional*):
Labels for position (index) of the start of the labelled span for computing the token classification loss.
Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
are not taken into account for computing the loss.
end_positions (`tf.Tensor` or `np.ndarray` of shape `(batch_size,)`, *optional*):
Labels for position (index) of the end of the labelled span for computing the token classification loss.
Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
are not taken into account for computing the loss.
"""
inputs = input_processing(
func=self.call,
config=self.config,
input_ids=input_ids,
attention_mask=attention_mask,
token_type_ids=token_type_ids,
position_ids=position_ids,
head_mask=head_mask,
inputs_embeds=inputs_embeds,
output_attentions=output_attentions,
output_hidden_states=output_hidden_states,
return_dict=return_dict,
start_positions=start_positions,
end_positions=end_positions,
training=training,
kwargs_call=kwargs,
)
outputs = self.bert(
input_ids=inputs["input_ids"],
attention_mask=inputs["attention_mask"],
token_type_ids=inputs["token_type_ids"],
position_ids=inputs["position_ids"],
head_mask=inputs["head_mask"],
inputs_embeds=inputs["inputs_embeds"],
output_attentions=inputs["output_attentions"],
output_hidden_states=inputs["output_hidden_states"],
return_dict=inputs["return_dict"],
training=inputs["training"],
)
sequence_output = outputs[0]
# You also have to add it here
my_logits = self.my_dense(inputs=sequence_output)
logits = self.qa_outputs(inputs=my_logits)
start_logits, end_logits = tf.split(value=logits, num_or_size_splits=2, axis=-1)
start_logits = tf.squeeze(input=start_logits, axis=-1)
end_logits = tf.squeeze(input=end_logits, axis=-1)
loss = None
if inputs["start_positions"] is not None and inputs["end_positions"] is not None:
labels = {"start_position": inputs["start_positions"]}
labels["end_position"] = inputs["end_positions"]
loss = self.hf_compute_loss(labels=labels, logits=(start_logits, end_logits))
if not inputs["return_dict"]:
output = (start_logits, end_logits) + outputs[2:]
return ((loss,) + output) if loss is not None else output
return TFQuestionAnsweringModelOutput(
loss=loss,
start_logits=start_logits,
end_logits=end_logits,
hidden_states=outputs.hidden_states,
attentions=outputs.attentions,
)
def serving_output(self, output: TFQuestionAnsweringModelOutput) -> TFQuestionAnsweringModelOutput:
hs = tf.convert_to_tensor(output.hidden_states) if self.config.output_hidden_states else None
attns = tf.convert_to_tensor(output.attentions) if self.config.output_attentions else None
return TFQuestionAnsweringModelOutput(
start_logits=output.start_logits, end_logits=output.end_logits, hidden_states=hs, attentions=attns
)
I've implemented an accumulated gradient optimizer but when I want to train model it gives me this error.So what is the problem?
The idea behind gradient accumulation is that it calculates the loss and gradients after each mini-batch, but instead of updating the model parameters, it waits and accumulates the gradients over consecutive batches. And then ultimately updates the parameters based on the cumulative gradient after a specified number of batches. It serves the same purpose as having a mini-batch with higher number of images.
class AccumAdamOptimizer(keras.optimizers.Optimizer):
def __init__(self,learning_rate=0.001,steps=1,beta_1=0.9,beta_2=0.999,epsilon=1e-
7,amsgrad=False,name='AccumAdamOptimizer',**kwargs):
super(AccumAdamOptimizer, self).__init__(name, **kwargs)
self._set_hyper('learning_rate', kwargs.get('lr', learning_rate))
self._set_hyper('decay', self._initial_decay)
self._set_hyper('beta_1', beta_1)
self._set_hyper('beta_2', beta_2)
self.epsilon = epsilon
self.amsgrad = amsgrad
self.iterations = tf.Variable(1, dtype='int64', name='iterations')
self.steps = steps
self.condition = tf.math.equal(self.iterations % self.steps , 0)
def _create_slots(self, var_list):
for var in var_list:
self.add_slot(var, 'm')
for var in var_list:
self.add_slot(var, 'v')
# if self.amsgrad:
# for var in var_list:
# self.add_slot(var, 'vhat')
for var in var_list:
self.add_slot(var, "ag") #accumulated gradient
def _resource_apply_dense(self, grad, var, apply_state=None):
var_device, var_dtype = var.device, var.dtype.base_dtype
m = self.get_slot(var, 'm')
v = self.get_slot(var, 'v')
ag = self.get_slot(var, 'ag')
lr=self._get_hyper('learning_rate', var_dtype)
beta1= self._get_hyper('beta_1', var_dtype)
beta2=self._get_hyper('beta_2', var_dtype)
t = tf.cast(self.iterations, tf.float32)
beta1_power=tf.math.pow(beta1, t )
beta2_power=tf.math.pow(beta2, t)
if self.condition:
new_m = beta1 * m + (1-beta1) * ag
new_v = beta2 * v + (1-beta2) * tf.math.square(ag)
m_corrected = new_m/(1-beta1_power)
v_corrected = new_v/(1-beta2_power)
new_var = var - lr * m_corrected/(tf.math.sqrt(v_corrected)+self.epsilon))
var.assign(new_var) # update weights
shape_var = tf.shape(var)
ag.assign(tf.zeros(shape_var, dtype=var.dtype))
m.assign(m_corrected)
v.assign(v_corrected)
self.iterations.assign_add(1)
else:
ag.assign_add(grad)
self.iterations.assign_add(1)
def _resource_apply_sparse(self, grad, var):
raise NotImplementedError
def get_config(self):
config = super(AccumAdamOptimizer, self).get_config()
config.update({
'learning_rate': self._serialize_hyperparameter('learning_rate'),
'decay': self._initial_decay,
'beta_1': self._serialize_hyperparameter('beta_1'),
'beta_2': self._serialize_hyperparameter('beta_2'),
'epsilon': self.epsilon,
'amsgrad': self.amsgrad,
})
return config
This is my complete error :
TypeError: in user code:
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:830 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:813 run_step *
outputs = model.train_step(data)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:774 train_step *
self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:530 minimize **
return self.apply_gradients(grads_and_vars, name=name)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:668 apply_gradients
apply_state)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:732 _distributed_apply
with ops.control_dependencies([control_flow_ops.group(update_ops)]):
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/control_flow_ops.py:2966 group
"'%s' with type '%s'" % (inp, type(inp)))
TypeError: Expected tf.group() expected Tensor arguments not 'None' with type '<class 'NoneType'>'
I wrote the text classification code with two classes using the Roberta model and now I want to draw the confusion matrix. How to go about plotting the confusion matrix based of a Roberta model?
RobertaTokenizer = RobertaTokenizer.from_pretrained('roberta-base',do_lower_case=False)
roberta_model = TFRobertaForSequenceClassification.from_pretrained('roberta-base',num_labels=2)
input_ids=[]
attention_masks=[]
for sent in sentences:
bert_inp=RobertaTokenizer.encode_plus(sent,add_special_tokens = True,max_length =128,pad_to_max_length = True,return_attention_mask = True)
input_ids.append(bert_inp['input_ids'])
attention_masks.append(bert_inp['attention_mask'])
input_ids=np.asarray(input_ids)
attention_masks=np.array(attention_masks)
labels=np.array(labels)
#split
train_inp,val_inp,train_label,val_label,train_mask,val_mask=train_test_split(input_ids,labels,attention_masks,test_size=0.5)
print('Train inp shape {} Val input shape {}\nTrain label shape {} Val label shape {}\nTrain attention mask shape {} Val attention mask shape {}'.format(train_inp.shape,val_inp.shape,train_label.shape,val_label.shape,train_mask.shape,val_mask.shape))
log_dir='tensorboard_data/tb_roberta'
model_save_path='/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/callbacks.py'
callbacks = [tf.keras.callbacks.ModelCheckpoint(filepath=model_save_path,save_weights_only=True,monitor='val_loss',mode='min',save_best_only=True),keras.callbacks.TensorBoard(log_dir=log_dir)]
print('\nBert Model',roberta_model.summary())
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
optimizer = tf.keras.optimizers.Adam(learning_rate=2e-5,epsilon=1e-08)
roberta_model.compile(loss=loss,optimizer=optimizer,metrics=[metric]) history=roberta_model.fit([train_inp,train_mask],train_label,batch_size=16,epochs=2,validation_data=([val_inp,val_mask],val_label),callbacks=callbacks)
preds = roberta_model.predict([val_inp,val_mask],batch_size=16)
pred_labels = preds.argmax(axis=1)
f1 = f1_score(val_label,pred_labels)
print('F1 score',f1)
print('Classification Report')
print(classification_report(val_label,pred_labels,target_names=target_names))
c1 = confusion_matrix(val_label,pred_labels)
print('confusion_matrix ',c1)
I now have the following error:
AttributeError Traceback (most recent call last)
<ipython-input-13-abcbb1d223b8> in <module>()
106
107 preds = trained_model.predict([val_inp,val_mask],batch_size=16)
--> 108 pred_labels = preds.argmax(axis=1)
109 f1 = f1_score(val_label,pred_labels)
110 print('F1 score',f1)
AttributeError: 'TFSequenceClassifierOutput' object has no attribute 'argmax'
Instead of pred_labels = preds.argmax (axis = 1), replace the following code:
pred_labels = np.argmax(preds.logits, axis=1)
To get started with TF, I wanted to learn a predictor of match outcomes for a game. There are three features: the 5 heros on team 0, the 5 heroes on team 1, and the map. The winner is the label, 0 or 1. I want to represent the teams and the maps as SparseTensors. Out of a possible 71 heroes, five will be selected. Likewise for maps, out of a possible 13, one will be selected.
import tensorflow as tf
import packunpack as source
import tempfile
from collections import namedtuple
GameRecord = namedtuple('GameRecord', 'team_0 team_1 game_map winner')
def parse(line):
parts = line.rstrip().split("\t")
return GameRecord(
game_map = parts[1],
team_0 = parts[2].split(","),
team_1 = parts[3].split(","),
winner = int(parts[4]))
def conjugate(record):
return GameRecord(
team_0 = record.team_1,
team_1 = record.team_0,
game_map = record.game_map,
winner = 0 if record.winner == 1 else 1)
def sparse_team(team):
indices = list(map(lambda x: [x], map(source.encode_hero, team)))
return tf.SparseTensor(indices=indices, values = [1] * len(indices), dense_shape=[len(source.heroes_array)])
def sparse_map(map_name):
return tf.SparseTensor(indices=[[source.encode_hero(map_name)]], values = [1], dense_shape=[len(source.maps_array)])
def make_input_fn(filename, shuffle = True, add_conjugate_games = True):
def _fn():
records = []
with open(filename, "r") as raw:
i = 0
for line in raw:
record = parse(line)
records.append(record)
if add_conjugate_games:
# since 0 and 1 are arbitrary team labels, learn and test the conjugate game whenever
# learning the original inference
records.append(conjugate(record))
print("Making team 0")
team_0s = tf.constant(list(map(lambda r: sparse_team(r.team_0), records)))
print("Making team 1")
team_1s = tf.constant(list(map(lambda r: sparse_team(r.team_1), records)))
print("making maps")
maps = tf.constant(list(map(lambda r: sparse_map(r.game_map), records)))
print("Making winners")
winners = tf.constant(list(map(lambda r: tf.constant([r.winner]), records)))
return {
"team_0": team_0s,
"team_1": team_1s,
"game_map": maps,
}, winners
#Please help me finish this function?
return _fn
team_0 = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("team_0", source.heroes_array), len(source.heroes_array))
team_1 = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("team_1", source.heroes_array), len(source.heroes_array))
game_map = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("game_map", source.maps_array), len(source.maps_array))
model_dir = tempfile.mkdtemp()
m = tf.estimator.DNNClassifier(
model_dir=model_dir,
hidden_units = [1024, 512, 256],
feature_columns=[team_0, team_1, game_map])
def main():
m.train(input_fn=make_input_fn("tiny.txt"), steps = 100)
if __name__ == "__main__":
main()
This fails on team_0s = tf.constant(list(map(lambda r: sparse_team(r.team_0), records)))
It's very difficult to understand what tf wants me to return in my input_fn, because all of the examples I can find in the docs ultimately call out to a pandas or numpy helper function, and I'm not familiar with those frameworks. I thought that each dictionary value should be a Tensor containing all examples of a single feature. Each of my examples is a SparseTensor, and I want to simply embed them as their dense versions for the sake of the DNNClassifier.
I'm sure my mental model is horribly broken right now, and I appreciate any help setting it straight.
Error output:
python3 estimator.py
Making team 0
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 468, in make_tensor_proto
str_values = [compat.as_bytes(x) for x in proto_values]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 468, in <listcomp>
str_values = [compat.as_bytes(x) for x in proto_values]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/compat.py", line 65, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fe8
b4d7aef0>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "estimator.py", line 79, in <module>
main()
File "estimator.py", line 76, in main
m.train(input_fn=make_input_fn("tiny.txt"), steps = 100)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 709, in _train_model
input_fn, model_fn_lib.ModeKeys.TRAIN)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 577, in _get_features_and_l
abels_from_input_fn
result = self._call_input_fn(input_fn, mode)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 663, in _call_input_fn
return input_fn(**kwargs)
File "estimator.py", line 44, in _fn
team_0s = tf.constant(list(map(lambda r: sparse_team(r.team_0), records)))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 472, in make_tensor_proto
"supported type." % (type(values), values))
TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [<tensorflow.python.framework.sparse_tenso
r.SparseTensor object at 0x7fe8b4d7aef0>, <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fe8b4d7af28
>, <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fe8b4d7af60>, <tensorflow.python.framework.sparse_
tensor.SparseTensor object at 0x7fe8b4d7aeb8> ... ]
Ultimately it wasn't necessary to convert my text representation into sparse vectors in my input_fn. Instead I had to tell the model to expect an input of an array of strings, which it understands how to convert into a "bag of words" or n-hot vector and how to embed as dense vectors.
import tensorflow as tf
import tempfile
import os
from collections import namedtuple
GameRecord = namedtuple('GameRecord', 'team_0 team_1 game_map winner')
def parse(line):
parts = line.rstrip().split("\t")
return GameRecord(
game_map = parts[1],
team_0 = parts[2].split(","),
team_1 = parts[3].split(","),
winner = int(parts[4]))
def conjugate(record):
return GameRecord(
team_0 = record.team_1,
team_1 = record.team_0,
game_map = record.game_map,
winner = 0 if record.winner == 1 else 1)
def make_input_fn(filename, batch_size=128, shuffle = True, add_conjugate_games = True, epochs=1):
def _fn():
records = []
with open(filename, "r") as raw:
i = 0
for line in raw:
record = parse(line)
records.append(record)
if add_conjugate_games:
records.append(conjugate(record))
team_0s = tf.constant(list(map(lambda r: r.team_0, records)))
team_1s = tf.constant(list(map(lambda r: r.team_1, records)))
maps = tf.constant(list(map(lambda r: r.game_map, records)))
winners = tf.constant(list(map(lambda r: [r.winner],
return {
"team_0": team_0s,
"team_1": team_1s,
"game_map": maps,
}, winners
return _fn
team_0 = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("team_0", source.heroes_array), dimension=len(source.heroes_array))
team_1 = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("team_1", source.heroes_array), dimension=len(source.heroes_array))
game_map = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("game_map", source.maps_array), dimension=len(source.maps_array))
model_dir = "DNNClassifierModel_00"
os.mkdir(model_dir)
m = tf.estimator.DNNClassifier(
model_dir=model_dir,
hidden_units = [1024, 512, 256],
feature_columns=[team_0, team_1, game_map])
def main():
m.train(input_fn=make_input_fn("training.txt"))
results = m.evaluate(input_fn=make_input_fn("validation.txt"))
print("model directory = %s" % model_dir)
for key in sorted(results):
print("%s: %s" % (key, results[key]))
if __name__ == "__main__":
main()
Note that this code isn't perfect yet. I need to add in batching.
I am working with a toy example to check how tensorflow.metrics.sparse_precision_at_k works
From the documentation:
labels: int64 Tensor or SparseTensor with shape
[D1, ... DN, num_labels] or [D1, ... DN], where the latter implies
num_labels=1. N >= 1 and num_labels is the number of target classes for
the associated prediction. Commonly, N=1 and labels has shape
[batch_size, num_labels]. [D1, ... DN] must match predictions. Values
should be in range [0, num_classes), where num_classes is the last
dimension of predictions. Values outside this range are ignored.
predictions: Float Tensor with shape [D1, ... DN, num_classes] where
N >= 1. Commonly, N=1 and predictions has shape [batch size, num_classes].
The final dimension contains the logit values for each class. [D1, ... DN]
must match labels.
k: Integer, k for #k metric.
So I have written a following example accordingly:
import tensorflow as tf
import numpy as np
pred = np.asarray([[.8,.1,.1,.1],[.2,.9,.9,.9]]).T
print(pred.shape)
segm = [0,1,1,1]
segm = np.asarray(segm, np.float32)
print(segm.shape)
segm_tf = tf.Variable(segm, np.int64)
pred_tf = tf.Variable(pred, np.float32)
print("segm_tf", segm_tf.shape)
print("pred_tf", pred_tf.shape)
prec,_ = tf.metrics.sparse_precision_at_k(segm_tf, pred_tf, 1, class_id=1)
sess = tf.InteractiveSession()
tf.variables_initializer([prec, segm_tf, pred_tf])
However, I am getting an error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-c6243802dedc> in <module>()
25 print("pred_tf", pred_tf.shape)
26
---> 27 prec,_ = tf.metrics.sparse_precision_at_k(segm_tf, pred_tf, 1, class_id=1)
28 sess = tf.InteractiveSession()
29 tf.variables_initializer([prec, segm_tf, pred_tf])
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/metrics_impl.py in sparse_precision_at_k(labels, predictions, k, class_id, weights, metrics_collections, updates_collections, name)
2828 metrics_collections=metrics_collections,
2829 updates_collections=updates_collections,
-> 2830 name=scope)
2831
2832
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/metrics_impl.py in _sparse_precision_at_top_k(labels, predictions_idx, k, class_id, weights, metrics_collections, updates_collections, name)
2726 tp, tp_update = _streaming_sparse_true_positive_at_k(
2727 predictions_idx=top_k_idx, labels=labels, k=k, class_id=class_id,
-> 2728 weights=weights)
2729 fp, fp_update = _streaming_sparse_false_positive_at_k(
2730 predictions_idx=top_k_idx, labels=labels, k=k, class_id=class_id,
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/metrics_impl.py in _streaming_sparse_true_positive_at_k(labels, predictions_idx, k, class_id, weights, name)
1743 tp = _sparse_true_positive_at_k(
1744 predictions_idx=predictions_idx, labels=labels, class_id=class_id,
-> 1745 weights=weights)
1746 batch_total_tp = math_ops.to_double(math_ops.reduce_sum(tp))
1747
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/metrics_impl.py in _sparse_true_positive_at_k(labels, predictions_idx, class_id, weights, name)
1689 name, 'true_positives', (predictions_idx, labels, weights)):
1690 labels, predictions_idx = _maybe_select_class_id(
-> 1691 labels, predictions_idx, class_id)
1692 tp = sets.set_size(sets.set_intersection(predictions_idx, labels))
1693 tp = math_ops.to_double(tp)
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/metrics_impl.py in _maybe_select_class_id(labels, predictions_idx, selected_id)
1651 if selected_id is None:
1652 return labels, predictions_idx
-> 1653 return (_select_class_id(labels, selected_id),
1654 _select_class_id(predictions_idx, selected_id))
1655
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/metrics_impl.py in _select_class_id(ids, selected_id)
1627 filled_selected_id = array_ops.fill(
1628 filled_selected_id_shape, math_ops.to_int64(selected_id))
-> 1629 result = sets.set_intersection(filled_selected_id, ids)
1630 return sparse_tensor.SparseTensor(
1631 indices=result.indices, values=result.values, dense_shape=ids_shape)
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/sets_impl.py in set_intersection(a, b, validate_indices)
191 intersections.
192 """
--> 193 a, b, _ = _convert_to_tensors_or_sparse_tensors(a, b)
194 return _set_operation(a, b, "intersection", validate_indices)
195
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/sets_impl.py in _convert_to_tensors_or_sparse_tensors(a, b)
82 b = sparse_tensor.convert_to_tensor_or_sparse_tensor(b, name="b")
83 if b.dtype.base_dtype != a.dtype.base_dtype:
---> 84 raise TypeError("Types don't match, %s vs %s." % (a.dtype, b.dtype))
85 if (isinstance(a, sparse_tensor.SparseTensor) and
86 not isinstance(b, sparse_tensor.SparseTensor)):
TypeError: Types don't match, <dtype: 'int64'> vs <dtype: 'float32'>.
Below is a simple example of using this metric.
sess = tf.Session()
predictions = tf.constant([[0.1, 0.3, 0.2, 0.4], [0.1, 0.2, 0.3, 0.4]],
dtype=tf.float32)
labels = tf.constant([3, 2], tf.int64)
precision_op, update_op = tf.metrics.sparse_precision_at_k(
labels=labels,
predictions=predictions,
k=1,
class_id=3)
sess.run(tf.local_variables_initializer())
print(sess.run(update_op))
This examples prints 0.5 because our predictions predicted class 3 for all (two) examples and only one of them is correct.
The two returned ops (precision_op and update_op) can be confusing. Please read this guide -https://www.tensorflow.org/api_guides/python/contrib.metrics. It talks about "streaming" metrics, but the same logic applies to all metrics. Basically, update_op actually updates the variables using the examples/labels you gave and precision_op is idempotent - it simply returns the current value of the metric. If you never call update_op the current value of the metric is undefined, likely nan.
In regard to your code, the shapes are not correct. In the simplest case, labels should just give the correct label for each example in the batch. In your case, there are just two examples, so there should be just two labels. Also, you don't need to create variables yourself - sparse_precision_at_k does it for you.