DCN recommender system recommend function - tensorflow

I am working on a recommender system using DCN, following this tutorial https://www.tensorflow.org/recommenders/examples/dcn
But his tutorial lacks the recommend function, which I can pass the user_id and command, it can output the prediction. Similar to what's happening the basic_rating tutorial
https://github.com/tensorflow/recommenders/blob/main/docs/examples/basic_ranking.ipynb
Is there a way to do that in DCN as well?
Thank you.

I have used the dcn model for a custom dataset but followed the tutorial.
Following is the model training code (copy pasted from the dcn tutorial):
dcn_result = run_models(use_cross_layer=True,
deep_layer_sizes=[192, 192])
def run_models(use_cross_layer, deep_layer_sizes, projection_dim=None, num_runs=5):
models = []
rmses = []
for i in range(num_runs):
model = DCN(use_cross_layer=use_cross_layer,
deep_layer_sizes=deep_layer_sizes,
projection_dim=projection_dim)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate))
models.append(model)
model.fit(cached_train, epochs=epochs, verbose=False)
metrics = model.evaluate(cached_test, return_dict=True)
rmses.append(metrics["RMSE"])
mean, stdv = np.average(rmses), np.std(rmses)
return {"model": models, "mean": mean, "stdv": stdv}
After this you can choose a particular model from the list of models trained, for example:
model = dcn_result['model'][3]
Then you can pass in a dictionary to the model with your input features to get recommendations with their scores, for example:
test_ratings = model({"user_id": np.array(["00012a2ce6f8dcda20d059ce98491703"]),
"customer_city":np.array(["sao paulo"])})
for product, score in test_ratings:
print(f"{product}: {score}")
This will produce the output:
43ee88561093499d9e571d4db5f20b79: [[1.3206882]]

Related

How to compute the mean of weights of multiple models?

Hi i'm a student and i'm working on a Federated Learning problem, but before doing that with the proper tools like OpenFL or Flower, I started a little experiment to try in local to train using this technique.
I managed to train multiple models using IID data, now I'm struggling with the local_update() function that should collect the models and then i need to take all the weights of these models and compute their mean. I read some documentation of Keras and Tensorflow that I'm using for my work, and i found some functions but i can't get it to work properly.
Currently this is my local_update() that's not working
def local_update(self, models):
weights = []
#Take the weights of the models and compute the mean then return the weights to an updated model
for model in models:
for layer in model.layers:
weights = layer.get_weights()
#Compute the mean of weights
weights = np.mean(weights, axis=0)
for layer in self.model.layers:
self.model.set_weights(weights)
return self.model
In TensorFlow/Keras there are many way to do this but what is the best and simplest one?
Thank you in advance for the help!

BERT for Text Summarization

I'm trying to build a text summarization model using seq2seq architecture in Keras. I've followed this tutorial https://keras.io/examples/lstm_seq2seq/ and implemented it with Embeddings layer, which works fine. But now I want to use BERT. Can pretrained BERT embeddings be used in such a task, usually I see text classifiation, but not the encoder-decoder architecture used with BERT.
I access BERT model from TF Hub, and have a Layer class implemented from this tutorial https://github.com/strongio/keras-bert/blob/master/keras-bert.ipynb, I also tokenize accordingly with BERT tokenizer, below is my model
enc_in_id = Input(shape=(None, ), name="Encoder-Input-Ids")
enc_in_mask = Input(shape=(None, ), name="Encoder-Input-Masks")
enc_in_segment = Input(shape=(None, ), name="Encoder-Input-Segment-Ids")
bert_encoder_inputs = [enc_in_id, enc_in_mask, enc_in_segment]
encoder_embeddings = BertLayer(name='Encoder-Bert-Layer')(bert_encoder_inputs)
encoder_embeddings = BatchNormalization(name='Encoder-Batch-Normalization')(encoder_embeddings)
encoder_lstm = LSTM(latent_size, return_state=True, name='Encoder-LSTM')
encoder_out, e_state_h, e_state_c = encoder_lstm(encoder_embeddings)
encoder_states = [e_state_h, e_state_c]
dec_in_id = Input(shape=(None,), name="Decoder-Input-Ids")
dec_in_mask = Input(shape=(None,), name="Decoder-Input-Masks")
dec_in_segment = Input(shape=(None,), name="Decoder-Input-Segment-Ids")
bert_decoder_inputs = [dec_in_id, dec_in_mask, dec_in_segment]
decoder_embeddings_layer = BertLayer(name='Decoder-Bert-Layer')
decoder_embeddings = decoder_embeddings_layer(bert_decoder_inputs)
decoder_batchnorm_layer = BatchNormalization(name='Decoder-Batch-Normalization-1')
decoder_batchnorm = decoder_batchnorm_layer(decoder_embeddings)
decoder_lstm = LSTM(latent_size, return_state=True, return_sequences=True, name='Decoder-LSTM')
decoder_out, _, _ = decoder_lstm(decoder_batchnorm, initial_state=encoder_states)
dense_batchnorm_layer = BatchNormalization(name='Decoder-Batch-Normalization-2')
decoder_out_batchnorm = dense_batchnorm_layer(decoder_out)
decoder_dense_id = Dense(vocabulary_size, activation='softmax', name='Dense-Id')
dec_outputs_id = decoder_dense_id(decoder_out_batchnorm)
The model builds and after a couple of epochs accuracy rises to 1, and loss drops below 0.5, but the predictions are awful. Since I'm working on a dev set comprised of 5 samples, with max 30 WordPiece tokens and predicting on the same data, I only get the first or maybe two tokens right, then it just repeats the last seen token, or [PAD] token.
There different methods for summarizing a text i.e. Extractive & Abstractive.
Extractive summarization means identifying important sections of the text and generating them verbatim producing a subset of the
sentences from the original text; while abstractive summarization
reproduces important material in a new way after interpretation and
examination of the text using advanced natural language techniques to
generate a new shorter text that conveys the most critical information
from the original one.
For a transformer based approach you just need an additional attention layer which you can add to an encoder-decoder model or you can use pre-trained transformers (fine tune them maybe) like BERT, GPT, T5, etc.
You can have a look at : https://huggingface.co/transformers/
For Abstractive Summarization T5 works pretty well. Here's a nice and simple example : https://github.com/faiztariq/FzLabs/blob/master/abstractive-text-summarization-t5.ipynb
For Extractive Summarization you may take a look at : https://pypi.org/project/bert-extractive-summarizer/
There's a paper (Attention Is All You Need) that explains transformers pretty well, you may also take a look at it : https://arxiv.org/abs/1706.03762
I think this work might prove helpful, there are many other text summarization models that you can try out here they also contain their own blogs to discuss into detail how they were made
Hope this is helpful

How to initialize a keras tensor employed in an API model

I am trying to implemente a Memory-augmented neural network, in which the memory and the read/write/usage weight vectors are updated according to a combination of their previous values. These weigths are different from the classic weight matrices between layers that are automatically updated with the fit() function! My problem is the following: how can I correctly initialize these weights as keras tensors and use them in the model? I explain it better with the following simplified example.
My API model is something like:
input = Input(shape=(5,6))
controller = LSTM(20, activation='tanh',stateful=False, return_sequences=True)(input)
write_key = Dense(4,activation='tanh')(controller)
read_key = Dense(4,activation='tanh')(controller)
w_w = Add()([w_u, w_r]) #<---- UPDATE OF WRITE WEIGHTS
to_write = Dot()([w_w, write_key])
M = Add()([M,to_write])
cos_sim = Dot()([M,read_key])
w_r = Lambda(lambda x: softmax(x,axis=1))(cos_sim) #<---- UPDATE OF READ WEIGHTS
w_u = Add()([w_u,w_r,w_w]) #<---- UPDATE OF USAGE WEIGHTS
retrieved_memory = Dot()([w_r,M])
controller_output = concatenate([controller,retrieved_memory])
final_output = Dense(6,activation='sigmoid')(controller_output)`
You can see that, in order to compute w_w^t, I have to have first defined w_r^{t-1} and w_u^{t-1}. So, at the beginning I have to provide a valid initialization for these vectors. What is the best way to do it? The initializations I would like to have are:
M = K.variable(numpy.zeros((10,4))) # MEMORY
w_r = K.variable(numpy.zeros((1,10))) # READ WEIGHTS
w_u = K.variable(numpy.zeros((1,10))) # USAGE WEIGHTS`
But, analogously to what said in #2486(entron), these commands do not return a keras tensor with all the needed meta-data and so this returns the following error:
AttributeError: 'NoneType' object has no attribute 'inbound_nodes'
I also thought to use the old M, w_r and w_u as further inputs at each iteration and analogously get in output the same variables to complete the loop. But this means that I have to use the fit() function to train online the model having just the target as final output (Model 1), and employ the predict() function on the model with all the secondary outputs (Model 2) to get the variables to use at the next iteration. I have also to pass the weigth matrices from Model 1 to Model 2 using get_weights() and set_weights(). As you can see, it becomes a little bit messy and too slow.
Do you have any suggestions for this problem?
P.S. Please, do not focus too much on the API model above because it is a simplified (almost meaningless) version of the complete one where I skipped several key steps.

Tensorflow : Trainning and test into the same graph with input queues

I am facing to an issue that can't solve with what I found on the internet.
I have build my neural network and connect it to inpute pipeline.
Reading data from tfrecord, with tf.train.batch and queueRunners, Coords, etc..
I have build my NN into a python class named "Model" that I use like :
model = Model(...all hyperparameter here...)
...
model.predict()
or
model.step()
All the training phase works very well.
But now I would like to add a test phase every X epoch/step of training.
I really don't know how to do this.
I have several idea but I don't find the best one:
Duplicate the code into my class to get : loss_train and loss_test, and so on for each node of my graph ? (using sharing variable between train and test)
create 2 instance of my model :
model_train = Model(reuse=false)
model_test = Model(reuse=true)
use tf.make_template ? I really don't found any good exemple of this fonction ...
any other solution ?
I would appreciate any suggestion,
I came across the same Problem when experimenting with TFRecords Datasets. There are several possibilities. Since I wanted to do this on a Computer with only one GPU anyways I implemented it as follows:
# Training Dataset
train_dataset = tf.contrib.data.TFRecordDataset(train_files)
train_dataset = train_dataset.map(parse_function)
train_dataset = train_dataset.shuffle(buffer_size=10000)
train_dataset = train_dataset.batch(200)
# Validation Dataset
validation_dataset = tf.contrib.data.TFRecordDataset(val_files)
validation_dataset = validation_dataset.map(parse_function)
validation_dataset = validation_dataset.batch(200)
# A feedable iterator is defined by a handle placeholder and its structure. We
# could use the `output_types` and `output_shapes` properties of either
# `training_dataset` or `validation_dataset` here, because they have
# identical structure.
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.contrib.data.Iterator.from_string_handle(handle,
train_dataset.output_types, train_dataset.output_shapes)
next_element = iterator.get_next()
# Generate the Iterators
training_iterator = train_dataset.make_initializable_iterator()
validation_iterator = validation_dataset.make_one_shot_iterator()
# The `Iterator.string_handle()` method returns a tensor that can be evaluated
# and used to feed the `handle` placeholder.
training_handle = sess.run(training_iterator.string_handle())
validation_handle = sess.run(validation_iterator.string_handle())
Then for accessing the elements, you can just go like:
img, lbl = sess.run(next_element, feed_dict={handle: training_handle})
And exchange the handle dependant on what you are willing to do ATM.
Keep in mind that this is not parallelizable, however. Following this link, you can get insight into the different methods of creating multiple input pipelines Tensorflow | Reading Data.

access trained parameter in CNTK

With a model like this, how can one access the trained parameters like weight and bias of each layer?
model = Sequential ([
Dense(xx, activation=cntk.sigmoid),
Dense(outputs)])
z = model(features)
Thanks.
The specific mechanisms are shown in this tutorial. Here is the sample that shows how to access the parameters:
model = create_model()
print(len(model.layers))
print(model.layers[0].E.shape)
print(model.layers[2].b.value)