Most recent output in Colab notebook - google-colaboratory

I forgot to get the history histrory = model.fit( ...) trained for 2 hours and my cell look like this:
model.fit( ....)
Is there a variable (as in matlab's "ans"), for the most recent output?
I tried output, out, out[-1], etc. None of the work.
Any help or workarounds would be appreciated.
CS

The global dictionary Out holds all cell outputs. So, for example, if you executed the code in cell 20, Out[20] has the result.
You can also get the output of the most recent execution with the _ variable.

Related

How to use Huggingface Data Collator

I was following this tutorial which comes with this notebook.
I plan to use Tensorflow for my project, so I followed this tutorial and added the line
tokenized_datasets = tokenized_datasets["train"].to_tf_dataset(columns=["input_ids"], shuffle=True, batch_size=16, collate_fn=data_collator)
to the end of the notebook.
However, when I ran it, I got the following error:
RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Long for the source.
Why didn't this work? How can I use the collator?
The issue is not your code, but how the collator is set up. (It's set up to not use Tensorflow by default.)
If you look at this, you'll see that their collator uses the return_tensors="tf" argument. If you add this to your collator, your code for using the collator will work.
In short, your collator creation should look like
data_collator = DataCollatorForLanguageModeling(tokenizer, mlm_probability=0.15, return_tensors="tf")
This will fix the issue.

ValueError: Error when checking input: expected keras_layer_input to have 4 dimensions, but got array with shape (10, 1)

Before this gets marked as duplicate, I already tried all of the the similar questions and most of them were not resolved, if they have an answer, it did not work with my problem. The original code has more than 10 samples.
Input: list of model input np.arrays. sample_train_emb1 has length = 2
Problem: model.fit() error ValueError: Error when checking input: expected keras_layer_input to have 4 dimensions, but got array with shape (10, 1)
Here is my plot_model image:
The model.fit() looks like this:
model.fit(
sample_train_emb1,
sample_y_train,
validation_data=(sample_valid_emb1, sample_y_valid),
epochs=epoch,
batch_size=batch_size,
verbose=1,
)
Thank you! Let me know if you need more details to help me solve this problem. It has many similar posts that remained unresolved so I thought it will help anybody who might face the same problem in the future.
What I've tried so far:
Swapping the two features.
Converting the image feature into a `TensorShape([Dimension(1),
Dimension(224), Dimension(224), Dimension(3)]) based on a similar question's answer
I eventually figured it out. Using the answer from this post.
sample_train_emb1[1] = np.array([x for x in sample_train_emb1[1]])
Hope this helps in the future to anyone.

Issue with np.vstack

I have split my data into training and test, followed by the split of training into another train set and a validation set. To this new train set and validation set I have applied the below transformation. I am implementing a Random forest regression, so at the next step I apply the transformations to these set and try to combine it into one. The issue is np.vstack isn't returning me the correct shape:
Output:
(2, 1) <- should have been (25455, 2394)
(21636, 2394)
(3819, 2394)
Could someone one tell me what am I doing wrong?
Xtrain_rf, Xtest_rf = train_test_split(insurance_data_prep, test_size=0.15, random_state=42)
Xtrain2_rf, Xval_rf = train_test_split(Xtrain_rf, test_size=0.15, random_state=42)
full_transform_rf = ColumnTransformer([
("num", StandardScaler(), attributes_num),
("cat", OneHotEncoder(handle_unknown='ignore'), attributes_cat),
])
## fit transform in the train set
Xtrain2_rf_att_prepared = full_transform_rf.fit_transform(Xtrain2_rf_att)
## transform in the validation set
Xval_rf_att_prepared = full_transform_rf.transform(Xval_rf_att)
whole_train_set_attributes_rf = np.vstack((Xtrain2_rf_att_prepared, Xval_rf_att_prepared))
print(whole_train_set_attributes_rf.shape)
print(Xtrain2_rf_att_prepared.shape)
print(Xval_rf_att_prepared.shape)
I tried to replicate your code with a dataset of my own, and one problem I have ran into is that Xtrain2_rf_att and Xval_rf_att variables have not been defined in your code snippet. Those are the variables that you passed to your ColumnTransformer.
I guess that in your case the variables are defined but what you really want to pass to the ColumnTransformer are Xtrain2_rf and Xval_rf.
If this does not solve your problem, could you edit your question and add the following information?:
insurance_data_prep shape and columns (or some of them)
attributes_num and attributes_dat
clearly provide the outputs from the three prints statements at the end of your code and not just one of them
You might need to use a MRE to replace your dataframe, we will see

I'm getting "DataFrame object is not callable" on k=f1_score(temp, train_y)

Here, temp is an array in which I had stored prediction values. y_train is the daataframe with the target variable values. y_train has only one column with the target values.
This is weird because I've used this code before and it worked fine. But suddenly it's giving this error: "DataFrame object is not collable"
Anyone know the reason for this?
'''
from sklearn.ensemble import RandomForestClassifier
train_f1 = []
test_f1 = []
for i in range(10,100,10):
clf=RandomForestCLassifier(min_samples_split=i)
clf.fit(x_train2, y_train2)
temp = clf.predict(x_train2)
temp = f1_score(temp,y_train2)
train_f1.append(temp)
tmp = clf.predict(x_valid)
tmp = f1_score(temp,y_valid)
test_f1.append(tmp)
I'm using jupyter notebook and I'm using the above code twice in my file, once with RandomForest and the second time with Knn. Both are otherwise identical. This error appears in whichever block of code I run second. Earlier, I was running the randomforest loop first and the knn loop second and the knn loop was showing this error. Now, I ran the knn block first and randomforest second. Now the randomforest loop gives this error, even if the code was unchanged.
Should I convert y_train2 dataframe into an array and give the array as the input to the f1_score function? However, I don't know how to convert it. y_train2 has a single column. But then again, this same code (with the dataframe as input) works fine on whichever loop I run first.

How to visualize a tensor summary in tensorboard

I'm trying to visualize a tensor summary in tensorboard. However I can't see the tensor summary at all in the board. Here is my code:
out = tf.strided_slice(logits, begin=[self.args.uttWindowSize-1, 0], end=[-self.args.uttWindowSize+1, self.args.numClasses],
strides=[1, 1], name='softmax_truncated')
tf.summary.tensor_summary('softmax_input', out)
where out is a multi-dimensional tensor. I guess there must be something wrong with my code. Probably I used the tensor_summary function incorrectly.
What you do is you create a summary op, but you don't invoke it and don't write the summary (see documentation).
To actually create a summary you need to do the following:
# Create a summary operation
summary_op = tf.summary.tensor_summary('softmax_input', out)
# Create the summary
summary_str = sess.run(summary_op)
# Create a summary writer
writer = tf.train.SummaryWriter(...)
# Write the summary
writer.add_summary(summary_str)
Explicitly writing a summary (last two lines) is only necessary if you don't have a higher level helper like a Supervisor. Otherwise you invoke
sv.summary_computed(sess, summary_str)
and the Supervisor will handle it.
More info, also see:
How to manually create a tf.Summary()
Hopefully a workaround which achieves what you want. ..
If you wish to view the tensor values, you can convert them using as_string, then use summary.text. The values will appear in the tensorboard text tab.
Not tried with 3D tensors, but feel free to slice according to needs.
code snippet, which includes use of inserting a print statement to get console output as well.
predictions = tf.argmax(reshaped_logits, 1)
txtPredictions = tf.Print(tf.as_string(predictions),[tf.as_string(predictions)], message='predictions', name='txtPredictions')
txtPredictions_op = tf.summary.text('predictions', txtPredictions)
Not sure whether this is kinda obvious, but you could use something like
def make_tensor_summary(tensor, name='defaultTensorName'):
for i in range(tensor.get_shape()[0]:
for j in range(tensor.get_shape()[1]:
tf.summary.scalar(Name + str(i) + '_' + str(j), tensor[i, j])
in case you know it is a 'matrix-shaped' Tensor in advance.