latest aws xgb image does not support reg: - xgboost

The following code snippet is inspired by this.
hyperparameters = {
"max_depth":"5",
"eta":"0.2",
"gamma":"4",
"min_child_weight":"6",
"subsample":"0.7",
"objective":"reg:squarederror",
"num_round":"10"}
output_path = 's3://{}/{}/output'.format(s3_bucket_name, s3_prefix)
estimator = sagemaker.estimator.Estimator(image_uri=sagemaker.image_uris.retrieve("xgboost", region_name, "1.2-2"),
hyperparameters=hyperparameters,
role=role,
instance_count=1,
instance_type='ml.m5.2xlarge',
volume_size=1, # 1 GB
output_path=output_path)
estimator.fit({'train': s3_input_train, 'validation': s3_input_val})
It works fine. I was trying to use:
training_image_name = image_uris.retrieve(framework='xgboost', region=region_name, version='latest')
instead of:
sagemaker.image_uris.retrieve("xgboost", region_name, "1.2-2")
to (I believe) get hold of the latest training image but reg:squarederror is not supported? Is my code to get hold of the latest image name incorrect?

Using "latest" it not suggested as per documentation(see note): https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html
Use specific versions as they are more stable.

Related

How to use tf.data.Dataset.ignore_errors to ignore errors in a Tensorflow Dataset?

When loading images from a directory in Tensorflow, you use something like:
dataset = tf.keras.utils.image_dataset_from_directory(
"S:\\Images",
batch_size=32,
image_size=(128,128),
label_mode=None,
validation_split=0.20, #Reserve 20% of images for validation
subset='training', #If we specify a validation_split, we *must* specify subset
seed=619 #If using validation_split we *must* specify a seed to ensure there is no overlap between training and validation data
)
But of course some of the images (.jpg, .png, .gif, .bmp) will be invalid. So we want to ignore those errors; just skip them (and ideally log the filenames so they can be repaired, removed, or deleted).
There have been some ideas along the way of how to ignore invalid images:
Method 1: tf.contrib.data.ignore_errors (Tensorflow 1.x only)
Warning: The tf.contrib module will not be included in TensorFlow 2.0.
Sample usage:
dataset = dataset.apply(tf.contrib.data.ignore_errors())
The only down-side of this method is that it was only available in Tensorflow 1. Trying to use it today simply won't work, as the tf.contib namespace no longer exists. That led to a built-in method:
Method 2: tf.data.experimental.ignore_errors(log_warning=False) (deprecated)
From the documentation:
Creates a Dataset from another Dataset and silently ignores any errors. (deprecated)
Deprecated: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use tf.data.Dataset.ignore_errors instead.
Sample usage:
dataset = dataset.apply(tf.data.experimental.ignore_errors(log_warning=True))
And this method works. It works great. And it has the advantage of working.
But it's apparently deprecated, and they documentation says we should use method 3:
Method 3 - tf.data.Dataset.ignore_errors(log_warning=False, name=None)
Drops elements that cause errors.
Sample usage:
dataset = dataset.ignore_errors(log_warning=True, name="Loading images from directory")
Except it doesn't work
The dataset.ignore_errors attribute doesn't work, and gives the error:
AttributeError: 'BatchDataset' object has no attribute 'ignore_errors'
Which means:
the thing that works is deprecated
they tell us to use this other thing instead
and "provide the instructions for updating"
but the other thing doesn't work
So we ask Stackoverflow:
How do i use tf.data.Dataset.ignore_errors to ignore errors?
Bonus Reading
TensorFlow Dataset `.map` - Is it possible to ignore errors?
TensorFlow: How to skip broken data
Untested Workaround
Not only is it not what i was asking, but people are not allowed to read this:
It looks like the tf.data.Dataset.ignore_errors() method is not
available in the BatchDataset object, which is what you are using in
your code. You can try using tf.data.Dataset.filter() to filter out
elements that cause errors when loading the images. You can use a
try-except block inside the lambda function passed to filter() to
catch the errors and return False for elements that cause errors,
which will filter them out. Here's an example of how you can use
filter() to achieve this:
def filter_fn(x):
try:
# Load the image and do some processing
# Return True if the image is valid, False otherwise
return True
except:
return False
dataset = dataset.filter(filter_fn)
Alternatively, you can use the tf.data.experimental.ignore_errors()
method, which is currently available in TensorFlow 2.x. This method
will silently ignore any errors that occur while processing the
elements of the dataset. However, keep in mind that this method is
experimental and may be removed or changed in a future version.
tf.data.Dataset.ignore_errors was introduced in TensorFlow 2.11. You can use tf.data.experimental.ignore_errors for older versions like so:
dataset.apply(tf.data.experimental.ignore_errors())

Problem with connecting google Colab with google Cloud TPUs

I have this code which based on t5 notebook (https://colab.research.google.com/github/google-research/text-to-text-transfer-transformer/blob/master/notebooks/t5-trivia.ipynb)
FINETUNE_STEPS = 3000##param {type: "integer"}
model.finetune(
mixture_or_task_name="text_diacritization_short",
pretrained_model_dir=PRETRAINED_DIR,
finetune_steps=FINETUNE_STEPS
)
my code was working fine in 8 Augustus then something happened resulting of this error.
these two lines appeared when my model worked so i don't think they are the problem.
INFO:root:system_path_file_exists:gs://my_bucket/my_file/models/small/operative_config.gin
ERROR:root:Path not found: gs://my_bucket/my_file/models/small/operative_config.gin
Rest of the error.
From /usr/local/lib/python3.7/dist-packages/tensorflow/python/training/training_util.py:399: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:absl:Using an uncached FunctionDataset for training is not recommended since it often results in insufficient shuffling on restarts, resulting in overfitting. It is highly recommended that you cache this task before training with it or use a data source that supports lower-level shuffling (e.g., FileDataSource).
SimdMeshImpl ignoring devices ['', '', '', '', '', '', '', '']
Using default tf glorot_uniform_initializer for variable encoder/block_000/layer_000/SelfAttention/relative_attention_bias The initialzer will guess the input and output dimensions based on dimension order.
Using default tf glorot_uniform_initializer for variable decoder/block_000/layer_000/SelfAttention/relative_attention_bias The initialzer will guess the input and output dimensions based on dimension order.
From /usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py:1161: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
From /usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py:758: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Prefer Variable.assign which has equivalent behavior in 2.X.
I changed the google cloud account and the Colab notebook to completely new gmail account, I think the problem is that something got updated in google Colab regarding connecting to Google Cloud TPUs.
Also, I can connect to my bucket normally using this code.
BASE_DIR = "gs://my_bucket/my_file" ##param { type: "string" }
if not BASE_DIR or BASE_DIR == "gs://":
raise ValueError("You must enter a BASE_DIR.")
DATA_DIR = os.path.join(BASE_DIR, "data")
FINETUNE_MODELS_DIR = os.path.join(BASE_DIR, "models")
ON_CLOUD = True
if ON_CLOUD:
print("Setting up GCS access...")
import tensorflow_gcs_config
from google.colab import auth
# Set credentials for GCS reading/writing from Colab and TPU.
TPU_TOPOLOGY = "v2-8"
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
TPU_ADDRESS = tpu.get_master()
print('Running on TPU:', TPU_ADDRESS)
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
auth.authenticate_user()
tf.enable_eager_execution()
tf.config.experimental_connect_to_host(TPU_ADDRESS)
tensorflow_gcs_config.configure_gcs_from_colab_auth()
tf.disable_v2_behavior()
# Improve logging.
from contextlib import contextmanager
import logging as py_logging
if ON_CLOUD:
tf.get_logger().propagate = False
py_logging.root.setLevel('INFO')
#contextmanager
def tf_verbosity_level(level):
og_level = tf.logging.get_verbosity()
tf.logging.set_verbosity(level)
yield
tf.logging.set_verbosity(og_level)
it would be great if someone can help me I have been looking in the issue for a week and found nothing, is there any changes to how Google Colab works that I am not aware of.
Thanks in advance.

Streamlit with Tensorflow to analyse image and return the probability if is positive or negative

I'm trying to use Tensorflow to Machine Learning to analyze an image and return the probability if is positive or negative based on a model created (extension .h5). I couldn't found a documentation exactly for that, or repository, so even a link to read will be awesome.
Link for the application: https://share.streamlit.io/felipelx/hackathon/IDC_Detector.py
Libraries that I'm trying to use.
import numpy as np
import streamlit as st
import tensorflow as tf
from keras.models import load_model
The function to load the model.
#st.cache(allow_output_mutation=True)
def loadIDCModel():
model_idc = load_model('models/IDC_model.h5', compile=False)
model_idc.summary()
return model_idc
The function to work the image, and what I'm trying to see: model.predict - I can see but is not updating the %, independent of the image the value is always the same.
if uploaded_file is not None:
# transform image to numpy array
file_bytes = tf.keras.preprocessing.image.load_img(uploaded_file, target_size=(96,96), grayscale = False, interpolation = 'nearest', color_mode = 'rgb', keep_aspect_ratio = False)
c.image(file_bytes, channels="RGB")
Genrate_pred = st.button("Generate Prediction")
if Genrate_pred:
model = loadMetModel()
input_arr = tf.keras.preprocessing.image.img_to_array(file_bytes)
input_arr = np.array([input_arr])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
prediction = probability_model.predict(input_arr)
dict_pred = {0: 'Benigno/Normal', 1: 'Maligno'}
result = dict_pred[np.argmax(prediction)]
value = 0
if result == 'Benigno/Normal':
value = str(((prediction[0][0])*100).round(2)) + '%'
else:
value = str(((prediction[0][1])*100).round(2)) + '%'
c.metric('Predição', result, delta=value, delta_color='normal')
Thank you in advance to any help.
The first thing I'm noticing is that your function for loading the model is named loadIDCModel, but then the function you call for loading the model is loadMetModel. When I check your source code, though, it looks like you've already addressed this issue. I'd recommend updating your question to reflect this.
Playing around with your application, I think the issue is your model itself. I tried various images — images containing carcinomas, and even a picture of a cat — and each gave me a probability around 73%. The lowest score I got was 72.74%, and the highest was 73.11% (this one was the cat). It seems that the output percentage is varying slightly, hinting that rather than something being wrong in the code, your model itself is likely at fault. You might need to retrain your model, as it seems to have learned to always return a value of approximately 0.73.

Adding metdata to tensorflow lite file

I have retrained a MobileNet model with tweaks in the model and custom outputs in TensorFlow. I have to run the model on Android using Google ML KIT but the problem is it requires metadata. But whenever I go through he process it is giving me an error:
ValueError: File, '/content/drive/My Drive/labels.txt', is recorded in the metadata, but has not been loaded into the populator.
Here is my code for adding metadata:
model_metadata=_metadata_fb.ModelMetadataT()
model_metadata.name="MobileNet_with_Metadata"
model_metadata.description="This model is trained on plant village leaf disease dataset so that it can be used for detectiong crop diseases"
model_metadata.version="v1.0.0.0"
model_metadata.author="open-source"
model_metadata.license=("Apache License. Version 2.0 "
"http://www.apache.org/licenses/LICENSE-2.0.")
input_metadata=_metadata_fb.TensorMetadataT()
output_metadata=_metadata_fb.TensorMetadataT()
input_metadata.name="image"
input_metadata.description="input_meta.description = (
"Input image to be classified. The expected image is {0} x {1}, with "
"three channels (red, blue, and green) per pixel. Each value in the "
"tensor is a single byte between 0 and 1.".format(256, 256))"
input_normalization = _metadata_fb.ProcessUnitT()
input_normalization.optionsType = (
_metadata_fb.ProcessUnitOptions.NormalizationOptions)
input_normalization.options = _metadata_fb.NormalizationOptionsT()
input_normalization.options.mean = [127.5]
input_normalization.options.std = [127.5]
input_metadata.processUnits = [input_normalization]
input_stats = _metadata_fb.StatsT()
input_stats.max = [255]
input_stats.min = [0]
input_metadata.stats = input_stats
output_metadata.name="Probability"
output_metadata.description="Probabbility of 50 classes"
output_stats = _metadata_fb.StatsT()
output_stats.max = [1.0]
output_stats.min = [0.0]
output_metadata.stats = output_stats
label_file = _metadata_fb.AssociatedFileT()
label_file.name = "/content/drive/My Drive/labels.txt"
label_file.description = "Labels for objects that the model can recognize."
label_file.type = _metadata_fb.AssociatedFileType.TENSOR_AXIS_LABELS
output_metadata.associatedFiles = [label_file]
subgraph = _metadata_fb.SubGraphMetadataT()
subgraph.inputTensorMetadata = [input_metadata]
subgraph.outputTensorMetadata = [output_metadata]
model_metadata.subgraphMetadata = [subgraph]
b = flatbuffers.Builder(0)
b.Finish(
model_metadata.Pack(b),
_metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER)
metadata_buf = b.Output()
populator = _metadata.MetadataPopulator.with_model_file("/content/drive/My Drive/MobileNet_Model_latest.tflite")
populator.load_metadata_buffer(metadata_buf)
populator.load_associated_files(["/content/drive/My Drive/labels.txt"])
populator.populate()
I am doing this for first time and I couldn't get any proper help or documentation. Also, can anyone tell me properly on how to add meta data to the tflite model?
I have referred this link: Add Metadata to TensorFlow
I have faced this issue before while trying to deploy TF2 object detection models to mlkit vision sample. If I remember correctly it was trying to read labels from tflite metadata instead of the labelmap in assets folder.
There are existing issues on github for this metadata problem. Seems like it is not yet fixed as others pointed out. There are a few ways to solve it. Have a look at these issues,
https://github.com/tensorflow/models/issues/9341
https://github.com/tensorflow/tensorflow/issues/43583
Edit: The way shown below is not correct for adding metadata outputs. Correct way should in links above.
For adding metadata to object detection model have a look at these,
https://nbviewer.jupyter.org/github/quickgrid/CodeLab/blob/master/tensorflow/TFlite_Object_Detection_Custom_Model_Export_With_Metadata_TF1.ipynb
https://nbviewer.jupyter.org/github/quickgrid/CodeLab/blob/master/tensorflow/TFlite_Object_Detection_Custom_Model_Export_With_Metadata_TF2.ipynb
ok, I think you were guided by the mentioned code sections, but at the top the paragraph gives you a link to a github with the complete metadata code: https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/metadata/metadata_writer_for_image_classifier.py#L63-L74

TensorFlow: Opening log data written by SummaryWriter

After following this tutorial on summaries and TensorBoard, I've been able to successfully save and look at data with TensorBoard. Is it possible to open this data with something other than TensorBoard?
By the way, my application is to do off-policy learning. I'm currently saving each state-action-reward tuple using SummaryWriter. I know I could manually store/train on this data, but I thought it'd be nice to use TensorFlow's built in logging features to store/load this data.
As of March 2017, the EventAccumulator tool has been moved from Tensorflow core to the Tensorboard Backend. You can still use it to extract data from Tensorboard log files as follows:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
event_acc = EventAccumulator('/path/to/summary/folder')
event_acc.Reload()
# Show all tags in the log file
print(event_acc.Tags())
# E. g. get wall clock, number of steps and value for a scalar 'Accuracy'
w_times, step_nums, vals = zip(*event_acc.Scalars('Accuracy'))
Easy, the data can actually be exported to a .csv file within TensorBoard under the Events tab, which can e.g. be loaded in a Pandas dataframe in Python. Make sure you check the Data download links box.
For a more automated approach, check out the TensorBoard readme:
If you'd like to export data to visualize elsewhere (e.g. iPython
Notebook), that's possible too. You can directly depend on the
underlying classes that TensorBoard uses for loading data:
python/summary/event_accumulator.py (for loading data from a single
run) or python/summary/event_multiplexer.py (for loading data from
multiple runs, and keeping it organized). These classes load groups of
event files, discard data that was "orphaned" by TensorFlow crashes,
and organize the data by tag.
As another option, there is a script
(tensorboard/scripts/serialize_tensorboard.py) which will load a
logdir just like TensorBoard does, but write all of the data out to
disk as json instead of starting a server. This script is setup to
make "fake TensorBoard backends" for testing, so it is a bit rough
around the edges.
I think the data are encoded protobufs RecordReader format. To get serialized strings out of files you can use py_record_reader or build a graph with TFRecordReader op, and to deserialize those strings to protobuf use Event schema. If you get a working example, please update this q, since we seem to be missing documentation on this.
I did something along these lines for a previous project. As mentioned by others, the main ingredient is tensorflows event accumulator
from tensorflow.python.summary import event_accumulator as ea
acc = ea.EventAccumulator("folder/containing/summaries/")
acc.Reload()
# Print tags of contained entities, use these names to retrieve entities as below
print(acc.Tags())
# E. g. get all values and steps of a scalar called 'l2_loss'
xy_l2_loss = [(s.step, s.value) for s in acc.Scalars('l2_loss')]
# Retrieve images, e. g. first labeled as 'generator'
img = acc.Images('generator/image/0')
with open('img_{}.png'.format(img.step), 'wb') as f:
f.write(img.encoded_image_string)
You can also use the tf.train.summaryiterator: To extract events in a ./logs-Folder where only classic scalars lr, acc, loss, val_acc and val_loss are present you can use this GIST: tensorboard_to_csv.py
Chris Cundy's answer works well when you have less than 10000 data points in your tfevent file. However, when you have a large file with over 10000 data points, Tensorboard will automatically sampling them and only gives you at most 10000 points. It is a quite annoying underlying behavior as it is not well-documented. See https://github.com/tensorflow/tensorboard/blob/master/tensorboard/backend/event_processing/event_accumulator.py#L186.
To get around it and get all data points, a bit hacky way is to:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
class FalseDict(object):
def __getitem__(self,key):
return 0
def __contains__(self, key):
return True
event_acc = EventAccumulator('path/to/your/tfevents',size_guidance=FalseDict())
It looks like for tb version >=2.3 you can streamline the process of converting your tb events to a pandas dataframe using tensorboard.data.experimental.ExperimentFromDev().
It requires you to upload your logs to TensorBoard.dev, though, which is public. There are plans to expand the capability to locally stored logs in the future.
https://www.tensorflow.org/tensorboard/dataframe_api
You can also use the EventFileLoader to iterate through a tensorboard file
from tensorboard.backend.event_processing.event_file_loader import EventFileLoader
for event in EventFileLoader('path/to/events.out.tfevents.xxx').Load():
print(event)
Surprisingly, the python package tb_parse has not been mentioned yet.
From documentation:
Installation:
pip install tensorflow # or tensorflow-cpu pip install -U tbparse # requires Python >= 3.7
Note: If you don't want to install TensorFlow, see Installing without TensorFlow.
We suggest using an additional virtual environment for parsing and plotting the tensorboard events. So no worries if your training code uses Python 3.6 or older versions.
Reading one or more event files with tbparse only requires 5 lines of code:
from tbparse import SummaryReader
log_dir = "<PATH_TO_EVENT_FILE_OR_DIRECTORY>"
reader = SummaryReader(log_dir)
df = reader.scalars
print(df)