"Empty table from specified data source" error in Create ML - createml

I'm trying to train a new object detection model using the Create ML tool from Apple. I've already used RectLabel to generate annotations for all of the JPEG images in my directory of training images.
However, every time I try loading the directory in Create ML, I receive this error message:
Empty table from specified data source
I already looked on the Apple Developer forums and that thread incorrectly claims the problem was solved in a previous update.
What causes this error? How can I get Create ML to accept my training data?
I'm using Create ML Version 2.0 (53.2.2) and RectLabel Version 3.04.2 (3.04.2) on macOS Big Sur 11.0.1 (20B29).

The “Empty table from specified data source” error occurs if any of the filenames contain spaces.
My solution was to rename all the files so the filenames don't contain spaces.

Make sure that there are only images and annotations.json file in your directory of training images.
If there are any other files including .mlproj file in the folder, Create ML shows the "Empty table from specified data source" error.
When you create a new project on Create ML, specify outside the directory of training images.

Related

Keras "SavedModel file does not exist at..." for a model retrieved from an online URL

Keras "SavedModel file does not exist at..." error occurs for a model retrieved from an online URL and never manually saved at any local directory.
The code ran just fine for as long as I've been working on it before but I reopened the project today and without changing anything it now gives me this error.
Code Snippet & Error Screenshot
Managed to solve it myself. Simply visit the file directory the error mentions and delete the folder with the random numbers and letters in it. Rerun the program and it'll properly generate the files needed.

How to migrate MlFlow experiments from one Databricks workspace to another with registered models?

so unfortunatly we have to redeploy our Databricks Workspace in which we use the MlFlow functonality with the Experiments and the registering of Models.
However if you export the user folder where the eyperiment is saved with a DBC and import it into the new workspace, the Experiments are not migrated and are just missing.
So the easiest solution did not work. The next thing I tried was to create a new experiment in the new workspace. Copy all the experiment data from the dbfs of the old workspace (with dbfs cp -r dbfs:/databricks/mlflow source, and then the same again to upload it to the new workspace) to the new one. And then just reference the location of the data to the experiment like in the following picture:
This is also not working, no run is visible, although the path is already existing.
The next idea was that the registred models are the most important one so at least those should be there and accessible. For that I used the documentation here: https://www.mlflow.org/docs/latest/model-registry.html.
With the following code you get a list of the registred models on the old workspace with the reference on the run_id and location.
from mlflow.tracking import MlflowClient
client = MlflowClient()
for rm in client.list_registered_models():
pprint(dict(rm), indent=4)
And with this code you can add models to a model registry with a reference to the location of the artifact data (on the new workspace):
# first the general model must be defined
client.create_registered_model(name='MyModel')
# and then the run of the model you want to registre will be added to the model as version one
client.create_model_version( name='MyModel', run_id='9fde022012046af935fe52435840cf1', source='dbfs:/databricks/mlflow/experiment_id/run_id/artifacts/model')
But that did also not worked out. if you go into the Model Registry you get a message like this: .
And I really checked, at the given path (the source) there the data is really uploaded and also a model is existing.
Do you have any new ideas to migrate those models in Databricks?
There is no official way to migrate experiments from one workspace to another. However, leveraging the MLflow API, there is an "unofficial" tool that can migrate experiments minus the notebook revision associated with a run.
mlflow-tools
As an addition to #Andre's anwser
you can also check mlflow-export-import from the same developer
mlflow-export-import

Loading Keras Model in [Google App Engine]

Use-case:
I am trying to load a pre-trained Keras Model as .h5 file in Google App Engine. I am running App Engine on a Python runtime 3.7 and Standard Environment.
Issue:
I tried using the load_model() Keras function. Unfortunately, the load_model function does require a 'file_path' and I failed to load the Model from the Google App Engine file explorer. Further, Google Cloud Storage seems not to be an option as it is not recognized as a file path.
Questions:
(1) How can I load a pretrained model (e.g. .h5) into Google App Engine (without saving it locally first)?
(2) Maybe there is a way to load the model.h5 into Google App Engine from Google Storage that I have not thought of, e.g by using another function (other than tf.keras.models.load_model()) or in another format?
I just want to read the model in order to make predictions. Writing or training the model in not required.
I finally managed to load the Keras Model in Google App Engine -- overcoming four challenges:
Solution:
First challenge: As of today Google App Engine does only provide TF Version 2.0.0x. Hence, make sure to set in your requirements.txt file the correct version. I ended up using 2.0.0b1 for my project.
Second challenge: In order to use a pretrained model, make sure the model has been saved using this particular TensorFlow Version, which is running on Google App Engine.
Third challenge: Google App Engine does not allow you to read from disk. The only possibility to read / or store data is to use memory respectively the /tmp folder (as correctly pointed out by user bhito). I ended up connecting my Gcloud bucket and loaded the model.h5 file as a blob into the /tmp folder.
Fourth challenge: By default the instance class of Google App Engine is limited to 256mb. Due to my model size, I needed to increase the instance class accordingly.
In summary, YES tf.keras.models.load_model() does work on App Engine reading from Cloud Storage having the right TF Version and the right instance (with enough memory)
I hope this will help future folks who want to use Google App Engine to deploy there ML Models.
You will have to download the file first before using it, Cloud Storage paths can't be used to access objects. There is a sample on how to download objects in the documentation:
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)
And then write the file to the /tmp temporary folder which is the only one available in App Engine. But you have to take into consideration that once the instance using the file is deleted, the file will be deleted as well.
Being more specific to your question, to load a keras model, it's useful to have it as a pickle, as this tutorial shows:
def _load_model():
global MODEL
client = storage.Client()
bucket = client.get_bucket(MODEL_BUCKET)
blob = bucket.get_blob(MODEL_FILENAME)
s = blob.download_as_string()
MODEL = pickle.loads(s)
I also have been able to found an answer to another Stackoverflow post that covers what you're actually looking for.

Google Vision AutoML > Datasets | Validation data in csv doesn't upload

I am using Google Vision Automl. In order to train a model the data needs to be uploaded. There are following two ways.
Upload directly from your computer
Upload to google bucket and make a csv which contains the paths to the image files.
See the following image
Since, i want to compare my locally pre-trained model with the model i will train on Google Automl, i want to ensure that the same data splits are used (train, test, validation). So #2 way is the best way
Issue:
I have made a the csv in the following format. But when i upload it, only train and test sets are loaded.
I solved it by putting "Validation" instead of "Validate" in the set column.
So the issue was the language used on the upload form, where they have the following.
Optionally, you can specify the TRAIN, VALIDATE, or TEST split.
Which is misleading and they also did not show the sample row for Validation.
For more details:
https://cloud.google.com/vision/automl/docs/prepare#csv

unable to create a tfrecord

I am unable to create a tfrecord and there is no error in the code, it was just executing when i have given this code
python generate_tfrecords.py --csv_input=images\train_labels.csv --image_dir=images\train --output_path=train.record
i have also checked the image files all are in one folder and with the same format.
please help to create the tfrecord
Thanking you
created a images using labelImg and also did the protobuf part
there are no error but it was not creating a train record