I failed to convert caffe model into mlmodel using coremltools 5 - python-3.8

I try to convert caffe model. I am using coremltools v5.
this is my code
import coremltools
caffe_model = ('oxford102.caffemodel', 'deploy.prototxt')
labels = 'flower-labels.txt'
coreml_model = coremltools.converters.caffe.convert(
caffe_model,
class_labels=labels,
image_input_names='data'
)
coreml_model.save('FlowerClassifier.mlmodel')
I convert using below command
python3 convert-script.py
And i get an error message like below.
error message
Does anybody face this problem and have solution on it?

I just came across this as I was having the same problem. The caffe support is not available in the newer versions of coremltools API. To make this code run an older version of coremltools (such as 3.4) must be used, which requires using Python 2.7 - which is best done in a virtual environment.
I assume you've solved your issue already, but I added this in case anyone else stumbles onto this question.

There are several solutions according to your case:
I had the same issue on my M1 Mac. You can resolve the same by duplicating your Terminal, and running it with Rosetta.(This worked for me)
cd ~/.virtualenvs/<your venv name here>/bin
mkdir bk; cp python bk; mv -f bk/python .;rmdir bk
codesign -s - --preserve-metadata=identifier,entitlements,flags,runtime -f python
Fore more solutions and issue you can watch this issue on github

I had the same error running python 3.7
In the virtualenv, solution is to run:
pip install coremltools==3.0
Don't have to change python versions and just rerun the script

Related

Tensorflow serving failing with std::bad_alloc

I'm trying to run tensorflow-serving using docker compose (served model + microservice) but the tensorflow serving container fails with the error below and then restarts.
microservice | To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
tensorflow-serving | terminate called after throwing an instance of 'std::bad_alloc'
tensorflow-serving | what(): std::bad_alloc
tensorflow-serving | /usr/bin/tf_serving_entrypoint.sh: line 3: 7 Aborted
(core dumped) tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$#"
I monitored the memory usage and it seems like there's plenty of memory. I also increased the resource limit using Docker Desktop but still get the same error. Each request to the model is fairly small as the microservice is sending tokenized text with batch size of one. Any ideas?
I was encountering the same problem, and this fixed worked for me:
uninstalled and reinstalled:
tensorflow, tensorflow-gpu, etc to 2.9.0, (and trained and built my model)
docker pull and docker run tensorflow/serving:2.8.0 (this did the trick and finally got rid of this problem.)
Had the same error when using tensorflow/serving:latest. Based on Hanafi's response, I used tensorflow/serving:2.8.0 and it worked.
For reference, I used
sudo docker run -p 8501:8501 --mount type=bind,source= \
[PATH_TO_MODEL_DIRECTORY],target=/models/[MODEL_NAME] \
-e MODEL_NAME=[MODEL_NAME] -t tensorflow/serving:2.8.0
The issue is solved for TensorFlow and TensorFlow Serving 2.11 (not yet released) and fix is included in nightly release of TF serving. You can build nightly docker image or use pre-compiled version.
Also TensorFlow 2.9 and 2.10 was patched to fix this issue. Refer PR here.[1, 2]

Tensorflow hash error during installation (pip)

I am trying to install tensorflow on in a Docker Container on a RB Pi Zero.
I get some weird hash-error, see below. What can I try next?
root#123456:/# sudo pip3 install --no-cache-dir tensorflow
Collecting tensorflow
Downloading https://www.piwheels.org/simple/tensorflow/tensorflow-1.14.0-cp35-none-linux_armv6l.whl (94.2MB)
100% |████████████████████████████████| 94.2MB 888kB/s
THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
tensorflow from https://www.piwheels.org/simple/tensorflow/tensorflow-1.14.0-cp35-none-linux_armv6l.whl#sha256=cba22b6d9a3e7a92c07e142bd5256c9773fd20c18090cb1d222357d3b3028655:
Expected sha256 cba22b6d9a3e7a92c07e142bd5256c9773fd20c18090cb1d222357d3b3028655
Got 65c83ef17cd950cf40d021070f3e7e1fa99499a99815c15495920ddc3440a98f
No space issues:
root#123456:/# df -h /
Filesystem Size Used Avail Use% Mounted on
overlay 29G 9.8G 19G 36% /
I've tried to remove files in 'rm -rf /var/lib/apt/lists/partial' and performed apt-get update&upgrade, but with same result. What can I try next?
Update
The hashes were fixed, so the hash mismatch warning should be gone now.
Original answer
Unfortunately, this is a common issue with https://www.piwheels.org when large wheels are uploaded. If you take a close look at the simple URLs of tensorflow wheels, you'll notice that the wheels
tensorflow-1.14.0-cp36-none-linux_armv7l.whl
tensorflow-1.14.0-cp36-none-linux_armv6l.whl
tensorflow-1.14.0-cp35-none-linux_armv7l.whl
tensorflow-1.14.0-cp35-none-linux_armv6l.whl
tensorflow-1.14.0-cp34-none-linux_armv7l.whl
tensorflow-1.14.0-cp34-none-linux_armv6l.whl
all have the same sha256 hash in download links, which means the hashes are just wrong. The workaround is to download the wheels to and install from disk:
$ wget https://www.piwheels.org/simple/tensorflow/tensorflow-1.14.0-cp35-none-linux_armv6l.whl
$ pip install tensorflow-1.14.0-cp35-none-linux_armv6l.whl
I have also reported wrong hashes here, so the issue will be fixed sooner or later.

Install RAPIDS library on Googe Colab notebook

I was wondering if I could install RAPIDS library (executing machine learning tasks entirely on GPU) in Google Colaboratory notebook?
I've done some research but I've not been able to find the way to do that...
This is now possible with the new T4 instances https://medium.com/rapids-ai/run-rapids-on-google-colab-for-free-1617ac6323a8
To enable cuGraph too, you can replace the wget command with:
!conda install -c nvidia/label/cuda10.0 -c rapidsai/label/cuda10.0 -c pytorch \
-c numba -c conda-forge -c numba -c defaults \
boost cudf=0.6 cuml=0.6 python=3.6 cugraph=0.6 -y
Dec 2019 update
New process for RAPIDS v0.11+
Because
RAPIDS v0.11 has dependencies (pyarrow) which were
not covered by the prior install script,
the notebooks-contrib repo, which contains RAPIDS demo notebooks (e.g.
colab_notebooks) and the Colab install script, now follows RAPIDS standard version-specific branch structure*
and some Colab users still enjoy v0.10,
our honorable notebooks-contrib overlord taureandyernv has updated the script which now:
If running v0.11 or higher, updates pyarrow library to 0.15.x.
Here's the code cell to run in Colab for v0.11:
# Install RAPIDS
!wget -nc https://raw.githubusercontent.com/rapidsai/notebooks-contrib/890b04ed8687da6e3a100c81f449ff6f7b559956/utils/rapids-colab.sh
!bash rapids-colab.sh
import sys, os
dist_package_index = sys.path.index("/usr/local/lib/python3.6/dist-packages")
sys.path = sys.path[:dist_package_index] + ["/usr/local/lib/python3.6/site-packages"] + sys.path[dist_package_index:]
sys.path
if os.path.exists('update_pyarrow.py'): ## This file only exists if you're using RAPIDS version 0.11 or higher
exec(open("update_pyarrow.py").read(), globals())
For a walk thru setting up Colab & implementing this script, see How to Install RAPIDS in Google Colab
-* e.g. branch-0.11 for v0.11 and branch-0.12 for v0.12 with default set to the current version
Looks like various subparts are not yet pip-installable so the only way to get them on colab would be to build them on colab, which might be more effort than you're interested in investing in this :)
https://github.com/rapidsai/cudf/issues/285 is the issue to watch for rapidsai/cudf (presumably the other rapidsai/ libs will follow suit).
Latest solution;
!wget -nc https://github.com/rapidsai/notebooks-extended/raw/master/utils/rapids-colab.sh
!bash rapids-colab.sh
import sys, os
sys.path.append('/usr/local/lib/python3.6/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
was pushed a few days ago, see issues #104 or #110, or the full rapids-colab.sh script for more info.
Note: instillation currently requires a Tesla T4 instance, checking for this can be done with;
# check gpu type
!nvidia-smi
import pynvml
pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0)
device_name = pynvml.nvmlDeviceGetName(handle)
# your dolphin is broken, please reset & try again
if device_name != b'Tesla T4':
raise Exception("""Unfortunately this instance does not have a T4 GPU.
Please make sure you've configured Colab to request a GPU instance type.
Sometimes Colab allocates a Tesla K80 instead of a T4. Resetting the instance.
If you get a K80 GPU, try Runtime -> Reset all runtimes...""")
# got a T4, good to go
else:
print('Woo! You got the right kind of GPU!')

How to set up Spark to use pandas managed by anaconda?

We've updated the Spark version from 2.2 to 2.3, but admins didn't update the pandas. So our jobs fail with the following error:
ImportError: Pandas >= 0.19.2 must be installed; however, your version was 0.18.1
Our admin team suggested to created a VM downloading latest version from anaconda (using the command conda create -n myenv anaconda).
I did that and after activating the local environment using source activate myenv when I logged into pyspark2 then I found it was picking the new version of pandas.
But when I am submitting a job using spark2-submit command then it is not working. I did added the below configuration in the spark2-submit command
--conf spark.pyspark.virtualenv.enabled=true
--conf spark.pyspark.virtualenv.type=conda
--conf spark.pyspark.virtualenv.requirements=/home/<user>/.conda/requirements_conda.txt --conf spark.pyspark.virtualenv.bin.path=/home/<user>/.conda/envs/myenv/bin
Also I did zipped whole python 2.7 folder and passed that in the --py-files option along with other .py files --py-files /home/<user>/python.zip, but still getting the same version issue for pandas.
I tried to follow the instruction specified in the URL https://community.hortonworks.com/articles/104947/using-virtualenv-with-pyspark.html , but still no luck yet.
How to fix it and be able to spark2-submit with the proper pandas?
I think you may need to define environment variables such as SPARK_HOME and PYTHONPAH pointing to corresponding locations in your virtualenv.
export SPARK_HOME=path_to_spark_in_virtualenv
export PYTHONPATH=$SPARK_HOME/python

Shipping and using virtualenv in a pyspark job

PROBLEM: I am attempting to run a spark-submit script from my local machine to a cluster of machines. The work done by the cluster uses numpy. I currently get the following error:
ImportError:
Importing the multiarray numpy extension module failed. Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control). Otherwise reinstall numpy.
Original error was: cannot import name multiarray
DETAIL:
In my local environment I have setup a virtualenv that includes numpy as well as a private repo I use in my project and other various libraries. I created a zip file (lib/libs.zip) from the site-packages directory at venv/lib/site-packages where 'venv' is my virtual environment. I ship this zip to the remote nodes. My shell script for performing the spark-submit looks like this:
$SPARK_HOME/bin/spark-submit \
--deploy-mode cluster \
--master yarn \
--conf spark.pyspark.virtualenv.enabled=true \
--conf spark.pyspark.virtualenv.type=native \
--conf spark.pyspark.virtualenv.requirements=${parent}/requirements.txt \
--conf spark.pyspark.virtualenv.bin.path=${parent}/venv \
--py-files "${parent}/lib/libs.zip" \
--num-executors 1 \
--executor-cores 2 \
--executor-memory 2G \
--driver-memory 2G \
$parent/src/features/pi.py
I also know that on the remote nodes there is a /usr/local/bin/python2.7 folder that includes a python 2.7 install.
so in my conf/spark-env.sh I have set the following:
export PYSPARK_PYTHON=/usr/local/bin/python2.7
export PYSPARK_DRIVER_PYTHON=/usr/local/bin/python2.7
When I run the script I get the error above. If I screen print the installed_distributions I get a zero length list []. Also my private library imports correctly (which says to me it is actually accessing my libs.zip site-packages.). My pi.py file looks something like this:
from myprivatelibrary.bigData.spark import spark_context
spark = spark_context()
import numpy as np
spark.parallelize(range(1, 10)).map(lambda x: np.__version__).collect()
EXPECTATION/MY THOUGHTS:
I expect this to import numpy correctly especially since I know numpy works correctly in my local virtualenv. I suspect this is because I'm not actually using the version of python that is installed in my virtualenv on the remote node. My question is first, how do I fix this and second how do I use my virtualenv installed python on the remote nodes instead of the python that is just manually installed and currently sitting on those machines? I've seen some write-ups on this but frankly they are not well written.
With --conf spark.pyspark.{} and export PYSPARK_PYTHON=/usr/local/bin/python2.7 you set options for your local environment / your driver. To set options for the cluster (executors) use the following syntax:
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON
Furthermore, I guess you should make your virtualenv relocatable (this is experimental, however). <edit 20170908> This means that the virtualenv uses relative instead of absolute links. </edit>
What we did in such cases: we shipped an entire anaconda distribution over hdfs.
<edit 20170908>
If we are talking about different environments (MacOs vs. Linux, as mentioned in the comment below), you cannot just submit a virtualenv, at least not if your virtualenv contains packages with binaries (as is the case with numpy). In that case I suggest you create yourself a 'portable' anaconda, i.e. install Anaconda in a Linux VM and zip it.
Regarding --archives vs. --py-files:
--py-files adds python files/packages to the python path. From the spark-submit documentation:
For Python applications, simply pass a .py file in the place of instead of a JAR, and add Python .zip, .egg or .py files to the search path with --py-files.
--archives means these are extracted into the working directory of each executor (only yarn clusters).
However, a crystal-clear distinction is lacking, in my opinion - see for example this SO post.
In the given case, add the anaconda.zip via --archives, and your 'other python files' via --py-files.
</edit>
See also: Running Pyspark with Virtualenv, a blog post by Henning Kropp.