I have a Jupyter notebook in GitHub and wanted to run it in Binder so other people can play with it.
However it complains that pandas is not installed.
The error is:
ModuleNotFoundError: No module named 'pandas'
How can I get Binder to install pandas for this instance ?
You had to edit/create requirements.txt at the base of the repo. I have tried to use the pip install method in a cell and this did not work for Binder as it prevents live installations in your session.
You can list the modules you need and specify versions if you need to.
There is an example in this GitHub:
https://github.com/binder-examples/requirements/blob/master/requirements.txt
Contents are:
numpy==1.16.*
matplotlib==3.*
seaborn==0.8.1
pandas
It is noteworthy to mention that the requirements.txt file has to be created in the GitHub repository, not in the Binder UI. The Binder UI is also read only and will not sync any file back to GitHub. Any requirements.txt in the Binder UI will not be picked up and also if reloading the runtime or refreshing the page it will not be considered. When the requirements.txt is created, launch again Binder UI start page pointing to the GitHub repository.
Related
Background
I've been trying to follow the tutorial in this video. The goal is trying to install TensorFlow and TensorFlow's object_detection module.
Goal
How do I install it so that I can follow the rest of the tutorial? I only want to install the CPU version.
Additional Information
Errors that I ran into
ERROR: Could not find a version that satisfies the requirement tensorflow==2.1.0 (from versions: None) ERROR: No matching distribution found for tensorflow
ERROR: tensorflow.whl is not a supported wheel on this platform.
##Research##
https://github.com/tensorflow/tensorflow/issues/39130
Tensorflow installation error: not a supported wheel on this platform
Prologue
I found this ridiculously complex, if anyone else has a simpler way to install this package please let everyone else know.
Main resource is https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/install.html#set-env
Summary of Steps
Newest update of python (x64 bit) which you can install here -
Create a virtual environment from that newest version of python
Get the latest version of TensorFlow from Google - https://www.tensorflow.org/install/pip#package-location
Install latest version of TensorFlow using pip with --upgrade tag and link from above step
Get latest version of protoc (data transfer protocol) - https://github.com/protocolbuffers/protobuf/releases
Install protoc and add location to path so you can easily call it later
Get TensorFlow Garden files from here - https://github.com/tensorflow/models
Copy to a location and add a folder structure models
Compile Protobufs for each model from the TensorFlow Garden using protoc
Set up COCO API to connect to the COCO dataset
Copy the setup file from TensorFlow2 in the TensorFlow Garden object_detection module
Run the installation for object_detection module & hope for the best
Detailed Descriptions
I ran into a problem when first attempting to install object_detection because my version of python wasn't supported
Get the latest version by going to this page - https://www.python.org/downloads/
Click "Download Python 3.9.X"
Once downloaded, run the installation file
Navigate to where python was installed and copy the path to the executable.
Open up command prompt by going Windows Key -> cmd
Navigate to where you would like to create the virtual environment by using the cd "path/to/change/directory/to"
then type "previously/copied/python/executable/path/python.exe" -m venv "name_of_your_virtual_environment"
TensorFlow seems to be supported by google storage api and not by pip to find the link to the latest stable TensorFlow use
this website https://www.tensorflow.org/install/pip#package-location
Now grab the TensorFlow installation link that matches your version of python.
Since mine was version 3.9 and windows I got this link - https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow_cpu-2.6.0-cp39-cp39-win_amd64.whl
Install TensorFlow by getting the python.exe from your virtual environment "name_of_your_virtual_environment"
"name_of_your_virtual_environment/Scripts/python.exe" -m pip install --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow_cpu-2.6.0-cp39-cp39-win_amd64.whl
Note that you have to use the upgrade tag for some reason
Because TensorFlow is a Google thing they use a special data interchange format called Protobuffs
Find the latest version of this tool by navigating to their website - https://github.com/protocolbuffers/protobuf/releases
Find the link under newest releases that matches your operating system aka windows and architecture x64
I chose https://github.com/protocolbuffers/protobuf/releases/download/v3.17.3/protoc-3.17.3-win64.zip
To install this thing extract the .zip file and put into "C://Program Files/Google Protoc"
Get the folder location that has the protoc executable and add it to your environment variables
To edit your environmental variables press the Windows Key and search for "Environmental Variables" click on "Edit the system Environment Variables"
Then click "Environmental Variables"
Navigate to the "Path" environment variable under your user, select it and click edit
Click new and paste the executable location of protoc, aka "C:/Program Files/GoogleProtoc/bin"
Now to get the actual code for the object_detection module which is supoorted by researchers and is separate to base TensorFlow
Navigate to TensorFlow Garden - https://github.com/tensorflow/models
Download or clone the repository
Copy the files to another location using the following structure
TensorFlow
-> models (You have to add this folder)-> community
-> official
-> orbit
-> research
Restart your command prompt. It will need to be restarted to take into account changes in environmental variables. In this case
Path because you added protoc on there to make it easier to call from your command prompt
Again that is Windows Key -> Search cmd
Navigate inside the research folder with cd "TensorFlow/models/research/"
Run the command to download and compile Protobuf libraries for /f %i in ('dir /b object_detection\protos\*.proto') do protoc object_detection\protos\%i --python_out=.
Install COCO API so that you can access the dataset. It is a requirement of TensorFlow's object_detection api
Ensure you are still in the folder "TensorFlow/models/research/"
Copy the setup python file to the folder you're in using copy object_detection/packages/tf2/setup.py .
Now use pip to perform the installation "name_of_your_virtual_environment/Scripts/python.exe" -m pip install --use-feature=2020-resolver
Move the set up python file for TensorFlow 2 into the directory which will install the object_detection module.
Go into "TensorFlow/models/research/object_detection/packages/tf2/setup.py" and move that to "TensorFlow/models/research/object_detection/setup.py"
Now run the installation process for the object_detection module
Open CMD and navigate to "TensorFlow/models/research/object_detection/" by using cd command
Using your virtual environment run the script "name_of_your_virtual_environment/Scripts/python.exe" setup.py
Error Guides
ERROR: Could not find a version that satisfies the requirement tensorflow==2.1.0 (from versions: None) ERROR: No matching distribution found for tensorflow
This occurs because your version of Python isn't correct or the architecture is wrong 32bit instead of 64bit. Fix this by downloading a new version of Python and creating a new virtual environment.
ERROR: tensorflow.whl is not a supported wheel on this platform.
Similar to above your version of Python might be wrong or you have selected the wrong link from the TensorFlow repo from Google Storage API. Start at the beginning, download the newest version of Python, create your new virtual environment and then download the right version of TensorFlow that matches the Python version, your operating system (e.g. MAC, Linux or Windows).
I have 2 separate projects that I created in PyCharmCE2018.3.
The first project:
set up without a venv
its packages installed with pip are in Program Files\Python\Python37\Lib\site-packages, which is read-only
successfully made into an .exe using PyInstaller. All dependencies were found and accessed by PyInstaller's Analyzer without needing to add paths to the .spec (https://pythonhosted.org/PyInstaller/when-things-go-wrong.html)
The second project:
set up with a venv
its packages installed with pip are in 'projectroot\venv\Lib\site-packages', which is read-only
PyInstaller's Analyzer does not automatically find the imported modules on the default path and populates a list of the missing modules in projectroot\build\warn-projectname
adding projectroot\venv\Lib\site-packages as another path in the .spec causes PyInstaller to correctly look there, but then encounters the PermissionError:
PermissionError: [Errno 13] Permission denied: '\\projectroot\\venv\\Lib\\site-packages'
(Sorry, had trouble formatting the output, so I just put last line. It looks very similar to this: Permission Error When Trying to Use PyInstaller)
I copied \site-packages outside of the projectroot and added the path to the .spec. Again encountered the PermissionError for this new location.
Is there something special about venv which is causing this different reaction? Perhaps there's something else not on my radar?
Thank you for your advice; I'm feeling a bit frustrated at the moment.
3/19/19 EDIT:
I ended up removing PyCharm and the Python interpreter and did a fresh install. When setting up my environments again, I chose venv and used all the default suggestions. Then, I created a new project, dropped my script into that environment, and used pip to install the most recent versions of all my dependencies. Everything worked...with the exception of setuptools. At the suggestion of some other post (I've forgotten now where it was), I checked to see what version of setuptools the system environment was running. It was outdated and I updated it. Lo and behold, everything then worked like a charm (pun intended). This other post claimed that venv's isolation was not always complete and I may have stumbled across a related issue. In any case, setuptools was unrelated to the original issue and I suspect I was having pathing errors that were reset with the fresh install.
I am trying to share my codes on Github using binder beta. The binder generate an environment however it generates Error on Importing numpy library. The Error is "ModuleNotFoundError: No module named 'numpy'"
How may I solve the problem?
Check that you created a .yml file in your repo, like they did it here.
The environment.yml file should list all Python libraries on which
your notebooks depend, specified as though they were created using the
following conda commands:
source activate example-environment
conda env export --no-builds -f environment.yml
I'm attempting to train the NER within SpaCy to recognize a new set of entities. Everything works just fine until I try to save and reload the model.
I'm attempting to follow the SpaCy doc recommendations from https://spacy.io/usage/training#saving-loading, so I have been saving with:
model.to_disk("save_this_model")
and then going to the Command Line and attempting to turn it into a package using:
python -m spacy package save_this_model saved_model_package
so I can then use
spacy.load('saved_model_package')
to pull the model back up.
However, when I'm attempting to use spacy package from the Command Line, I keep getting the error message "Can't locate model data"
I've looked in the save_this_model file and there is a meta.json there, as well as folders for the various pipes (I've tried this with all pipes saved and the non-NER pipes disabled, neither works).
Does anyone know what I might be doing wrong here?
I'm pretty inexperienced, so I think it's very possible that I'm attempting to make a package incorrectly or committing some other basic error. Thank you very much for your help in advance!
The spacy package command will create an installable and loadable Python package based on your model data, which you can then pip install and store in a single .tar.gz file. If you just want to load a model you've saved out, you usually don't even need to package it – you can simply pass the path to the model directory to spacy.load. For example:
nlp = spacy.load('/path/to/save_this_model')
spacy.load can take either a path to a model directory, a model package name or the name of a shortcut link, if available.
If you're new to spaCy and just experimenting with training models, loading them from a directory is usually the simplest solution. Model packages come in handy if you want to share your model with others (because you can share it as one installable file), or if you want to integrate it into your CI workflow or test suite (because the model can be a component of your application, like any other package it depends on).
So if you do want a Python package, you'll first need to build it by running the package setup from within the directory created by spacy package:
cd saved_model_package
python setup.py sdist
You can find more details here in the docs. The above command will create a .tar.gz archive in a directory /dist, which you can then install in your environment.
pip install /path/to/en_example_model-1.0.0.tar.gz
If the model installed correctly, it should show up in the installed packages when you run pip list or pip freeze. To load it, you can call spacy.load with the package name, which is usually the language code plus the name you specified when you packaged the model. In this example, en_example_model:
nlp = spacy.load('en_example_model')
I'm trying to embed a Python 3 interpreter in an Objective C Cocoa app on a Mac, following instructions in this answer (which extends this article) and building Python and PyObjC by hand.
I'd like to be able to run Python code as plugins. I specifically don't want to rely on the stock Apple Python (v2.7). I have most of it working but can't seem to reliably load the plugin scripts. It looks like the embedded Python interpreter is unable to create the __pycache__/*.pyc files. This may be a symptom, or a cause. If I import the plugin file manually from the Python3 REPL (via import or the imp or importlib modules) the .pyc is generated and the plugin then loads correctly. If I don't do this manually the .pyc is not created and I receive a ValueError "Unmarshallable object".
I've tried loosening permissions on the script directory to no avail. The cache_tag looks OK, both from the REPL and from within the bouncer script:
>>> sys.implementation.cache_tag
'cpython-35'
py_compile raises a Cocoa exception if I try and compile the plugin file manually (I'm still digging into that).
I'm using the following:
OS X 10.11.5 (El Capitan)
XCode 7.2.1
Python v3.5.2
PyObjC v3.11
I had to make a couple of necessary tweaks to the process outlined in the linked SO answer:
Compiling Python 3 required Homebrew versions of OpenSSL and zlib, and appropriate LDFLAGS and CPPFLAGS:
export CPPFLAGS="-I$(brew --prefix openssl)/include -I$(brew --prefix zlib)/include"
export LDFLAGS="-L$(brew --prefix openssl)/lib -L$(brew --prefix zlib)/lib"
I also ensure pip is installed OK when configuring Python to build:
./configure --prefix="/path/to/python/devbuild/python3.5.2" --with-ensurepip=install
There is a fork of the original article source (which uses the stock Python2) that works fine here, so I suspect I'm not too far off the mark. Any idea what I've missed? Do I need to sign, or otherwise give permission to, the embedded Python? Are there complilation/configuration options I've neglected to set?
TIA
Typical. It's always the last thing you try, isn't it? Adding the directory containing the plugin scripts to sys.path seems to do the trick, although I'm not sure why importlib needs this (I thought the point was to allow you to circumvent the normal import mechanism). Perhaps it's to do with the way the default importlib.machinery.SourceFileLoader is implemented?
Something like:
sys.path.append(os.path.abspath("/path/to/plugin/scripts"))
makes the "Unmarshallable object" problem go away. The cache directory and .pyc files are created correctly.