Unable to load fasttext-wiki-news-subswords-300 - api

Starting with the gensim api:
import gensim.downloader as api
api.load('fasttext-wiki-news-subwords-300')
I get the error:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/user.name/gensim-data/fasttext-wiki-news-subwords-300/fasttext-wiki-news-subwords-300.gz'
I also tried the cli:
python3 -m gensim.downloader --download fasttext-wiki-news-subwords-300
and when I check the ~/gensim-data/fasttext-wiki-news-subwords-300 folder it only contains:
__init__.py
__pycache__
Have there been any changes to the to api or the dataset in the last few months?
Note
I am using Python3.8 and gensim==4.2.0
I have checked that the Certificates are Installed ('Install Certificates.command').

I ended up deleting the ~/gensim-data folder and downgraded gensim to 3.8.3, seems to be working now. Leaving the question and answer here as (1) error message was a red herring and (2) solution was not straightforward.

Related

Setting up on Macbook Pro M1 Tenserflow with OpenCV, Scipy, Scikit-learn

I think I read pretty much most of the guides on setting up tensorflow, tensorflow-hub, object detection on Mac M1 on BigSur v11.6. I managed to figure out most of the errors after more than 2 weeks. But I am stuck at OpenCV setup. I tried to compile it from source but seems like it can't find the modules from its core package so constantly can't make the file after the successful cmake build. It fails at different stages, crying for different libraries, despite they are there but max reached 31% after multiple cmake and deletion of the build folder or the cmake cash file. So I am not sure what to do in order to make successfully the file.
I git cloned and unzipped the opencv-4.5.0 and opencv_contrib-4.5.0 in my miniforge3 directory. Then I created a folder "build" in my opencv-4.5.0 folder and the cmake command I use in it is (my miniforge conda environment is called silicon and made sure I am using arch arm64 in bash environment):
cmake -DCMAKE_SYSTEM_PROCESSOR=arm64 -DCMAKE_OSX_ARCHITECTURES=arm64 -DWITH_OPENJPEG=OFF -DWITH_IPP=OFF -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D OPENCV_EXTRA_MODULES_PATH=/Users/adi/miniforge3/opencv_contrib-4.5.0/modules -D PYTHON3_EXECUTABLE=/Users/adi/miniforge3/envs/silicon/bin/python3.8 -D BUILD_opencv_python2=OFF -D BUILD_opencv_python3=ON -D INSTALL_PYTHON_EXAMPLES=ON -D INSTALL_C_EXAMPLES=OFF -D OPENCV_ENABLE_NONFREE=ON -D BUILD_EXAMPLES=ON /Users/adi/miniforge3/opencv-4.5.0
So it cries like:
[ 20%] Linking CXX shared library ../../lib/libopencv_core.dylib
[ 20%] Built target opencv_core
make: *** [all] Error 2
or also like in another tries was initially asking for calib3d or dnn but those libraries are there in the main folder opencv-4.5.0.
The other way I try to install openCV is with conda:
conda install opencv
But then when I test with
python -c "import cv2; cv2.__version__"
it seems like it searches for the ffmepg via homebrew (I didn't install any of these via homebrew but with conda). So it complained:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/adi/miniforge3/envs/silicon/lib/python3.8/site-packages/cv2/__init__.py", line 5, in <module>
from .cv2 import *
ImportError: dlopen(/Users/adi/miniforge3/envs/silicon/lib/python3.8/site-packages/cv2/cv2.cpython-38-darwin.so, 2): Library not loaded: /opt/homebrew/opt/ffmpeg/lib/libavcodec.58.dylib
Referenced from: /Users/adi/miniforge3/envs/silicon/lib/python3.8/site-packages/cv2/cv2.cpython-38-darwin.so
Reason: image not found
Though I have these libs, so when I searched with: find /usr/ -name 'libavcodec.58.dylib' I could find many locations:
find: /usr//sbin/authserver: Permission denied
find: /usr//local/mysql-8.0.22-macos10.15-x86_64/keyring: Permission denied
find: /usr//local/mysql-8.0.22-macos10.15-x86_64/data: Permission denied
find: /usr//local/hw_mp_userdata/Internet_Manager/OnlineUpdate: Permission denied
/usr//local/lib/libavcodec.58.dylib
/usr//local/Cellar/ffmpeg/4.4_2/lib/libavcodec.58.dylib
(silicon) MacBook-Pro:opencv-4.5.0 adi$ ln -s /usr/local/Cellar/ffmpeg/4.4_2/lib/libavcodec.58.dylib /opt/homebrew/opt/ffmpeg/lib/libavcodec.58.dylib
ln: /opt/homebrew/opt/ffmpeg/lib/libavcodec.58.dylib: No such file or directory
One of the guides said to install homebrew also in arm64 env, so I did it with:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
export PATH="/opt/homebrew/bin:/usr/local/bin:$PATH"
alias ibrew='arch -x86_64 /usr/local/bin/brew' # create brew for intel (ibrew) and arm/ silicon
Not sure if that is affecting it but seems like it didn't do anything because still uses /opt/homebrew/ instead of /usr/local/.
So any help would be highly appreciated if I can make any of the ways work. Ultimately I want to use Tenserflow Model Zoo Object Detection models. So all the other dependencies seems fine (for now) besides either OpenCV not working or if it is working with conda install then it seems that scipy and scikit-learn don't work.
In my case I also had lot of trouble trying to install both modules. I finally managed to do so but to be honest not really sure how and why. I leave below the requirements in case you might want to recreate the environment that worked in my case. You should have the conda Miniforge 3 installed :
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: osx-arm64
absl-py=1.0.0=pypi_0
astunparse=1.6.3=pypi_0
autocfg=0.0.8=pypi_0
blas=2.113=openblas
blas-devel=3.9.0=13_osxarm64_openblas
boto3=1.22.10=pypi_0
botocore=1.25.10=pypi_0
c-ares=1.18.1=h1a28f6b_0
ca-certificates=2022.2.1=hca03da5_0
cachetools=5.0.0=pypi_0
certifi=2021.10.8=py39hca03da5_2
charset-normalizer=2.0.12=pypi_0
cycler=0.11.0=pypi_0
expat=2.4.4=hc377ac9_0
flatbuffers=2.0=pypi_0
fonttools=4.31.1=pypi_0
gast=0.5.3=pypi_0
gluoncv=0.10.5=pypi_0
google-auth=2.6.0=pypi_0
google-auth-oauthlib=0.4.6=pypi_0
google-pasta=0.2.0=pypi_0
grpcio=1.42.0=py39h95c9599_0
h5py=3.6.0=py39h7fe8675_0
hdf5=1.12.1=h5aa262f_1
idna=3.3=pypi_0
importlib-metadata=4.11.3=pypi_0
jmespath=1.0.0=pypi_0
keras=2.8.0=pypi_0
keras-preprocessing=1.1.2=pypi_0
kiwisolver=1.4.0=pypi_0
krb5=1.19.2=h3b8d789_0
libblas=3.9.0=13_osxarm64_openblas
libcblas=3.9.0=13_osxarm64_openblas
libclang=13.0.0=pypi_0
libcurl=7.80.0=hc6d1d07_0
libcxx=12.0.0=hf6beb65_1
libedit=3.1.20210910=h1a28f6b_0
libev=4.33=h1a28f6b_1
libffi=3.4.2=hc377ac9_2
libgfortran=5.0.0=11_1_0_h6a59814_26
libgfortran5=11.1.0=h6a59814_26
libiconv=1.16=h1a28f6b_1
liblapack=3.9.0=13_osxarm64_openblas
liblapacke=3.9.0=13_osxarm64_openblas
libnghttp2=1.46.0=h95c9599_0
libopenblas=0.3.18=openmp_h5dd58f0_0
libssh2=1.9.0=hf27765b_1
llvm-openmp=12.0.0=haf9daa7_1
markdown=3.3.6=pypi_0
matplotlib=3.5.1=pypi_0
mxnet=1.6.0=pypi_0
ncurses=6.3=h1a28f6b_2
numpy=1.21.2=py39hb38b75b_0
numpy-base=1.21.2=py39h6269429_0
oauthlib=3.2.0=pypi_0
openblas=0.3.18=openmp_h3b88efd_0
opencv-python=4.5.5.64=pypi_0
openssl=1.1.1m=h1a28f6b_0
opt-einsum=3.3.0=pypi_0
packaging=21.3=pypi_0
pandas=1.4.1=pypi_0
pillow=9.0.1=pypi_0
pip=22.0.4=pypi_0
portalocker=2.4.0=pypi_0
protobuf=3.19.4=pypi_0
pyasn1=0.4.8=pypi_0
pyasn1-modules=0.2.8=pypi_0
pydot=1.4.2=pypi_0
pyparsing=3.0.7=pypi_0
python=3.9.7=hc70090a_1
python-dateutil=2.8.2=pypi_0
python-graphviz=0.8.4=pypi_0
pytz=2022.1=pypi_0
pyyaml=6.0=pypi_0
readline=8.1.2=h1a28f6b_1
requests=2.27.1=pypi_0
requests-oauthlib=1.3.1=pypi_0
rsa=4.8=pypi_0
s3transfer=0.5.2=pypi_0
scipy=1.8.0=pypi_0
setuptools=58.0.4=py39hca03da5_1
six=1.16.0=pyhd3eb1b0_1
sqlite=3.38.0=h1058600_0
tensorboard=2.8.0=pypi_0
tensorboard-data-server=0.6.1=pypi_0
tensorboard-plugin-wit=1.8.1=pypi_0
tensorflow-deps=2.8.0=0
tensorflow-macos=2.8.0=pypi_0
termcolor=1.1.0=pypi_0
tf-estimator-nightly=2.8.0.dev2021122109=pypi_0
tk=8.6.11=hb8d0fd4_0
tqdm=4.63.1=pypi_0
typing-extensions=4.1.1=pypi_0
tzdata=2021e=hda174b7_0
urllib3=1.26.9=pypi_0
werkzeug=2.0.3=pypi_0
wheel=0.37.1=pyhd3eb1b0_0
wrapt=1.14.0=pypi_0
xz=5.2.5=h1a28f6b_0
yacs=0.1.8=pypi_0
zipp=3.7.0=pypi_0
zlib=1.2.11=h5a0b063_4

Tensorflow dataloading issue

I'm working with an example program that uses the MNIST dataset.
It tries to load the dataset using this line:
dataset = tfds.load(name='mnist', split=split)
However, this yields the following error:
2020-07-30 12:08:17.926262: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".
Traceback (most recent call last):
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/utils/py_utils.py", line 399, in try_reraise
yield
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/registered.py", line 244, in builder
return builder_cls(name)(**builder_kwargs)
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/api_utils.py", line 69, in disallow_positional_args_dec
return fn(*args, **kwargs)
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/dataset_builder.py", line 206, in __init__
self.info.initialize_from_bucket()
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/dataset_info.py", line 423, in initialize_from_bucket
data_files = gcs_utils.gcs_dataset_info_files(self.full_name)
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/utils/gcs_utils.py", line 71, in gcs_dataset_info_files
return gcs_listdir(posixpath.join(GCS_DATASET_INFO_DIR, dataset_dir))
File "/home/tflynn/pylocal/lib/python3.7/site-
packages/tensorflow_datasets/core/utils/gcs_utils.py", line 64, in gcs_listdir
if is_gcs_disabled() or not tf.io.gfile.exists(root_dir):
File "/home/tflynn/.local/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py",
line 267, in file_exists_v2
_pywrap_file_io.FileExists(compat.as_bytes(path))
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error executing an HTTP
request: libcurl code 77 meaning 'Problem with the SSL CA cert (path? access rights?)',
error details: error setting certificate verify locations:
CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
when reading metadata of gs://tfds-data/dataset_info/mnist/3.0.1
I've searched on google, but couldn't find any other instances of this error with tensorflow. The node is connected to the internet, if that makes a difference.
I've had the same issue on an Fedora32 system. The directory /etc/ssl/certs/ exists and there is a file ca-bundle.crt. The following command solved the problem:
sudo ln -s /etc/ssl/certs/ca-bundle.crt /etc/ssl/certs/ca-certificates.crt
Probably same as [1]. Run
apt-get update
apt-get install -y ca-certificates
if on linux before executing your code or commands of similar effect on your OS.
[1] https://github.com/tensorflow/serving/issues/1022
If you are unable to create a symbolic link or install using apt, you can try to upgrade to a more recent version of tfds. This issue is not present in the nightly build version 3.2.1.
pip install tfds-nightly: Released every day, contains the last versions of the datasets.
According to the TensorFlow/datasets GitHub repository, one commenter suggests to downgrade to 3.0.0, however; I have not tried this to see if it works.
The error is saying CURL couldn't find the file on your computer with root SSL certificates. On my machine this file is stored at /etc/ssl/certs/ca-bundle.crt.
You can override the path CURL looks for by setting the CURL_CA_BUNDLE environment variable. For example, either add this to the top of your python script/notebook:
import os
os.environ['CURL_CA_BUNDLE'] = "/etc/ssl/certs/ca-bundle.crt"
or you could set the environment variable in a shell, e.g.
export CURL_CA_BUNDLE=/etc/ssl/certs/ca-bundle.crt

Launching Tensorboard: bad interpreter: No such file or directory

I am unable to run tensorboard, and get the message:
bad interpreter: No such file or directory
Steps to reproduce:
Installed TF on Ubuntu, using a virtenv, and pip as per instructions install instructions
Confirmed TF was correctly installed by running the mnist example. Output was as expected
Attempted to run tensorboard using:
tensorboard --logdir=/tmp/tensorflow/mnist/logs/mnist_with_summaries/
Checked that this location does contain the summary files within the "test" and "train" directories
Command and error:
(tensorflow_1_4_0) js#pchome01:~$ tensorboard --logdir=/tmp/tensorflow/mnist/logs/mnist_with_summaries/
bash: /home/js/tensorflow_1_4_0/bin/tensorboard: /home/js/tensorflow_1_3/bin/python3: bad interpreter: No such file or directory
In my virtenv folder for tensorflow_1_4_0, a tensorboard script exists:
#!/home/js/tensorflow_1_3/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from tensorboard.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
sys.exit(main())
When I run the following from the terminal, no errors are reported:
from tensorboard.main import main
Thank you
Just spotted my silly mistake and posting the resolution in case others encounter this.
The meaning of the error message is that the interpreter of the code (in this case python3) cannot be found.
The first line of the tensorboard script:
#!/home/js/tensorflow_1_3/bin/python3
This tells the compiler to look for python3 at this location, however this path is incorrect and the virtual environment is actually called tensorflow_1_4_0.
Therefore changing this line to the following fixed the error:
#!/home/js/tensorflow_1_4_0/bin/python3

python -m pip install urllib having systax error while installation of this module

here see what happened
when I run the above command cmd I get an error while installing. As you can see in the above image.
s.connect((base64.b64decode(rip),17620)
I get syntax error: invalid token in line 191
and it is also giving me problems on some other modules also.
s.connect((base64.b64decode(rip),17620)
I get syntax error: invalid token in line 191
(I ran into this myself using jupyter notebook)
As you are using python 3 you don't need to install URL lib as it is part of core https://github.com/python/cpython/tree/3.6/Lib/urllib/
It's submodules are restructured so you need to change python 2 code like
import urllib
...
urllib.urlopen
into
import urllib.request
...
urllib.request.urlopen

Numpy and matplotlib without compiling/building in virtualenv

I'm trying to set up virtualenv with numpy. I've found that the recommended way to do it is by using
python setup.py install
in the numpy directory while under virtual enviroment.
I was wondering if it's possible to avoid the fortran compiling and just use a numpy binary available for OS?
Has anyone tried this? I couldn't figure out where is numpy located.
UPDATE:
Managed to do something.
Searched for "numpy" in my file system and found it in "usr/lib/pymodules/python2.7/numpy".
Then i just copied that to my virtualenv folder to "lib/pymodules/python2.7"
For now, i was able to call all numpy methods i tried.
UPDATE:
Tried to install matplotlib since numpy is a dependency for it. That failed:
REQUIRED DEPENDENCIES
numpy: 1.5.1
freetype2: found, but unknown version (no pkg-config)
* WARNING: Could not find 'freetype2' headers in any
* of '/usr/include', '.', '/usr/include/freetype2',
* './freetype2'.
pymods ['pylab']
packages ['matplotlib', 'matplotlib.backends', 'matplotlib.backends.qt4_editor', 'matplotlib.projections', 'matplotlib.testing', 'matplotlib.testing.jpl_units', 'matplotlib.tests', 'mpl_toolkits', 'mpl_toolkits.mplot3d', 'mpl_toolkits.axes_grid', 'mpl_toolkits.axes_grid1', 'mpl_toolkits.axisartist', 'matplotlib.sphinxext', 'matplotlib.tri', 'matplotlib.delaunay', 'pytz', 'dateutil', 'dateutil.zoneinfo']
warning: no files found matching 'KNOWN_BUGS'
warning: no files found matching 'INTERACTIVE'
warning: no files found matching 'MANIFEST'
warning: no files found matching '__init__.py'
warning: no files found matching 'examples/data/*'
warning: no files found matching 'lib/mpl_toolkits'
warning: no files found matching 'LICENSE*' under directory 'license'
In file included from ./CXX/Extensions.hxx:37:0,
from src/ft2font.h:6,
from src/ft2font.cpp:3:
./CXX/WrapPython.h:58:20: fatal error: Python.h: No such file or directory
compilation terminated.
error: Setup script exited with error: command 'gcc' failed with exit status 1
It does seem so that it isn't numpy which is causing the errors. Trying to diagnose the cause of the error...
UPDATE:
Manually went through all REQUIRED DEPENDENCIES and installed them.
It flied by to fast so i didn't notice it, and believed it yielded no errors.
Probably you need python dev package. try this
sudo apt-get install python2.7-dev
Not sure what OS you're using, but I would just use an EPD Free binary for this. Granted, you get SciPy and some other stuff along with it, but it's about as hassle-free as you can get.