Pandas incompatible with numpy - pandas

I am using anaconda 3. When I try to import pandas I receive the following message:
ImportError: this version of pandas is incompatible with numpy < 1.15.4
your numpy version is 1.15.3.
Please upgrade numpy to >= 1.15.4 to use this pandas version
Printing numpy.__path__ gives me the following
['C:\Users\andrei\AppData\Roaming\Python\Python37\site-packages\numpy']
In conda list, my numpy version is 1.19.1. I checked the above directory to find that it has only numpy 1.15.3 inside and nothing else. Spyder is using this path instead of the anaconda's path to numpy for some arcane reason.

Looks like you have somehow installed several versions of NumPy. Try to remove them all by running several times conda remove numpy and pip uninstall numpy. If you have two versions, the corresponding uninstall command needs to be run twice. After these, install a fresh version of NumPy conda install numpy
You can verify if you still have a version of NumPy installed
conda list | grep numpy
pip list | grep numpy
Note that these commands show only one version number even if you have several copies installed.

You can use conda to upgrade to upgrade your numpy. Run this command in the terminal:
conda update numpy

You need to remove this directory
C:\Users\andrei\AppData\Roaming\Python\
to fix this problem. It seems at some point you used pip to install numpy and that's interfering with the packages installed by conda (which is reporting the right version, as you said).
Furthermore, please be aware that pip and conda packages are binary incompatible, so you should avoid as much as possible to mix them.

Related

Issue with 'pandas on spark' used with conda: "No module named 'pyspark.pandas'" even though both pyspark and pandas are installed

I have installed both Spark 3.1.3 and Anaconda 4.12.0 on Ubuntu 20.04.
I have set PYSPARK_PYTHON to be the python bin of a conda environment called my_env
export PYSPARK_PYTHON=~/anaconda3/envs/my_env/bin/python
I installed several packages on conda environment my_env using pip. Here is a portion of the output of pip freeze command:
numpy==1.22.3
pandas==1.4.1
py4j==0.10.9.3
pyarrow==7.0.0
N.B: package pyspark is not installed on the conda environment my_env. I would like to be able to launch a pyspark shell on different conda environments without having to reinstall pyspark in every environment (I would like to only modify PYSPARK_PYTHON). This would also avoids having different versions of Spark on different conda environments (which is sometimes desirable but not always).
When I launch a pyspark shell using pyspark command, I can indeed import pandas and numpy which confirms that PYSPARK_PYTHON is properly set (my_env is the only conda env with pandas and numpy installed, moreover pandas and numpy are not installed on any other python installation even outside conda, and finally if I change PYSPARK_PYTHON I am no longer able to import pandas or numpy).
Inside the pyspark shell, the following code works fine (creating and showing a toy Spark dataframe):
sc.parallelize([(1,2),(2,4),(3,5)]).toDF(["a", "b"]).show()
However, if I try to convert the above dataframe into a pandas on spark dataframe it does not work. The command
sc.parallelize([(1,2),(2,4),(3,5)]).toDF(["t", "a"]).to_pandas_on_spark()
returns:
AttributeError: 'DataFrame' object has no attribute 'to_pandas_on_spark'
I tried to first import pandas (which works fine) and then pyspark.pandas before running the above command but when I run
import pyspark.pandas as ps
I obtain the following error:
ModuleNotFoundError: No module named 'pyspark.pandas'
Any idea why this happens ?
Thanks in advance
From here, it seems that you need apache spark 3.2, not 3.1.3. Update to 3.2 and you will have the desired API.
pip install pyspark #need spark 3.3
import pyspark.pandas as ps

ModuleNotFoundError after installing from github

I installed the OSMNX package from GitHub with pip using
pip install git+git://github.com/gboeing/osmnx.git
and I confirmed OSMNX was installed as it showed up on pip list.
However, when trying to import osmnx I receive an error that NumPy cannot be found. I definitely have NumPy installed, and I confirmed that NumPy shows up in pip list, so I'm not sure why OSMNX can't find NumPy. Any ideas on how to make OSMNX recognize NumPy?
Here's the full error

`save_model` requires h5py error occurss after installing h5py and cython packages too

I need to save my new sequential model but when I use the model.save(filename),it shows error like save_model requires hp5y.I tried installing h5py in conda by 'conda install -c anaconda h5py'command. And I also installed cython,but then the error exists .what should I do?
one should include the package in pycharm project interpreter after being downloaded in the conda environment.

i cant import numpy in pycharm and its showing errors even after running pip install numpy in cmd

even after trying pip install numpy in cmd i cannot get the numpy module installed in pycharm.
i have also tried to add numpy using project interpreter.
numpy is being listed in the available modules but it cannot be installed.
The screenshot of the error is placed below. please help me out

Theano fails due to NumPy Fortran mixup under Ubuntu

I installed Theano on my machine, but the nosetests break with a Numpy/Fortran related error message. For me it looks like Numpy was compiled with a different Fortran version than Theano. I already reinstalled Theano (sudo pip uninstall theano + sudo pip install --upgrade --no-deps theano) and Numpy / Scipy (apt-get install --reinstall python-numpy python-scipy), but this did not help.
What steps would you recommend?
Complete error message:
ImportError: ('/home/Nick/.theano/compiledir_Linux-2.6.35-31-generic-x86_64-with-Ubuntu-10.10-maverick--2.6.6/tmpIhWJaI/0c99c52c82f7ddc775109a06ca04b360.so: undefined symbol: _gfortran_st_write_done'
My research:
The Installing SciPy / BuildingGeneral page about the undefined symbol: _gfortran_st_write_done' error:
If you see an error message
ImportError: /usr/lib/atlas/libblas.so.3gf: undefined symbol: _gfortran_st_write_done
when building SciPy, it means that NumPy picked up the wrong Fortran compiler during build (e.g. ifort).
Recompile NumPy using:
python setup.py build --fcompiler=gnu95
or whichever is appropriate (see python setup.py build --help-fcompiler).
But:
Nick#some-serv2:/usr/local/lib/python2.6/dist-packages/numpy$ python setup.py build --help-fcompiler
This is the wrong setup.py file to run
Used software versions:
scipy 0.10.1 (scipy.test() works)
NumPy 1.6.2 (numpy.test() works)
theano 0.5.0 (several tests fails with undefined symbol: _gfortran_st_write_done')
python 2.6.6
Ubuntu 10.10
[UPDATE]
So I removed numpy and scipy from my system with apt-get remove and using find -name XXX -delete of what was left.
Than I installed numpy and scipy from the github sources with sudo python setpy.py install.
Afterwards I entered again sudo pip uninstall theano and sudo pip install --upgrade --no-deps theano.
Error persists :/
I also tried the apt-get source ... + apt-get build-dep ... approach, but for my old Ubuntu (10.10) it installs too old version of numpy and scipy for theano: ValueError: numpy >= 1.4 is required (detected 1.3.0 from /usr/local/lib/python2.6/dist-packages/numpy/__init__.pyc)
I had the same problem, and after reviewing the source code, user212658's answer seemed like it would work (I have not tried it). I then looked for a way to deploy user212658's hack without modifying the source code.
Put these lines in your theanorc file:
[blas]
ldflags = -lblas -lgfortran
This worked for me.
Have you tried to recompile NumPy from the sources?
I'm not familiar with the Ubuntu package system, so I can't check what's in your dist-packages/numpy. With a clean archive of the NumPy sources, you should have a setup.py at the same level as the directories numpy, tools and benchmarks (among others). I'm pretty sure that's the one you want to use for a python setup.py build.
[EDIT]
Now that you have recompiled numpy with the proper --fcompiler option, perhaps could you try to do the same with Theano, that is, compiling directly from sources without relying on a apt-get or even pip. You should have a better control on the build process that way, which will make debugging/trying to find a solution easier.
I had the same problem. The solution I found is to add a hack in theano/gof/cmodule.py to link against gfortran whenever 'blas' is in the libs. That fixed it.
class GCC_compiler(object):
...
#staticmethod
def compile_str(module_name, src_code, location=None,
include_dirs=None, lib_dirs=None, libs=None,
preargs=None):
...
cmd.extend(['-l%s' % l for l in libs])
if 'blas' in libs:
cmd.append('-lgfortran')
A better fix is to remove atlas and install openblas. openblas is faster then atlas. Also, openblas don't request gfortran and is the one numpy was linked with. So it will work out of the box.