"no module named numpy.core" error after installing numpy in a layer in AWS Lambda - pandas

When I deployed my serverless FastAPI with AWS Lambda and API Gateway I got the following error:
No module named numpy.core. Unable to import module 'path/to/handler': Unable to import required dependencies: numpy
And I also got this:
The python version is: Python3.7 from "var/lang/bin/python3.7" , the numpy version is: "1.21.6"
I have installed the numpy module in a layer and included the path to the directory where the module is installed in the serverless.yml file and I still experience the "no module named numpy.core" error.
My serverless.yml file looks like this:
layers:
something:
compatibleRuntimes:
- python3.8
- python3.7
- python3.6
path: 'path/to/layer'
I also tried without the compatibleRuntimes block, and I still got the same error.
I think that since the numpy module is installed and the correct version is being used, it is likely that the issue is not related to the numpy module itself.
And, actually, in my code, what I import is the pandas module, but when I install pandas, it also installs numpy.
import pandas as pd
# functions
Any advice or tip? I have been stuck in this for a long time. Thank you in advance!

Related

Error in Power BI while importing pandas library in python scrip

Below are the mentioned error while importing pandas library in Power BI in python script.
Details: "ADO.NET: Python script error.
C:\USERS\YADAVP\ANACONDA3\lib\site-packages\numpy\__init__.py:140: UserWarning: mkl-service package failed to import, therefore Intel(R) MKL initialization ensuring its correct out-of-the box operation under condition when Gnu OpenMP had already been loaded by Python process is not assured. Please install mkl-service package, see http://github.com/IntelPython/mkl-service
from . import _distributor_init
Traceback (most recent call last):
File "PythonScriptWrapper.PY", line 2, in <module>
import os, pandas, matplotlib
File "C:\USERS\YADAVP\ANACONDA3\lib\site-packages\pandas\__init__.py", line 17, in <module>
"Unable to import required dependencies:\n" + "\n".join(missing_dependencies)
ImportError: Unable to import required dependencies:
numpy:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy c-extensions failed.
- Try uninstalling and reinstalling numpy.
- If you have already done that, then:
1. Check that you expected to use Python3.7 from "C:\USERS\YADAVP\ANACONDA3\python.exe",
and that you have no directories in your PATH or PYTHONPATH that can
interfere with the Python and numpy version "1.18.1" you're trying to use.
2. If (1) looks fine, you can open a new issue at
https://github.com/numpy/numpy/issues. Please include details on:
- how you installed Python
- how you installed numpy
- your operating system
- whether or not you have multiple versions of Python installed
- if you built from source, your compiler versions and ideally a build log
- If you're working with a numpy git repository, try `git clean -xdf`
(removes all files not under version control) and rebuild numpy.
Note: this error has many possible causes, so please don't comment on
an existing issue about this - open a new one instead.
Original error was: DLL load failed: The specified module could not be found.
What is the resolution to sort this kind of error in Power BI?
Forget Anaconda and use WinPython.
I tried Anaconda for days with all the workarounds available in StackOverflow and other forums, and they took me nowhere.
Then I tried WinPython, and it worked immediately. Of course, you will need to change the PowerBI options accordingly.
To install WinPython: https://github.com/winpython/winpython
To change the detected Python home directory: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-python-scripts#enable-python-scripting
If you consider my answer, you won't need to downgrade Python, PBI, or anything else.
I had the same error. Unfortunately, PowerBI wont work with Jupyter Notebook Python.
So you have to install a "normal" Python: https://www.python.org/downloads/
And configure the Python you want to use in PowerBI and install your needed Python libraries via pip
Edit: Please use Python 3.8 because 3.9 doesnt support NumPy for now

ModuleNotFoundError: No module named 'pandas.io' for json_normalize

Please read carefully. In my Python script I have the following:
import json
import pandas
from pandas.io.json import json_normalize
and it returns the following error:
from pandas.io.json import json_normalize ModuleNotFoundError: No
module named 'pandas.io'; 'pandas' is not a package
My steps:
I have uninstalled and installed Pandas
I have upgraded pip and pandas
I have installed io (pip install -U pandas.io)
I have installed data_reader and replaced the pandas.io.json part with that: from pandas_datareader import json_normalize
I have tried every solution I saw on stackoverflow and github and nothing worked. The only one I have not tried is installing Anaconda but it should work with what I tried before. Do you think it is a Windows setting things I must change?
PS: My Python version is 3.7.4
Try:
Go to ...\Lib\site-packages\pytrends on your local disk and open file request.py
Change
from pandas.io.json._normalize import nested_to_record
to
from pandas.io.json.normalize import nested_to_record
I had the same error, but it helped me.
also change
from pandas.io.json.normalize
to
from pandas.io.json._normalize
The cause of the problem was the fact that the python file had the name pandas. The filename was pandas.py. After renaming it, the code worked normally without errors.
i had same problem and i solve it b uninstalling extra python versions install on my windows.now i have only one python installed by anaconda,and everything is working perfectly

No module named lstm_predictor

When I am trying to import the following module,
>>> from lstm_predictor import lstm_model
The error says no module named lstm_predictor.
How can I solve the problem?
It seems like you are utilizing the lstm_predictor package present in https://github.com/tgjeon/TensorFlow-Tutorials-for-Time-Series.
Since this is not a standard module, make sure you have cloned this project and you have the lstm_predictor.py file in the same folder as your python terminal.

Enthought Canopy Pandas not installing

Using Enthought Canopy; the following command import pandas produces this error message:
ImportError: C extension: hashtable not built. If you want to import pandas
from the source directory, you may need to run 'python setup.py build_ext --
inplace' to build the C extensions first.
Which I understand means that the package hasn't been built with it's C dependencies? I thought Canopy's environment handled module installations, I have tried removing, and updating Pandas with no luck.
Does anyone know how to correctly use Pandas in Enthought Canopy?
Forcing a reinstallation of Pandas and its dependencies with enpkg pandas --forceall run from a Canopy Terminal/Command Prompt seems to have fixed the problem.

No module named numpy when spark-submitting

I’m spark-submitting a python file that imports numpy but I’m getting a no module named numpy error.
$ spark-submit --py-files projects/other_requirements.egg projects/jobs/my_numpy_als.py
Traceback (most recent call last):
File "/usr/local/www/my_numpy_als.py", line 13, in <module>
from pyspark.mllib.recommendation import ALS
File "/usr/lib/spark/python/pyspark/mllib/__init__.py", line 24, in <module>
import numpy
ImportError: No module named numpy
I was thinking I would pull in an egg for numpy —python-files, but I'm having trouble figuring out how to build that egg. But then it occurred to me that pyspark itself uses numpy. It would be silly to pull in my own version of numpy.
Any idea on the appropriate thing to do here?
It looks like Spark is using a version of Python that does not have numpy installed. It could be because you are working inside a virtual environment.
Try this:
# The following is for specifying a Python version for PySpark. Here we
# use the currently calling Python version.
# This is handy for when we are using a virtualenv, for example, because
# otherwise Spark would choose the default system Python version.
os.environ['PYSPARK_PYTHON'] = sys.executable
I got this to work by installing numpy on all the emr-nodes by configuring a small bootstrapping script that contains the following (among other things).
#!/bin/bash -xe
sudo yum install python-numpy python-scipy -y
Then configure the bootstrap script to be executed when you start your cluster by adding the following option to the aws emr command (the following example gives an argument to the bootstrap script)
--bootstrap-actions Path=s3://some-bucket/keylocation/bootstrap.sh,Name=setup_dependencies,Args=[s3://some-bucket]
This can be used when setting up a cluster automatically from DataPipeline as well.
Sometimes, when you import certain libraries, your namespace is polluted with numpy functions. Functions such as min, max and sum are especially prone to this pollution. Whenever in doubt, locate calls to these functions and replace these calls with __builtin__.sum etc. Doing so will sometimes be faster than locating the pollution source.
Make sure your spark-env.sh has PYSPARK_PATH pointing to the correct Python release. Add export PYSPARK_PATH=/your_python_exe_path to /conf/spark-env.sh file.