How to use python boto3 inside of an AWS CodeBuild build? - aws-codebuild

I need to send SNS notifications directly from inside my CodeBuild script, but I'm getting this error:
ImportError: No module named boto3
Is it possible to fix? Or is the CodeBuild environment just too restrictive to allow this sort of thing?

CodeBuild curated images for Python don't have boto3 installed. You could use pip install boto3 to install this module during the build by specifying this command in the buildspec.yml. For example, if your python file is main.py, you buildspec.yml should look like this:
version: 0.2
phases:
install:
- pip install boto3
- [other install commands if needed]
build:
- python main.py

Related

install "pip undetected-chromedriver" for selenium python

I'm trying to make an autofiller using selenium, but it couldn't be done. so I decided to use undetected chromedriver to finish the automation.
I am having some difficulty here to import the undetected-chromedriver.
I already downloaded it by inputting the command line: pip install undetected-chromedriver
But when I put the import undetected_chromedriver as uc, the complier doesn't recognize it.
Below is the Error message after trying to import undetected-chromedriver:
import undetected_chromedriver as uc
ModuleNotFoundError: No module named 'undetected_chromedriver'
Use the following command to check if the undetected_chromedriver package is in the list
pip list
or
pip3 list
Try the following
# navigate into the project directory with your python script
cd presearch
# create virtual environment
python3 -m venv venv
# activate the virtual environment
source venv/bin/activate
# install required pip packages
pip3 install undetected-chromedriver
If you have multiple python versions installed, you might check if you actually installed it in the right one.

How to install pandas using PIP for AWS Web Services written in Python

I'm installing Pandas to work with CSV files in AWS API's written in Python. And getting build errros.
Code in my build spec file:
- npm install
- pip install --target [FILE_PATH] pandas
- pip install --target [FILE_PATH] lxml
- pip install --target [FILE_PATH] zeep
- pip install --target [FILE_PATH] xlrd
- pip install --target [FILE_PATH] requests
Error information:
[ERROR] Runtime.ImportModuleError: Unable to import required dependencies:
numpy:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
The Python version is: Python3.8 from "/var/lang/bin/python3.8"
The NumPy version is: "1.19.0"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: No module named 'numpy.core._multiarray_umath'

tensor flow install problems [duplicate]

I'm trying to use pip to install a package. I try to run pip install from the Python shell, but I get a SyntaxError. Why do I get this error? How do I use pip to install the package?
>>> pip install selenium
^
SyntaxError: invalid syntax
pip is run from the command line, not the Python interpreter. It is a program that installs modules, so you can use them from Python. Once you have installed the module, then you can open the Python shell and do import selenium.
The Python shell is not a command line, it is an interactive interpreter. You type Python code into it, not commands.
Use the command line, not the Python shell (DOS, PowerShell in Windows).
C:\Program Files\Python2.7\Scripts> pip install XYZ
If you installed Python into your PATH using the latest installers, you don't need to be in that folder to run pip
Terminal in Mac or Linux
$ pip install XYZ
As #sinoroc suggested correct way of installing a package via pip is using separate process since pip may cause closing a thread or may require a restart of interpreter to load new installed package so this is the right way of using the API: subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'SomeProject']) but since Python allows to access internal API and you know what you're using the API for you may want to use internal API anyway eg. if you're building own GUI package manager with alternative resourcess like https://www.lfd.uci.edu/~gohlke/pythonlibs/
Following soulution is OUT OF DATE, instead of downvoting suggest updates. see https://github.com/pypa/pip/issues/7498 for reference.
UPDATE: Since pip version 10.x there is no more get_installed_distributions() or main method under import pip instead use import pip._internal as pip.
UPDATE ca. v.18 get_installed_distributions() has been removed. Instead you may use generator freeze like this:
from pip._internal.operations.freeze import freeze
print([package for package in freeze()])
# eg output ['pip==19.0.3']
If you want to use pip inside the Python interpreter, try this:
import pip
package_names=['selenium', 'requests'] #packages to install
pip.main(['install'] + package_names + ['--upgrade'])
# --upgrade to install or update existing packages
If you need to update every installed package, use following:
import pip
for i in pip.get_installed_distributions():
pip.main(['install', i.key, '--upgrade'])
If you want to stop installing other packages if any installation fails, use it in one single pip.main([]) call:
import pip
package_names = [i.key for i in pip.get_installed_distributions()]
pip.main(['install'] + package_names + ['--upgrade'])
Note: When you install from list in file with -r / --requirement parameter you do NOT need open() function.
pip.main(['install', '-r', 'filename'])
Warning: Some parameters as simple --help may cause python interpreter to stop.
Curiosity: By using pip.exe you actually use python interpreter and pip module anyway. If you unpack pip.exe or pip3.exe regardless it's python 2.x or 3.x, inside is the SAME single file __main__.py:
# -*- coding: utf-8 -*-
import re
import sys
from pip import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
sys.exit(main())
To run pip in Python 3.x, just follow the instructions on Python's page: Installing Python Modules.
python -m pip install SomePackage
Note that this is run from the command line and not the python shell (the reason for syntax error in the original question).
I installed python and when I run pip command it used to throw me an error like shown in pic below.
Make Sure pip path is added in environmental variables. For me, the python and pip installation path is::
Python: C:\Users\fhhz\AppData\Local\Programs\Python\Python38\
pip: C:\Users\fhhz\AppData\Local\Programs\Python\Python38\Scripts
Both these paths were added to path in environmental variables.
Now Open a new cmd window and type pip, you should be seeing a screen as below.
Now type pip install <<package-name>>. Here I'm installing package spyder so my command line statement will be as pip install spyder and here goes my running screen..
and I hope we are done with this!!
you need to type it in cmd not in the IDLE. becuse IDLE is not an command prompt if you want to install something from IDLE type this
>>>from pip.__main__ import _main as main
>>>main(#args splitted by space in list example:['install', 'requests'])
this is calling pip like pip <commands> in terminal. The commands will be seperated by spaces that you are doing there to.
If you are doing it from command line,
try -
python -m pip install selenium
or (for Python3 and above)
python3 -m pip install selenium

Missing required dependencies ['numpy'] in AWS Lambda function

I guess many people have come across the same issue. I have tried to find every possible blog and try every method. I have reached this point and stuck here.
I am using Serverless framework and virtualenv.
serverless.yml:
service: test-pandas
provider:
name: aws
runtime: python2.7
plugins:
- serverless-python-requirements
package:
exclude:
- venv/**
- node_modules/**
functions:
hello:
handler: validation.hello
validation.py:
import pandas as pd
import numpy as np
def hello(event, context):
return "hello world"
I am using python 2.7. I have run these commands in Virtualenv:
virtualenv venv --python=python2
source venv/bin/activate
pip install pandas
pip freeze > requirements.txt
cat requirements.txt
Before creating the requirements.txt, the error was "No import module named pandas" and after I setup serverless-python-requirements, I am getting "Missing required dependencies ['numpy']".
Am I missing something here?
I used Docker to package and deploy the Lambda function with the libraries.
Add the following in serverless.yml:
custom:
pythonRequirements:
dockerizePip: non-linux
Make sure Docker is running on your machine and deploy it using serverless commands. Another thing I noticed is that, after using Docker, the .zip filesize reduced almost half of the original filesize.

Pandas & AWS Lambda

Does anyone have a fully compiled version of pandas that is compatible with AWS Lambda?
After searching around for a few hours, I cannot seem to find what I'm looking for and the documentation on this subject is non-existent.
I need access to the package in a lambda function however I have been unsuccessful at getting the package to compile properly for usage in a Lambda function.
In lieu of the compilation can anyone provide reproducible steps to create the binaries?
Unfortunately I have not been able to successfully reproduce any of the guides on the subjects as they mostly combine pandas with scipy which I don't need and adds an extra layer of burden.
I believe you should be able to use the recent pandas version (or likely, the one on your machine). You can create a lambda package with pandas by yourself like this,
First find where the pandas package is installed on your machine i.e. Open a python terminal and type
import pandas
pandas.__file__
That should print something like '/usr/local/lib/python3.4/site-packages/pandas/__init__.py'
Now copy the pandas folder from that location (in this case '/usr/local/lib/python3.4/site-packages/pandas) and place it in your repository.
Package your Lambda code with pandas like this:
zip -r9 my_lambda.zip pandas/
zip -9 my_lambda.zip my_lambda_function.py
You can also deploy your code to S3 and make your Lambda use the code from S3.
aws s3 cp my_lambda.zip s3://dev-code//projectx/lambda_packages/
Here's the repo that will get you started
After some tinkering around and lot's of googling I was able to make everything work and setup a repo that can just be cloned in the future.
Key takeaways:
All static packages have to be compiled on an ec2 amazon Linux instance
The python code needs to load the libraries in the lib/ folder before executing.
Github repo:
https://github.com/moesy/AWS-Lambda-ML-Microservice-Skeleton
The repo mthenw/awesome-layers lists several publicly available aws lambda layers.
In particular, keithrozario/Klayers has pandas+numpy and is up-to-date as of today with pandas 0.25.
Its ARN is arn:aws:lambda:us-east-1:113088814899:layer:Klayers-python37-pandas:1
I know the question was asked a couple years ago and Lambda was on a different stage back then.
I faced similar issues lately and I thought it would be a good idea to add the newest solution here for future users facing the same problem.
It turns out that amazon released the concept of layers in the re:Invent 2018. It is a great feature. This post in medium describes it much better than I could here: Creating New AWS Lambda Layer For Python Pandas Library
The easiest way to get pandas working in a Lambda function is to utilize Lambda Layers and AWS Data Wrangler. A Lambda Layer is a zip archive that contains libraries or dependencies. According to the AWS documentation, using layers keeps your deployment package small, making development easier.
The AWS Data Wrangler is an open source package that extends the power of pandas to AWS services.
Follow the instructions (under AWS Lambda Layer) here.
Another option is to download the pre-compiled wheel files as discussed on this post: https://aws.amazon.com/premiumsupport/knowledge-center/lambda-python-package-compatible/
Essentially, you need to go to the project page on https://pypi.org and download the files named like the following:
For Python 2.7: module-name-version-cp27-cp27mu-manylinux1_x86_64.whl
For Python 3.6: module-name-version-cp36-cp36m-manylinux1_x86_64.whl
Then unzip the .whl files to your project directory and re-zip the contents together with your lambda code.
NOTE: The main Python function file(s) must be in the root folder of the resulting deployment package .zip file. Other Python modules and dependencies can be in sub-folders. Something like:
my_lambda_deployment_package.zip
├───lambda_function.py
├───numpy
│ ├───[subfolders...]
├───pandas
│ ├───[subfolders...]
└───[additional package folders...]
#ashtonium's answer actually works and is most likely the easiest, however, a few additional steps are required. Also, Pandas requires Pytz (mentioned in the link provided by #b3rt0) so that package is needed as well.
Download the whl-files from PyPI (the Pandas file ends with ...manylinux1_x86_64.whl, there is only one Pytz file of relevance)
Unzip the whl-files using terminal command, e.g. unzip filename.whl (Linux/MacOS)
Create a new folder structure python/lib/python3.7/site-packages/ (swap 3.7 for version of your choice)
Move folders from step 2 to site-packages folder in step 3
Zip root folder in new structure, i.e. python
Create a new layer in AWS management console where you upload the zip-file
This is a very common question, I hope my solution helps.
Update on Aug 19, 2020:
Wheel-files aren't available for all packages. In these cases you can skip to step 3, go into the site-packages folder and install the package in there with pip3 install PACKAGE_NAME -t . (no venv required). Some packages are easier than others, some are trickier. Psycopg2 for example, requires you to move only one of the two (as of this writing) package folders.
/Cheers
There are some precompiled packages on github by ryfeus.
My solution has been to maintain 2 requirements.txt style files of packages that go in my layer, one named provided_packages.txt and one named provided_linux_installs.txt
Before deployment (if the packages are not already installed) I run:
pip install -r provided_packages.txt -t layer_name/python/lib/python3.8/site-packages/.
pip download -r provided_linux_installs.txt --platform manylinux1_x86_64 --no-deps -d layer_name/python/lib/python3.8/site-packages
cd layer_name/python/lib/python3.8/site-packages
unzip \*.whl
rm *.whl
Then deploy normally (I am using cdk synth & cdk deploy \* --profile profile_name)
In case helpful, my provided_linux_installs.txt looks like this:
pandas==1.1.0
numpy==1.19.1
pytz==2020.1
python-dateutil==2.8.1
I have started to maintain a GitHub repo for easy and quick access to layers. https://github.com/kuharan/Lambda-Layers
I have been using these for my open-source projects and stuff.
I managed to deploy a pandas code in aws lambda using python3.6 runtime . this is the step that i follow :
Add required libraries into requirements.txt
Build project in a docker container (using aws sam cli : sam build --use-container)
Run code (sam local invoke --event test.json)
this is a helper : https://github.com/ysfmag/aws-lambda-py-pandas-template
# all the step are done in AWS EC2 Linux Free tier so that all the Libraries are compatible with the Lambda environment
# install the required packages
mkdir packages
pip3 install -t . pandas
pip3 install -t . numpy --upgrade
pip3 install -t . wikipedia --upgrade
pip3 install -t . sklearn --upgrade
pip3 install -t . pickle-mixin --upgrade
pip3 install -t . fuzzywuzzy --upgrade
# Now remove all unnecessary files
sudo rm -r *.whl *.dist-info __pycache__
# Now make a DIR so that lambda function can reconginzes
sudo mkdir -p build/python/lib/python3.6/site-packages
# Now move all the files from packages folder to site-packages folder
sudo mv /home/ec2-user/packages/* build/python/lib/python3.6/site-packages/
# Now move to the build packages
cd build
# Now zip all the files starting from python folder to site-packages
sudo zip -r python.zip .
upload the zip file to lambda layers
python 3.8 windows 10 lambda aws pandas
You need to do the following steps on a linux machine and python 3.8:
sudo mkdir python
sudo pip3 install --target python pandas
sudo zip -r pandas.zip python
create a public s3 bucket, upload pandas.zip, grab the public URL.
create new lambda layer using s3 URL from above.
add layer to lambda function and import pandas as pd like you normally would
No linux machine? Launch an Ubuntu EC2 instance or container:
sudo apt install python3.8 zip unzip python3-pip
run 1-3 above
Now you need to copy the zip to your local machine. Open a command terminal and change directory to the folder containing your EC2 instance's pem file and run: scp -i yourPemFile.pem ubuntu#'EC2.Instance.IP.Here':/home/path/to/pandas.zip C:\Users\YourUser\Desktop
run steps 4-6 from above
*for number 3 above: you need to grab your EC2 IP and insert it. You may get an error about the permissions on the pem file, if you do then right click the pem file > properties > security > advanced > disable inheritance and make sure only your user is in the "permission entries." Lastly, fix the paths to point to where the pandas.zip file is on the EC2 instance and where you want the file to end up locally.
**pay attention to the python runtime of the lambda function. Make sure it matches the version of python you're using to do the pip stuff (which should be 3.8).
***the original folder name "python" is named that for a reason as per AWS documentation.
After lots of googling on this and messing around, the concept of layers are great and seem to work for me.
This github repo from keithrozario has loads of pre-build layers you can simply add to your lambda via the arn which has some great stuff in there like pandas, requests and sqlalchemy.
I've create a template to compile and upload a layer (containing python dependencies) to lambda using the AWS CLI which you can find in my Gitlab repo here.
I'm running this on an Amazon Linux EC2, using a virtual environment (venv) to install libraries from a requirements.txt file and then load the zipped files to lambda using the AWS CLI.
Note the folder structure my_zip_file/python/binaries which is required for lambda.
Note: Pandas is quite a large library. Your zipped layer file must be below 70mb.
You may also encounter the horrible "OpenBLAS WARNING - could not determine the L2 cache size on this system" error message. I had to increase the memory from the default 128mb in order to the lambda to successfully run.
After searching around for a few hours, I cannot seem to find what I’m looking for and the documentation on this subject is non-existent.
So i decided to build the libraries myself to support the Amazon Linux 2 arch.
Read full blog here https://khanakia.medium.com/add-pandas-and-numpy-python-to-aws-lambda-layers-python-3-7-3-8-694db42f6119