Upload entire Bitbucket repo to S3 using Bitbucket Pipeline - amazon-s3

I'm using Bitbuckets Pipeline. I want it to push the entire contents of my repo (very small) to S3. I don't want to have to zip it up, push to S3 and then unzip things. I just want it to take the existing file/folder structure in my Bitbucket repo and push that to S3.
What should the yaml file and .py file look like to accomplish this?
Here is the current yaml file:
image: python:3.5.1
pipelines:
branches:
master:
- step:
script:
# - apt-get update # required to install zip
# - apt-get install -y zip # required if you want to zip repository objects
- pip install boto3==1.3.0 # required for s3_upload.py
# the first argument is the name of the existing S3 bucket to upload the artefact to
# the second argument is the artefact to be uploaded
# the third argument is the the bucket key
# html files
- python s3_upload.py my-bucket-name html/index_template.html html/index_template.html # run the deployment script
# Example command line parameters. Replace with your values
#- python s3_upload.py bb-s3-upload SampleApp_Linux.zip SampleApp_Linux # run the deployment script
And here is my current python:
from __future__ import print_function
import os
import sys
import argparse
import boto3
from botocore.exceptions import ClientError
def upload_to_s3(bucket, artefact, bucket_key):
"""
Uploads an artefact to Amazon S3
"""
try:
client = boto3.client('s3')
except ClientError as err:
print("Failed to create boto3 client.\n" + str(err))
return False
try:
client.put_object(
Body=open(artefact, 'rb'),
Bucket=bucket,
Key=bucket_key
)
except ClientError as err:
print("Failed to upload artefact to S3.\n" + str(err))
return False
except IOError as err:
print("Failed to access artefact in this directory.\n" + str(err))
return False
return True
def main():
parser = argparse.ArgumentParser()
parser.add_argument("bucket", help="Name of the existing S3 bucket")
parser.add_argument("artefact", help="Name of the artefact to be uploaded to S3")
parser.add_argument("bucket_key", help="Name of the S3 Bucket key")
args = parser.parse_args()
if not upload_to_s3(args.bucket, args.artefact, args.bucket_key):
sys.exit(1)
if __name__ == "__main__":
main()
This requires me to list every single file in the repo in the yaml file as another command. I just want it to grab everything and upload it to S3.

You can change to use docker https://hub.docker.com/r/abesiyo/s3/
It runs quite well
bitbucket-pipelines.yml
image: abesiyo/s3
pipelines:
default:
- step:
script:
- s3 --region "us-east-1" rm s3://<bucket name>
- s3 --region "us-east-1" sync . s3://<bucket name>
Please also setup environment variables on bitbucket pipelines
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY

Figured it out myself. Here is the python file, 's3_upload.py'
from __future__ import print_function
import os
import sys
import argparse
import boto3
#import zipfile
from botocore.exceptions import ClientError
def upload_to_s3(bucket, artefact, is_folder, bucket_key):
try:
client = boto3.client('s3')
except ClientError as err:
print("Failed to create boto3 client.\n" + str(err))
return False
if is_folder == 'true':
for root, dirs, files in os.walk(artefact, topdown=False):
print('Walking it')
for file in files:
#add a check like this if you just want certain file types uploaded
#if file.endswith('.js'):
try:
print(file)
client.upload_file(os.path.join(root, file), bucket, os.path.join(root, file))
except ClientError as err:
print("Failed to upload artefact to S3.\n" + str(err))
return False
except IOError as err:
print("Failed to access artefact in this directory.\n" + str(err))
return False
#else:
# print('Skipping file:' + file)
else:
print('Uploading file ' + artefact)
client.upload_file(artefact, bucket, bucket_key)
return True
def main():
parser = argparse.ArgumentParser()
parser.add_argument("bucket", help="Name of the existing S3 bucket")
parser.add_argument("artefact", help="Name of the artefact to be uploaded to S3")
parser.add_argument("is_folder", help="True if its the name of a folder")
parser.add_argument("bucket_key", help="Name of file in bucket")
args = parser.parse_args()
if not upload_to_s3(args.bucket, args.artefact, args.is_folder, args.bucket_key):
sys.exit(1)
if __name__ == "__main__":
main()
and here is they bitbucket-pipelines.yml file:
---
image: python:3.5.1
pipelines:
branches:
dev:
- step:
script:
- pip install boto3==1.4.1 # required for s3_upload.py
- pip install requests
# the first argument is the name of the existing S3 bucket to upload the artefact to
# the second argument is the artefact to be uploaded
# the third argument is if the artefact is a folder
# the fourth argument is the bucket_key to use
- python s3_emptyBucket.py dev-slz-processor-repo
- python s3_upload.py dev-slz-processor-repo lambda true lambda
- python s3_upload.py dev-slz-processor-repo node_modules true node_modules
- python s3_upload.py dev-slz-processor-repo config.dev.json false config.json
stage:
- step:
script:
- pip install boto3==1.3.0 # required for s3_upload.py
- python s3_emptyBucket.py staging-slz-processor-repo
- python s3_upload.py staging-slz-processor-repo lambda true lambda
- python s3_upload.py staging-slz-processor-repo node_modules true node_modules
- python s3_upload.py staging-slz-processor-repo config.staging.json false config.json
master:
- step:
script:
- pip install boto3==1.3.0 # required for s3_upload.py
- python s3_emptyBucket.py prod-slz-processor-repo
- python s3_upload.py prod-slz-processor-repo lambda true lambda
- python s3_upload.py prod-slz-processor-repo node_modules true node_modules
- python s3_upload.py prod-slz-processor-repo config.prod.json false config.json
As an example for the dev branch, it grabs everything in the "lambda" folder, walks the entire structure of that folder, and for each item it finds, it uploads it to the dev-slz-processor-repo bucket
Lastly, here is a little helpful function, 's3_emptyBucket', to remove all objects from the bucket before uploading the new ones:
from __future__ import print_function
import os
import sys
import argparse
import boto3
#import zipfile
from botocore.exceptions import ClientError
def empty_bucket(bucket):
try:
resource = boto3.resource('s3')
except ClientError as err:
print("Failed to create boto3 resource.\n" + str(err))
return False
print("Removing all objects from bucket: " + bucket)
resource.Bucket(bucket).objects.delete()
return True
def main():
parser = argparse.ArgumentParser()
parser.add_argument("bucket", help="Name of the existing S3 bucket to empty")
args = parser.parse_args()
if not empty_bucket(args.bucket):
sys.exit(1)
if __name__ == "__main__":
main()

Atlassian now offers "Pipes" to simplify configuration of some common tasks. There's one for S3 upload as well.
No need to specify a different image type:
image: node:8
pipelines:
branches:
master:
- step:
script:
- pipe: atlassian/aws-s3-deploy:0.2.1
variables:
AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION: "us-east-1"
S3_BUCKET: "your.bucket.name"
LOCAL_PATH: "dist"

For deploying a static website to Amazon S3 I have this bitbucket-pipelines.yml configuration file:
image: attensee/s3_website
pipelines:
default:
- step:
script:
- s3_website push
I’m using the attensee/s3_website docker image because that one has the awesome s3_website tool installed.
The configuration file of s3_website (s3_website.yml) [create this file in the root directory of the repository in Bitbucket] looks something like this:
s3_id: <%= ENV['S3_ID'] %>
s3_secret: <%= ENV['S3_SECRET'] %>
s3_bucket: bitbucket-pipelines
site : .
We have to define the environment variables S3_ID and S3_SECRET in environment variable ,from bit-bucket settings
Thankx to https://www.savjee.be/2016/06/Deploying-website-to-ftp-or-amazon-s3-with-BitBucket-Pipelines/
for the solution

Related

How to run a Lambda Docker with serverless offline

I would like to run serverless offline using a Lambda function that points to a Docker image.
When I try to run serverless offline, I am just receiving:
Offline [http for lambda] listening on http://localhost:3002
Function names exposed for local invocation by aws-sdk:
* hello-function: sample-app3-dev-hello-function
If I try to access http://localhost:3002/hello, a 404 error is returned
serverless.yml
service: sample-app3
frameworkVersion: '3'
plugins:
- serverless-offline
provider:
name: aws
ecr:
images:
sampleapp3image:
path: ./app/
platform: linux/amd64
functions:
hello-function:
image:
name: sampleapp3image
events:
- httpApi:
path: /hello
method: GET
app/myfunction.py
def lambda_handler(event, context):
return {
'statusCode': 200,
'body': 'Hello World!'
}
app/Dockerfile
FROM public.ecr.aws/lambda/python:3.9
COPY myfunction.py ./
CMD ["myfunction.lambda_handler"]
at the moment such functionality is not supported in serverless-offline plugin. There's an issue open where the discussion started around supporting this use case: https://github.com/dherault/serverless-offline/issues/1324

Conan Errror : install loop detected in context host .. requires .. which is an ancestor to

I know that this might seem obvious for some of you but I've been digging everywhere for answers with no success. I'm trying to Conan install my own package from my repo but I can't get over this error. It tells me that I have a loop in my requires but my package has no requirement.
I use this recipe for upload with Conan export-pkg :
# Standard library imports
import configparser
import os
import sys
# Related third party imports
import conans
class TlConanFile(conans.ConanFile):
settings = "os", "arch"
def __init__(self, output, runner, display_name="", user=None, channel=None): # pylint: disable=too-many-arguments
super().__init__(output, runner, display_name, user, channel)
if "--build-folder" in sys.argv:
# Conan checks the arguments and fails if the value is missing, the next argument is always the value
build_folder = sys.argv[sys.argv.index("--build-folder") + 1]
self.__class__.exports = os.path.relpath(os.path.join(build_folder, "..", "conanfile.txt"),
os.path.dirname(__file__))
elif "-bf" in sys.argv:
# Conan checks the arguments and fails if the value is missing, the next argument is always the value
build_folder = sys.argv[sys.argv.index("-bf") + 1]
self.__class__.exports = os.path.relpath(os.path.join(build_folder, "..", "conanfile.txt"),
os.path.dirname(__file__))
elif "-pf" in sys.argv:
# Conan checks the arguments and fails if the value is missing, the next argument is always the value
build_folder = sys.argv[sys.argv.index("-pf") + 1]
self.__class__.exports = os.path.relpath(os.path.join(build_folder, "..", "conanfile.txt"),
os.path.dirname(__file__))
elif "--package-folder" in sys.argv:
# Conan checks the arguments and fails if the value is missing, the next argument is always the value
build_folder = sys.argv[sys.argv.index("--package-folder") + 1]
self.__class__.exports = os.path.relpath(os.path.join(build_folder, "..", "conanfile.txt"),
os.path.dirname(__file__))
else:
# Simply assume that we are running the command in the build directory
build_folder = os.getcwd()
self.__class__.exports = os.path.relpath(os.path.join(build_folder, "..", "conanfile.txt"),
os.path.dirname(__file__))
def package(self):
self.copy("*.h", dst="include/aveer", src="output/include/aveer")
self.copy("*.i", dst="include/aveer/swig", src="output/include/aveer/swig")
self.copy("*.so*", dst="lib", src="output/lib", symlinks=True)
self.copy("*.cmake", dst="lib/cmake/aveer", src="output/lib/cmake/aveer")
self.copy("*.so", dst="lib/python3/dist-packages/aveer", src="output/lib/python3/dist-packages/aveer")
self.copy("*", dst="lib/python3/dist-packages/aveer", src="output/lib/python3/dist-packages/aveer")
self.copy("*.yml", dst="share/gnuradio/grc/blocks", src="output/share/gnuradio/grc/blocks")
#self.copy("*", dst="share/gnuradio/grc/blocks", src="share/gnuradio/grc/blocks") doc?
def package_info(self):
self.cpp_info.libs = conans.tools.collect_libs(self)
def requirements(self):
with open("../conanfile.txt") as conanfile_txt:
config = configparser.ConfigParser(allow_no_value=True, delimiters=["\0"])
config.optionxform = str
config.read_file(conanfile_txt)
for requirement in config['requires']:
self.requires(requirement)
to go with this conanfile.py, I also have this conanfile.txt:
[requires]
[generators]
cmake
[options]
that's it for the upload
and I use this conanfile.txt for the install:
[requires]
lib-grplugin/1.0.45#aveer_repo/Release
[generators]
cmake
[options]
[imports]
include/aveer, *.h -> ./output/include/aveer
include/aveer/swig, *.i ->./ output/include/aveer/swig
lib, *.so* -> ./output/lib
lib/cmake/aveer, *.cmake -> ./output/lib/cmake/aveer
lib/python3/dist-packages/aveer, *.so -> ./output/lib/python3/dist-packages/aveer
lib/python3/dist-packages/aveer, *.py -> ./output/lib/python3/dist-packages/aveer
share/gnuradio/grc/blocks, *.yml -> ./output/share/gnuradio/grc/blocks
when I try to run my conan install .. It gives me the following in the prompt:
Picture from the command prompt with the error
I also tried to install another package to test my profile/config and it has worked as intended.
As you can see I'm new here and even newer to conan, so if you need more info that I've not mention here plz let me know.

Not able to use BigQuery operator in Apache Airflow

I am creating a Data pipeline where I fetch data from BigQuery either through Bigquery operator or google cloud library. But I am always getting an error. Following is the dag for big query operator:
from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.operators.bigquery_operator import BigQueryOperator
from airflow.contrib.operators.bigquery_check_operator import BigQueryCheckOperator
from read_val_send1 import read,validating_hit,track_google_analytics_event,add_gcp_connection
default_args = {
"owner" : "Airflow",
"depends_on_past": False,
"start_date" : datetime(2021,5,9),
"email": ["airflow#airflow.com"],
"email_on_failure": False,
"email_on_retry": False,
"retries": 0,
"retry_delay": timedelta(seconds = 5)
}
dag = DAG("Automp", default_args = default_args, schedule_interval = "#daily", catchup = False)
activateGCP = PythonOperator(
task_id='add_gcp_connection_python',
python_callable=add_gcp_connection,
provide_context=True, dag = dag
)
BQ_CONN_ID = "my_gcp_conn"
BQ_PROJECT = 'pii-test'
BQ_DATASET = 'some_Dataset'
t1 = BigQueryCheckOperator(
task_id='bq_check',
sql='''
#standardSQL
Select * from table''',
use_legacy_sql=False,
bigquery_conn_id=BQ_CONN_ID,
dag=dag
)
activateGCP >> t1
Error
I have attached the error image
Broken DAG: [/usr/local/airflow/dags/Automp.py] No module named 'httplib2'
I am not able to install python packages in airflow as well with required.txt file. Following is compose file:
version: '2.1'
services:
redis:
image: 'redis:5.0.5'
# command: redis-server --requirepass redispass
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
# Uncomment these lines to persist data on the local filesystem.
# - PGDATA=/var/lib/postgresql/data/pgdata
# volumes:
# - ./pgdata:/var/lib/postgresql/data/pgdata
webserver:
image: puckel/docker-airflow:1.10.9
restart: always
depends_on:
- postgres
- redis
environment:
- LOAD_EX=n
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
# - POSTGRES_USER=airflow
# - POSTGRES_PASSWORD=airflow
# - POSTGRES_DB=airflow
# - REDIS_PASSWORD=redispass
volumes:
- ./dags:/usr/local/airflow/dags
# Uncomment to include custom plugins
# - ./plugins:/usr/local/airflow/plugins
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
flower:
image: puckel/docker-airflow:1.10.9
restart: always
depends_on:
- redis
environment:
- EXECUTOR=Celery
# - REDIS_PASSWORD=redispass
ports:
- "5555:5555"
command: flower
scheduler:
image: puckel/docker-airflow:1.10.9
restart: always
depends_on:
- webserver
volumes:
- ./dags:/usr/local/airflow/dags
- ./requirements.txt:/requirements.txt
# Uncomment to include custom plugins
# - ./plugins:/usr/local/airflow/plugins
environment:
- LOAD_EX=n
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
# - POSTGRES_USER=airflow
# - POSTGRES_PASSWORD=airflow
# - POSTGRES_DB=airflow
# - REDIS_PASSWORD=redispass
command: scheduler
worker:
image: puckel/docker-airflow:1.10.9
restart: always
depends_on:
- scheduler
volumes:
- ./dags:/usr/local/airflow/dags
- ./requirements.txt:/requirements.txt
# Uncomment to include custom plugins
# - ./plugins:/usr/local/airflow/plugins
environment:
- FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
- EXECUTOR=Celery
# - POSTGRES_USER=airflow
# - POSTGRES_PASSWORD=airflow
# - POSTGRES_DB=airflow
# - REDIS_PASSWORD=redispass
command: worker
My folder structure looks like this:
Folder Structure
The image that you are using does not include the httplib2 package, which is perhaps used by the imports coming from the read_val_send1 directory.
What you can do is adding the following line on your ./requirements.txt.
httplib2==0.19.1
The puckel's docker-airflow setup has an entrypoint.sh that supports pip install -r requirements.txt. So this must be sufficient.
In case something goes wrong you can always use Docker logs or Docker interactive execute bash to see what is going wrong.
I also recommend using the latest docker-compose for Airflow to have a smoother workflow.

How to create the custom module in the ansible

This is the custom module I have written to get the datetime from the current system. I have put the module in the /usr/share/my_modules folder.
#!/usr/bin/python
import datetime
import json
date = str(datetime.datetime.now())
print(json.dumps({
"time" : date
}))
def main():
module = AnsibleModule(
argument_spec = dict(
state = dict(default='present', choices=['present', 'absent']),
name = dict(required=True),
enabled = dict(required=True, type='bool'),
something = dict(aliases=['whatever'])
)
)
module.exit_json(changed=True, something_else=12345)
module.fail_json(msg="Something fatal happened")
from ansible.module_utils.basic import *
from ansible.module_utils.basic import AnsibleModule
if __name__ == '__main__':
main()
And now When I try to execute it using command ansible local -m timetest
I Am getting this error
127.0.0.1 | FAILED! => {
"failed": true,
"msg": "The module timetest was not found in configured module paths. Additionally, core modules are missing. If this is a checkout, run 'git submodule update --init --recursive' to correct this problem."
}
why it is not executing my custom module ? please help me resolve this issue.
You can create the library directory inside the directory where your playbook exist, your file structure will look like this:
.
|-- playbook.yml
|-- library
`-- your-custom-module.py
Hope that might help you
Have you tried following Ansible test module instructions at http://ansible-docs.readthedocs.io/zh/stable-2.0/rst/developing_modules.html#testing-modules?
git clone git://github.com/ansible/ansible.git --recursive
source ansible/hacking/env-setup
chmod +x ansible/hacking/test-module
ansible/hacking/test-module -m ./timetest.py

How can a README.md file be included in a PyPI module package using setup.py?

I want to include a README.md file with my module package for PyPI such that it can be read by a function in my setup.py. However, it is not obvious to me how to get setup.py and related infrastructure to actually include the README.md file.
I have included a MANIFEST.in file in my package that itself lists README.md and I have set the setuptools.setup argument include_package_data to True but this has not worked.
manifest.in:
junkmodule.py
junkmodule_script.py
LICENSE
MANIFEST.in
README.md
setup.py
setup.py:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import pypandoc
import setuptools
def main():
setuptools.setup(
name = "junkmodule",
version = "2017.01.13.1416",
description = "junk testing module",
long_description = pypandoc.convert("README.md", "rst"),
url = "https://github.com/user/junkmodule",
author = "LRH",
author_email = "lhr#psern.ch",
license = "GPLv3",
include_package_data = True,
py_modules = [
"junkmodule"
],
install_requires = [
"numpy"
],
scripts = [
"junkmodule_script.py"
],
entry_points = """
[console_scripts]
junkmodule = junkmodule:junkmodule
"""
)
if __name__ == "__main__":
main()
The commands I use to register and upload the module to PyPI are as follows:
python setup.py register -r https://pypi.python.org/pypi
python setup.py sdist upload -r https://pypi.python.org/pypi
I'm using this in my modules, try:
import pypandoc
try:
description=pypandoc.convert('README.md', 'rst')
except (IOError, ImportError):
description=open('README.md').read()