Persistent Bitbucket pipeline build artifacts greater than 14 days - amazon-s3

I have a pipeline which loses build artifacts after 14 days. I.e, after 14 days, without S3 or Artifactory integration, the pipeline of course loses "Deploy" button functionality - it becomes greyed out since the build artifact is removed. I understand this is by intention by BB/Atlassian to reduce costs etc (detail in below link).
Please check last section of this page "Artifact downloads and Expiry" - https://support.atlassian.com/bitbucket-cloud/docs/use-artifacts-in-steps/
If you need artifact storage for longer than 14 days (or more than 1
GB), we recommend using your own storage solution, like Amazon S3 or a
hosted artifact repository like JFrog Artifactory.
Question:
Is anyone able to provide advice or sample code on how to approach BB Pipeline integration with Artifactory (or S3) in order to retain artifacts. Is the Artifactory generic upload/download pipe approach the only way or is the quote above hinting at a more native BB "repository setting" to provide integration with S3 or Artifactory? https://www.jfrog.com/confluence/display/RTF6X/Bitbucket+Pipelines+Artifactory+Pipes

Bitbucket give an example of linking to an S3 bucket on their site.
https://support.atlassian.com/bitbucket-cloud/docs/publish-and-link-your-build-artifacts/
The key is Step 4 where you link the artefact to the build.
However the example doesn't actually create an artefact that is linked to S3, but rather adds a status code with a description that links to the uploaded item's in S3. To use these in further steps you would then have to download the artefacts.
This can be done using the aws cli and an image that has this installed, for example the amazon/aws-sam-cli-build-image-nodejs14.x (SAM was required in my case).
The following is an an example that:
Creates an artefact ( a txt file ) and uploads to an AWS S3 bucket
Creates a "link" as a build status against the commit that triggered the pipeline, as per Amazon's suggestion ( this is just added for reference after the 14 days... meh)
Carrys out a "deployment", where by the artefact is downloaded from AWS S3, in this stage I also then set the downloaded S3 artefact as a BitBucket artefact, I mean why not... it may expire after 14 days but at if I've just re-deployed then I may want this available for another 14 days....
image: amazon/aws-sam-cli-build-image-nodejs14.x
pipelines:
branches:
main:
- step:
name: Create artefact
script:
- mkdir -p artefacts
- echo "This is an artefact file..." > artefacts/buildinfo.txt
- echo "Generating Build Number:\ ${BITBUCKET_BUILD_NUMBER}" >> artefacts/buildinfo.txt
- echo "Git Commit Hash:\ ${BITBUCKET_COMMIT}" >> artefacts/buildinfo.txt
- aws s3api put-object --bucket bitbucket-artefact-test --key ${BITBUCKET_BUILD_NUMBER}/buildinfo.txt --body artefacts/buildinfo.txt
- step:
name: Link artefact to AWS S3
script:
- export S3_URL="https://bitbucket-artefact-test.s3.eu-west-2.amazonaws.com/${BITBUCKET_BUILD_NUMBER}/buildinfo.txt"
- export BUILD_STATUS="{\"key\":\"doc\", \"state\":\"SUCCESSFUL\", \"name\":\"DeployArtefact\", \"url\":\"${S3_URL}\"}"
- curl -H "Content-Type:application/json" -X POST --user "${BB_AUTH_STRING}" -d "${BUILD_STATUS}" "https://api.bitbucket.org/2.0/repositories/${BITBUCKET_REPO_OWNER}/${BITBUCKET_REPO_SLUG}/commit/${BITBUCKET_COMMIT}/statuses/build"
- step:
name: Test - Deployment
deployment: Test
script:
- mkdir artifacts
- aws s3api get-object --bucket bitbucket-artefact-test --key ${BITBUCKET_BUILD_NUMBER}/buildinfo.txt artifacts/buildinfo.txt
- cat artifacts/buildinfo.txt
artifacts:
- artifacts/**
Note:
I've got the following secrets/variables against the repository:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
BB_AUTH_STRING

Related

Download from s3 into a actions workflow

I'm working on 2 github actions workflows:
Train a model and save it to s3 (monthly)
Download the model from s3 and use it in predictions (daily)
Using https://github.com/jakejarvis/s3-sync-action I was able to complete the first workflow. I train a model and then sync a dir, 'models' with a bucket on s3.
I had planned on using the same action to download the model for use in prediction but it looks like this action is one directional, upload only no download.
I found out the hard way by creating a workflow and attempting to sync with the runner:
retreive-model-s3:
runs-on: ubuntu-latest
steps:
- name: checkout current repo
uses: actions/checkout#master
- name: make dir to sync with s3
run: mkdir models
- name: checkout s3 sync action
uses: jakejarvis/s3-sync-action#master
with:
args: --follow-symlinks
env:
AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_S3_ENDPOINT: ${{ secrets.AWS_S3_ENDPOINT }}
AWS_REGION: 'us-south' # optional: defaults to us-east-1
SOURCE_DIR: 'models' # optional: defaults to entire repository
- name: dir after
run: |
ls -l
ls -l models
- name: Upload model as artifact
uses: actions/upload-artifact#v2
with:
name: xgb-model
path: models/regression_model_full.rds
At the time of running, when I login to the UI I can see the object regression_model_full.rds is indeed there, it's just not downloading. I'm still unsure if this is expected or not (the name of the action 'sync' is what's confusing me).
For our s3 we must use the parameter AWS_S3_ENDPOINT. I found another action, AWS S3 here but unlike the sync action I started out with there's no option to add AWS_S3_ENDPOINT. Looking at the repo too it's two years old except a update tot he readme 8 months ago.
What's the 'prescribed' or conventional way to download from s3 during a workflow?
Soo I had the same problem as you. I was trying to download from S3 to update a directory folder in GitHub.
What I learned from actions is if you're updating some files in the repo you must follow the normal approach as if you were doing it locally eg) checkout, make changes, push.
So for your particular workflow you must checkout your repo in the workflow using actions/checkout#master and after you sync with a particular directory the main problem I was not doing was then pushing the changes back to the repo! This allowed me to update my folder daily.
Anyway, here is my script and hope you find it useful. I am using the AWS S3 action you mention towards the end.
# This is a basic workflow to help you get started with Actions
name: Fetch data.
# Controls when the workflow will run
on:
schedule:
# Runs "at hour 6 past every day" (see https://crontab.guru)
- cron: '00 6 * * *'
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- uses: keithweaver/aws-s3-github-action#v1.0.0 # Verifies the recursive flag
name: sync folder
with:
command: sync
source: ${{ secrets.S3_BUCKET }}
destination: ./data/
aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws_region: ${{ secrets.AWS_REGION }}
flags: --delete
- name: Commit changes
run: |
git config --local user.email "action#github.com"
git config --local user.name "GitHub Action"
git add .
git diff-index --quiet HEAD || git commit -m "{commit message}" -a
git push origin main:main
Sidenote: the flag --delete allows you to keep your current folder up to date with your s3 folder by deleting any files that are not present in your s3 folder anymore

How to copy data from private S3 bucket to Azure Blob storage via Azure yaml pipeline

I have one S3 bucket storing CSV files in it. New CSV files get added to this bucket at the beginning of each month. I want these new files to be uploaded automatically to the Azure blob storage at the beginning of each month.
The way I was thinking to do this is to create a script(bash/PowerShell) that pulls data from the AWS S3 bucket to Azure blob storage via AZ Copy command. and then plug this script into an Azure YAML pipeline which runs every start of the month to execute this script. but I can't find a way to integrate this script in an Azure YAML pipeline. Is this command feasible with the YAML pipeline? or is there any simple way to do this?
We can copy data from S3 bucket to Azure Blob Storage using azcopy.
azcopy copy "<s3-bucket-uri>" "https://StorageAccountName.blob.core.windows.net/container-name/?sas-token" --recursive
We can integrate the azcopy with YAML pipeline.
Firstly, install azcopy in pipeline agent as below :
- task: Bash#3
displayName: Install azcopy
inputs:
targetType: 'inline'
script: |
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
mkdir $(Agent.ToolsDirectory)/azcopy
wget -O $(Agent.ToolsDirectory)/azcopy/azcopy_v10.tar.gz https://aka.ms/downloadazcopy-v10-linux
tar -xf $(Agent.ToolsDirectory)/azcopy/azcopy_v10.tar.gz -C $(Agent.ToolsDirectory)/azcopy --strip-components=1
Create cli-task with azcopy in pipeline to copy data from S3 bucket to Azure Blob Storage using azcopy
Reference code :
- task: AzureCLI#2
displayName: Download using azcopy
inputs:
azureSubscription: 'Service-Connection'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
end=`date -u -d "180 minutes" '+%Y-%m-%dT%H:%M:00Z'`
$(Agent.ToolsDirectory)/azcopy/azcopy copy "<s3-bucket-uri>" "https://StorageAccountName.blob.core.windows.net/container-name/?sas-token" --recursive --check-md5=FailIfDifferent
Reference SO thread : Azure Pipelines - Download files with azcopy - Stack Overflow

Dotnet core code publish push to s3 as Zip from gitlab CI/CD

How can I zip the artifacts before copying to s3 bucket, this is done as the beanstalk requires zip file to update.
I wanted to deploy the dotnet publish code in beanstalk. I am using Gitlab CI/CD to trigger the build when new changes are pushed to the gitlab repo
In my .gitlab-ci.yml file what am doing is
build and publish the code using dotnet publish
copy the published folder artifact to s3 bucket as zip
create new beanstalk application version
update beanstalk environment to reflect the new changes.
Here I was able to perform all the steps except step 3. Can anyone please help me on how can I Zip the published folder and copy that zip to s3 bucket. Please find my relavant code below:
build:
image: mcr.microsoft.com/dotnet/core/sdk:3.1
script:
- dotnet publish -c Release -o /builds/maskkk/samplewebapplication/publish/
stage: build
artifacts:
paths:
- /builds/maskkk/samplewebapplication/publish/
deployFile:
image: python:latest
stage: deploy
script:
- pip install awscli
- aws configure set aws_access_key_id $AWS_ACCESS_KEY_ID
- aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY
- aws configure set region us-east-2
- aws s3 cp --recursive /builds/maskkk/samplewebapplication/publish/ s3://elasticbeanstalk-us-east-2-654654456/JBGood-$CI_PIPELINE_ID
- aws elasticbeanstalk create-application-version --application-name Test5 --version-label JBGood-$CI_PIPELINE_ID --source-bundle S3Bucket=elasticbeanstalk-us-east-2-654654456,S3Key=JBGood-$CI_PIPELINE_ID
- aws elasticbeanstalk update-environment --application-name Test5 --environment-name Test5-env --version-label JBGood-$CI_PIPELINE_ID````
I got the answer to this issue we can simply run a
zip -r ../published.zip *
this will create a zip file and can upload this zip folder to s3.
Please let me know if we have any other better solution to this.

Transporting with data from S3 amazon to local server

I am trying to import data from S3 and using the described below script (which I sort of inherited). It's a bit long...The problem is I kept receiving following output:
The config profile (importer) could not be found
I am not a bash person-so be gentle, please. It seemed there are some credentials missing or something else is wrong with configuration of "importer" on local machine.
In S3 configs(the console) - there is a user with the same name, which, according to permissions can perform access the bucket and download data.
I have tried changing access keys in amazon console for the user and creating file, named "credentials" in home/.aws(there was no .aws folder in home dir by default-created it), including the new keys in the file, tried upgrading AWS CLI with pip - nothing helped
Then I have modified the "credentials", placing [importer] as profile name, so it looked like:
[importer]
aws_access_key = xxxxxxxxxxxxxxxxx
aws_secret+key = xxxxxxxxxxxxxxxxxxx
Appears, that I have gone through the "miss-configuration":
A client error (InvalidAccessKeyId) occurred when calling the ListObjects operation: The AWS Access Key Id you provided does not exist in our records.
Completed 1 part(s) with ... file(s) remaining
And here's the part, where I am stuck...I placed the keys, I have obtained from the amazon into that config file. Double checked...Any suggestions? I can't produce anymore keys-aws quota/user. Below is part of the script:
#!/bin/sh
echo "\n$0 started at: `date`"
incomming='/Database/incomming'
IFS='
';
mkdir -p ${incomming}
echo "syncing files from arrivals bucket to ${incomming} incomming folder"
echo aws --profile importer \
s3 --region eu-west-1 sync s3://path-to-s3-folder ${incomming}
aws --profile importer \
s3 --region eu-west-1 sync s3://path-to-s3-folder ${incomming}
count=0
echo ""
echo "Searching for zip files in ${incomming} folder"
for f in `find ${incomming} -name '*.zip'`;
do
echo "\n${count}: ${f} --------------------"
count=$((count+1))
name=`basename "$f" | cut -d'.' -f1`
dir=`dirname "$f"`
if [ -d "${dir}/${name}" ]; then
echo "\tWarning: directory "${dir}/${name}" already exist for file: ${f} ... - skipping - not imported"
continue
fi

Travis CI: Uploading artifacts to S3 results in "The bucket you are attempting to access must be addressed using the specified endpoint"

I have a Travis CI build that is configured to upload the build artifacts to S3. I've followed the Travis artifacts documentation but when the build completes I get the following error (and the S3 container is empty).
ERROR: failed to upload: /home/travis/build/jonburney/KingsgateMediaPlayer-Android/
app/build/outputs/apk/app-release-unsigned.apk
err: The bucket you are attempting to access must be addressed using the specified
endpoint. Please send all future requests to this endpoint.
I have tried to specify the "endpoint" option in the configuration but it was ignored. It appears to be attempting to upload the file to
https://s3.amazonaws.com/kmp-build-output/jonburney/KingsgateMediaPlayer-Android/30/30.1/app/build/outputs/apk/app-release-unsigned.apk.
Here is a copy of the relevant section from my .travis.yml file
addons:
artifacts: true
s3_region: "us-west-2"
artifacts:
paths:
- $(git ls-files -o app/build/outputs | tr "\n" ":")
Have I missed a configuration option for this scenario? Any help is appreciated!
This was fixed after an email to the Travis-CI support team and some investigation. The code in my .travis.yml file was modified to ensure that "artifacts" was only present once, like so:
addons:
artifacts:
s3_region: "us-west-2"
paths:
- $(git ls-files -o app/build/outputs | tr "\n" ":")