Read bucket object in CDK - amazon-s3

In terraform to read an object from s3 bucket at the time of deployment I can use data source
data aws_s3_bucket_object { }
Is there a similar concept in CDK? I've seen various methods of uploading assets to s3, as well as importing an existing bucket, but not getting an object from the bucket. I need to read a configuration file from the bucket that will affect further deployment.

Its important to remember that CDK itself is not a deployment option. it can deploy, but the code you are writing in a cdk stack is the definition of your resources - not a method for deployment.
So, you can do one of a few things.
Use your SDK for your language to make a call to the s3 bucket and load the data directly. This is perfectly acceptable and an understood way to gather information you need before deployment - each time the stack Synths (which it does before every cdk deploy that code will run and will pull your data.
Use a CodePipeline to set up a proper pipeline, and give it two sources - one your version control repo and the second your s3 bucket:
https://docs.aws.amazon.com/codebuild/latest/userguide/sample-multi-in-out.html
The preferred way - drop the json file, and use Parameter Store. CDK contains modules that will create a token version of this parameter on synth, and when it deploys it will reference that properly back to the Systems Manager Parameter store
https://docs.aws.amazon.com/cdk/v2/guide/get_ssm_value.html
If your parameters change after deployment, you can have that as part of your cdk stack pretty easily (using cfn outputs). If they change in the middle/during deployment, you really need to be using a CodePipeline to manage these steps instead of just CDK.
Because remember: The cdk deploy option is just a convenience. It will execute everything and has no way to pause in the middle and execute specific steps. (other than a very basic, this depends on this resources)

Related

Does Serverless, Inc ever see my AWS credentials?

I would like to start using serverless-framework to manage lambda deploys at my company, but we handle PHI so security’s tight. Our compliance director and CTO had concerns about passing our AWS key and secret to another company.
When doing a serverless deploy, do AWS credentials ever actually pass through to Serverless, Inc?
If not, can someone point me to where in the code I can prove that?
Thanks!
Running serverless deploy isn't just one call, it's many.
AWS example (oversimplification):
Check if deployment s3 bucket already exists
Create an S3 bucket
Upload packages to s3 bucket
Call CloudFormation
Check CloudFormation stack status
Get info of created recourses (e.g. endpoint urls of created APIs)
And those calls can change dependent on what you are doing and what you have done before.
The point I'm trying to make is is that these calls which contain your credentials are not all located in one place and if you want to do a full code review of Serverless Framework and all it's dependencies, have fun with that.
But under the hood, we know that it's actually using the JavaScript aws-sdk (go check out the package.json), and we know what endpoints that uses {service}.{region}.amazonaws.com.
So to prove to your employers that nothing with your credentials is going anywhere except AWS you can just run a serverless deploy with wireshark running (other network packet analyzers are available). That way you can see anything that's not going to amazonaws.com
But wait, why are calls being made to serverless.com and serverlessteam.com when I run a deploy?
Well that's just tracking some stats and you can see what they track here. But if you are uber paranoid, this can be turned off with serverless slstats --disable.

Automatic sync S3 to local

I want to auto sync my local folder with S3 bucket. I mean, when i change some file in S3, automatically this file would update in the local folder.
I tried using scheduler task and AWS cli but i think there is a better way to do that.
Do you know some app or better solution?
Hope you can help me.
#mgg, You can mount the s3 bucket to the local server using s3fs, this way you can sync your local changes to s3 bucket.
You could execute code (Lambda Functions) that responds to some events in a given bucket (such file change, deleted or created), so, you could have a simple http service that receives a post or a get request from that lambda and update your local data accordingly.
Read more:
Tutorial, Using AWS Lambda with Amazon S3
Working with Lambda Functions
The other approach (I don't recommend this) is to have some code "pulling" for changes in some bucket and then reflecting those changes locally. At first glance it looks easier to implement, but ... it get complicated when you try to handle not just creation events.
And of course for each cycle of your "pulling" component you have to check all elements in your local directory against all elements in the bucket, it is a performance killing approach!

How to add rollback functionality to a basic S3 CodeBuild deploy

I have followed this instruction to get a very basic ci workflow in aws. It works flawless but I want to have a extra functionality, rollback. First i though it would work "out-of-the-box", but not in my case, if I select the the previous job in CodeBuild that i want to rollback to and hit "Retry" i get this error message: "Error ArtifactsOverride must be set when using artifacts type CodePipelines". I have also tried to rerun the whole pipeline again with pipeline history page, but it's just a list of builds without any functionality.
My questions is: how to add a rollback function to my workflow. It doesn't have to be in the same pipeline etc. But it should not touch git.
AWS CloudFormation now supports rolling back based in a CloudWatch alarm.
I'd put a CloudFront distribution in front of your S3 bucket with the origin path set to a folder within that bucket. Every time you deploy to S3 from CodeBuild you deploy to a random new S3 folder.
You then pass the folder name in a JSON file as an output artifact from your CodeBuild step. You can use this artifact as a parameter to a CloudFormation template updated by a CloudFormation action in your pipeline.
The CloudFormation template would update the OriginPath field of your CloudFront distribution to the folder containing your new deployment.
If the alarm fires then the CloudFormation template would roll back and flip back to the old folder.
There are several advantages to this approach:
Customers should only see either the new or old version while the deployment is happening rather than seeing potentially mixed files while the deployment is running.
The deployment logic is simpler because you're uploading a fresh set of files every time, rather than figuring out which files are new and which need to be deleted.
The rollback is pretty simple because you're flipping back to files which are still there rather than re-deploying the old files.
Your pipeline would need to contain both the CodeBuild and a sequential CloudFormation action.

Terraform Shared State

Terraform 0.9.5.
I am in the process of putting together a group of modules that our infrastructure team and automation team will use to create resources in a standard fashion and in turn create stacks to provision different envs. All working well.
Like all teams using terraform shared state becomes a concern. I have configured terraform to use a s3 backend, that is versioned and encrypted, added a lock via a dynamo db table. Perfect. All works with local accounts... Okay the problem...
We have multiple aws accounts, 1 for IAM, 1 for billing, 1 for production, 1 for non-production, 1 for shared services etc... you get where I am going. My problem is as follows.
I authenticate as user in our IAM account and assume the required role. This has been working like a dream until i introduced terraform backend configuration to utilise s3 for shared state. It looks like the backend config within terraform requires default credentials to be set within ~/.aws/credentials. It also looks like these have to be a user that is local to the account where the s3 bucket was created.
Is there a way to get the backend configuration setup in such a way that it will use the creds and role configured within the provider? Is there a better way to configured shared state and locking? Any suggestions welcome :)
Update:Got this working. I created a new user within the account where the s3 bucket is created. Created a policy to just allow that new user s3:DeleteObject,GetObject,PutObject,ListBucket and dynamodb:* on the specific s3 bucket and dynamodb table. Created a custom credentials file and added default profile with access and secret keys assigned to that new user. Used the backend config similar to
terraform {
required_version = ">= 0.9.5"
backend "s3" {
bucket = "remote_state"
key = "/NAME_OF_STACK/terraform.tfstate"
region = "us-east-1"
encrypt = "true"
shared_credentials_file = "PATH_TO_CUSTOM_CREDENTAILS_FILE"
lock_table = "MY_LOCK_TABLE"
}
}
It works but there is an initial configuration that needs to happen within your profile to get it working. If anybody knows of a better setup or can identify problems with my backend config please let me know.
Terraform expects backend configuration to be static, and does not allow it to include interpolated variables as might be true elsewhere in the config due to the need for the backend to be initialized before any other work can be done.
Due to this, applying the same config multiple times using different AWS accounts can be tricky, but is possible in one of two ways.
The lowest-friction way is to create a single S3 bucket and DynamoDB table dedicated to state storage across all environments, and use S3 permissions and/or IAM policies to impose granular access controls.
Organizations adopting this strategy will sometimes create the S3 bucket in a separate "adminstrative" AWS account, and then grant restrictive access to the individual state objects in the bucket to the specific roles that will run Terraform in each of the other accounts.
This solution has the advantage that once it has been set up correctly in S3 Terraform can be used routinely without any unusual workflow: configure the single S3 bucket in the backend, and provide appropriate credentials via environment variables to allow them to vary. Once the backend is initialized, use workspaces (known as "state environments" prior to Terraform 0.10) to create a separate state for each of the target environments of a single configuration.
The disadvantage is the need to manage a more-complicated access configuration around S3, rather than simply relying on coarse access control with whole AWS accounts. It is also more challenging with DynamoDB in the mix, since the access controls on DynamoDB are not as flexible.
There is a more complete description of this option in the Terraform s3 provider documentation, Multi-account AWS Architecture.
If a complex S3 configuration is undesirable, the complexity can instead be shifted into the Terraform workflow by using partial configuration. In this mode, only a subset of the backend settings are provided in config and additional settings are provided on the command line when running terraform init.
This allows options to vary between runs, but since it requires extra arguments to be provided most organizations adopting this approach will use a wrapper script to configure Terraform appropriately based on local conventions. This can be just a simple shell script that runs terraform init with suitable arguments.
This then allows to vary, for example, the custom credentials file by providing it on the command line. In this case, state environments are not used, and instead switching between environments requires re-initializing the working directory against a new backend configuration.
The advantage of this solution is that it does not impose any particular restrictions on the use of S3 and DynamoDB, as long as the differences can be represented as CLI options.
The disadvantage is the need for unusual workflow or wrapper scripts to configure Terraform.

Backing up a Serverless Framework deployment

I'm familiar with Terraform and its terraform.tfstate file where it keeps track of which local resource identifiers map to which remote resources. I've noticed that there is a .serverless directory on my machine which seems to contain files such as CloudFormation templates and ZIP files containing Lambda code.
Suppose I create and deploy a project from my laptop, and Serverless spins up fooxyz.cloudfront.net which points to a Lambda function arn:aws:lambda:us-east-1:123456789012:function:handleRequest456. If I naively try to run Serverless again from another machine (or if I git clean my working directory), it'll spin up a new CloudFront endpoint since it doesn't know that fooxyz.cloudfront.net already represents the same application. I'm looking to back up the state it keeps internally, so that it modifies an existing resource rather than creates a new one. (The equivalent in Terraform would be to back up the terraform.tfstate file.)
If I wished to back up or restore a Serverless deployment state, which files would I back up? In the case of AWS, it seems like I should be backing up the CloudFormation templates; I don't want to back up the Lambda code since it's directly generated from the source. However, I'm likely going to use more than just AWS in the future, and so don't want to "special-case" the CloudFormation templates if at all possible.
How can I back up only the files I cannot regenerate?
I think what you are asking is If I or a colleague checks out the serverless code from git on a different machine, will we still be able to deploy and update the same lambda functions and the same API gateway endpoints?
And the answer to that is yes! Serverless keeps track of all of that for you within their files. Unless you run serverless destroy - no operation will create a new lambda or api endpoint.
My team and I are using this method: we commit all code to a git repo and one of us checks it out and deploys a function or the entire thing and it updates the existing set of functions properly. If you setup an environment file - that's all you need to worry about really. And I recommend leaving it outside of git entirely.
For AWS; Serverless Framework keeps track of your deployment via Cloudformation (CF) parameters/identifiers which are specific to an account/region. The CF stack templates are uploaded to an (auto-generated) S3 bucket so it's already backed up for you.
So all you really need to have is the original deployment code in a git repo and have access to your keys. Everything else is already backed up for you.