Retrieve application config from secure location during task start - amazon-s3

I want to make sure I'm not storing sensitive keys and credentials in source or in docker images. Specifically I'd like to store my MySQL RDS application credentials and copy them when the container/task starts. The documentation provides an example of retrieving the ecs.config file from s3 and I'd like to do something similar.
I'm using the Amazon ECS optimized AMI with an auto scaling group that registers with my ECS cluster. I'm using the ghost docker image without any customization. Is there a way to configure what I'm trying to do?

You can define a volume on the host and map it to the container with Read only privileges.
Please refer to the following documentation for configuring ECS volume for an ECS task.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html
Even though the container does not have the config at build time, it will read the configs as if they are available in its own file system.
There are many ways to secure the config on the host OS.
In my past projects, I have achieved the same by disabling ssh into the host and injecting the config at boot-up using cloud-init.

Related

Which s3 compatible blob storage?

I want deploy a s3 compatible blob storage in my Kubernetes Cluster. I already use GlusterFS for volumes like mongodb, and I tried to set up minio with the helm chart https://github.com/helm/charts/tree/master/stable/minio. I just realize I can't scale up minio easily because of erasure code.
So I have some questions about blob storage solutions :
Is GlusterFS blob storage service stable and reliable (https://github.com/gluster/gluster-kubernetes/tree/master/docs/examples/gluster-s3-storage-template) ?
Do I must use OpenShift to deploy GlusterFS blob storage as I read in the web ? I think no because I can see simple Kubernetes manifests in the GlusterFS repo like this one : https://github.com/gluster/gluster-kubernetes/blob/master/deploy/kube-templates/gluster-s3-template.yaml.
Is it easy to use Minio federation in Kubernetes ? Is it easily scalable with a "helm upgrade --set replicas=X" or do I need manually upgrade minio configuration ?
As you can see, I feel lost with this s3 storage. So if you have more information/solutions, do not hesitate.
Thanks in advance !
About reliability you should read more about user experience like:
An end user review of GlusterFS
Community Survey Feedback, 2019
Why openshift with glusterFS:
For standalone Red Hat Gluster Storage, there is no component installation required to use it with OpenShift Container Platform. OpenShift Container Platform comes with a built-in GlusterFS volume driver, allowing it to make use of existing volumes on existing clusters but Red Hat Gluster Storage is a commercial storage software product, based on Gluster.
How to deploy it in AWS
For minio please follow official docs:
ConfigMap allows injecting containers with configuration data even while a Helm release is deployed.
To update your MinIO server configuration while it is deployed in a release, you need to
Check all the configurable values in the MinIO chart using helm inspect values stable/minio.
Override the minio_server_config settings in a YAML formatted file, and then pass that file like this helm upgrade -f config.yaml stable/minio.
Restart the MinIO server(s) for the changes to take effect
I didn't try but, but as per documentation:
For federation I can see additional environment variables in the values.yaml.
In addition you should Run MinIO in federated mode Federation Quickstart Guide
Here you can find differences between google and amazon s3 sotrage
or Cloud Storage interoperability from gcloud perspective.
Hope this help.

Google Cloud Manage Tomcat Service

Does google cloud or aws provide manage Apache tomcat which just take war file and do auto-scaling based on load increase and decrease ? not compute engine. I dont want to create VM. this should be manage by manage service.
Google App Engine can directly take and run a WAR file - just use the appcfg deployment method.
You will have more options if you package with docker, as this then provides an image type that can be run in many places (Multilpe GCP, AWS and Azure options, on-prem Kubernetes, etc). This can even be as simple as building a dockerfile that just copies the WAR into a jetty image:
FROM jetty:latest
COPY YOUR_WAR.war /var/lib/jetty/webapps
It might be better to explode the war though - see discussion in this question
AWS provide ** AWS Elastic Beanstalk **
The AWS Elastic Beanstalk Tomcat platform is a set of environment configurations for Java web applications that can run in a Tomcat web container. Each configuration corresponds to a major version of Tomcat, like Java 8 with Tomcat 8.
Platform-specific configuration options are available in the AWS Management Console for modifying the configuration of a running environment. To avoid losing your environment's configuration when you terminate it, you can use saved configurations to save your settings and later apply them to another environment.
To save settings in your source code, you can include configuration files. Settings in configuration files are applied every time you create an environment or deploy your application. You can also use configuration files to install packages, run scripts, and perform other instance customization operations during deployments.
It also provide autoscaling
The Auto Scaling group in your Elastic Beanstalk environment uses two Amazon CloudWatch alarms to trigger scaling operations. The default triggers scale when the average outbound network traffic from each instance is higher than 6 MB or lower than 2 MB over a period of five minutes. To use Amazon EC2 Auto Scaling effectively, configure triggers that are appropriate for your application, instance type, and service requirements. You can scale based on several statistics including latency, disk I/O, CPU utilization, and request count.

GCP - CDN Server

I'm trying to architect a system on GCP for scalable web/app servers. My initial intention was to have one disk per web server group hosting the OS, and another hosting the source code + imagery etc. My idea was to mount the OS disk on multiple VM instances so to have exact clones of the servers, with one place to store PHP session files (so moving in between different servers would be transparent and not cause problems).
The second idea was to mount a 2nd disk, containing the source code and media files, which would then be shared with 2 web servers, one configured as a CDN server and one with the main website and backend. The backend would modify/add/delete media files, and the CDN server would supply them to the browser when requested.
My problem arises when reading that the Persistent Disk Storage is only mountable on a single VM instance with read/write access, and if it's needed on multiple instances it can be mounted only in write access. I need to have one of the instances with read/write access with the others (possibly many) with read only access.
Would you be able to suggest ways or methods on how to implement such a system on the GCP, or if it's not possible at all?
Unfortunately, it's not possible.
But, you can create a Single-Node File Server and mount it as a read and write disk on other VMs.
GCP has documentation on how to create a single-Node File Server
An alternative to using persistent (which as you said, only alows a single RW mount or many read-only) is to use Cloud Storage - which can be mounted through FUSE.

retrieving Apache log files from AWS Beanstalk

I know that Beanstalk's Snapshot Logs can give you a recent overview of the httpd/access_log files from among the EC2 instances under the ELB for that environment. But does anyone know a good way to get all the logs?
It's a production environment, so I want to do the processing elsewhere. But I don't want to (for obvious reasons) configure root sftp and go around collecting the files manually.
I think I had read something about configuring logging to S3?
In the "Configuration" tab for an Environment, under "Software Configuration", there is a checkbox for enabling log file rotation to S3. These are stored in an S3 bucket used specifically for Elastic Beanstalk.
You can feed your current logs to aws cloudwatch logs.
AWS cloudwatch logs will centralise all logs of your infrastructure with a neat solution to search an process them as well as creating metrix and alarm based on your logs.
I have a guide on how to Store aws beanstalk symfony and apache logs in cloudwatch logs. This will help you to get up and running fast, and then you can tweak it.

How to replicate Amazon EBS to S3?

We have a site where users upload files, some of them quite large. We've got multiple EC2 instances and would like to load balance them. Currently, we store the files on an EBS volume for fast access. What's the best way to replicate the files so they can be available on more than one instance?
My thought is that some automatic replication process that uploads the files to S3, and then automatically downloads them to other EC2 instances would be ideal.
EBS snapshots won't work because they replicate the entire volume, and we need to be able to replicate the directories of individual customers on demand.
You could write a shell script that would spawn s3cmd to sync your local filesystem with a S3 bucket whenever a new file is uploaded (or deleted). It would look something like:
s3cmd sync ./ s3://your-bucket/
Depends on what OS you are running on your EC2 instances:
There isn't really any need to add S3 to the mix unless you want to store them there for some other reason (like backup).
If you are running *nix the classic choice might be to run rsync and just sync between instances.
On Windows you could still use rsync or else SyncToy from Microsoft is a simple free option. Otherwise there are probably hundreds of commercial applications in this space...
If you do want to sync to S3 then I would suggest one of the S3 client apps like CloudBerry or JungleDisk, which both have sync functionality...
If you are running Windows it's also worth considering DFS (Distributed File System) which provides replication and is part of Windows Server...
The best way is to use the Amazon Cloud Front service. All of the replication is managed as part of the AWS. Content is served from several different availability zones, but does not require you to have EBS volumes in those zones.
Amazon CloudFront delivers your static and streaming content using a global network of edge locations. Requests for your objects are automatically routed to the nearest edge location, so content is delivered with the best possible performance.
http://aws.amazon.com/cloudfront/
Two ways:
Forget EBS, transfer the files to S3 and use S3 as your file-manager than EBS, add cloudfront and use the common-link everywhere.
Mount S3 bucket on any machines.
1. Amazon CloudFront is a web service for content delivery. It delivers your static and streaming content using a global network of edge locations.
http://aws.amazon.com/cloudfront/
2. You can mount S3 bucket on your linux machine. See below:
s3fs -
http://code.google.com/p/s3fs/wiki/InstallationNotes
- this did work for me. It uses FUSE file-system + rsync to sync the files
in S3. It kepes a copy of all
filenames in the local system & make
it look like a FILE/FOLDER.
That way you can share the S3 bucket on different machines.