How to set up a volume linked to S3 in Docker Cloud with AWS? - amazon-s3

I'm running my Play! webapp with Docker Cloud (could also use Rancher) and AWS and I'd like to store all the logs in S3 (via volume). Any ideas on how I could achieve that with minimal effort?

Use docker volumes to store the logs in the host system.
Try S3 aws-cli to sync your local directory with S3 Bucket
aws s3 sync /var/logs/container-logs s3://bucket/
create a cron to run it on every minute or so.
Reference: s3 aws-cli

Related

How to set up AWS S3 bucket as persistent volume in on-premise k8s cluster

Since NFS has single point of failure issue. I am thinking to build a storage layer using S3 or Google Cloud Storage as PersistentVolumn in my local k8s cluster.
After a lot of google search, I still cannot find an way. I have tried using s3 fuse to mount volume to local, and then create PV by specifying the hotPath. However, a lot of my pods (for example airflow, jenkins), complained about no write permission, or say "version being changed".
Could someone help figuring out the right way to mount S3 or GCS bucket as a PersistenVolumn from local cluster without using AWS, or GCP.
S3 is not a file system and is not intended to be used in this way.
I do not recommend to use S3 this way, because in my experience any FUSE-drivers very unstable and with I/O operations you will easily ruin you mounted disk and stuck in Transport endpoint is not connected nightmare for you and your infrastructure users. It's also may lead to high CPU usage and RAM leakage.
Useful crosslinks:
How to mount S3 bucket on Kubernetes container/pods?
Amazon S3 with s3fs and fuse, transport endpoint is not connected
How stable is s3fs to mount an Amazon S3 bucket as a local directory

AWS S3 file transfer from a non-AWS Linux server

Was wondering if anyone has a solution to transfer files from a non-AWS Linux Server A to a AWS S3 bucket location by using/running commands from a non-AWS Linux Server B? Is it possible to avoid doing two hops? Future plan is to automate the process on Server B.
new info:
I am able to upload files to S3 from ServerA such as:
aws s3 sync /path/files s3://bucket/folder
But not sure how to run/execute it from a different Linux server (ServerB)?
There are several steps to using the aws s3 sync command from any server that supports the aws cli and aws s3 sync command, Linux or otherwise
Enable Programmatic Access for the IAM user/account you will use with the AWS CLI and download the credentials
docs: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_console
Download and install the aws-cli for your operating system
Instructions available for:
Docker
Linux
macOS
Windows
docs: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html
Configure your aws credentials for your cli
e.g. aws configure
docs: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html
Create the bucket you will sync to and allow your aws user/identity access to this bucket
doc: https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html
Run the aws s3 sync command according to the rules outlined in the official documentation
e.g. aws s3 sync myfile s3://mybucket
docs: https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/sync.html#examples

Processing AWS ELB access logs (from S3 bucket to InfluxDB)

We would like to process AWS ELB access logs and write them into InfluxDB
to be used for application metrics and monitoring (ex. Grafana).
We configured ELB to store access logs into S3 bucket.
What would be the best way to process those logs and write them to InfluxDB?
What we tried so far was to mount S3 bucket to filesystem using s3fs and then use Telegraf agent for processing. But this approach has some issues: s3fs mounting looks like a hack, and all the files in the bucket are compressed and need to be unzipped before telegraf can process them which makes this task overcomplicated.
Is there any better way?
Thanks,
Oleksandr
Can you just install the telegraf agent on the AWS instance that is generating the logs, and have them sent directly to InfluxDB in real-time?

Amazon EC2 creates automatically if I use S3?

Amazon EC2 creates automatically if I use S3?
I use only S3.
No, if you use S3 it won't automatically create an Amazon EC2 Instance if that is what you are referring to. Can you clarify your question.
An AWS EC2 instance/server is different from S3.
If you use AWS S3 to upload/download store files no EC2 servers will be launched.
You can access these files through AWS console or through AWS Cli on your local machine.

How to specify different AWS credentials for EMR and S3 when using MRJob

I can specify what AWS credentials to use to create an EMR cluster via environment variables. However, I would like to run a mapreduce job on another AWS user's S3 bucket for which they gave me a different set of AWS credentials.
Does MRJob provide a way to do this, or would I have to copy the bucket using my account first so that the bucket and EMR keys are the same?