retrieving Apache log files from AWS Beanstalk - apache

I know that Beanstalk's Snapshot Logs can give you a recent overview of the httpd/access_log files from among the EC2 instances under the ELB for that environment. But does anyone know a good way to get all the logs?
It's a production environment, so I want to do the processing elsewhere. But I don't want to (for obvious reasons) configure root sftp and go around collecting the files manually.
I think I had read something about configuring logging to S3?

In the "Configuration" tab for an Environment, under "Software Configuration", there is a checkbox for enabling log file rotation to S3. These are stored in an S3 bucket used specifically for Elastic Beanstalk.

You can feed your current logs to aws cloudwatch logs.
AWS cloudwatch logs will centralise all logs of your infrastructure with a neat solution to search an process them as well as creating metrix and alarm based on your logs.
I have a guide on how to Store aws beanstalk symfony and apache logs in cloudwatch logs. This will help you to get up and running fast, and then you can tweak it.

Related

EKS pods logging to Elastic Cloud

I am trying to set up pods logs shipping from EKS to ElasticSearch Cloud.
According to Fluent Bit for Amazon EKS on AWS Fargate is here, ElasticSearch should be supported:
You can choose between CloudWatch, Elasticsearch, Kinesis Firehose and Kinesis Streams as outputs.
According to FluentBit Configuration Parameters for ElasticSearch having Cloud_ID and Cloud_Auth parameters should be enough to ship logs to Elasticsearch Cloud.
An example here shows how to configure ES output for FluentBit, so my config looks like:
[OUTPUT]
Name es
Match *
Logstash_Format On
Logstash_Prefix ${logstash_prefix}
tls On
tls.verify Off
Pipeline date_to_timestamp
Cloud_ID ${es_cloud_id}
Cloud_Auth ${es_cloud_auth}
Trace_Output On
I am running a simple ngnix container to generate some logs (as in one of the linked examples), but they don't seem to appear in my ElasticSearch / Kibana.
Am I missing anything? How do I ship logs to ElasticSearch Cloud?
Also, Trace_Output On is supposed to log FluentBits' attempts to ship logs, but where can I see these logs on EKS?
I also ran into this. It seems to me only AWS ElasticSearch is supported when using the AWS managed FluentBit (from what I can tell).
https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-eks-adds-built-in-logging-support-for-aws-fargate/
You can work around this by using a sidecar fluentbit container (which can send to ElasticSearch) if that's an option for you. You will need to modify the application to have logs written to the filesystem.
Or you can use the managed FluentBit with the cloudwatch output, subscribe with to the log group with a lambda function and send it to ES.

ECS AWS Cloudwatch logs

I have a task in ECS that runs tomcat. That tomcat has 2 or 3 apps deployed to it. I know its not an ideal situation but this is what we've got. Log4j is used and logs for apps goto different log files under logs folder of tomcat. Is there a way I can have those different log files from my docker container to CloudWatch under different streams? I know if I write logs to stdout using log4j appender I can have them in cloudwatch easily but then they will not be separate, it'll be log from all apps going in one place.
Many Thanks
Instead of using log4j and sending logs to STDOUT you may set your log-configuration and docker log driver to aws-logs, which will help you to send logs directly to the cloudwatch using cloudwatch agent.
Reference: https://aws.amazon.com/blogs/devops/send-ecs-container-logs-to-cloudwatch-logs-for-centralized-monitoring/

Processing AWS ELB access logs (from S3 bucket to InfluxDB)

We would like to process AWS ELB access logs and write them into InfluxDB
to be used for application metrics and monitoring (ex. Grafana).
We configured ELB to store access logs into S3 bucket.
What would be the best way to process those logs and write them to InfluxDB?
What we tried so far was to mount S3 bucket to filesystem using s3fs and then use Telegraf agent for processing. But this approach has some issues: s3fs mounting looks like a hack, and all the files in the bucket are compressed and need to be unzipped before telegraf can process them which makes this task overcomplicated.
Is there any better way?
Thanks,
Oleksandr
Can you just install the telegraf agent on the AWS instance that is generating the logs, and have them sent directly to InfluxDB in real-time?

Retrieve application config from secure location during task start

I want to make sure I'm not storing sensitive keys and credentials in source or in docker images. Specifically I'd like to store my MySQL RDS application credentials and copy them when the container/task starts. The documentation provides an example of retrieving the ecs.config file from s3 and I'd like to do something similar.
I'm using the Amazon ECS optimized AMI with an auto scaling group that registers with my ECS cluster. I'm using the ghost docker image without any customization. Is there a way to configure what I'm trying to do?
You can define a volume on the host and map it to the container with Read only privileges.
Please refer to the following documentation for configuring ECS volume for an ECS task.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html
Even though the container does not have the config at build time, it will read the configs as if they are available in its own file system.
There are many ways to secure the config on the host OS.
In my past projects, I have achieved the same by disabling ssh into the host and injecting the config at boot-up using cloud-init.

Monitoring Amazon S3 logs with Splunk?

We have a large extended network of users that we track using badges. The total traffic is in the neighborhood of 60 Million impressions a month. We are currently considering switching from a fairly slow, database-based logging solution (custom-built on PHP—messy...) to a simple log-based alternative that relies on Amazon S3 logs and Splunk.
After using Splunk for some other analyisis tasks, I really like it. But it's not clear how to set up a source like S3 with the system. It seems that remote sources require the Universal Forwarder installed, which is not an option there.
Any ideas on this?
Very late answer but I was looking for the same thing and found a Splunk app that does what you want, http://apps.splunk.com/app/1137/. I have yet not tried it though.
I would suggest logging j-son preprocessed data to a documentdb database. For example, using azure queues or simmilar service bus messaging technologies that fit your scenario in combination with azure documentdb.
So I'll keep your database based approach and modify it to be a schemaless easy to scale document based DB.
I use http://www.insight4storage.com/ from AWS Marketplace to track my AWS S3 storage usage totals by prefix, bucket or storage class over time; plus it shows me the previous versions storage by prefix and per bucket. It has a setting to save the S3 data as splunk format logs that might work for your use case, in addition to its UI and webservice API.
You use Splunk Add-On for AWS.
This is what I understand,
Create a Splunk instance. Use the website version or the on-premise
AMI of splunk to create an EC2 where splunk is running.
Install Splunk Add-On for AWS application on the EC2.
Based on the input logs type (e.g. Cloudtrail logs, Config logs, generic logs, etc) configure the Add-On and supply AWS account id or IAM Role, etc parameters.
The Add-On will automatically ping AWS S3 source and fetch the latest logs after specified amount of time (default to 30 seconds).
For generic use case (like ours), you can try and configure Generic S3 input for Splunk