AWS elasticbeanstalk automating deletion of logs published to S3 - amazon-s3

I have enabled publishing of logs from AWS elasticbeanstalk to AWS S3 by following these instructions: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.loggingS3.title.html
This is working fine. My question is how do I automate the deletion of old logs from S3, say over one week old? Ideally I'd like a way to configure this within AWS but I can't find this option. I have considered using logrotate but was wondering if there is a better way. Any help is much appreciated.

I eventually discovered how to do this. You can create an S3 Lifecycle rule to delete particular files or all files in a folder more than N days old. Note: you can also archive instead of delete or archive for a while before deleting, among other things- it's a great feature.
Reference: http://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectExpiration.html
and http://docs.aws.amazon.com/AmazonS3/latest/dev/manage-lifecycle-using-console.html

Related

How to stop AWS ElasticBeanstalk from creating an S3 Bucket or inserting into it?

It created an S3 bucket. If I delete it, it just creates a new one. How can I set it to not create a bucket or to stop write permissions from it?
You cannot prevent AWS Elastic Beanstalk from creating S3 Bucket as it stores your application and settings as a bundle in that bucket and executes deployments. That bucket is required till the time you run/deploy your application using AWS EB. Please be vary of deleting these buckets as this may cause your deployments/applications to crash. Although, you may remove older objects (which may not be in use).
Take a look at this link for a detailed information on how EB uses S3 buckets for deployments https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.S3.html

S3 objects have been deleted randomly

Is there a command in AWS CLI to restore Versioning files?
i've been developing a web server using Django
someday i found there was deleted image files randomly in S3
i think Django sorl-thumbnail will delete it
and tried to fix it but it failed
So I thought of temporary solution.
AWS S3 is delivering versioning. i use it to recover manually every day.
This is very Annoying to do, so I am writing a script.
But I could not find a way to restore a file with a delete marker.
Does anyone know the situation above?
thanks you!
Recovering of objects is a bit tricky in s3. As per AWS documentation http://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjects.html
When you delete an object from a versioned bucket, S3 creates a new object called a delete marker, which has its own, new version ID.
If you delete that "version" of the object, it will restore your object's visibility.
You can use this command
aws s3api delete-object --bucket <bucket> --key <key> --version-id <version_id_of_delete_marker>

How to filter or cleanup my S3 buckets clutters by log file?

I use S3 and amazon cloud front to put images.
When I go on amazon S3 interface, it's hard to find the folder where i have put my images because i need to scroll 10 minutes past all the buckets it creates every 15 minutes/hour. There are literally thousands.
Is it normal?
Did I put something wrong on the settings of S3 or of the cloud front file I connected to this S3 folder?
What should I do to delete them? It seems I can only delete them one by one.
See here a snapshot:
AND SO ON.....FOR THOUSANDS OF FILES UNTIL...
Those are not buckets, but are actually log files generated by S3 because you enabled logging for your bucket and configured it to save the logs in the same bucket.
If you want to keep logging enabled but make it easier to work with the logs, just use a prefix in the logging configuration or set up logging to use a different bucket.
If you don't need the logs, just disable logging.
See http://docs.aws.amazon.com/AmazonS3/latest/dev/ServerLogs.html for more details.

Not able to backup the log files during instance termination issued by Auto Scaling Policy

I am having EC2 instances with auto scaling enabled on it.
Now as part of scale down policy when one of the instance is issued termination, the log files remaining on that instance need to be backed up on s3, but I am not finding any way to perform s3 logging of log files for that instance. I have tried putting the needed script in rc0.d directory through chkconfig with highest priority. I also tried to put my script in /lib/systemd/system/halt.service (or reboot.service or poweroff.service), but no luck till now.
I have found some threads related to this on stack overflow and AWS forum but no proper solution found till now.
Can any one please let me know the solution to this problem?
The only reliable way I have found of achieving this behaviour is to use rsyslog/syslog to transfer the log files to a central host as soon as they are written to the syslog subsystem.
This means you will need to run another instance that receives the log files and ships them to S3, or use an SQS-based system such as logstash.
Unfortunately there is no other way to ensure all of your log messages will be stored on S3 - you can not guarantee that your script will finish before autoscaling "pulls the plug".

Allowing users to download files as a batch from AWS s3 or Cloudfront

I have a website that allows users to search for music tracks and download those they they select as mp3.
I have the site on my server and all of the mp3s on s3 and then distributed via cloudfront. So far so good.
The client now wishes for users to be able to select a number of music track and then download them all in bulk or as a batch instead of 1 at a time.
Usually I would place all the files in a zip and then present the user a link to that new zip file to download. In this case, as the files are on s3 that would require I first copy all the files from s3 to my webserver process them in to a zip and then download from my server.
Is there anyway i can create a zip on s3 or CF or is there someway to batch / group files in to a zip?
Maybe i could set up an EC2 instance to handle this?
I would greatly appreciate some direction.
Best
Joe
I am afraid you won't be able to create the batches w/o additional processing. firing up an EC2 instance might be an option to create a batch per user
I am facing the exact same problem. So far the only thing I was able to find is Amazon's s3sync tool:
https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
In my case, I am using Rails + its Paperclip addon which means that I have no way to easily download all of the user's images in one go, because the files are scattered in a lot of subdirectories.
However, if you can group your user's files in a better way, say like this:
/users/<ID>/images/...
/users/<ID>/songs/...
...etc., then you can solve your problem right away with:
aws s3 sync s3://<your_bucket_name>/users/<user_id>/songs /cache/<user_id>
Do have in mind you'll have to give your server the proper credentials so the S3 CLI tools can work without prompting for usernames/passwords.
And that should sort you.
Additional discussion here:
Downloading an entire S3 bucket?
s3 is single http request based.
So the answer is threads to achieve the same thing
Java api - uses TransferManager
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html
You can get great performance with multi threads.
There is no bulk download sorry.