AWS S3 to Glacier: did backup work? [closed] - amazon-s3

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am experimenting with backing up data in my Amazon S3 folders to Glacier using lyfecycle management options. I chose one of the folders in the bucket for the testing and created a lifecycle rule that states that objects with that prefix need to be migrated to Glacier after 30 days. I created the rule today but these files are all older than 30 days so I expected them to get migrated right away. But I am looking at that S3 folder and not noticing any changes. How do I find out if a backup actually occurred?

The lifecycle management policy (LMP) you applied will affect all items matching it, whether they existed before you applied the policy or were created after you applied it. It takes time for the policy to synchronize across all of your items in S3. See Object Lifecycle Management just before and after "Before You Decide to Archive Objects".
The objects moved by a LMP are only visible through the S3 API, not via the Glacier API or console. You'll continue to see the objects listed in your S3 bucket, but the object's metadata will be updated to indicate that the x-amz-storage-class is Glacier. You should be able to see this through the S3 console, or by making a request for the object's metadata using the S3 API. See Object Key and Metadata for the System-Defined Metadata.

Related

Cost effective solution to upload large files on S3 using application running in EC2 instance [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 13 days ago.
Improve this question
I have an application running on EC2 instance. Using that application user can large file 10MB+ on S3 bucket. The mechanism I am copies the file on EC2 instance and then upload it on S3 bucket. It is costly because two times copy (first on EC2 and then on S3 bucket). I have tried using another solution in order to reduce two-time copy a file.
API-gateway ==>> lambda function ==>> S3 bucket.
API gateway has a limit of 10MB i.e. file size should be less than 10MB.
I thought of splitting the file into small pieces but again lambda have to zip them (it again takes time to zip them).
The other solution is s3 pre-signed URLs but it is again costly. I need some effective solution for this problem
I found solutions but they were not either cost effective or time consuming.
S3 pre-signed URLs is the way as mentioned by luk2302.

Should I create a separate file upload server besides main graphql server? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I'm creating a mobile application with React Native which is going to heavily depend on file uploading in form of images and videos. Currently my GraphQL server is handling all the database interaction and I now want to add the functionality to upload images (right now only profile pictures, later videos). These files will be stored in a cloud object storage.
It would be quite easy to use apollo-upload-client in the mobile application and graphql-upload with my Apollo server to handle the uploading of files. But I'm not sure if I should create a seperate server which only handles the interaction with files so that my main server only needs to handle the DB jobs. File uploading would also add a huge amount of load to the main GraphQL server which needs to be super fast and responsive as most of the application depends on it.
I would be interested to hear other opinions and tips on this topic and if it's worth creating a seperate server for interaction with files.
Or even look into different languages like Elixir or Python to improve performance because we would also need to process and compress the videos and images to reduce their size.
IMO, if your final destination is cloud-based storage, you're going to be better off (and pay less) if you upload directly to the cloud. What I generally recommend is a 3-step process:
mutation to create a signed upload URL (or signed upload form)
Client uploads directly to the cloud (temporary location with TTL)
mutation to process the form which contained the upload metadata (process and move to final location)
This especially gets interesting once you start handling multiple uploads, and figuring how to process them asynchronously while the user is filling out the rest of the form.

How to make a daily back up of my ec2 instance? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a community AMI based Linux EC2 instance in AWS. Now I want to take a daily back up of my instance, and upload that image in to S3.
Is that the correct way of doing the back up of my EC2 instance? Can anybody help me to point out the correct method for taking back up of my EC2 instance?
Hopefully your instance is EBS backed.
If so, you can backup your instance by taking an EBS Snapshot. That can be done through aws.amazon.com (manually), using AWS Command Line Tools (which can be automated and scheduled in cron or Windows Task Scheduler as appropriate) or through the AWS API.
You want to ensure that no changes are made to the state of the database backup files during the snapshot process. When I used this strategy for MySQL running on Ubuntu, I used a script to ensure a consistent snapshot. That script uses a feature of the XFS file system to freeze the filesystem during the snapshot. In that deployment, the snapshot only took 2-3 seconds and was performed at a very off-peak time. Any website visitors would experience a 2-3 second lag. For Windows, if the device can not be rebooted for the snapshot (you have no maintenance window at night), I would instead create a separate EBS device (e.g. a "S:\" device for snapshots), use SQL Server backup tools to create a .bak file on that other device, then create an EBS snapshot of that separate EBS device.
For details on scripting the backup, see this related question:
Automating Amazon EBS snapshots anyone have a good script or solution for this on linux
If you have separate storage mounted e.g. for your database, be sure you back that up too!
UPDATE
To create a snapshot manually,
Browse to https://console.aws.amazon.com/ec2/home?#s=Volumes
Right-click on the volume you want to backup (the instance the volume is attached to is in the column named 'Attachment Information')
Select Create Snapshot
To create an AMI image from the instance and lauch other instances just like it (in instances with more resources or to balance load, etc.):
Browse to https://console.aws.amazon.com/ec2/home?#s=Instances
Right-click on the instance you want to create the AMI from
Select Create Image (EBS AMI)

How to detect changes in Amazon S3? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Notification of new S3 objects
Get notified when user uploads to an S3 bucket?
What's the most efficient way to detect changes in Amazon S3? A number of distributed boxes need to synchronize local files with S3. Each box needs to synchronize with a portion of an S3 bucket. Sometimes files get dropped into a bucket from an external source, and so the boxes won't know about it.
I could write a script that continually crawls all files on S3 and notifies the appropriate box when there is a change, but that will be slow and expensive. (There will be millions of files). I thought about enabling logging on the bucket, but it takes a long time for logs to get written, and I would like to get notified of changes fairly quickly.
Any other ideas?
Amazon provides a means of notifying bucket events (as seen here), but the only event currently supported is the s3:ReducedRedundancyLostObject.
I am afraid the only ways you can do what you want, today, are by either polling (or crawling, like you said) or modifying the clients who upload files to your bucket(s) (if you are in control of their code) in order to notify your boxes whenever stuff is uploaded/changed.

Backup strategy for user uploaded files on Amazon S3? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
We're switching from storing all user uploaded files on our servers, to using Amazon S3. It's approx. 300 GB of files.
What is the best way to keep an backup of all files? I've seen a few different suggestions:
Copy bucket to a bucket in a different S3 location
Versioning
Backup to an EBS with EC2
Pros/cons? Best practice?
What is the best way to keep an backup of all files?
In theory, you don't need to. S3 has never lost a single bit in all these years. Your data is already stored in multiple data centers.
If you're really worried about accidentally deleting the files, use IAM keys. For each IAM user, disable the delete operation. And/or turn on versioning and remove the ability for an IAM user to do the real deletes.
If you still want a backup, EBS or S3 is pretty trivial to implement: Just run an S3 Sync utility to sync between buckets or to the EBS drive. (There are a lot of them, and it's trivial to write.) Note that you pay for unused space on your EBS drive, so it's probably more expensive if you're growing. I wouldn't use EBS unless you really had a use for local access to the files.
The upside of the S3 bucket sync is you can quickly switch your app to using the other bucket.
You could also use Glacier to backup your files, but that has some severe limitations.
IMHO, backup to another S3 bucket in another Availability Zone (hence Bucket) is the best way to go:
You already have the infrastructure to manipulate S3 so there is little change to do
This will ensure that in the event of a catastrophic failure of S3, your backup AZ won't be affected
Other solutions have drawbacks this doesn't have:
Versioning is not catastrophic failure proof
EBS backup requires specific implementation to manipulate these backups directly on the disk.
I didn't try it myself but Amazon have a versioning feature that could solve your backup fears - see: http://aws.amazon.com/about-aws/whats-new/2010/02/08/versioning-feature-for-amazon-s3-now-available/
Copy bucket to a bucket in a different S3 location:
This may not be necessary because S3 already has achieved six "9" reliable by redundancy backup. People who want to achieve data accessing performance globally might make copy of buckets in different data center. So, unless you want to avoid some unlikely disaster like "911", then you can make a copy in Tokyo data center for buckets in New York.
However, within same data center, copying buckets to different buckets gives you very little help when disaster happens to same data center.
Versioning
It helps you achieve storage efficiency by saving redundancy and helps to restore faster. Definitely it is a good choice.
Backup to an EBS with EC2
You probably will NEVER do this because EBS is a much expensive/faster storage in AWS compared with S3. And its main purpose is for backup EC2 image for faster bootup. EC2 is computing instance which has nothing to do with storage or S3. It is totally irrelevant and I cannot see any point that you introduce EC2 to your data backup.