Cost effective solution to upload large files on S3 using application running in EC2 instance [closed]

Cost effective solution to upload large files on S3 using application running in EC2 instance [closed] - amazon-s3

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 13 days ago.
Improve this question
I have an application running on EC2 instance. Using that application user can large file 10MB+ on S3 bucket. The mechanism I am copies the file on EC2 instance and then upload it on S3 bucket. It is costly because two times copy (first on EC2 and then on S3 bucket). I have tried using another solution in order to reduce two-time copy a file.
API-gateway ==>> lambda function ==>> S3 bucket.
API gateway has a limit of 10MB i.e. file size should be less than 10MB.
I thought of splitting the file into small pieces but again lambda have to zip them (it again takes time to zip them).
The other solution is s3 pre-signed URLs but it is again costly. I need some effective solution for this problem
I found solutions but they were not either cost effective or time consuming.

S3 pre-signed URLs is the way as mentioned by luk2302.

Related

Can S3 bucket be slowed down by Internet Provider? [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 22 days ago.
Improve this question
I have an s3 bucket in us-east-2 region and access is mainly from Nepal. When I use my Wifi, it is really slow but when using mobile data it is fast enough. And it is also fast when using VPN outside my country. What could be the reason behind it. Also the speed was good enough just a day before. Just today it started to slow down for no reason. Is it due to my Wifi provider? What should I do in this situation?

Buckets are globally accessible, but they reside in a specific AWS Region. The geographical distance between the request and the bucket contributes to the time it takes for a response to be received.
To decrease the distance between the client and the S3 bucket, consider moving your data into a bucket in another Region that's closer to the client. You can configure cross-Region replication so that data in the source bucket is replicated into the destination bucket in the new Region. As another option, consider migrating the client closer to the S3 bucket.
You can also try S3 Transfer Acceleration, which manages fast, easy, and secure transfers of files over long geographic distances between the client and an S3 bucket. It takes advantage of the globally distributed edge locations in Amazon CloudFront. As the data arrives at an edge location, it is routed to Amazon S3 over an optimized network path. Transfer Acceleration is ideal for transferring gigabytes to terabytes of data regularly across continents. It's also useful for clients that upload to a centralized bucket from all over the world.
Can S3 bucket be slowed down by Internet Provider?
If you are connecting to S3 over the Internet, the performance of your your Internet connection can affect S3 upload and download time. Because of the difference in the network latency between WiFi and Mobile network, I encourage you to test whether the cause of your issues is with your network rather than with your AWS setup. Here is a robust guideline on how to troubleshoot slow or inconsistent speeds when downloading or uploading to Amazon S3.

Should I create a separate file upload server besides main graphql server? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I'm creating a mobile application with React Native which is going to heavily depend on file uploading in form of images and videos. Currently my GraphQL server is handling all the database interaction and I now want to add the functionality to upload images (right now only profile pictures, later videos). These files will be stored in a cloud object storage.
It would be quite easy to use apollo-upload-client in the mobile application and graphql-upload with my Apollo server to handle the uploading of files. But I'm not sure if I should create a seperate server which only handles the interaction with files so that my main server only needs to handle the DB jobs. File uploading would also add a huge amount of load to the main GraphQL server which needs to be super fast and responsive as most of the application depends on it.
I would be interested to hear other opinions and tips on this topic and if it's worth creating a seperate server for interaction with files.
Or even look into different languages like Elixir or Python to improve performance because we would also need to process and compress the videos and images to reduce their size.

IMO, if your final destination is cloud-based storage, you're going to be better off (and pay less) if you upload directly to the cloud. What I generally recommend is a 3-step process:
mutation to create a signed upload URL (or signed upload form)
Client uploads directly to the cloud (temporary location with TTL)
mutation to process the form which contained the upload metadata (process and move to final location)
This especially gets interesting once you start handling multiple uploads, and figuring how to process them asynchronously while the user is filling out the rest of the form.

AWS S3 to Glacier: did backup work? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am experimenting with backing up data in my Amazon S3 folders to Glacier using lyfecycle management options. I chose one of the folders in the bucket for the testing and created a lifecycle rule that states that objects with that prefix need to be migrated to Glacier after 30 days. I created the rule today but these files are all older than 30 days so I expected them to get migrated right away. But I am looking at that S3 folder and not noticing any changes. How do I find out if a backup actually occurred?

The lifecycle management policy (LMP) you applied will affect all items matching it, whether they existed before you applied the policy or were created after you applied it. It takes time for the policy to synchronize across all of your items in S3. See Object Lifecycle Management just before and after "Before You Decide to Archive Objects".
The objects moved by a LMP are only visible through the S3 API, not via the Glacier API or console. You'll continue to see the objects listed in your S3 bucket, but the object's metadata will be updated to indicate that the x-amz-storage-class is Glacier. You should be able to see this through the S3 console, or by making a request for the object's metadata using the S3 API. See Object Key and Metadata for the System-Defined Metadata.

How to make a daily back up of my ec2 instance? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a community AMI based Linux EC2 instance in AWS. Now I want to take a daily back up of my instance, and upload that image in to S3.
Is that the correct way of doing the back up of my EC2 instance? Can anybody help me to point out the correct method for taking back up of my EC2 instance?

Hopefully your instance is EBS backed.
If so, you can backup your instance by taking an EBS Snapshot. That can be done through aws.amazon.com (manually), using AWS Command Line Tools (which can be automated and scheduled in cron or Windows Task Scheduler as appropriate) or through the AWS API.
You want to ensure that no changes are made to the state of the database backup files during the snapshot process. When I used this strategy for MySQL running on Ubuntu, I used a script to ensure a consistent snapshot. That script uses a feature of the XFS file system to freeze the filesystem during the snapshot. In that deployment, the snapshot only took 2-3 seconds and was performed at a very off-peak time. Any website visitors would experience a 2-3 second lag. For Windows, if the device can not be rebooted for the snapshot (you have no maintenance window at night), I would instead create a separate EBS device (e.g. a "S:\" device for snapshots), use SQL Server backup tools to create a .bak file on that other device, then create an EBS snapshot of that separate EBS device.
For details on scripting the backup, see this related question:
Automating Amazon EBS snapshots anyone have a good script or solution for this on linux
If you have separate storage mounted e.g. for your database, be sure you back that up too!
UPDATE
To create a snapshot manually,
Browse to https://console.aws.amazon.com/ec2/home?#s=Volumes
Right-click on the volume you want to backup (the instance the volume is attached to is in the column named 'Attachment Information')
Select Create Snapshot
To create an AMI image from the instance and lauch other instances just like it (in instances with more resources or to balance load, etc.):
Browse to https://console.aws.amazon.com/ec2/home?#s=Instances
Right-click on the instance you want to create the AMI from
Select Create Image (EBS AMI)

Backup strategy for user uploaded files on Amazon S3? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
We're switching from storing all user uploaded files on our servers, to using Amazon S3. It's approx. 300 GB of files.
What is the best way to keep an backup of all files? I've seen a few different suggestions:
Copy bucket to a bucket in a different S3 location
Versioning
Backup to an EBS with EC2
Pros/cons? Best practice?

What is the best way to keep an backup of all files?
In theory, you don't need to. S3 has never lost a single bit in all these years. Your data is already stored in multiple data centers.
If you're really worried about accidentally deleting the files, use IAM keys. For each IAM user, disable the delete operation. And/or turn on versioning and remove the ability for an IAM user to do the real deletes.
If you still want a backup, EBS or S3 is pretty trivial to implement: Just run an S3 Sync utility to sync between buckets or to the EBS drive. (There are a lot of them, and it's trivial to write.) Note that you pay for unused space on your EBS drive, so it's probably more expensive if you're growing. I wouldn't use EBS unless you really had a use for local access to the files.
The upside of the S3 bucket sync is you can quickly switch your app to using the other bucket.
You could also use Glacier to backup your files, but that has some severe limitations.

IMHO, backup to another S3 bucket in another Availability Zone (hence Bucket) is the best way to go:
You already have the infrastructure to manipulate S3 so there is little change to do
This will ensure that in the event of a catastrophic failure of S3, your backup AZ won't be affected
Other solutions have drawbacks this doesn't have:
Versioning is not catastrophic failure proof
EBS backup requires specific implementation to manipulate these backups directly on the disk.

I didn't try it myself but Amazon have a versioning feature that could solve your backup fears - see: http://aws.amazon.com/about-aws/whats-new/2010/02/08/versioning-feature-for-amazon-s3-now-available/

Copy bucket to a bucket in a different S3 location:
This may not be necessary because S3 already has achieved six "9" reliable by redundancy backup. People who want to achieve data accessing performance globally might make copy of buckets in different data center. So, unless you want to avoid some unlikely disaster like "911", then you can make a copy in Tokyo data center for buckets in New York.
However, within same data center, copying buckets to different buckets gives you very little help when disaster happens to same data center.
Versioning
It helps you achieve storage efficiency by saving redundancy and helps to restore faster. Definitely it is a good choice.
Backup to an EBS with EC2
You probably will NEVER do this because EBS is a much expensive/faster storage in AWS compared with S3. And its main purpose is for backup EC2 image for faster bootup. EC2 is computing instance which has nothing to do with storage or S3. It is totally irrelevant and I cannot see any point that you introduce EC2 to your data backup.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas