Affordability of Amazon Simple Storage Service (S3) - amazon-s3

I have a website that attracts about 30,000 visitors per month. It has a lot of photos and PDF files which eat up a good deal of bandwidth. It's hosted by site5.com, which offers unlimited bandwidth & storage for ~$5 per month. According to site5's statistics, my site has about 20 GB of downloads per day, but I've seen it as high as 116 GB. Uploads range from 5-15 GB daily. (Though, I don't really upload things everyday, so I don't know where they get those numbers from.)
In anticipation of growing my site even more, perhaps by hosting videos, high-res photos, etc., I was looking into other storage options, even though site5 has been pretty good. Specifically, amazon.com's Simple Storage Service (S3) looks pretty good and is supposed to be a "highly scalable, reliable, fast, inexpensive data storage infrastructure."
Using Amazon's Simple Monthly Calculator, I multiplied out my worst-case scenario numbers:
Storage: 2 GB
Data Transfer-in: 15 GB/day * 31 days = 465 GB/month
Data Transfer-out: 116 GB/day * 31 days = 3596 GB/month
With those numbers alone, the calculator estimates my monthly bill to be a whopping $658.27!!! That's insane! Is anyone here using S3? Are your bills outrageous?

Wow, are you sure about those stats? I suppose that's possible, but you're lucky that your host hasn't given you the boot. Leasing a dedicated server will typically get you somewhere in the neighborhood of 1.5TB/month for at least 20 times what you are paying now. If you're doing 3.5TB for $5 per month and your host isn't complaining, don't even think about moving.
(note: most unlimited plans are indeed limited by the company's terms of service, which usually allows them to give anyone the boot for using "too many" resources.)
I would try to find some way to verify your stats before you continue.
$5/3500GB is $0.0014 per gig. That's insane.

3.6TB/month is kind of a lot. Just as a sanity-check, my internet connection seems to deliver somewhere around 100kB/sec reception if I'm lucky (I assume the send/receive rat are about the same). At that bandwidth limit it would take my computer 417 days sending continuously to deliver that amount of data.
10c per gigabyte seems pretty reasonable to me. NearlyFreeSpeech.net charges $1/gigabyte delivered but that decreases to 20c/gigabyte at high volumes. Mosso charges 22c/GB delivered.

If you are paying $5 for unlimited transfer and storage I would stick with your current provider as they are offering something that no-one else is going to be able to offer you for that price.
S3 is also a content distribution network, it has certain uptime guarantees, data storage guarantees, your host probably does not. When Amazon says they can deliver your 116 GB a day they really mean it, whereas your host is probably overselling their capacity and hoping people don't really use their unlimited transfer.
You are getting a steal in terms of what you use. Good luck finding that elsewhere.

Related

Why is Azure SQL database so expensive?

For a small personal coding project I recently created a SQL database in Azure. For the past weeks I have been hardly using the database, out of 2 GB available space I have been using only 13 MB.
However, the database costs me 6,70 EUR per day and I don't understand why this is the case. Read a few topics/posts stating that the costs with similar use should be around 5-7 EUR per month, not per day.
This is the configuration for the database:
No elastic pool
General purpose, Gen5, 2 vCores
West Europe
Does anyone have an idea about what could be causing the costs per month to be so high?
You choosed the General purpose, Gen5, 2 vCores price tier. Here is the cost every month:
This means that you must pay for it no matter how many space you used. As you said you just used only 13M. So you must change the Pricing tier.
What I suggest you is configure you database price to Bacic which only cost you 4.99 USD per month. Basic price tier provides 5 DTUs and Max size 2GB for you.
You can change the price tier on the database overview site:
Hope this helps.
You're paying for the entire infrastructure is why. It really only saves on upfront cost. A dedicated server, Windows Server + SQL Server Web will run you, at least $5K. Performance wise, a dedicated server at a colo center will be a lot cheaper to run once you get the hardware. I know, I've switched several companies off of Azure and, instead of paying $2500/mo, they pay $200/mo (after the server) for 4U at a colo + $100/mo basic maintenance and 1TB/mo bandwidth, so it adds up. For example, I built 2 custom 1U servers (12 core/32GB) for $8500 and an opensource router for another $500 (pfSense), including OSes & SQL Server Web. Initial setup of both servers including SQL and the router for 16 IP Addresses was about $1K. Total cost was $10K up front. The equivalent horsepower and storage from Azure was $2500/mo. In 1 year on Azure it ran $30K! 1 year on colo (hosting + maintenance) was $13600, the following year was $3600. So far in 5 years, they saved ~$122,000. There was only 15mins of downtime during the entire period. Cloud hosting is a great idea, but it will never save you time nor money at the rates these company's charge. As far as downtime, I have been hosting for 2 decades and the worst downtime happened due to a network failure (that also took out multiple cloud providers) and it was 13 hours. The only other one was due to a fried router (about 3 hours). Just my take on it - Cloud hosting is still way too expensive for what you actually get & redundancy is nice but you can buy a new server every 2 months for the price difference (just get good equipment w/redundant power supplies and hot swap drives - in a 55 degree colo center, failures are rare)
It seems you don't know Azure offers a free tier. Please refer to this StackOverflow thread for details on how to take advantage of the free tier that supports databases of 32 MB of space.
If it is a small project you can run it on Ubuntu Linux and it's $3.80/month or $0.0052/hour.
On top of this, you can install MySql or SQL Express. I personally find MySql easier to access/configure
It's sure that Azure offers a free tier but still you can optimize it with very low cost if you use any purchased plan.
Here I provided some direction on the picture below that how to create free App Service Plan
Now let's see how we can optimize the cost for small database size for your purchased plan.
Go to the option of Create SQL Database
Click on the link Configure Database as per the below picture
Then Select the Basic option under DTU-Based as per the below picture
As the above picture shows, the default selected option is General Purpose option under VCore-Based section, so it costs $410 as it provides you 32 GB database.
As Basic option is selected, so DB Size is changed into 2 GB instead of 32 GB, hence cost is changed into just $5.64 instead of $410

Is there any affordable cloud storage?

I'm developing a web/mobile app similar to dropbox or drive, but I'm finding problems about storage cost.
As I said, my application lets the user storage files and retrieve it later, but my users pay only one time, so I've found Amazon S3 and GCS too expensive, because they charge every month, also they charge per transaction and download bandwith, so it would be unaffordable.
In my search I've wondered how could work a website like youtube considering that the cost is too much.
I've found Backblaze and It would be cheaper for my needs, but still goes very expensive.
I've considered using Youtube API for upload videos and reduce costs, but my application would work offline too (It would sync frequently) so I don't think youtube works for offline playing.
Could you help me please?
Thank you.
This is not really an answer, but your situation is of interest to me as I am asked this constantly by customers. What is the cheapest solution, and not what is the most appropriate solution?
When you try to reduce storage costs too far, reliability will usually drop significantly. The cost for S3 is dirt cheap to me and Backblaze is 4 times cheaper (I don't have personal experience with Backblaze).
Think about your business model a bit. If the service that you are offering cannot offer the reliability that will be required, you will quickly fail. A couple of data loss situations and poof, your business is gone.

Good idea to host data that will be downloaded internationally using S3?

I don't have any experience regarding server hosting performance and how slow it gets so I wanted to ask this question.
My situation is, I want to host a ~1MB data file that needs to be downloaded by clients occasionally (once every 2-3 days). Of course I would like to minimize costs as long as it does not hurt user experience too much. I have data to indicate that I have clients globally.
I wanted to ask what the ballpark figure would be for the amount of time it would take to download a file of this size from other parts of the world (data is hosted in the US). Does anyone have any idea, for instance, how long it would take to download a 1MB file from locations such as Japan?
In case people are wondering, I personally would consider it OK if it takes under 10s to download in most parts of the world.
The first thing to do when you don't know how well something works... is to try it. Create buckets in all of the regions, store a file, and then download it and see.
The official AWS-centric answer for global content distribution is to connect a CloudFront distribution to an S3 bucket, and set things up so that your content is downloaded from S3 via CloudFront. This will tend to improve download speeds more when the user is distant from the bucket, even if the content isn't cached at a CloudFront edge, because most of the distance the download has to travel, it will be traveling on the AWS "Edge Network," a global network connecting CloudFront to the AWS regions, with fewer unknowns than the Internet at large between here and wherever.
I have a global client base, but -- for example -- my shopping pages' catalog images are stored in S3 in Oregon (us-west-2), but with links pointing to CloudFront.
Interestingly, the pricing for using both services together sometimes works out a little bit less expensive than using only S3. A possible explanation for this is that edge network egress traffic represents a lower cost to AWS and the rates are set accordingly. It's not a major difference, but once you understand the pricing tables, you'll see it.
1MB in 10s equals 800kbps. I'd be very surprised if any reputable hosting provider couldn't keep up with that speed of delivery. Looking at Akamai rankings (2015)*, in Japan (as in your example) the average user's speed is 15Mbps: your file would then be downloaded in 0.53 seconds.
( *Looking at the rankings, keep in mind that in countries where fast internet infra is yet to be ubiquitous, the "average speed" will be an average of fast corporate pipes and other premium links, with actual mainstream users having substantially slower speeds.)
Then in most cases, this will be up to the user's connection speed, and further, their ISP's international links, which can be much slower than their national or regional pipes. More so in countries with less developed internet infrastructure, where operators are cutting costs and corners.
In deciding if you need to deploy S3 or other CDN solutions, or no extra solutions at at all, you'll have to start with mapping up your user demographics. If there's a substantial sector from far-away countries with weak net infra, it makes sense. Otherwise, it doesn't seem likely that your target speed of 1MB/10s wouldn't be matched even without a special means of delivery.
If you have some but not substantial traffic from countries/regions where you reckon int'l traffic might be slower, and if you want to eliminate extra costs, I figure your users will survive even if it takes 15-20 seconds once in a blue moon as their speeds fluctuate. (This is opinion-based relative to how picky your users are!) In such a case, I'd only bother with a CDN if I wanted to improve speeds across the board, e.g. for all requests for static resources, not just a single file requested every couple of days. Would make a more substantial contribution towards the general user experience.

Microsoft Azure Blob Storage Upload Performance

I am running an Azure web role, which is storing very small blobs into Azure storage. (Blob upload is being done from the server, not from the browser.) I have searched stack overflow and the rest of the internet for tips on optimizing blob storage performance, and I believe I've checked and implemented all of the usual suspects: uploading async, allowing unlimited outgoing web connections (which now seems to be the default setting on web roles and no longer needs to be explicitly set in web.config or in code).
Tweaking the number of concurrent uploads I allow makes some difference, but regardless of what I've tried, I seem to max out at around 1,000 blob uploads per second. This is when running in the Azure web role, in the same region as the storage account (East US). My rate when running this from home over a good internet connection isn't much less, ~700 blobs/sec, which seems to tell me that it's not the network latency that's limiting the rate, it's the actual processing time of the storage service.
I wouldn't normally consider these rates horrible for this kind of a service, but I've read that Microsoft boasts a rate of ~20,000 storage transactions per second, so I've been a little disappointed with these results.
I'd like to get some feedback from those who have really tried to push the limits of blob storage. Does ~1000 small uploads per second sound about right? Or is there possibly something else I should be doing to improve this? I'll post the code if I need to, but I'd rather not receive speculative answers, I'd like to hear from developers who can either confirm that my results are reasonable, or that they've seen much higher throughput.
I should add that I'm currently running this in a small web role. I've tried it also in a medium web role, and didn't see any significant difference.
EDIT:
After a few days of development and testing, my upload rate seemed to suddenly increase. Not by a lot, but maybe by another ~200 per second. In looking around the web, I noticed a comment in the Azure documentation stating "A storage account scales automatically as usage increases." So I'm wondering if it really is capable of much higher rates, but will not automatically scale up until it sees sustained period of high volume. Some confirmation of that would also be greatly appreciated.
Depending on how small your requests are the problem might be caused by Nagle’s Algorithm is Not Friendly towards Small Requests - although usually I see that with queues / table operations. Try disabling Nagle's and let me know if that makes any difference. As an fyi, you have to disable it prior to establishing the connection otherwise the changes will not take effect.
Jason

Offsite backups - possible with large amounts of code/source images etc?

The biggest hurdle I have in developing an effective backup strategy is being able to do some sort of offsite backup. Unfortunately, this can only be via uploading data to the offsite source but my internet cable has upload speeds which prohibit this.
Has anyone here managed to do offsite backups of large libraries of source code?
This is only relevant to home users and not in the workplace where budgets may open up doors.
EDIT: I am using Windows Vista (So 'nix solutions aren't relevant).
Thanks
I don't think your connections upload speed will be as prohibitive as you think. Just make sure you look for a solution where your changes can be sent as diffs. Even if your initial sync takes days, daily changes would likely be more manageable.
Knowing a few more specifics about how much data you are talking about, and exactly how slow your connection is, I think would allow the community to make more specific suggestions.
Services like Mozy allow you to back up large amounts of data offsite.
They upload slowly in the background, and getting the initial sync to the servers can take a while depending on your speed and amount of data, but after that they use efficient diffs to keep the stored data in sync.
The pricing is very home-friendly too.
I think you have to define backup and whats acceptable to you.
At my house, i have a hot backup of our repositories where I poll svn once an hour over the VPN and it takes down any check ins. this is just to catch any check ins that are not captured each 24 hours via the normal backup. I also send a full backup every 2 days through the pipe to be outside of the normal 3 tier backup we do at the office. our current zipped repository is 2GB zipped at max compression. That takes 34 hrs at 12 k/s and 17hrs at 24k/s, you did not say the speed of your connection, so its hard to judge if thats workable.
If this isnt viable, you might want to invest in a couple of 2.5" USB drives and load/swap them offsite to a safety deposit box at the bank. this used to be my responsibility but I lacked the discipline to do this consistently each week to assure some safety net. In the end it was just easier to live uploading the data to an ftp site at my house.