Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
We are gathering information about different systems. At the moment we are looking for a storage solution. We will have a high amount of outgoing traffic with large files.
I want to compare s3 with google cloud storage.
google cloud storage costs around $0.08/GB at 90TB. S3 is around $0.06. But google cloud storage has already a cdn, which makes it way cheaper than amazon s3 with cloudfront.
Now I read somewhere that google cloud stroage is much slower than s3 with very large files. Is this true ?
I can not find any information.
What alternatives do I have if I have a high amount of outgoing traffic and large files ?
Edit:
benchmarks:
http://blog.zencoder.com/2012/07/23/first-look-at-google-compute-engine-for-video-transcoding/
2016 update - Look at this benchmark comparing AWS, Azure, and Google Cloud storage:
http://blog.zachbjornson.com/2015/12/29/cloud-storage-performance.html
As a reddit user commented:
“Not for network throughput on GCP. It consistently beats out all of
the competition by a such huge margin that if you’re writing an App
that’s latency/bandwidth sensitive on the network side just ignore the
competition.”
For current pricing schemes, compare:
https://developers.google.com/storage/pricing#pricing
https://aws.amazon.com/s3/pricing/
Performance is trickier to evaluate: You can get numbers, but those numbers will depend a lot on current conditions, network, proximity to datacenters, ratio of read/writes, etc.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm comparing the cost of services like google cloud SQL to launching your own VM in the cloud with whatever sql version you'd like.
VM instance only vs Cloud SQL only
I'm quite surprised by the results I got. The cloud SQL, is more than twice the price for the same underlying system (8 vCPU, 52 Go RAM, 2 To storage)
So basically, you pay more to have less. And I was expecting the contrary...
Granted, you don't have to deal with maintenance and automating backup yourself, but I found the price difference ridicule.
So my question is : when should I consider using Cloud SQL instead of running my own specialized VM ?
Right now, I feel like this service is just a fancy way to milk money from the client.
Note : I took the google Cloud example, but this is the same result with other cloud providers.
The tl;dr answer here is that a VM is very different than a fully-managed service. It's like comparing apples and oranges, honestly.
When you create a VM, you have a VM. You can do whatever you want with it, but it's just a VM. That VM may be subject to restarts, must be totally configured by you in many cases, is not redundant, has no (added) security layer, etc.
As a managed service, Cloud SQL (and other managed services) offer many things way beyond what you can do on just a VM. You mention a fraction of them, such as backups. With a managed service you're getting a ton of other things which really matter to most people, such as:
Updates, upgrades
Better performance (in your example the IOPS of PD and Cloud SQL do not match)
Support for the service
Added security
An IAM layer
Integrations with other services
No need to "build it yourself"
etc...
While a (very) small minority of people may want to roll their own, it's generally a waste of time and a heck of a lot riskier than using a managed service. I think if you asked most any business customer, the cost of a managed service pales in comparison to paying a fleet of people to replicate the benefits you get from one.
This is true for GCP, AWS, and Azure.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
We're uploading and serving/streaming media (pics,videos) using amazon s3 for storage combined with cloudfront for serving media. The site is used slightly but the Amazon costs come to 3000 $ per month and from the report 90% of the costs originate from the S3 service .
Heard that cloud can be expensive if you don't code the right way ..now my questions :
What is the right way ? and where should I pay more attention, to the way I upload files or to the way I serve them?
Has anyone else had to deal with unexpected high costs , if yes what was the cause?
We have almost similar model. We stream (rtmp ) from S3 and cloudfront. We do have 1000s of files and decent load, but our monthly bill for s3 is around 50$ ( negligible as compared to your figure). Firstly , you should complain about your charges to the technical support of AWS. They always give you a good response and also suggest better ways to utilize resources. Secondly , I think if you do live streaming, where you divide the file into chucks and stream them one by one, instead of streaming or downloading the whole file, it might be effective , in terms of i/o where users are not watching the whole video, but just the part of it. Also, you can try to utilize caching eat application level.
Another opportunity to get better picture on what's going on in your buckets: Qloudstat
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to store logs in a distributed file system.
Let's say that I have many types of logs. Each log type is recorded in file. But this file can be huge, so it must be distributed across many nodes (with replication for data durability).
These files must support append/get operations.
Is there a distributed system that achieves my needs?
Thanks!
I would recommend Flume, a log pulling infrastructure from the folks at Cloudera:
http://github.com/cloudera/flume
You can also try out Scribe from Facebook:
http://github.com/facebook/scribe
Combine a NAS with a no-sql database like MongoDB and you'll have distributed, large, and fault tolerant.
Of course, without more specific details like how much data, structure of the logs (or lack thereof), etc, it's really hard to recommend a real product.
For example, if by "huge" you really mean 2TB or less, and the data is highly structured, then a regular SQL server in a 2 machine clustered environment for fail over will do just fine.
However, if by "huge" you mean exabyte level or more and/or unstructured data then several large (and very expensive) NAS devices are needed. On which you run a set of no-sql databases that are clustered for fail/over and/or multi-master relationships...
You can use Logstash to collect the logs and centralize them with an Elasticsearch cluster. The local logs could be rolling log files, so that they remain small.
Further you can use Graylog2 to analyze and view your logs.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am creating a web-based RESTful service and want to cloud-enable it for scalability.
I don't want to get locked into one cloud provider though. I'd like to be able to switched between Go Grid or Amazon EC2, etc. as pricing and needs evolve.
Is there a common API to control the launch, monitoring and shutdown of cloud resources?
I've seen Right Scale, but their pricing is just from another planet.
Similarly, is there a common API for cloud storage?
Take a look at libcloud
If you are working with scala or java, you can also check jclouds(http://groups.google.com/group/jclouds)
For Java there is:
typica
jclouds
Dasein
AWS SDK for Java
What platform are you looking an API for? Our SecureBlackbox product offers CloudBlackbox package of components for uniform access to various cloud storages. Currently supported are S3 and Azure and Google (API for other services is possible on demand).
We don't have uniform API for computational functionality (at least at the moment).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to Track Unique Visitor count in my web application. I would really like to use Google Analytics but due to the Load limitations that google imposes I will not be able to use them. I am expecting WAY over 10,000 requests a day. This is the limitation that Google web analytics API imposes. Is there another company that has the same features as google analytics that is paid or free?
There definitely are.
Here are two open source and free solutions that are very polished:
Piwik - Designed as a direct competitor to Google Analytics (it looks just as nice) that you host on your own servers
Open Web Analytics
the 10,000 request apply to the Data API, not to the actual data collection.
Like you can have an unlimited number of users seeing your website. On the other hand if you use the API to extract data from their database, you can do 10k request a day only.
check this link for more details
The biggest, most obvious, most usual alternative is to simply do it yourself. Your webserver needs to log requests for security etc. anyway, so it's not a big deal to run something like webalizer on those logs. You won't get the quick, easy access to advanced information like paths users take through the site, btu that can be determined if you care enough. You do gain one huge benefit though: privacy of your own data.
We use Omniture here but it'll cost you.
There is SpeedTrap, a java-based analytics package. Our company used it for years before they turned into cheap **ards and decided Google Analytics was more cost effective (because it was free). But that's a story for another night.