KeyStore service with hard limits - amazon-s3

I'm putting together a datastore technology that relies on a backing key-value store, and has Typescript generator code to populate an index made of tiny JSON objects.
With AWS S3 in mind as an example store, I am alarmed at the possibility of a bug in my Typescript index generation simply continuing to write endless entries with unlimited cost. AWS has no mechanism that I know of to defend me from bankruptcy, so I don't want to use their services.
However, the volume of data might build for certain use cases to Megabytes or Gigabytes and access might be only occasional so cheap long term storage would be great.
What simple key-value cloud stores equivalent to S3 are there that will allow me to define limits on storage and retrieval numbers or cost? In such a service, a bug would mean I eventually start seeing e.g. 403, 429 or 507 errors as feedback that I have hit a quota, limiting my financial exposure. Preferably this would be on a per-bucket basis, rather than the whole account being frozen.
I am using S3 as a reference for the bare minimum needed to fulfil the tool's requirements, but any similar blob or object storage API (put, get, delete, list in UTF-8 order) that eventual starts rejecting requests when a quota is exceeded would be fine too.
Learning the names of qualifying systems and the terminology for their quota limit feature would give me important insights and allow me to review possible options.

Related

Static files as API GET targets

I'm creating a RESTful backend API for eventual use by a phone app, and am toying with the idea of making some of the API read functions nothing more than static files, created and periodically updated by my server-side code, that the app will simply GET directly.
Is this a good idea?
My hope is to significantly reduce the CPU and memory load on the server by not requiring any code to run at all for many of the API calls. However, there could potentially be a huge number of these files (at least one per user of the phone app, which will be a public app listed in the app stores that I naturally hope will get lots of downloads) and I'm wondering if that alone will lead to latency issues I'm trying to avoid.
Here are more details:
It's an Apache server
The hardware is a hosting provider's VPS with about 1gb memory and 20gb free disk space
The average file size (in terms of content and not disk footprint) will probably be < 1kb
I imagine my server-side code might update a given user's data once a day or so at most.
The app will probably do GETs on these files just a few times a day. (There's no real-time interaction going on.)
I might password protect the directory the files will be in at the .htaccess level, though there's no personal or proprietary information in any of the files, so maybe I don't need to, but if I do, will that make a difference in terms of the main question of feasibility and performance?
Thanks for any help you can give me.
This is generally a good thing to do: anything that can be static rather than dynamic is a win for performance and cost (it's why we do caching!), but the main issue with with authorization (which you'll still need to do for each incoming request).
You might also want to consider using a cloud service for storage of the static data (e.g., Amazon S3 or Google Cloud Storage). There are neat ways to provide temporary authorized URLs that you can pass to users so that they can read the data for a short time and then must re-authorize to continue having access.

Need for metadata store while storing an object

While checking out the design of a service like pastebin, I noticed the usage of two different storage systems:
An object store(such as Amazon S3) for storing the actual "paste" data
A metadata store to store other things pertaining to that "paste" data; such as - URL Hash(to access that paste data), Reference to the actual paste data etc.
I am trying to understand the need for this metadata store.
Is this generally the recommended way? Any specific advantage we get from using the metadata store?
Do object storage systems NOT allow metadata to be stored along with the actual object in the same storage server?
Object storage systems generally do allow quite a lot of metadata to be attached to the object.
But then your metadata is at the mercy of the object store.
Your metadata search is limited to what the object store allows.
Analysis, notification (a-la inotify) etc. are at limited to what the object store allows.
If you wanted to move from S3 to Google Cloud Storage, or to do both, you'd have to normalize your metadata.
Your metadata size limitations are limited to that of the object store.
You can't do cross-object-store metadata (e.g. a link that refers to multiple paste data).
You might not be able to have binary metdata.
Typically, metadata is both very important, and very heavily used by the business, so it has separate usage characteristics than the data, so it makes sense to put it on storage with different characteristics.
I can't find anywhere how pastebin.com makes money, so I don't know how heavily they use metadata, but merely the lookup, the translation between URL and paste data, is not something you can do securely with object storage alone.
Great answer above, just to add on - two more advantages are caching and scaling up both storage systems individually.
If you just use an object storage, and say a paste is 5 MB, would you cache all of it? Metadata storage also allows to improve UX by caching say first 10 or 100 KB of data for a paste for the user to preview, meanwhile the complete object is fetched in the background. This upper bound also helps to design cache deterministically.
You can also scale the object store and the metadata store independently of each other as per performance/ capacity needs. Lookups in the metadata store will also be quicker since it's less bulkier.
Your concern is legitimate that separating the storage into 2 tables (or mediums) does add some latency, but it's always a compromise with System Design, there is hardly a Win-Win situation.

Add a random prefix to the key names to improve S3 performance?

You expect this bucket to immediately receive over 150 PUT requests per second. What should the company do to ensure optimal performance?
A) Amazon S3 will automatically manage performance at this scale.
B) Add a random prefix to the key names.
The correct answer was B and I'm trying to figure out why that is. Can someone please explain the significance of B and if it's still true?
As of a 7/17/2018 AWS announcement, hashing and random prefixing the S3 key is no longer required to see improved performance:
https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/
S3 prefixes used to be determined by the first 6-8 characters;
This has changed mid-2018 - see announcement
https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/
But that is half-truth. Actually prefixes (in old definition) still matter.
S3 is not a traditional “storage” - each directory/filename is a separate object in a key/value object store. And also the data has to be partitioned/ sharded to scale to quadzillion of objects. So yes this new sharding is kinda of “automatic”, but not really if you created a new process that writes to it with crazy parallelism to different subdirectories. Before the S3 learns from the new access pattern, you may run into S3 throttling before it reshards/ repartitions data accordingly.
Learning new access patterns takes time. Repartitioning of the data takes time.
Things did improve in mid-2018 (~10x throughput-wise for a new bucket with no statistics), but it's still not what it could be if data is partitioned properly. Although to be fair, this may not be applied to you if you don't have a ton of data, or pattern how you access data is not hugely parallel (e.g. running a Hadoop/Spark cluster on many Tbs of data in S3 with hundreds+ of tasks accessing same bucket in parallel).
TLDR:
"Old prefixes" still do matter.
Write data to root of your bucket, and first-level directory there will determine "prefix" (make it random for example)
"New prefixes" do work, but not initially. It takes time to accommodate to load.
PS. Another approach - you can reach out to your AWS TAM (if you have one) and ask them to pre-partition a new S3 bucket if you expect a ton of data to be flooding it soon.
#tagar That's true especially if you are not in a read intensive scenario !
You have to understand the small characters of the doc to reverse engineer how it is working internally and how your are limited by the system. There is no magic !
503 Slow Down errors are emitted typically when a single shard of S3 is in a hot spot scenario : too much requests to a single shard. What is difficult to understand is how sharding is done internally and that the advertised limit of request is not guaranteed.
pre-2018 behavior gives the details : it was advised to start the first 6-8 digits of the prefix with random characters to avoid hot spots.
One can them assume that initial sharding of an S3 bucket is done based on the first 8 digits of the prefix.
https://aws.amazon.com/blogs/aws/amazon-s3-performance-tips-tricks-seattle-hiring-event/
post-2018 : an automatic sharding was put in place and AWS does no longer advise to bother about the first digits of the prefix... However from this doc :
http-5xx-errors-s3
amazon-s3-performance-tips-fb76daae65cb
One can understand that this automatic shard rebalancing can only work well if load to a prefix is PROGRESSIVELY scaled up to advertised limits:
If the request rate on the prefixes increases gradually, Amazon S3
scales up to handle requests for each of the two prefixes. (S3 will
scale up to handle 3,500 PUT/POST/DELETE or 5,500 GET requests per
second.) As a result, the overall request rate handled by the bucket
doubles.
From my experience 503 can appear way before the advertised levels and there is no guarantee on the speed of the internal rebalancing made internally by S3.
If you are in a write intensive scenario for exemple uploading a lot of small objects, the automatic scaling won't be efficient to rebalance your load.
In short : if you are relying on S3 performance I advise to stick to pre-2018 rules so that the initial sharding of your storage works immediately and does not rely on the auto-rebalancing algorithm of S3.
hash first 6 digits of prefix or design a datamodel balancing partitions uniformly across first 6 digits of prefix
avoid small objects (target size of object ~128MB)
Lookup/writes work means using filenames that are similar or ordered can harm performance.
Adding hashes/random ids prefixing the S3 key is still advisable to alleviate high loads on heavily accessed objects.
Amazon S3 Performance Tips & Tricks
Request Rate and Performance Considerations
How to introduce randomness to S3 ?
Prefix folder names with random hex hashes. For example: s3://BUCKET/23a6-FOLDERNAME/FILENAME.zip
Prefix file names with timestamps. For example: s3://BUCKET/ FOLDERNAME/2013-26-05-15-00-00-FILENAME.zip
B is correct because, when you add randomness (called entropy or some disorderness), that can place all the objects locat close to each other in the same partition in an index.(for example, a key prefixed with the current year) When your application experiences an increase in traffic, it will be trying to read from the same section of the index, resulting in decreased performance.So, app devs add some random prefixes to avoid this.
Note: AWS might have taken care of this so Dev won't need to take care but just wanted to attempt to give the correct answer for the question asked.
As of June 2021.
As mentioned on AWS guidebook Best practice design pattern: optimizing Amazon S3 performance, the application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.
I think the random prefix will help to scale S3 performance.
for example, if we have 10 prefixes in one S3 bucket, it will have up to 35000 put/copy/post/delete requests and 55000 read requests.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html

Microsoft Azure Blob Storage Upload Performance

I am running an Azure web role, which is storing very small blobs into Azure storage. (Blob upload is being done from the server, not from the browser.) I have searched stack overflow and the rest of the internet for tips on optimizing blob storage performance, and I believe I've checked and implemented all of the usual suspects: uploading async, allowing unlimited outgoing web connections (which now seems to be the default setting on web roles and no longer needs to be explicitly set in web.config or in code).
Tweaking the number of concurrent uploads I allow makes some difference, but regardless of what I've tried, I seem to max out at around 1,000 blob uploads per second. This is when running in the Azure web role, in the same region as the storage account (East US). My rate when running this from home over a good internet connection isn't much less, ~700 blobs/sec, which seems to tell me that it's not the network latency that's limiting the rate, it's the actual processing time of the storage service.
I wouldn't normally consider these rates horrible for this kind of a service, but I've read that Microsoft boasts a rate of ~20,000 storage transactions per second, so I've been a little disappointed with these results.
I'd like to get some feedback from those who have really tried to push the limits of blob storage. Does ~1000 small uploads per second sound about right? Or is there possibly something else I should be doing to improve this? I'll post the code if I need to, but I'd rather not receive speculative answers, I'd like to hear from developers who can either confirm that my results are reasonable, or that they've seen much higher throughput.
I should add that I'm currently running this in a small web role. I've tried it also in a medium web role, and didn't see any significant difference.
EDIT:
After a few days of development and testing, my upload rate seemed to suddenly increase. Not by a lot, but maybe by another ~200 per second. In looking around the web, I noticed a comment in the Azure documentation stating "A storage account scales automatically as usage increases." So I'm wondering if it really is capable of much higher rates, but will not automatically scale up until it sees sustained period of high volume. Some confirmation of that would also be greatly appreciated.
Depending on how small your requests are the problem might be caused by Nagle’s Algorithm is Not Friendly towards Small Requests - although usually I see that with queues / table operations. Try disabling Nagle's and let me know if that makes any difference. As an fyi, you have to disable it prior to establishing the connection otherwise the changes will not take effect.
Jason

Has anyone ever reached a read or write upper-bound for an Amazon S3 bucket?

Are there known limitations of S3 scaling? Anyone ever had so many simultaneous reads or writes that a bucket started returning errors? I'm a bit more interested in writes than reads because S3 is likely to be optimized for reads.
Eric's comment sums it up already on a conceptual level, as addressed in the FAQ What happens if traffic from my application suddenly spikes? as well:
Amazon S3 was designed from the ground up to handle traffic for any
Internet application. [...] Amazon S3’s massive scale enables us to
spread load evenly, so that no individual application is affected by
traffic spikes.
Of course, you still need to account for possible issues and Tune [your] Application for Repeated SlowDown errors (see Amazon S3 Error Best Practices):
As with any distributed system, S3 has protection mechanisms which
detect intentional or unintentional resource over-consumption and
react accordingly. SlowDown errors can occur when a high request rate
triggers one of these mechanisms. Reducing your request rate will
decrease or eliminate errors of this type. Generally speaking, most
users will not experience these errors regularly; however, if you
would like more information or are experiencing high or unexpected
SlowDown errors, please post to our Amazon S3 developer forum
http://developer.amazonwebservices.com/connect/forum.jspa?forumID=24
or sign up for AWS Premium Support
http://aws.amazon.com/premiumsupport/. [emphasis mine]
While rare, these slow downs do happen of course - here is an AWS team response illustrating the issue (pretty dated though):
Amazon S3 will return this error when the request rate is high enough
that servicing the requests would cause degraded service for other
customers. This error is very rarely triggered. If you do receive
it, you should exponentially back off. If this error occurs, system
resources will be reactively rebalanced/allocated to better support a
higher request rate. As a result, the time period during which this
error would be thrown should be relatively short. [emphasis mine]
Your assumption about read vs. write optimization is confirmed there as well:
The threshold where this error is trigged varies and will depend, in
part, on the request type and pattern. In general, you'll be able to
achieve higher rps with gets vs. puts and with lots of gets for a
small number of keys vs. lots of gets for a large number of keys.
When geting or puting a large number of keys you'll be able to achieve
higher rps if the keys are in alphanumeric order vs. random/hashed
order.