s3 rate limit against website endpoint - amazon-s3

I'm hitting my s3 bucket via its website endpoint with various paths/keys. I'm able to get ok (200) responses when I'm hitting it at 1,000 requests per second over the course of 5 minutes. I'm using a popular tool: https://github.com/tsenart/vegeta so I have confidence in these stats.
This is suprising considering the documentation says that anything above is 800 per second is problematic.
Is using a website endpoint different than an API call in terms of throttling? Is 800 a real rate limit or a crude theshhold?

It's a soft limit, and not really a limit from the bucket level perspective. Read carefully. The documentation warns of a rapid request rate increase beyond 800 requests per second potentially resulting in temporary rate limits on your request rate.
S3 increases available capacity by keyspace partition splitting and it takes some time for this to happen... but buckets scale up with workload.
If you are requesting the same object(s) repeatedly, you are also not likely to be imposing as much load on the available resources as you would be if you were hitting 800 unique objects per second and reading between the lines, that is the threshold under discussion -- the time to look up keys in the bucket index. Recent hits are probably already more accessible than cold spots in the index.
The problem that document highlights is that of your object keys are lexically sequential, then S3 will be unable to split the partitions meaningfully, because you will always be creating new objects on only one side of the split or the other and thus working against the scaling algorithm of S3.

The documentation has been updated in meantime and the limits have been increased. Now the limits are per bucket prefix and 1000 req/s isn't a problem any more. For more see the mentioned doc.

Related

KeyStore service with hard limits

I'm putting together a datastore technology that relies on a backing key-value store, and has Typescript generator code to populate an index made of tiny JSON objects.
With AWS S3 in mind as an example store, I am alarmed at the possibility of a bug in my Typescript index generation simply continuing to write endless entries with unlimited cost. AWS has no mechanism that I know of to defend me from bankruptcy, so I don't want to use their services.
However, the volume of data might build for certain use cases to Megabytes or Gigabytes and access might be only occasional so cheap long term storage would be great.
What simple key-value cloud stores equivalent to S3 are there that will allow me to define limits on storage and retrieval numbers or cost? In such a service, a bug would mean I eventually start seeing e.g. 403, 429 or 507 errors as feedback that I have hit a quota, limiting my financial exposure. Preferably this would be on a per-bucket basis, rather than the whole account being frozen.
I am using S3 as a reference for the bare minimum needed to fulfil the tool's requirements, but any similar blob or object storage API (put, get, delete, list in UTF-8 order) that eventual starts rejecting requests when a quota is exceeded would be fine too.
Learning the names of qualifying systems and the terminology for their quota limit feature would give me important insights and allow me to review possible options.

Google Pub/Sub + Cloud Run scalability

I have a python application writing pubsub msg into Bigquery. The python code use the google-cloud-bigquery library and the TableData.insertAll() method quota is 10,000 requests per second per table.Quotas documentation.
Cloud Run container auto scaling is set to 100 with 1000 requests per container.So technically, I should be able to reach 10 000 requests/sec right? With the BQ insert API being the biggest bottleneck.
I only have a few 100 requests per sec at the moment, with multiple service running at the same time.
CPU and RAM at 50%.
Now confirming your project structure, and a few details given in the comments; I would then review the Pub/Sub quotas and limits, especially the Quota and the Resource limits, both tables where you can check this information depending on the size and the Throughput quota units sections tells you how to calculate quota usage.
I would answer your question as a yes, you are able to reach 10,000 req/sec. And as in this question depending on the byte size you can have 10,000 row inserts unless the recommendation is 500.
The concurrency in Cloud Run can be modified in case you need to change it.

AWS DynamoDB Strange Behavior -- Provisioned Capacity & Queries

I have some strange things occurring with my AWS DynamoDB tables. To give you some context, I have several tables for an AWS Lambda function to query and modify. The source code for the function is housed in an S3 bucket. The function is triggered by an AWS Api.
A few days ago I noticed a massive spike in the amount of read and write requests I was being charged for in AWS. To be specific, the number of read and write requests increased by 3,000 from what my tables usually experience (they usually have fewer than 750 requests). Additionally, I have seen similar numbers in my Tier 1 S3 requests, with an increase of nearly 4,000 requests in the past six days.
Immediately, I suspected something malicious had happened, and I suspended all IAM roles and changed their keys. I couldn't see anything in the logs from Lambda denoting it was coming from my function, nor had the API received a volume of requests consistent with what was happening on the tables or the bucket.
When I was looking through the logs on the tables, I was met with this very strange behavior relating to the provisioned write and read capacity of the table. It seems like the table's capacities are ping ponging back and forth wildly as shown in the photo.
I'm relatively new to DynamoDB and AWS as a whole, but I thought I had set the table up with very specific provisioned write and read limits. The requests have continued to come in, and I am unable to figure out where in the world they're coming from.
Would one of you AWS Wizards mind helping me solve this bizarre situation?
Any advice or insight would be wildly appreciated.
Turns out refreshing the table that appears in the DynamoDB management window causes the table to be read from, hence the unexplainable jump in reads. I was doing it the whole time 🤦‍♂️

How to increase Google Sheets v4 API quota limitations

The new Google Sheets API v4 currently has an unlimited read/write quota per day (which is fantastic), but restricted to 500 reads/writes per account per 100 seconds, and 100 read/writes per key per 100 seconds (or, I have found, multiple keys coming from the same IP). This is probably plenty for most use cases, but I have an edge case that requires bringing a frequently-updated Google Sheet with 70 tabs down to a node.js server that distributes these to user's clients every ~30-60 seconds or so (users are data annotators who are student research assistants). This wasn't so bad early in the project when there were only 20-30 tabs, but now that the data is large the server is blowing through the 100 quota and returning errors every 10-15 minutes.
The problem is such that:
Frequent data updates: Only data on 1-5 of the 70 tabs is likely to be updated on any given minute, but which tabs have new data is random (so I am pulling down the whole sheet of 70 = 70 reads).
Update interval: The need for updates happens randomly at about 30 second to 5-minute intervals (so some within the quota, some about 3-5x the quota).
Throttling: I have tried throttling the update to be within the 100 calls/100 seconds (my previous solution), but this introduces large usability issues, significantly decreasing usability/productivity/work quality.
Quota increase: The sheets API does not currently appear to include a way to pay to increase the quota. It does allow filling out a form to request an increase in the quota, but I'm not sure what the mean response time is on this (my request is only a few days old).
Multiple service accounts: I have tried using multiple service accounts to get the full 500 requests/100 seconds quota (rather than the per-user quota), since this is a server, but Google Sheets looks to rate-limit to 100 requests/100 seconds from a given IP
Alternatives: I have considered that this project may have just grown beyond the size that Sheets is easily able to handle, but there do not appear to be any good, usable, self-hosted, collaborative spreadsheets with easy-to-interface-to APIs out there.
Are there settings/methods suggested to achieve the full 500 calls/100 seconds for a server?
You can request quota update in Google Cloud Platform and it will be increased to 2500 per account an 500 per user. (about your #4)
You can use spreadsheets.get to read the entire spreadsheet in a single call, rather than 1 call per request. Alternately, you can use spreadsheets.values.batchGet to read multiple different ranges in a single call, if all you need are the values.
The Drive API offers "push notifications", so you can get notified when changes occur and react to those, instead of polling for them. The latency of the notifications is a little on the slow side, but it gets the job done.

Add a random prefix to the key names to improve S3 performance?

You expect this bucket to immediately receive over 150 PUT requests per second. What should the company do to ensure optimal performance?
A) Amazon S3 will automatically manage performance at this scale.
B) Add a random prefix to the key names.
The correct answer was B and I'm trying to figure out why that is. Can someone please explain the significance of B and if it's still true?
As of a 7/17/2018 AWS announcement, hashing and random prefixing the S3 key is no longer required to see improved performance:
https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/
S3 prefixes used to be determined by the first 6-8 characters;
This has changed mid-2018 - see announcement
https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/
But that is half-truth. Actually prefixes (in old definition) still matter.
S3 is not a traditional “storage” - each directory/filename is a separate object in a key/value object store. And also the data has to be partitioned/ sharded to scale to quadzillion of objects. So yes this new sharding is kinda of “automatic”, but not really if you created a new process that writes to it with crazy parallelism to different subdirectories. Before the S3 learns from the new access pattern, you may run into S3 throttling before it reshards/ repartitions data accordingly.
Learning new access patterns takes time. Repartitioning of the data takes time.
Things did improve in mid-2018 (~10x throughput-wise for a new bucket with no statistics), but it's still not what it could be if data is partitioned properly. Although to be fair, this may not be applied to you if you don't have a ton of data, or pattern how you access data is not hugely parallel (e.g. running a Hadoop/Spark cluster on many Tbs of data in S3 with hundreds+ of tasks accessing same bucket in parallel).
TLDR:
"Old prefixes" still do matter.
Write data to root of your bucket, and first-level directory there will determine "prefix" (make it random for example)
"New prefixes" do work, but not initially. It takes time to accommodate to load.
PS. Another approach - you can reach out to your AWS TAM (if you have one) and ask them to pre-partition a new S3 bucket if you expect a ton of data to be flooding it soon.
#tagar That's true especially if you are not in a read intensive scenario !
You have to understand the small characters of the doc to reverse engineer how it is working internally and how your are limited by the system. There is no magic !
503 Slow Down errors are emitted typically when a single shard of S3 is in a hot spot scenario : too much requests to a single shard. What is difficult to understand is how sharding is done internally and that the advertised limit of request is not guaranteed.
pre-2018 behavior gives the details : it was advised to start the first 6-8 digits of the prefix with random characters to avoid hot spots.
One can them assume that initial sharding of an S3 bucket is done based on the first 8 digits of the prefix.
https://aws.amazon.com/blogs/aws/amazon-s3-performance-tips-tricks-seattle-hiring-event/
post-2018 : an automatic sharding was put in place and AWS does no longer advise to bother about the first digits of the prefix... However from this doc :
http-5xx-errors-s3
amazon-s3-performance-tips-fb76daae65cb
One can understand that this automatic shard rebalancing can only work well if load to a prefix is PROGRESSIVELY scaled up to advertised limits:
If the request rate on the prefixes increases gradually, Amazon S3
scales up to handle requests for each of the two prefixes. (S3 will
scale up to handle 3,500 PUT/POST/DELETE or 5,500 GET requests per
second.) As a result, the overall request rate handled by the bucket
doubles.
From my experience 503 can appear way before the advertised levels and there is no guarantee on the speed of the internal rebalancing made internally by S3.
If you are in a write intensive scenario for exemple uploading a lot of small objects, the automatic scaling won't be efficient to rebalance your load.
In short : if you are relying on S3 performance I advise to stick to pre-2018 rules so that the initial sharding of your storage works immediately and does not rely on the auto-rebalancing algorithm of S3.
hash first 6 digits of prefix or design a datamodel balancing partitions uniformly across first 6 digits of prefix
avoid small objects (target size of object ~128MB)
Lookup/writes work means using filenames that are similar or ordered can harm performance.
Adding hashes/random ids prefixing the S3 key is still advisable to alleviate high loads on heavily accessed objects.
Amazon S3 Performance Tips & Tricks
Request Rate and Performance Considerations
How to introduce randomness to S3 ?
Prefix folder names with random hex hashes. For example: s3://BUCKET/23a6-FOLDERNAME/FILENAME.zip
Prefix file names with timestamps. For example: s3://BUCKET/ FOLDERNAME/2013-26-05-15-00-00-FILENAME.zip
B is correct because, when you add randomness (called entropy or some disorderness), that can place all the objects locat close to each other in the same partition in an index.(for example, a key prefixed with the current year) When your application experiences an increase in traffic, it will be trying to read from the same section of the index, resulting in decreased performance.So, app devs add some random prefixes to avoid this.
Note: AWS might have taken care of this so Dev won't need to take care but just wanted to attempt to give the correct answer for the question asked.
As of June 2021.
As mentioned on AWS guidebook Best practice design pattern: optimizing Amazon S3 performance, the application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.
I think the random prefix will help to scale S3 performance.
for example, if we have 10 prefixes in one S3 bucket, it will have up to 35000 put/copy/post/delete requests and 55000 read requests.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html