Why send rate is lower than configured rate in config.yaml (hyperledger caliper) even after use of only one client? - caliper

I configured send rate at 500 tps and I am using only one client so send rate should be around 500tps but in generated report send rate is around 130-40 tps. Why there is so much deviation?
I am using fabric ccp version of caliper.
I expect the send rate around 450-480 but the actual send rate is around 130-40 tps.

Node.js is a single-threaded framework (async/await just means deferred execution, not parallel execution). Caliper runs a loop with the following step:
Waiting for the rate controller to enable the next TX
Creates an async operation in which the user module will call the blockchain adapter.
All of the pending TXs eat up some CPU time (when not waiting for I/O), plus other operations are also scheduled (like sending updates about TXs to the master process).
To reach 500 TPS, the rate controller must enable a TX every 2ms. That's not a lot of time. Try spawning more than 1 local clients, so the load will be shared among them (100 TPS/client for 5 clients, 50 TPS/client for 10 clients, etc).

Related

Rate Limit Pattern with Redis - Accuracy

Background
I have an application that send HTTP request to foreign servers. The application communicating with other services with strict rate limit policy. For example, 5 calls per second. Any call above the allowed rate will get 429 error code.
The application is deployed in the cloud and run by multiple instances. The tasks are coming from shared queue.
The allowed rate limit synced by Redis Rate Limit pattern.
My current implementation
Assuming that the rate limit is 5 per second: I split the time into multiple "window". Each window has maximum rate of 5. Before each call I checking if the counter is less then 5. If yes, fire the request. If no, wait for the next window (after a second).
The problem
In order to sync the application around the Redis, I need to Redis calls: INCR and EXPR. Let's say that each call can take around 250ms to be returned. So we have checking time of ~500ms. Having said that, in some cases you will check for old window because until you will get the answer the current second has been changed. In case that on the next second we will have another 5 quick calls - it will lead to 429 from the server.
Question
As you can see, this pattern not really ensuring that the rate of my application will be up to 5 calls\second.
How do you recommend to do it right?

Understanding why you would want to process Message Queues at a future time

So I'm trying to understand what practical problems Queues solve. By reading all the information from Google, I get the high-level.
Push message to Queue for processing at a later time
So I'm looking at an architecture from Company A and they have different use cases for Job Queueing like for example
chat messages
file conversion
searching
Heavy sql queries
Why process it at a later time?
Here's my best guess...
Let's say I have an application that can process 10 "things" at a time.
My application then maxes out it's processing capacity.
an 11th request came in so app puts it in the Queue for later processing
Assuming this is a valid Use Case, wouldn't adding more servers to process more "things" make sense? Is it because it's more costly to add more servers than employ a Queue and sacrifice response time a little bit?
Given my Use Case examples, what other problems would Queues solve for them?
Have you ever lined up at a bank when it is busy? You would have waited in a queue.
"But," you could say, "wouldn't adding more staff to process more customers make sense? Is it because it's more costly to add more staff than employ a Queue and sacrifice response time a little bit?"
That would be correct. It can be quite costly to staff a bank based on the peak number of customers who would arrive each day. It is cheaper to staff below this level and have some customers wait in a queue.
Also, the number of customers each day are not 100% predictable. A queue allows excess demand to wait without breaking the system.
Queues enable decoupling.
For example, imagine an online store where customers purchase an item. They select the item, provide a credit card number and click 'Purchase'. If the credit card is declined, the online store can immediately prompt them to re-enter the number. This interaction has to take place immediately while the customer is still online.
However, there is no need to have the customer wait while an invoice is generated, a record is added to the accounting system and inventory is pulled off the shelf. This can be decoupled from the ordering process. A good way to do it is to push the order into a queue, which can be handled by the next system.
If that 'next system' happens to be offline at the moment, there is no reason to cancel the whole sale. The transaction can be processed when the 'next system' comes back online. This is much better than failing the whole process just because one component (which is not required immediately) has a failure.
Bottom line: Queues are excellent. They enable better handling of failures. They makes things more resilient (just wait a few minutes and try again!). They should be used at all times when the process is compatible with a queuing architecture.
Let's do scenarios
Scenario 1 without queue:
you request an endpoint /blabla/do-eveything/
this request do
download an image from very slow FTP
e.g 1.5 sec (can error, retry ? add +X sec)
attach the image to an email
send an email (3 sec)
e.g 1 sec (can error, retry ? add +X sec)
confirmation received > store confirmation to a third company tracking stuff
e.g 1.5 (can error, retry ? add +X sec)
when tracking confirm, update your data from another third company for big data purpose
e.g 2 sec (can error, retry ? add +X sec)
... you get the idead
return the response e.g 11 sec later (this is to slow) or more or timeout when everything failed
End user said internet was faster 20 years ago, maybe I need to change my internet connection or change my 16 threads
Scenario 2 queue everything you can:
you request an endpoint /blabla/do-eveything/
this request do
Queue job "DO_EVERYTHING"
e.g 0.02 sec
Return the response less then 0.250 sec
End user said that is website/app is too fast, I can keep my 56K internet connection
on queue/event system one failed job can be retry later without affeting the end user
you can pause job, add a unlimited number a task/step after the original message
better fault tolerance
Working with queue will allow you a better micro/nano service architecture, better testing because, you can test a single job, intead of a full controller that do everything...
Ye, is maybe more work, more thinking, but a the end no need to think about the work when holidays

RabbitMQ: basic ack takes very long time and blocks publishing

I'm using the Java Client 3.5.6 for RabbitMQ.
My use case is this:
I have 10-15 Channels to one queue (mostly the same connection, one connection per channel makes no difference).
I get them without autoAck. Every Channel has a prefetch / QoS size of 5000. So let's just assume i have 30 channels, so i can get 150000 messages.
Every full minute, i compute some things and when successful, i use basicAck to acknowledge these messages.
However, the management webinterface shows in that phase that 0 messages are delivered, which is not realistic unless those are somehow "blocked".
I'm using this queue on 3-node-cluster as a HA-queue with TTL set to 1800 seconds. The nodes are connected via internal LAN and the machines are really powerful with plenty RAM.
My Question:
Why does this basicAck operation block the rest of the operations like publishing or delivering new messages?

Limiting a queue over time

I'm using an API that is limited for usage, let's say: no more than 10 calls per second, and no more than 5000 calls per day.
I am handling this calls in a beanstalkd queue process job. How can I limit the processing of this jobs, having in mind the API's limits.
When you use Beanstalkd you can have the tube paused for a certain seconds.
When you reserve a job, and you know the API call failed during that call, you get to pause the tube for X seconds.
You can find out the time needed to pause the tube, either from your API response (usually they return you are locked until Time X), or start with something adaptive like pause for the next 60 seconds, and increase/decrease on the go.
If you know you can delay, or disperse in advance, before placing the jobs into your queues, you can also add a delay to the job, so it won't execute immediately, this way you can have your jobs distributed over time.
Also there is a great post about distributed rate limiting using redis
If all workers share durable state, they can update shared status and collective implement rate limiting.
If the only shared writable state is the queue itself, you could create ticketing tubes for the rate limited jobs, and have a rate limit manager insert tickets (permission slips) to control when the jobs get run. Would need changes to the workers, would need a way to time out unused tickets, but should be workable.
Edit: a "valid until" timestamp in the ticket might do it for per-second limits. Per-day limits might need a feedback tube back to let the rate limit manager know about actual usage (to implement a rolling 24 hour window instead of the 5000 all getting reset at midnight)

Difference between server hit rate and througput in jMeter reports

I'm using jMeter to make load test on a web application. I use also the plugin "jMeter Plugins" to have more Graphs.
My question is
I can't understand the difference between the server hit rate (Server hit per second graph) and the througput (Transactions per Second). The two graphs are very close but they differ a bit in some locations.
I wonder also if "transaction" here means request .. right ??
Thx a lot :)
Both hits per second and throughput are talking about workload, the hits are the request send from the injector over time, meanwhile the throughput is the load that the system is able to handle, both graphs should look the same as long as the application haven't reach its breaking point, after the breaking point the hits will continue increasing triggering a response times increase.
A test in which you note the difference is the peak test (you increase load until you crash the application), when the application exceeds its throughput the 2 plots will diverge.
As you can see the blue curve differ from from the green one after 650RPS, then response times skyrocket and request start failing.
If we let the test continue running, the injector will run out of threads and the hits curve will be the same as the throughput again. Configuring the injectors pool thread.
The area in between the two curves are active request, request that the injector sent and are waiting to be processed.
The hits plot is measured in RPS, it is counting requests not transactions.
The same plot can be generate using the jmeter's composite graph.
server hit rate gives graph of how many hits can server handle per each second for single unit.
Throughput Rate is the amount of transactions produced over time during a test. It’s also expressed as the amount of capacity that a website or application can handle.
http://www.joecolantonio.com/2011/07/05/performance-testing-what-is-throughput/