How to resolve Aerospike error, Operation not allowed at this time? - aerospike

I am using Aerospike 4.8v and persisting my data on disk and I am making parallel write requests to aerospike, if i make 10 parallel requests then it works fine but when I make 100 parallel requests it gives error i.e. Operation not allowed at this time, with code 22. I think that aerospike must be capable of handling hundreds or thousands of requests in parallel but I don't know whats wrong so if anyone can guide that would be helpful.
Error log:
error : { AerospikeError: Operation not allowed at this time.
at Function.fromASError (/data/codebase/lib/node_modules/aerospike/lib/error.js:113:21)
at QueryCommand.convertError (/data/codebase/lib/node_modules/aerospike/lib/commands/command.js:91:27)
at QueryCommand.convertResponse (/data/codebase/lib/node_modules/aerospike/lib/commands/command.js:101:24)
at asCallback (/data/codebase/lib/node_modules/aerospike/lib/commands/command.js:163:24)
name: 'AerospikeError',
code: 22,
command:
QueryCommand {
client:
Client {
domain: null,
_events: {},
_eventsCount: 0,
_maxListeners: undefined,
config: [Object],
as_client: AerospikeClient {},
connected: true,
captureStackTraces: false },
args: [ 'antiSpamming', 'userTargetingMatrix', [Object], undefined ],
captureStackTraces: false,
key: undefined,
ensureConnected: true,
stream:
RecordStream {
aborted: false,
client: [Object],
_events: [Object],
_eventsCount: 3 } },
func: 'as_query_parse_records_async',
file: 'src/main/aerospike/aerospike_query.c',
line: 246,
inDoubt: false }
Warning logs:
Jan 28 06:03:25 ip-1-0-4-78 asd[32437]: Jan 28 2020 06:03:25 GMT: WARNING (scan): (scan_manager.c:103) at scan threads limit - can't start new scan
Jan 28 06:03:25 ip-1-0-4-78 asd[32437]: Jan 28 2020 06:03:25 GMT: WARNING (scan): (scan.c:676) basic scan job 5614303283813349397 failed to start (22)

You are most likely exceeding the default limit related to queries. First, see the definition for error code 22:
For scans with server 4.7 or later, no threads available
(scan-threads-limit reached).
Based on your other question, you're doing scans rather than a query with a secondary index. You'll need to increase that limit, as suggested in the 'Additional Information'. However, you have a pretty weak system in terms of CPU, so you should adjust that value and benchmark, comparing performance before and after with the same workload. In a real production system you'd have multiple nodes, probably want more than two CPU cores, and similarly tune the scan threads as needed.

This happened to me because a job was performing a backup while I was trying to scan the partitions.
Recommendation, do not run an extra job while a backup is in process.

Related

StackExchange.Redis sync-ops and conn-sec

Every now and then we received a large set of timeouts (around peak time for website traffic) with lots of logs in the following form:
Timeout performing GET (5000ms)
next: GET ObjectPageView.120.633.0
inst: 21
qu: 0
qs: 0
aw: False
bw: SpinningDown
rs: ReadAsync
ws: Idle
in: 0
last-in: 0
cur-in: 0
sync-ops: 456703
async-ops: 1
conn-sec: 72340.11
mc: 1/1/0
mgr: 10 of 10 available
IOCP: (Busy=0 Free=1800 Min=600 Max=1800)
WORKER: (Busy=720 Free=1080 Min=600 Max=1800)
v: 2.6.90.64945
What do sync-ops and conn-sec stand for? The rest of the numbers seem fine, but these seem high and I'm not entirely sure what they are describing.
These are statistic about the current connection:
"sync-ops" a count of synchronous operations (as opposed to async-ops for asynchronous operations) performed on the current connection.
"conn-sec" is the current duration of the current connection (from connected until now)

StackExchange.Redis.RedisTimeoutException Timeout performing EXISTS

.NET Framework 4.7 and StackExchange.Redis version=2.5.43.
I am only seeing this error when the cache is on the server but not when running locally with the cache running in a container on my machine.
StackExchange.Redis.RedisTimeoutException:
'Timeout performing EXISTS (10000ms),
next: PSETEX MyKey-f79c9cad-c265-e611-80d8-005056b35bfa,
inst: 190,
qu: 39996,
qs: 176,
aw: True,
bw: Flushing,
rs: DequeueResult,
ws: Flushing,
in: 0,
in-pipe: 0,
out-pipe: 528424,
serverEndpoint: MyServer:6380,
mc: 1/1/0,
mgr: 9 of 10 available,
clientName: MyClient(SE.Redis-v2.5.43.42402),
IOCP: (Busy=0,Free=1000,Min=12,Max=1000),
WORKER: (Busy=1,Free=2046,Min=12,Max=2047),
v: 2.5.43.42402 (Please take a look at this article for some common client-side issues
that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)'
I have tried increasing the SyncTimeout configuration to 10000 but no difference.
The out-pipe value looks quite high but I am not sure what this is an indication of, or if it is a red herring.
Any ideas what could cause this timeout?
Thanks
As pointed out by #slorello the high "qu" and "qs" seems to be the issue where some big payloads blocking the pipe.
Breaking the big payloads down into smaller payloads seems to stopped the timeout. I will also investigate ConnectionMultiplexer pooling as recommended here in point 10

How to build historgram of methods by time spent inside with Mono?

I have tried the following:
mono --profile=log myprog.exe
to collect profiler data. Then to interpret those I invoke:
> mprof-report output.mlpd
Mono log profiler data
Profiler version: 2.0
Data version: 14
Arguments: log
Architecture: x86-64
Operating system: linux
Mean timer overhead: 51 nanoseconds
Program startup: Fri Jul 20 00:11:12 2018
Program ID: 19840
Server listening on: 59374
JIT summary
Compiled methods: 8349
Generated code size: 2621631
JIT helpers: 0
JIT helpers code size: 0
GC summary
GC resizes: 0
Max heap size: 0
Object moves: 0
Metadata summary
Loaded images: 16
Loaded assemblies: 16
Exception summary
Throws: 0
Thread summary
Thread: 0x7fb49c50a700, name: ""
Thread: 0x7fb49d27b700, name: "Threadpool worker"
Thread: 0x7fb49d07a700, name: "Threadpool worker"
Thread: 0x7fb49ce79700, name: "Threadpool worker"
Thread: 0x7fb49cc78700, name: "Threadpool worker"
Thread: 0x7fb49d6b9700, name: ""
Thread: 0x7fb4bbff1700, name: "Finalizer"
Thread: 0x7fb4bfe3f740, name: "Main"
Domain summary
Domain: (nil), friendly name: "myprog.exe"
Domain: 0x1d037f0, friendly name: "(null)"
Context summary
Context: (nil), domain: (nil)
However, there's no information concerning which methods were called often and took long to complete, which was the only one thing I expected from profiling.
How do I use Mono profiling to gather and output information about method calls' total run time? Like hprof with cpu=times will generate.
The Mono docs are "slightly" wrong as the methods calls are not tracked by default. This option creates huge profile log output and massively slows down "total" execution time and when combined with other options like alloc, effect the execution time of the methods and thus any timings that are being collected.
Personally I would recommend using calls profiling by itself adjusting the calldepthto a level that matters to your profiling. i.e. do you need to profile into the framework calls or not? Also a smaller call depth also greatly decreases the size of the log produced.
Example:
mono --profile=log:calls,calldepth=10 Console_Ling.exe
Produces:
Method call summary
Total(ms) Self(ms) Calls Method name
53358 0 1 (wrapper runtime-invoke) <Module>:runtime_invoke_void_object (object,intptr,intptr,intptr)
53358 2 1 Console_Ling.MainClass:Main (string[])
53340 2 1 Console_Ling.MainClass:Stuff ()
53337 0 3 System.Linq.Enumerable:ToList<int> (System.Collections.Generic.IEnumerable`1<int>)
53194 13347 1 System.Linq.Enumerable/WhereListIterator`1<int>:ToList ()
33110 13181 20000000 Console_Ling.MainClass/<>c__DisplayClass0_0:<Stuff>b__0 (int)
19928 13243 20000000 System.Collections.Generic.List`1<int>:Contains (int)
6685 6685 20000000 System.Collections.Generic.GenericEqualityComparer`1<int>:Equals (int,int)
~~~~
Re: http://www.mono-project.com/docs/debug+profile/profile/profiler/#profiler-option-documentation

StackExcange.Redis.RedisTimeoutException

We are experiencing timeouts in our application using Redis. Already investigated but without success. See the timeout error below:
StackExchange.Redis.RedisTimeoutException: Timeout performing GET
USERORGANIZATIONS_D96510A4-A9A2-4DAA-84A9-BB77363DD3EA, inst: 9, mgr:
ProcessReadQueue, err: never, queue: 24, qu: 0, qs: 24, qc: 0, wr: 1, wq: 1,
in: 65536, ar: 1, clientName: RD00155D008B42, serverEndpoint:
Unspecified/xxxxxxx.redis.cache.windows.net:xxxx, keyHashSlot: 9735, IOCP:
(Busy=0,Free=1000,Min=4,Max=1000), WORKER:
(Busy=27,Free=32740,Min=200,Max=32767) (Please take a look at this article
for some common client-side issues that can cause timeouts:
http://stackexchange.github.io/StackExchange.Redis/Timeouts)
If need some more information, just ask me that I'll try to provide. Thanks in advance.
The “in: 65536” value in the timeout is very high.  This value indicates how much data is sitting in the client’s socket kernel buffer.  This indicates that the data has arrived at the local machine but has not been read by the application layer yet.  This typically happens when 1) thread pool settings need to be adjusted or 2) when client CPU is running high.  Here are some articles I suggest you read:
 
Diagnosing Redis errors on the client side
Azure Redis Best Practices

StackExchange.Redis System.TimeoutException

I got this timeout exception suddenly when I try to persist a range of data, it was working before and I didn't do any changes:
Timeout performing HMSET {key}, inst: 0, mgr: ExecuteSelect, err:
never, queue: 2, qu: 1, qs: 1, qc: 0, wr: 1, wq: 1, in: 0, ar: 0,
clientName: {machine-name}, serverEndpoint:
Unspecified/localhost:6379, keyHashSlot: 2689, IOCP:
(Busy=0,Free=1000,Min=4,Max=1000), WORKER:
(Busy=0,Free=2047,Min=4,Max=2047), Local-CPU: 100% (Please take a look
at this article for some common client-side issues that can cause
timeouts:
https://github.com/StackExchange/StackExchange.Redis/tree/master/Docs/Timeouts.md)
I'm using Redis on windows.
In your timeout error message, I see Local-CPU: 100%. This is the CPU on your client that is calling into Redis server. You might want to look into what is causing the high CPU load on your client.
This article describes why high CPU usage can lead to client-side timeouts. https://gist.github.com/JonCole/db0e90bedeb3fc4823c2#high-cpu-usage
So, I battled with this issue for a few days and almost gave up. Like #Amr Reda said, breaking a large sets into smaller ones might work but that's not optimal.
In my case, I was trying to move 27,000 records into redis and i kept encountering the issue.
To resolve the issue, increase the SyncTimeout value in your redis connection string. It's set by default to 1000ms ie 1second. Large datasets typically take longer to add.
I found out what causing the issue, as I was trying to bulk inserting into hash. What I did is that I chunked the inserted list into smaller ones.
Quick suggestions that worked in my case, using a console .net project with very high concurrency using multithread (around 30.000).
In the program.cs, I added some ThreadPool settings:
int newWorkerThreadsPerCore = 50, newIOCPPerCore = 100;
ThreadPool.SetMinThreads(newWorkerThreadsPerCore, newIOCPPerCore);
Also, I had to change everything from:
var redisValue = dbCache.StringGet("SOMETHING");
To:
var redisValue = dbCache.StringGetAsync("SOMETHING").Result;
Even if you might think they look almost the same (considering you always end up waiting for a result), if you use the non-async version and one single thread receives a redis timeout, it will make all the other 29.999 threads waiting for redis to timeout too, while the async one will only cause a timeout in that only single thread.