How can I check/compute the "delete hit rate" on redis?

How can I check/compute the "delete hit rate" on redis? - redis

It's well known that we are able to calculate redis hit-rate by using
redis-cli info command(to check key_hits and key_misses).
However, these two metrics record only hits&misses in query cases(i.e., GET command) . Are there any ways to solve my problem (recoding the successful rate of all executed DEL commands)?

Related

why REDIS SERVER clears memory as i push data?

As I am taking the learning curve in REDIS I developd a simple application that consumes market data of goverment bonds, and for each price, it routine ask a webservice for bonds analytics at that price.
Analytics is provided by a api webservice that might be hitted several times as prices arrives every second. The response is a json payload like this one {"md":2.9070078608390455,"paridad":0.7710514176999993,"price":186.0,"ticker":"GO26","tir":0.10945225427543438,"vt":241.22904871224668, "price":185}
My strategy with REDIS is to cache that payload in string format with a key formet by ticker + price (i.e "GO26185). That way I reduce service hits and also query time response. So from here, if a value is not on REDIS, i ask to the APi. If not, i ask to REDIS.
The problem i have is that when running this routine, as long as i PUSH different KEY VALUE pair on REDIS, the one I already have in memory disapears.
i,em. (dbsize, increases as soon as i push information, but decreases when there are no new values).
Although I set expiration to one day (in seconds):
await client.set(
rediskey,
JSON.stringify(response.data)
,{
EX: 86399,
}
);
Is there any configuration I might be messing to tell redis to persist that data and avoid clearing the cache randomly?
Just to clarify, a glance on how SET keys dissapears while registering new ones:
127.0.0.1:6379> dbsize
(integer) 946;
127.0.0.1:6379> dbsize
(integer) 1046;
127.0.0.1:6379> dbsize
(integer) 1048;
127.0.0.1:6379> dbsize
(integer) 1048;
127.0.0.1:6379> dbsize
(integer) 0 << Here all my keys have dissapeared

I am replying my own answer. The problem was that I didn't block redis port and a hacker was connecting to my redis server, causing it to reset. Seems it was using the replicatio nodes.

Redis Snapshot-Configuration: How are multiple changes on the same key counted?

in Redis you can configure the creation of snapshots, e.g. "save 60 10" would save the database after 60 seconds if at least 10 keys were changed.
If the SAME key was changed 10 times, would a snapshot be saved? Or does this refer to 10 unique/different keys that have to be changed?
Thank you!

The documented config doesn't say anything about "if at least 10 keys were changed". It says the snapshot will happen if "the given number of write operations against the DB occurred". Simple commands like SET and DEL count as one write operation. More complicated commands like HMSET and ZINTERSTORE might count as more than one write operation depending on the number of values they affect. Nothing takes into account the number of unique keys that were written to since the last snapshot.

Snakemake 200000 job submission

I have 200000 fasta sequences. I am doing GATK to call variants and created a wildcard for every sequence. Now I would like to submit 200000 jobs using snakemake. Will this cause a problem to cluster? Is there a way to submit jobs in a set of 10-20?

First off, it might take some time to calculate the DAG, but I have been told the DAG calculation recently has been greatly improved. Anyways, it might be wise to split up in batches.
Most clusters won't allow you to submit more than X jobs at the same time, usually in the range of 100-1000. I believe the documentation is not fully correct, but when using --cluster cluster I believe the --jobs argument controls the number of active/submitted jobs at the same time, so by using snakemake --jobs 20 --cluster "myclustercommand" you should be able to control this. Know that this control the number of submitted jobs, not active jobs. It might be that all your jobs are in the queue, so probably best to check in with your cluster administrator and ask what the maximum number of submitted jobs is, and get as close to that number.

How to interpret "evicted_keys" from Redis Info

We are using ElastiCache for Redis, and are confused by its Evictions metric.
I'm curious what the unit is on the evicted_keys metric from Redis Info? The ElastiCache docs say it is a count: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.Redis.html but for our application we have observed the "Evictions" metric (which is derived from evicted_keys) fluctuates up and down, indicating it's not a count. I would expect a count to never decrease, since we cannot "un-evict" a key. I'm wondering if evicted_keys is actually a rate (eg, evictions/sec), which would explain why it can fluctuate.
Thanks you in advance for any responses!

From INFO command:
evicted_keys: Number of evicted keys due to maxmemory limit
To learn more about evictions see Using Redis as an LRU cache - Eviction policies
This counter is zero when the server starts, and it is only reset if you issue the CONFIG RESETSTAT command. However, on ElastiCache, this command is not available.
That said, ElastiCache derives the metric from this value, by calculating the difference between data-points.
Redis evicted_keys 0 5 12 18 22 ....
CloudWatch Evictions 0 5 7 6 4 ....
This is the usual pattern in CloudWatch metrics. This allows you to use SUM if you want the cumulative value, but also to detect rate changes or spikes easily.
Think for example you want to alarm if evictions are more than 10,000 over one minute period. If ElastiCache stores the cumulative value from Redis straight as a metric, this would be hard to accomplish.
Also, by committing the metric only as evicted keys for the period, you are protected of the data distortion of a server-reset or a value overflow. While the Redis INFO value would go back to zero, on ElastiCache you still get the value for the period and you can still do running sum over any period.

What Redis data type fit the most for following example

I have following scenario:
Fetch array of numbers (from REDIS) conditionally
For each number do some async stuff (fetch something from DB based on number)
For each thing in result set from DB do another async stuff
Periodically repeat 1. 2. 3. because new numbers will be constantly added to REDIS structure.Those numbers represent unix timestamp in milliseconds so out of the box those numbers will always be sorted in time of addition
Conditionally means fetch those unix timestamp from REDIS that are less or equal to current unix timestamp in milliseconds(Date.now())
Question is what REDIS data type fit the most for this use case having in mind that this code will be scaled up to N instances, so N instances will share access to single REDIS instance. To equally share the load each instance will read for example first(oldest) 5 numbers from REDIS. Numbers are unique (adding same number should fail silently) so REDIS SET seems like a good choice but reading M first elements from REDIS set seems impossible.
To prevent two different instance of the code to read same numbers REDIS read operation should be atomic, it should read the numbers and delete them. If any async operation fail on specific number (steps 2. and 3.), numbers should be added again to REDIS to be handled again. They should be re-added back to the head not to the end to be handled again as soon as possible. As far as i know SADD would push it to the tail.
SMEMBERS key would read everything, it looks like a hammer to me. I would need to include some application logic to get first five than to check what is less or equal to Date.now() and then to delete those and to wrap somehow everything in single transaction. Besides that set cardinality can be huge.
SSCAN sounds interesting but i don't have any clue how it works in "scaled" environment like described above. Besides that, per REDIS docs: The SCAN family of commands only offer limited guarantees about the returned elements since the collection that we incrementally iterate can change during the iteration process. Like described above collection will be changed frequently

A more appropriate data structure would be the Sorted Set - members have a float score that is very suitable for storing a timestamp and you can perform range searches (i.e. anything less or equal a given value).
The relevant starting points are the ZADD, ZRANGEBYSCORE and ZREMRANGEBYSCORE commands.
To ensure the atomicity when reading and removing members, you can choose between the the following options: Redis transactions, Redis Lua script and in the next version (v4) a Redis module.
Transactions
Using transactions simply means doing the following code running on your instances:
MULTI
ZRANGEBYSCORE <keyname> -inf <now-timestamp>
ZREMRANGEBYSCORE <keyname> -inf <now-timestamp>
EXEC
Where <keyname> is your key's name and <now-timestamp> is the current time.
Lua script
A Lua script can be cached and runs embedded in the server, so in some cases it is a preferable approach. It is definitely the best approach for short snippets of atomic logic if you need flow control (remember that a MULTI transaction returns the values only after execution). Such a script would look as follows:
local r = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', ARGV[1])
return r
To run this, first cache it using SCRIPT LOAD and then call it with EVALSHA like so:
EVALSHA <script-sha> 1 <key-name> <now-timestamp>
Where <script-sha> is the sha1 of the script returned by SCRIPT LOAD.
Redis modules
In the near future, once v4 is GA you'll be able to write and use modules. Once this becomes a reality, you'll be able to use this module we've made that provides the ZPOP command and could be extended to cover this use case as well.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas