Use case:
I am implementing a rate limiter for a web application. For every incoming HTTP request I increment a redis counter where the IP address is the key. Additionally I set an expire with a 30min TTL to avoid a memory leak.
The problem:
Now I've got thousands of entries and I'd like to get those entries which have the highest counter values. How can I do that?
One option could be to use a redis Sorted Set and have the score to be the requests count, so for increasing the count you can use ZINCRBY and to get the top N you can use ZRANGE.
New request:
> ZINCRBY requests 1 10.0.0.1
Get top N:
> ZRANGE requests -5 -1 WITHSCORES
Downside is you will not be able to set a timeout per IP, but you could overcome that by separating the requests in different sorted sets by using different keys for each timeout period.
For example, if you want to count the requests for each day, instead of using the same key string "requests", you can construct the key to be something like "requests-date":
> ZINCRBY requests-19/01/25 1 10.0.0.1
> EXPIREAT requests-19/01/25 1516924800
Related
As I am taking the learning curve in REDIS I developd a simple application that consumes market data of goverment bonds, and for each price, it routine ask a webservice for bonds analytics at that price.
Analytics is provided by a api webservice that might be hitted several times as prices arrives every second. The response is a json payload like this one {"md":2.9070078608390455,"paridad":0.7710514176999993,"price":186.0,"ticker":"GO26","tir":0.10945225427543438,"vt":241.22904871224668, "price":185}
My strategy with REDIS is to cache that payload in string format with a key formet by ticker + price (i.e "GO26185). That way I reduce service hits and also query time response. So from here, if a value is not on REDIS, i ask to the APi. If not, i ask to REDIS.
The problem i have is that when running this routine, as long as i PUSH different KEY VALUE pair on REDIS, the one I already have in memory disapears.
i,em. (dbsize, increases as soon as i push information, but decreases when there are no new values).
Although I set expiration to one day (in seconds):
await client.set(
rediskey,
JSON.stringify(response.data)
,{
EX: 86399,
}
);
Is there any configuration I might be messing to tell redis to persist that data and avoid clearing the cache randomly?
Just to clarify, a glance on how SET keys dissapears while registering new ones:
127.0.0.1:6379> dbsize
(integer) 946;
127.0.0.1:6379> dbsize
(integer) 1046;
127.0.0.1:6379> dbsize
(integer) 1048;
127.0.0.1:6379> dbsize
(integer) 1048;
127.0.0.1:6379> dbsize
(integer) 0 << Here all my keys have dissapeared
I am replying my own answer. The problem was that I didn't block redis port and a hacker was connecting to my redis server, causing it to reset. Seems it was using the replicatio nodes.
I have used ZADD command to insert a bunch of IDs with their corresponding scores into a redis instance. The score is basically a timestamp at which the ZADD is called.
Now I want to retrieve a list of IDs whose score is bigger than the timestamp of the moment five minutes ago.
The client is written in java and I am using lettuce as the redis client library.
I have a few questions:
Here is a link to the documentation of zrangebyscore on redis website (https://redis.io/commands/zrangebyscore). However on the lettuce website the counterpart is marked as 'Deprecated'. Is it a discrepancy of documentations, or lettuce has retired the support of this API?
I want to be able to retrieve a list of ID whose score is bigger than a certain number N, but I do not care about the upper-end.
In lettuce's documentation this API zrange seems to be ideal for my purpose. However what sopt I can use to express that I do not care about the upper-bound? The documentation is not clear about this.
The Redis zrange command is a zero index based command. This means the indexes start from 0 and increments as you add new elements. What's helpful here is you can retrieve the last index by specifying negative index -1, second from last by specifying -2 and so on. See more details about zrange on the redis website here.
To retrieve the entire range, you can run
zrange keyname 0 -1
Note that '0' can be replaced by any index which means the above command would fetch a value starting at that index location. Therefore, this cannot be directly used to get all values higher than a specific score.
To retrieve from a specific score N, use
zrangebyscore keyname N +inf
Here is the right Lettuce API for zrangebyscore available since Lettuce 4.3
Bussiness Objective
I'm creating a dashboard that will depend on some time-series and I'll use Redis to implement it. I'm new to using Redis and I'm trying to use Redis-Streams to count the elements in a stream.
XADD conversation:9:chat_messages * id 2583 user_type Bot
XADD conversation:9:chat_messages * id 732016 user_type User
XADD conversation:9:chat_messages * id 732017 user_type Staff
XRANGE conversation:9:chat_messages - +
I'm aware that I can get the total count of the elements using the XLEN command like this:
XLEN conversation:9:chat_messages
but I want to also know the elements in a period, for example:
XLEN conversation:9:chat_messages 1579551316273 1579551321872
I know I can use LUA to count those elements but I want some REALLY fast way to achieve this and I know that using Redis markup will be the fastest way.
Is there any way to achieve this with a straight forward Redis command? Or do I have to write a Lua script to do this?
Additional information
I'm limited by AWS' ElastiCache to use the only Redis 5.0.6, I cannot install other modules such as the RedisTimeSeries module. I'd like to use that module but it's not possible at the moment.
While the Redis Stream data structure doesn't support this, you can use a Sorted Set alongside it for keeping track of message ranges.
Basically, for each message ID you get from XADD - e.g. "1579551316273-0" - you need to do a ZADD conversation:9:ids 0 1579551316273-0. Then, you can use ZLEXCOUNT to get the "length" of a range.
Sorry, there is no commands-way to achieve this.
Your best option with Redis Streams would be to use a Lua script. You will get O(N) with N being the number of elements being counted, instead of O(log N) if a command existed.
local T = redis.call('XRANGE', KEYS[1], ARGV[1], ARGV[2])
local count = 0
for _ in pairs(T) do count = count + 1 end
return count
Note the difference between O(N) and O(log(N)) is significant for a large N, but for a chat application, if tracked by conversation, this won't make that big of a difference if chats have hundreds or even thousands of entries, once you account total command time including Round Trip Time which takes most of the time. The Lua script above removes network-payload and client-processing time.
You can switch to sorted sets if you really want O(log N) and you don't need consumer groups and other stream features. See How to store in Redis sorted set with server-side timestamp as score? if you want to use Redis server timestamp atomically.
Then you can use ZCOUNT which is O(log(N)).
If you do need Stream features, then you would need to keep the sorted set as a secondary index.
There are proposals for sorted set item expiration in Redis (see https://groups.google.com/d/msg/redis-db/rXXMCLNkNSs/Bcbd5Ae12qQJ and https://quickleft.com/blog/how-to-create-and-expire-list-items-in-redis/), I tried the worker approach to expire geospatial indexes with ZREMRANGEBYSCORE and ZREMRANGEBYRANK commands unsuccessfully (nothing removed).
I succeded using ZREMRANGEBYLEX.
Is there a way to work with geospatial items score other than Strings?
Update:
For example, if time to live(ttl) of an item is 30sec, I add it as:
geoadd 1 -8.616021 41.154503 30
Now, suppose worker executes after 40sec, I was expecting that
zremrangebyscore 1 0 40
would do the job, but it does not,
ZREMRANGEBYLEX 1 [0 [40
does it. Why is this behavior? That means the score of a geospatial item supports only lexicographical operations?
Sorted Sets have elements (strings), and every element has a score (floating-point). Geosets use the score to encode a coordinate.
Redis doesn't expire members in a Sorted Set (or a Geoset). You have to remove them yourself if that is required.
In your case, you'll need to keep two Sorted Sets - one as your GeoSet and one for managing TTLs as scores.
For example, assuming your member is called 'foo', to add it:
ZADD ttls 30 foo
ZADD elems -8.616021 41.154503 foo
To manually expire, first find the members with a call to ZRANGEBYSCORE ttls, and then remove them from both Sets.
Tip: it is preferable to use a timestamp as score instead of seconds.
I am no an expert in redis at all. Today I run into one idea, but I don't know if it is possible in redis.
I want to store list of values but only for some time, for example list of ip addresses which visited page in last 5 minutes. As far as I know I can't set EXPIRE on single list/hash item, right? So I am pushing 1, 2, 3 into list/hash but after certain constant time I want each item to expire/disapear? Or maybe instead of list hash structure will be more suitable { '1': timestamp-when-disapear, ... }?
Or maybe only solution is
SET test.1.1 1
EXPIRE test.1.1 60
SET test.1.2 2
EXPIRE test.1.2 60
SET test.1.3 3
EXPIRE test.1.3 60
# to retrieve, can I pipeline KEYS output to MGET?
KEYS test.1.*
Use a sorted set instead.
log the server IP along with the timestamp in a sorted set. During retrieval make use of that timestamp to get things you need. In a scheduler periodically delete keys which goes beyond the range.
Example:
zadd test 1465371055 1.1
zadd test 1465381055 1.3
zadd test 1465391055 1.1
your sorted set will have 1.1 and 1.3, where 1.1 is with the new value 1465391055.
Now on retrieval use
zrangebyscore test min max
min -> currenttime - (5*60*1000)
max -> currenttime
you will get IP's visited in last 5 mins.
In another scheduler kind of thread you need to delete unwanted entries.
zremrangebyscore test min max
min -> currenttime - (10*60*1000) -> you can give it to any minute you want.
max -> currenttime
Also understand that if number of distinct IP's are too large then the sorted set will grow rapidly. Your scheduler thread must work properly to keep the memory in control.