Redis get multiple sorted sets?

Redis get multiple sorted sets? - redis

I have multiple sorted sets, which I have named by keys like:
hello:user_id:2015-01-01
hello:user_id:2015-01-02
hello:user_id:2015-01-03
hello:user_id:2015-01-04
etc.
Is it possible to get all of these sets for dates between hello:user_id:2015-01-01 and hello:user_id:2015-01-04 ?

As #zenbeni pointed out this is possible with ZUNIONSTORE
Here is how you can run it.
ZUNIONSTORE resultzset 4 hello:user_id:2015-01-01 hello:user_id:2015-01-02 hello:user_id:2015-01-03 hello:user_id:2015-01-04
Once that runs the result will be stored in resultzset which you can query to get the stored values.
ZRANGE resultzset 0 -1

Related

Redis, how to search by value in a sorted set?

I have a sorted list like the below:
"id" "{"date":1664353365563}"
So I need to delete from the sorted list when date is lass than today.
How to do such scenario?
I tried zrem but it didn't work.

You can ZREMRANGEBYSCORE. The command would be:
redis> ZREMRANGEBYSCORE id 0 1664353365563
Replace 1664353365563 with the timestamp for yesterday.

How can I paginate the results of GEOSEARCH?

I am following the tutorial on https://redis.io/commands/geosearch/ and I have successfully migrated ~300k records (from existing pg database) into testkey (sorry for the unfortunate name, but I am testing it out!) key.
However, executing a query to return items with 5km results in 1000s of items. I'd like to limit the number of items to 10 at a time, and be able to load the next 10 using some sort of keyset pagination.
So, to limit the results I am using
GEOSEARCH testkey FROMLONLAT -122.2612767 37.7936847 BYRADIUS 5 km WITHDIST COUNT 10
How can I execute GEOSEARCH queries with pagination?
Some context: I have a postgres + postgis database with ~3m records. I have a service that fetches items within a radius and even with right indexes it is starting to get sluggish. For context, my other endpoints can handle 3-8k rps, while this one can barely handle 1500 (8ms average query exec time). I am exploring moving items into redis cache, either the entire payload or just IDs and run IN query (<1ms query time).
I am struggling to find any articles using google search.

You can use GEOSEARCHSTORE to create a sorted set with the results from your search. You can then paginate this sorted set with ZRANGE. This is shown as an example on the GEOSEARCHSTORE page:
redis> GEOSEARCHSTORE key2 Sicily FROMLONLAT 15 37 BYBOX 400 400 km ASC COUNT 3 STOREDIST
(integer) 3
redis> ZRANGE key2 0 -1 WITHSCORES
1) "Catania"
2) "56.441257870158204"
3) "Palermo"
4) "190.44242984775784"
5) "edge2"
6) "279.7403417843143"
redis>

Redis Secondary Indexes and Performance Question

I know that Redis doesn't really have the concept of secondary indexes, but that you can use the Z* commands to simulate one. I have a question about the best way to handle the following scenario.
We are using Redis to keep track of orders. But we also want to be able to find those orders by phone number or email ID. So here is our data:
> set 123 7245551212:dlw#email.com
> set 456 7245551212:dlw#email.com
> set 789 7245559999:kdw#email.com
> zadd phone-index 0 7245551212:123:dlw#email.com
> zadd phone-index 0 7245551212:456:dlw#email.com
> zadd phone-index 0 7245559999:789:kdw#email.com
I can see all the orders for a phone number via the following (is there a better way to get the range other than adding a 'Z' to the end?):
> zrangebylex phone-index [7245551212 (7245551212Z
1) "7245551212:123:dlw#dcsg.com"
2) "7245551212:456:dlw#dcsg.com"
My question is, is this going to perform well? Or should we just create a list that is keyed by phone number, and add an order ID to that list instead?
> rpush phone:7245551212 123
> rpush phone:7245551212 456
> rpush phone:7245559999 789
> lrange phone:7245551212 0 -1
1) "123"
2) "456"
Which would be the preferred method, especially related to performance?

RE: is there a better way to get the range other than adding a 'Z' to the end?
Yes, use the next immediate character instead of adding Z:
zrangebylex phone-index [7245551212 (7245551213
But certainly the second approach offers better performance.
Using a sorted set for lexicographical indexing, you need to consider that:
The addition of elements, ZADD, is O(log(N))
The query, ZRANGEBYLEX, is O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements being returned
In contrast, using lists:
The addition, RPUSH, is O(1)
The query, LRANGE, is O(N) as you are starting in zero.
You can also use sets (SADD and SMEMBERS), the difference is lists allows duplicates and preserves order, sets ensure uniqueness and doesn't respect insertion order.

ZSet use skiplist for score and dict for hashset. And if you add all elements with same score, skiplist will be turned to B-TREE like structure, which have a O(logN) time complexity for lexicographical order search.
So if you don't always perform range query for phone number, you should use list for orders which phone number as key for precise query. Also this will work for email(you can use hash to combine these 2 list). In this way performance for query will be much better than ZSET.

How should I model this in Redis?

FYI: Redis n00b.
I need to store search terms in my web app.
Each term will have two attributes: "search_count" (integer) and "last_searched_at" (time)
Example I've tried:
Redis.hset("search_terms", term, {count: 1, last_searched_at: Time.now})
I can think of a few different ways to store them, but no good ways to query on the data. The report I need to generate is a "top search terms in last 30 days". In SQL this would be a where clause and an order by.
How would I do that in Redis? Should I be using a different data type?
Thanks in advance!

I would consider two ordered sets.
When a search term is submitted, get the current timestamp and:
zadd timestamps timestamp term
zincrby counts 1 term
The above two operations should be atomic.
Then to find all terms in the given time interval timestamp_from, timestamp_to:
zrangebyscore timestamps timestamp_from timestamp_to
after you get these, loop over them and get the counts from counts.
Alternatively, I am curious whether you can use zunionstore. Here is my test in Ruby:
require 'redis'
KEYS = %w(counts timestamps results)
TERMS = %w(test0 keyword1 test0 test1 keyword1 test0 keyword0 keyword1 test0)
def redis
#redis ||= Redis.new
end
def timestamp
(Time.now.to_f * 1000).to_i
end
redis.del KEYS
TERMS.each {|term|
redis.multi {|r|
r.zadd 'timestamps', timestamp, term
r.zincrby 'counts', 1, term
}
sleep rand
}
redis.zunionstore 'results', ['timestamps', 'counts'], weights: [1, 1e15]
KEYS.each {|key|
p [key, redis.zrange(key, 0, -1, withscores: true)]
}
# top 2 terms
p redis.zrevrangebyscore 'results', '+inf', '-inf', limit: [0, 2]
EDIT: at some point you would need to clear the counts set. Something similar to what #Eli proposed (https://stackoverflow.com/a/16618932/410102).

Depends on what you want to optimize for. Assuming you want to be able to run that query very quickly and don't mind expending some memory, I'd do this as follows.
Keep a key for every second you see some search (you can go more or less granular if you like). The key should point to a hash of $search_term -> $count where $count is the number of times $search_term was seen in that second.
Keep another key for every time interval (we'll call this $time_int_key) over which you want data (in your case, this is just one key where your interval is the last 30 days). This should point to a sorted set where the items in the set are all of your search terms seen over the last 30 days, and the score they're sorted by is the number of times they were seen in the last 30 days.
Have a background worker that every second grabs the key for the second that occurred exactly 30 days ago and loops through the hash attached to it. For every $search_term in that key, it should subtract the $count from the score associated with that $search_term in $time_int_key
This way, you can just use ZRANGE $time_int_key 0 $m to grab the m top searches ([WITHSCORES] if you want the amounts they were searched) in O(log(N)+m) time. That's more than cheap enough to be able to run as frequently as you want in Redis for just about any reasonable m and to always have that data updated in real time.

Redis: Excluding values from sorted set based on hash field value

I was wondering if anyone could provide some suggestions on how to make sorted set generation more efficient?
I am working on a project where ranking data is calculated on an hourly basis and stored in a database. The data can be filtered by member gender, country, etc. There is roughly 2 million rows that need to be processed and it takes a long time.
We want to move to a more real time approach where data is stored / sorted / filtered in Redis and a daily clean rebuild.
In my prototype, I am creating a sorted set for each possible combination of filters e.g.: leaderboard.au.male, leaderboard.au.female, etc. I've scripted this process but once you handle every case it means there is 118 sorted sets created.
Ideally, I'd like to have a single ranking sorted set and hash sets for each member containing their name, gender and country. Then using Redis only return sorted set values based on the user defined filters. (e.g. only get rankings for males from Australia).
Is this possible to do natively in Redis?

I suggest you keep a set with the rankings for all members:
leaderboard = { id1: score1, id2: score2, ... }
And a set for each type (gender, country etc):
members.male = { id1, id2, ... }
members.au = { id2, id3, ... }
Then, you do a ZINTERSTORE:
zinterstore leaderboard.male 2 leaderboard members.male
Or, to get a leaderboard of male AU members:
zinterstore leaderboard.au.male 3 leaderboard members.male members.au
You can control how the scoring for the resulting sorted set should be calculated using WEIGHTS and AGGREGATE.
If you don't want to keep the resulting sets for long, you could then EXPIRE them, and only build a new set if it doesn't exist.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redis get multiple sorted sets? - redis

I have multiple sorted sets, which I have named by keys like: hello:user_id:2015-01-01 hello:user_id:2015-01-02 hello:user_id:2015-01-03 hello:user_id:2015-01-04 etc. Is it possible to get all of these sets for dates between hello:user_id:2015-01-01 and hello:user_id:2015-01-04 ?

Related

Redis, how to search by value in a sorted set?

How can I paginate the results of GEOSEARCH?

Redis Secondary Indexes and Performance Question

How should I model this in Redis?

Redis: Excluding values from sorted set based on hash field value

Categories

Resources