How to combine muli-fields values and sorted time-ranges using Redis - redis

I am trying to insert time based records with multiple fields on the values (with TTL enabled).
For the multiple fields the best way to do it via Redis is using HSET:
HSET user:32 name "johns" timecreated "3333311232" address "somewhere"
I also try to read those values via time range:
for example return all history records (for example user 32) which was inserted in the last day:
so the best for that would be storing via ZADD using scores(this time I am losing the hash-map structure for easy retrieval):
ZADD user:32 3333311232 "name=johns,timecreated=3333311232,address=somewhere"
On the top of the things I want to add TTL for each record
Any idea how I could optimize my design?
I could split into two but that will requires two queries when reading:
ZADD user:32 3333311232 "user:32:3333311232"
HMSET user:32:3333311232 name “johns” timecreated “3333311232” address="somewhere"
than to retrieve ill need:
//some range
ZRANGEBYSCORE user:32 3333311232 333331123
result: 1389772850
now to get all information: HGETALL user:32:1389772850
What do you think?
Thank you,
ray.

The two methods you describe are the two common approaches. If you store the entire object in the ZSET, you would typically store it as a JSON string. If you don't need "random" access to the object, that's a valid approach.
I usually go for the other approach; a ZSET combined with hashes. the two queries are not a big deal. You could even abstract it away with a Lua script; see EVAL.
Regarding the TTL, while you cannot expire individual ZSET values, you could expire the hash, and use keyspace notifications to listen for the expired event, and remove the corresponding value from the ZSET.
Let me know if you need some more specifics.

Related

REDIS usecase using large keys with small values

I have a use-case for using redis that is a little bit different.
In my MySQL I have an entity, let's call it HumanEntity. this HumanEntity has many to many relations.
HumanEntity.Urls - Many URLs per HumanEntity.
HumanEntity.UserNames - Many UserNames per HumanEntity.
HumanEntity.Phones ...
HumanEntity.Emails ...
in a normal one hour, the application creates hundreds of these many values.
The use-case is that, the application receives an HTTP call (100 per one second) with a HumanEntity value (Url or UserName or Phone or Email).
I need to scan my MySQL (1,000,000 records) and return back the HumanEntity.Id(integer) .
Since its ok to have some latency in the data integrity I thought about REDIS.
Can I store the values as a Redis key and the and the HumanEntity.Id(integer) as the value.
My API needs to return back the HumanEntity.Id(integer).
does it make sense to have such long key and such short value? The URL, for example, maybe 1500 bytes and the value can be 1 byte.
What is the best redis method to implement that?
Thanks
If the values are not unique then you may have some problem. Phones, emails or usernames maybe unique for user but i am not sure about url or any other property stored in your database. You may overwrite the value of an identifier with another user's.
If you don't have any problem like that; you may proceed with string types, Time complexity of GET and SET is O(1) - that's the best you may get.
In some cases such as checking whether the user used any coupon, you may use long(let's say 64 chars) user id as key, and 1 as value and use EXISTS to determine it. So it's valid to use long key and short value.

Cascade deletes in Redis

On my current project I'm implementing autocompletion service on top of Redis, for it I use such approach (this article describes it more widely):
1) for storing dump of the data I have hash in which I put searchable objects as a values, for instance
HSET data 1 "{\"name\":\"Kill Bill\",\"year\":2003}"
HSET data 2 "{\"name\":\"King Kong\",\"year\":2005}"
2) for storing all possible sequences of input characters (that I generate in advance) which could be used in search I use sorted sets, like
ZADD search:index:k 0 1
ZADD search:index:ki 0 1
ZADD search:index:kil 0 1
ZADD search:index:kill 0 1
Where value stored in sorted set (in my example '1') is key for data from hash. So, for searching some data (for example where name started with 'ki') we need to make two steps:
data_keys = REDIS.zrevrange('search:index:ki', 0, -1)
matching_data = REDIS.hmget(data, *data_keys)
The issue I tried to solve - how automatically remove all data from sorted sets related to hash values when I removed it? In relational databases I can use cascade deletion for such cases, but how can I handle it in Redis?
Your design appears awkward to me, I'm unsure what you're actually trying to do with Redis and perhaps that could be the topic of another question.
That said, to address your question, Redis does offer a "cascading delete"-like behavior. Instead, if you're deleting hash "1", iterate the prefix and ZREM it from the relevant sorted sets.
Note: do not use a Lua script for this task, as it will generate key names (i.e. sorted sets by prefix) and that is against the recommendations (will not work on a cluster)

Getting top results from Redis hash

I am trying to write a query in Redis to get the first 2 field values of my hash key..
Basically, when I do HVALS hashname, I want to get the values of the first 2 fields added (the oldest 2). This is somewhat like getting the TOP 2 tuples in a SQL database.
Is this possible in redis?
No, this isn't possible. The order of fields and values in a Redis Hash is for all intents and purposes random (despite the empirical evidence obtained on from experimenting on smallish Hashes). For ordering elements, refer to Redis' Sorted Sets.
Update: to answer the question in the comment, IIUC it looks like you can solve it easily with just Strings. Because of Redis' nature, at any given moment there is either one user waiting for a specific match, or zero. You can SET matchmaking:blue username1:token if the key doesn't exist (i.e. zero users waiting for the match) and GET and DEL it if it exists. Be sure to use SET's "NX" subcommand, MULTI/EXEC and/or Lua to ensure the atomicity of these two logical operations.
From what I have experimented with, HVALS returns values for keys in the order you're looking for i.e. oldest key first. Now its up to you to only pick the first two values using the client program e.g. HSET myhmap name "abhi" , HSET myhmap email "test#test" , HSET myhmap planet "earth", HSET myhmap galaxy "andromeda". HVALS myhmap will return "abhi","test#test", "earth" , "andromeda"

Redis - Sorted set, find item by property value

In redis I store objects in a sorted set.
In my solution, it's important to be able to run a ranged query by dates, so I store the items with the score being the timestamp of each items, for example:
# Score Value
0 1443476076 {"Id":"92","Ref":"7ADT","DTime":1443476076,"ATime":1443901554,"ExTime":0,"SPName":"7ADT33CFSAU6","StPName":"7ADT33CFSAU6"}
1 1443482969 {"Id":"11","Ref":"DAJT","DTime":1443482969,"ATime":1443901326,"ExTime":0,"SPName":"DAJTJTT4T02O","StPName":"DAJTJTT4T02O"}
However, in other situations I need to find a single item in the set based on it's ID.
I know I can't just query this data structure as if it were a nosql db, but I tried using ZSCAN, which didn't work.
ZSCAN MySet 0 MATCH Id:92 count 1
It returns; "empty list or set"
Maybe I need to serialize different?
I have serialized using Json.Net.
How, if possible, can I achieve this; using dates as score and still be able to lookup an item by it's ID?
Many thanks,
Lars
Edit:
Assume it's not possible, but any thoughts or inputs are welcome:
Ref: http://openmymind.net/2011/11/8/Redis-Zero-To-Master-In-30-Minutes-Part-1/
In Redis, data can only be queried by its key. Even if we use a hash,
we can't say get me the keys wherever the field race is equal to
sayan.
Edit 2:
I tried to do:
ZSCAN MySet 0 MATCH *87*
127.0.0.1:6379> ZSCAN MySet 0 MATCH *87*
1) "192"
2) 1) "{\"Id\":\"64\",\"Ref\":\"XQH4\",\"DTime\":1443837798,\"ATime\":1444187707,\"ExTime\":0,\"SPName\":\"XQH4BPGW47FM\",\"StPName\":\"XQH4BPGW47FM\"}"
2) "1443837798"
3) "{\"Id\":\"87\",\"Ref\":\"5CY6\",\"DTime\":1443519199,\"ATime\":1444172326,\"ExTime\":0,\"SPName\":\"5CY6DHP23RXB\",\"StPName\":\"5CY6DHP23RXB\"}"
4) "1443519199"
And it finds the desired item, but it also finds another one with an occurance of 87 in the property ATime. Having more unique, longer IDs might work this way and I would have to filter the results in code to find the one with the exact value in its property.
Still open for suggestions.
I think it's very simple.
Solution 1(Inferior, not recommended)
Your way of ZSCAN MySet 0 MATCH Id:92 count 1 didn't work out because the stored string is "{\"Id\":\"92\"... not "{\"Id:92\".... The string has been changed into another format. So try to use MATCH Id\":\"64 or something like that to match the json serialized data in redis. I'm not familiar with json.net, so the actual string leaves for you to discover.
By the way, I have to ask you did ZSCAN MySet 0 MATCH Id:92 count 1 return a cursor? I suspect you used ZSCAN in a wrong way.
Solution 2(Better, strongly recommended)
ZSCAN is good when your sorted set is not large and you know how to save network roundtrip time by Redis' Lua transaction. This still make "look up by ID" operation O(n). Therefore, a better solution is to change you data model in the following way:
change sorted set
from
# Score Value
0 1443476076 {"Id":"92","Ref":"7ADT","DTime":1443476076,"ATime":1443901554,"ExTime":0,"SPName":"7ADT33CFSAU6","StPName":"7ADT33CFSAU6"}
1 1443482969 {"Id":"11","Ref":"DAJT","DTime":1443482969,"ATime":1443901326,"ExTime":0,"SPName":"DAJTJTT4T02O","StPName":"DAJTJTT4T02O"}
to
# Score Value
0 1443476076 Id:92
1 1443482969 Id:11
Move the rest detailed data in another set of hashes type keys:
# Key field-value field-value ...
0 Id:92 Ref-7ADT DTime-1443476076 ...
1 Id:11 Ref-7ADT DTime-1443476076 ...
Then, you locate by id by doing hgetall id:92. As to ranged query by date, you need do ZRANGEBYSCORE sortedset mindate maxdate then hgetall every id one by one. You'd better use lua to wrap these commands in one and it will still be super fast!
Data in NoSql database need to be organized in a redundant way like above. This may make some usual operation involve more than one commands and roundtrip, but it can be tackled by redis's lua feature. I strongly recommend the lua feature of redis, cause it wrap commands into one network roundtrip, which are all executed on the redis-server side and is atomic and super fast!
Reply if there's anything you don't know

How to fetch first 100 records from Redis

I am working on small application where I am using redis to hold my intermediate data. After inserting data, I need to reload my data in same order in which i have inserted.
I am using keys method to get all keys but the order of returned keys is not same as they were inserted.
You have to maintain order yourself, by keeping a separate list for inserted keys. So, instead of
SET foo, bar
you may do something like this:
SET foo, bar
RPUSH insert_order, foo
Then you can do
LRANGE insert_order, 0, 100
to get first 100 set fields.
If you want to track actual insertion (and not updates), you can use SETNX, for example. Also, you can use a sorted set instead of a list (as mentioned by #Leonid) Additionally, you can wrap the whole thing in Lua, so that the bookkeeping is hidden from the client code.
For indexing URL and getting the list by inserted order, you should use sorted set:
zadd <your_url_list_key> <inserted_time> <url>
Detail data for a single url should be stored in a different place. For example, use hash:
hset <your_url_data_key> <url> <url_data>
It's better if you don't store detail data on redis, so instead of using redis hash, you should save url detail data on mysql.
You can also md5(url) before indexing to reduce the size (then the full url value will be stored in url_data).
In my project, sorted set still works ok with about 3mil records (read & write frequently). But you should watch the hash size often, it will grow really fast.