I was wondering if something like this is possible. I am working with a redis list, and was wondering if I could move a item from the middle of a list to the top of a list like this:
LPUSH mylist "This"
LPUSH mylist "is"
LPUSH mylist "a"
LPUSH mylist "Test"
Somehow Move "a" to top
LRANGE mylist
1. "a"
2. "This"
3. "Is"
4."Test"
Thanks for the help!
Redis Lists are implemented using Linked lists, and Linked Lists are not suitable for such use (ie. random access and efficient indexing).
You would have to store all the elements until "a" (inclusive) somewhere, then remove them from the list using LTRIM and then push them again in the order you want (ie. after RPOPing the last element and LPUSHing it). You could do this using an embedded Lua script since Redis supports this out of the box.
However, if you want each word to appear only once in your list, you could do this efficiently using a Sorted Set. You would have to just update the score of the specific element to something greater than all the others (ZADD). Then you'd do a ZRANGEBYSCORE to retrieve the re-ordered set.
However, using a sorted set has its trade-offs, mostly that insertion/deletion of elements is slower (ie. happens in logarithmic time) than pushing/poping values from a list (ie. happens in constant time). It all depends in your problem, you should weigh the different approaches (Redis documentation provides the time complexity of each operation) and pick the one that fits your problem.
I wrote a lua script that will move an item forwards or backwards in a list:
https://github.com/stereosteve/redis-moveby
As the readme indicates, a sorted set may be a better option, and I have not used this in production, so use with care.
Related
Given I have a list stored in a key "A", how can I duplicate that list in a different key "B"?
I know that for non list values, I can "get", and then "set". But for lists when I try to get it, I see the WRONGTYPE operation error.
Redis enables 5 different data structures, such as:
Key values
Strings
Hashs
Lists
Sets
Each of the data structure has its own commands.
In order to get the current list you should use the LRANGE command.
The prefix L referes to the List data structure.
(Redis Set data structure has a related command to range by using SETRANGE)
If you read the Redis LRANGE documentation you will understand how to use it.
Here is the brief code you may use:
LRANGE mylist 0 -1
Where mylist is the list you get the values from.
The offsets start and stop are zero-based indexes, with 0 being the
first element of the list (the head of the list), 1 being the next
element and so on.
-1 is used to describe the last element in the list.
Now you should use the LPUSH or RPUSH depends on the list side you wish to insert the old elements into the new list.
You should use LRANGE to get all elements of the first list, then use LPUSH or RPUSH to put these elements into the second list.
Another way to duplicate a key's value, regardless the data structure, is to DUMP it and then RESTORE it into the new key. In most cases this approach is also the quickest.
In redis I store objects in a sorted set.
In my solution, it's important to be able to run a ranged query by dates, so I store the items with the score being the timestamp of each items, for example:
# Score Value
0 1443476076 {"Id":"92","Ref":"7ADT","DTime":1443476076,"ATime":1443901554,"ExTime":0,"SPName":"7ADT33CFSAU6","StPName":"7ADT33CFSAU6"}
1 1443482969 {"Id":"11","Ref":"DAJT","DTime":1443482969,"ATime":1443901326,"ExTime":0,"SPName":"DAJTJTT4T02O","StPName":"DAJTJTT4T02O"}
However, in other situations I need to find a single item in the set based on it's ID.
I know I can't just query this data structure as if it were a nosql db, but I tried using ZSCAN, which didn't work.
ZSCAN MySet 0 MATCH Id:92 count 1
It returns; "empty list or set"
Maybe I need to serialize different?
I have serialized using Json.Net.
How, if possible, can I achieve this; using dates as score and still be able to lookup an item by it's ID?
Many thanks,
Lars
Edit:
Assume it's not possible, but any thoughts or inputs are welcome:
Ref: http://openmymind.net/2011/11/8/Redis-Zero-To-Master-In-30-Minutes-Part-1/
In Redis, data can only be queried by its key. Even if we use a hash,
we can't say get me the keys wherever the field race is equal to
sayan.
Edit 2:
I tried to do:
ZSCAN MySet 0 MATCH *87*
127.0.0.1:6379> ZSCAN MySet 0 MATCH *87*
1) "192"
2) 1) "{\"Id\":\"64\",\"Ref\":\"XQH4\",\"DTime\":1443837798,\"ATime\":1444187707,\"ExTime\":0,\"SPName\":\"XQH4BPGW47FM\",\"StPName\":\"XQH4BPGW47FM\"}"
2) "1443837798"
3) "{\"Id\":\"87\",\"Ref\":\"5CY6\",\"DTime\":1443519199,\"ATime\":1444172326,\"ExTime\":0,\"SPName\":\"5CY6DHP23RXB\",\"StPName\":\"5CY6DHP23RXB\"}"
4) "1443519199"
And it finds the desired item, but it also finds another one with an occurance of 87 in the property ATime. Having more unique, longer IDs might work this way and I would have to filter the results in code to find the one with the exact value in its property.
Still open for suggestions.
I think it's very simple.
Solution 1(Inferior, not recommended)
Your way of ZSCAN MySet 0 MATCH Id:92 count 1 didn't work out because the stored string is "{\"Id\":\"92\"... not "{\"Id:92\".... The string has been changed into another format. So try to use MATCH Id\":\"64 or something like that to match the json serialized data in redis. I'm not familiar with json.net, so the actual string leaves for you to discover.
By the way, I have to ask you did ZSCAN MySet 0 MATCH Id:92 count 1 return a cursor? I suspect you used ZSCAN in a wrong way.
Solution 2(Better, strongly recommended)
ZSCAN is good when your sorted set is not large and you know how to save network roundtrip time by Redis' Lua transaction. This still make "look up by ID" operation O(n). Therefore, a better solution is to change you data model in the following way:
change sorted set
from
# Score Value
0 1443476076 {"Id":"92","Ref":"7ADT","DTime":1443476076,"ATime":1443901554,"ExTime":0,"SPName":"7ADT33CFSAU6","StPName":"7ADT33CFSAU6"}
1 1443482969 {"Id":"11","Ref":"DAJT","DTime":1443482969,"ATime":1443901326,"ExTime":0,"SPName":"DAJTJTT4T02O","StPName":"DAJTJTT4T02O"}
to
# Score Value
0 1443476076 Id:92
1 1443482969 Id:11
Move the rest detailed data in another set of hashes type keys:
# Key field-value field-value ...
0 Id:92 Ref-7ADT DTime-1443476076 ...
1 Id:11 Ref-7ADT DTime-1443476076 ...
Then, you locate by id by doing hgetall id:92. As to ranged query by date, you need do ZRANGEBYSCORE sortedset mindate maxdate then hgetall every id one by one. You'd better use lua to wrap these commands in one and it will still be super fast!
Data in NoSql database need to be organized in a redundant way like above. This may make some usual operation involve more than one commands and roundtrip, but it can be tackled by redis's lua feature. I strongly recommend the lua feature of redis, cause it wrap commands into one network roundtrip, which are all executed on the redis-server side and is atomic and super fast!
Reply if there's anything you don't know
I am brand new to redis so I have a number of questions regarding the setbit function.
I have a dataset of the following type:
{'items':[{'item_1':0001...1000,
...
'item_n':0011...0011}
]
}
In each item there are 10's of thousands of bits and there are hundreds of thousands of items. It seems that I can use the following to set the items:
Redis.setbit('item1', 0, 0)
Redis.setbit('item1', 1, 0)
Redis.setbit('item1', 2, 0)
Redis.setbit('item1', 3, 1)
...
However this seems horribly inefficient. Is there anyway to set all of the bits at once?
Is there someway that I can group these into a set or a hash so I can lookup what items are currently set? They change on a daily basis and I need to know what was previously written so I can analyze and delete it accordingly.
How can I look up the names of the previously written items?
No and yes. SETBIT (as its name suggests) sets a single bit. However, since Redis uses the string data type to store bits, you can construct the relevant item string in your app and then just SET it in one fell stroke. More information about the internal representation of bits in strings is at Can someone explain redis setbit command?.
To keep track items you've written to and look them up, Redis' sets should do the trick nicely. SADD your items to a key (possibly named according to the date) and to fetch use SSCAN.
Using keys I can query the keys as you can see below:
redis> set popo "pepe"
OK
redis> set coco "kansas"
OK
redis> set cool "rock"
OK
redis> set cool2 "punk"
OK
redis> keys *co*
1) "cool2"
2) "coco"
3) "cool"
redis> keys *ol*
1) "cool2"
2) "cool"
Is there any way to get the values instead of the keys? Something like: mget (keys *ol*)
NOTICE: As others have mentioned, along with myself in the comments on the original question, in production environments KEYS should be avoided. If you're just running queries on your own box and hacking something together, go for it. Otherwise, question if REDIS makes sense for your particular application, and if you really need to do this - if so, impose limits and avoid large blocking calls, such as KEYS. (For help with this, see 2015 Edit, below.)
My laptop isn't readily available right now to test this, but from what I can tell there isn't any native commands that would allow you to use a pattern in that way. If you want to do it all within redis, you might have to use EVAL to chain the commands:
eval "return redis.call('MGET', unpack(redis.call('KEYS', KEYS[1])))" 1 "*co*"
(Replacing the *co* at the end with whatever pattern you're searching for.)
http://redis.io/commands/eval
Note: This runs the string as a Lua script - I haven't dove much into it, so I don't know if it sanitizes the input in any way. Before you use it (especially if you intend to with any user input) test injecting further redis.call functions in and see if it evaluates those too. If it does, then be careful about it.
Edit: Actually, this should be safe because neither redis nor it's lua evaluation allows escaping the containing string: http://redis.io/topics/security
2015 Edit: Since my original post, REDIS has released 2.8, which includes the SCAN command, which is a better fit for this type of functionality. It will not work for this exact question, which requests a one-liner command, but it's better for all reasonable constraints / environments.
Details about SCAN can be read at http://redis.io/commands/scan .
To use this, essentially you iterate over your data set using something like scan ${cursor} MATCH ${query} COUNT ${maxPageSize} (e.g. scan 0 MATCH *co* COUNT 500). Here, cursor should always be initialized as 0.
This returns two things: first is a new cursor value that you can use to get the next set of elements, and second is a collection of elements matching your query. You just keep updating cursor, calling this query until cursor is 0 again (meaning you've iterated over everything), and push the found elements into a collection.
I know SCAN sounds like a lot more work, but I implore you, please use a solution like this instead of KEYS for anything important.
I'm trying to implement something like Google suggest on a website I am building and am curious how to go about doing in on a very large dataset. Sure if you've got 1000 items you cache the items and just loop through them. But how do you go about it when you have a million items? Further, suppose that the items are not one word. Specifically, I have been really impressed by Pandora.com. For example, if you search for "wet" it brings back "Wet Sand" but it also brings back Toad The Wet Sprocket. And their autocomplete is FAST. My first idea was to group the items by the first two letters, so you would have something like:
Dictionary<string,List<string>>
where the key is the first two letters. That's OK, but what if I want to do something similar to Pandora and allow the user to see results that match the middle of the string? With my idea: Wet would never match Toad the Wet Sprocket because it would be in the "TO" bucket instead of the "WE" bucket. So then perhaps you could split the string up and "Toad the Wet Sprocket" go in the "TO", "WE" and "SP" buckets (strip out the word "THE"), but when you're talking about a million entries which may have to say a few words each possibly, that seems like you'd quickly start using up a lot of memory. Ok, that was a long question. Thoughts?
As I pointed out in How to implement incremental search on a list you should use structures like a Trie or Patricia trie for searching patterns in large texts.
And for discovering patterns in the middle of some text there is one simple solution. I am not sure if it is the most efficient solution, but I usually do it as follows.
When I insert some new text into the Trie, I just insert it, then remove the first character, insert again, remove the second character, insert again ... and so on until the whole text is consumed. Then you can discover every substring of every inserted text by just one search from the root. That resulting structure is called a Suffix Tree and there are a lot of optimizations available.
And it is really incredible fast. To find all texts that contain a given sequence of n characters you have to inspect at most n nodes and perform a search on the list of children for every node. Depending on the implementation (array, list, binary tree, skip list) of the child node collection, you might be able to identify the required child node with as few as 5 search steps assuming case insensitive latin letters only. Interpolation sort might be helpful for large alphabets and nodes with many children as those usually found near the root.
Don't try to implement this yourself (unless you're just curious). Use something like Lucene or Endeca - it will save you time and hair.
Not algorithmically related to what you are asking, but make sure you have a 200ms or more delay (lag) after the kaypress(es) so you ensure that the user has stopped typing before issuing the asynchronous request. That way you will reduce redundant http requests to the server.
I would use something along the lines of a trie, and have the value of each leaf node be a list of the possibilities that contain the word represented by the leaf node. You could sort them in order of likelihood, or dynamically sort/filter them based on other words the user has entered into the search box, etc. It will execute very quickly and in a reasonable amount of RAM.
You keep the items on the server side (perhaps in a DB, if the dataset is really large and complex) and you send AJAX calls from the client's browser that return the results using json/xml. You can do this in response to the user typing, or with a timer.
if you don't want a trie and you want stuff from the middle of the string, you generally want to run some sort of edit distance function (levenshtein distance) which will give you a number indicating how well 2 strings match up. it's not a particularly efficient algorithm, but it doesn't matter too much for things like words, as they're relatively short. if you're running comparisons on like, 8000 character strings it'll probably take a few seconds. i know most languages have an implementation, or you can find code/pseudocode for it pretty easily on the internet.
I've built AutoCompleteAPI for this scenario exactly.
Sign up to get a private index, then,
Upload your documents.
Example upload using curl on document "New York":
curl -X PUT -H "Content-Type: application/json" -H "Authorization: [YourSecretKey]" -d '{
"key": "New York",
"input": "New York"
}' "http://suggest.autocompleteapi.com/[YourAccountKey]/[FieldName]"
After indexing all document, to get autocomplete suggestions, use:
http://suggest.autocompleteapi.com/[YourAccountKey]/[FieldName]?prefix=new
You can use any client autocomplete library to show these results to the user.