How do I get list of keys/values using Booksleeve? - redis

I'm trying to get a list of values, where key name starts with let's say "monkey".
I really couldn't find a doc on this. :(
How can I do this? What API should I use? Keys, Sets, Strings? What method?
Or it is not available yet, but a workaround?
Thanks

Redis does not have a "get all keys like {x} along with their values" command, but it does have:
get all keys like {x}
get value of key / keys
Whether your approach is sensible in the first place depends a bit on which server version you are using. If you are on a recent version then the library will use SCAN, which is not terrible. On older server versions it will use KEYS, which is to be avoided at all costs. I am not at a PC so this is pseudo-code only, but:
foreach(var batch in db.GetKeys("monkey*")
.Batchify(100))
{
list.AddRange(await db.Strings.GetString(batch));
}
Note that this isn't optimized - the batches could be fun much more concurrently than the above - but I would need a keyboard and compiler to demonstrate that!

Related

How to list keys without serial number in REDIS?

I am trying to list keys with specific pattern like below:
KEYS "*Team*"
and I am getting resultset with serial number like below:
1) "TeamMetricSummary_google_bps_app_google wfep
league_chambersc2016:04-03-2016_06-04-2016"
2) "\xac\xed\x00\x05t\x00TTeamMetricSummary_google_bps_app_google wfep
league_malini.gto:12-06-2015_04-02-2016"
My problem is that I want to avoid serial number in result set.
Is that possible?
That is not possible. Redis will return the whole key. You can use regex or string operations like split in your application logic to achieve this. For this you must know your input. For example if your key is in a pattern like xTeamNamey. where x and y are some constraints (serial number) you want to avoid, you can insert your key like x:TeamName:y. In retrieval you can use string.split(":")[1] to get the TeamName.
To answer your specific question:
Redis supports Lua scripting. If you're on a version of Redis that is bundled with Lua version 5.0 or above, you Lua script can use regular expressions. Write a Lua script that does redis.call of KEYS command and then does pattern matching to remove serial numbers for you.
Alternative:
By the way, assuming above is part of software that runs in production, here is what I would suggest: Don't use KEYS command on production! As it's documentation says, Redis has to go through all the keys to get keys matching your pattern. Alternatively, you may consider doing shadow writes to a Redis set that's trimmed of serial number every time you add a key. When you need list of keys, you can read the whole set. However, you'll have to manually handle the expiration of keys.

How do I delete all keys matching a specified key pattern using StackExchange.Redis?

I've got about 150,000 keys in a Redis cache, and need to delete > 95% of them - all keys matching a specific key prefix - as part of a cache rebuild. As I can see it, there are three ways to achieve this:
Use server.Keys(pattern) to pull out the entire key list matching my prefix pattern, and iterate through the keys calling KeyDelete for each one.
Maintain a list of keys in a Redis set - each time I insert a value, I also insert the key in the corresponding key set, and then retrieve these sets rather than using Keys. This would avoid the expensive Keys() call, but still relies on deleting tens of thousands of records one by one.
Isolate all of my volatile data in a specific numbered database, and just flush it completely at the start of a cache rebuild.
I'm using .NET and the StackExchange.Redis client - I've seen solutions elsewhere that use the CLI or rely on Lua scripting, but nothing that seems to address this particular use case - have I missed a trick, or is this just something you're not supposed to do with Redis?
(Background: Redis is acting as a view model in front of the Microsoft Dynamics CRM API, so the cache is populated on first run by pulling around 100K records out of CRM, and then kept in sync by publishing notifications from within CRM whenever an entity is modified. Data is cached in Redis indefinitely and we're dealing with a specific scenario here where the CRM plugins fail to fire for a period of time, which causes cache drift and eventually requires us to flush and rebuild the cache.)
Both options 2 & 3 are reasonable.
Steer clear of option 1. KEYS really is slow and only gets slower as your keyspace grows.
I'd normally go for 2 (without LUA, including LUA would increase the learning curve to support the solution - which of course is fine when justified and assuming it's existence is clear/documented.), but 3 could definitely be a contender, fast and simple, as long as you can be sure you won't exceed the configured DB limit.
Use scanStream instead of keys and it will work like a charm.
Docs - https://redis.io/commands/scan
The below code can get you a array of keys starting with LOGIN:: and you can loop through the array and execute redis DEL command to del the corresponding keys.
Example code in nodejs :-
const redis = require('ioredis');
let stream = redis.scanStream({
match: "LOGIN::*"
});
stream.on("data", async (keys = []) => {
let key;
for (key of keys) {
if (!keysArray.includes(key)) {
await keysArray.push(key);
}
}
});
stream.on("end", () => {
res(keysArray);
});

Alphabetical index with millions of rows in redis

For my application, I need an alphabetical index on a set with millions of rows.
When I use a sorted set, and give all members the same score, the result looks perfect.
Performance is also great, with a test set of 2 million rows, the last third does not perform noticably less than the first third of the set.
However, I need to query those results. For example, get the first (max) 100 items that start with "goo". I played around with zscan and sort, but it does not give me a working and performant result.
Since redis is very fast when inserting a new member to the sorted set, it must be technically possible to immediately (well, very quickly) go to the right memory location. I suppose redis uses some kind of quicksort mechanism to accomplish this.
But.. I don't seem to get the result when I just want to query the data, and not write to it.
We use replicated slaves for read actions, and we prefer the (default) read-only config switch. So creating a dummy key and deleting it afterward (however unelegant) is not really an option.
I'm stuck a bit, and I'm thinking about writing a ZLEX command in redis-server itself. Which I could use like this:
HELP "ZLEX" -> (ZLEX set score startswith)
-- Query the lexicographical index of a sorted set, supplying a 'startswith' string.
127.0.0.1:12345> ZLEX myset 0 goo LIMIT 0 100
1) goo
2) goof
3) goons
4) goozer
What are your thoughts? Am I missing something in the standard redis commands?
We're using Redis 2.8.4 x64 on Debian.
Kind regards, TW
Edits:
Note:
Related issue: indexing-using-redis-sorted-sets -> At least the name I gave to ZLEX seems to conform with Antirez' (Salvatore's) standards. As of 24-1-2014, I'm working on implementing ZLEX. It seems to be the easiest and most straight-forward solution for this use case, and Antirez could merge it into the main branch for everyone's benefit.
I've implemented ZLEX.
Here are the full specs.
You can grab the new functionality from here: github tw-bert
I also posted a pull request to Antirez here.
Kind regards, TW
Have you had a look at this ?
It can be useful depending on the length of the field by which you sort, this method requires b*(a^2) keys, where a is the length of the field , and b is amount of rows for this field.

REDIS - How can I query keys and get their values in one query?

Using keys I can query the keys as you can see below:
redis> set popo "pepe"
OK
redis> set coco "kansas"
OK
redis> set cool "rock"
OK
redis> set cool2 "punk"
OK
redis> keys *co*
1) "cool2"
2) "coco"
3) "cool"
redis> keys *ol*
1) "cool2"
2) "cool"
Is there any way to get the values instead of the keys? Something like: mget (keys *ol*)
NOTICE: As others have mentioned, along with myself in the comments on the original question, in production environments KEYS should be avoided. If you're just running queries on your own box and hacking something together, go for it. Otherwise, question if REDIS makes sense for your particular application, and if you really need to do this - if so, impose limits and avoid large blocking calls, such as KEYS. (For help with this, see 2015 Edit, below.)
My laptop isn't readily available right now to test this, but from what I can tell there isn't any native commands that would allow you to use a pattern in that way. If you want to do it all within redis, you might have to use EVAL to chain the commands:
eval "return redis.call('MGET', unpack(redis.call('KEYS', KEYS[1])))" 1 "*co*"
(Replacing the *co* at the end with whatever pattern you're searching for.)
http://redis.io/commands/eval
Note: This runs the string as a Lua script - I haven't dove much into it, so I don't know if it sanitizes the input in any way. Before you use it (especially if you intend to with any user input) test injecting further redis.call functions in and see if it evaluates those too. If it does, then be careful about it.
Edit: Actually, this should be safe because neither redis nor it's lua evaluation allows escaping the containing string: http://redis.io/topics/security
2015 Edit: Since my original post, REDIS has released 2.8, which includes the SCAN command, which is a better fit for this type of functionality. It will not work for this exact question, which requests a one-liner command, but it's better for all reasonable constraints / environments.
Details about SCAN can be read at http://redis.io/commands/scan .
To use this, essentially you iterate over your data set using something like scan ${cursor} MATCH ${query} COUNT ${maxPageSize} (e.g. scan 0 MATCH *co* COUNT 500). Here, cursor should always be initialized as 0.
This returns two things: first is a new cursor value that you can use to get the next set of elements, and second is a collection of elements matching your query. You just keep updating cursor, calling this query until cursor is 0 again (meaning you've iterated over everything), and push the found elements into a collection.
I know SCAN sounds like a lot more work, but I implore you, please use a solution like this instead of KEYS for anything important.

Fastest way to query for object existence in NHibernate

I am looking for the fastest way to check for the existence of an object.
The scenario is pretty simple, assume a directory tool, which reads the current hard drive. When a directory is found, it should be either created, or, if already present, updated.
First lets only focus on the creation part:
public static DatabaseDirectory Get(DirectoryInfo dI)
{
var result = DatabaseController.Session
.CreateCriteria(typeof (DatabaseDirectory))
.Add(Restrictions.Eq("FullName", dI.FullName))
.List<DatabaseDirectory>().FirstOrDefault();
if (result == null)
{
result = new DatabaseDirectory
{
CreationTime = dI.CreationTime,
Existing = dI.Exists,
Extension = dI.Extension,
FullName = dI.FullName,
LastAccessTime = dI.LastAccessTime,
LastWriteTime = dI.LastWriteTime,
Name = dI.Name
};
}
return result;
}
Is this the way to go regarding:
Speed
Separation of Concern
What comes to mind is the following: A scan will always be performed "as a whole". Meaning, during a scan of drive C, I know that nothing new gets added to the database (from some other process). So it MAY be a good idea to "cache" all existing directories prior to the scan, and look them up this way. On the other hand, this may be not suitable for large sets of data, like files (which will be 600.000 or more)...
Perhaps some performance gain can be achieved using "index columns" or something like this, but I am not so familiar with this topic. If anybody has some references, just point me in the right direction...
Thanks,
Chris
PS: I am using NHibernate, Fluent Interface, Automapping and SQL Express (could switch to full SQL)
Note:
In the given problem, the path is not the ID in the database. The ID is an auto-increment, and I can't change this requirement (other reasons). So the real question is, what is the fastest way to "check for the existance of an object, where the ID is not known, just a property of that object"
And batching might be possible, by selecting a big group with something like "starts with C:Testfiles\" but the problem then remains, how do I know in advance how big this set will be. I cant select "max 1000" and check in this buffered dictionary, because i might "hit next to the searched dir"... I hope this problem is clear. The most important part, is, is buffering really affecting performance this much. If so, does it make sense to load the whole DB in a dictionary, containing only PATH and ID (which will be OK, even if there are 1.000.000 object, I think..)
First off, I highly recommend that you (anyone using NH, really) read Ayende's article about the differences between Get, Load, and query.
In your case, since you need to check for existence, I would use .Get(id) instead of a query for selecting a single object.
However, I wonder if you might improve performance by utilizing some knowledge of your problem domain. If you're going to scan the whole drive and check each directory for existence in the database, you might get better performance by doing bulk operations. Perhaps create a DTO object that only contains the PK of your DatabaseDirectory object to further minimize data transfer/processing. Something like:
Dictionary<string, DirectoryInfo> directories;
session.CreateQuery("select new DatabaseDirectoryDTO(dd.FullName) from DatabaseDirectory dd where dd.FullName in (:ids)")
.SetParameterList("ids", directories.Keys)
.List();
Then just remove those elements that match the returned ID values to get the directories that don't exist. You might have to break the process into smaller batches depending on how large your input set is (for the files, almost certainly).
As far as separation of concerns, just keep the operation at a repository level. Have a method like SyncDirectories that takes a collection (maybe a Dictionary if you follow something like the above) that handles the process for updating the database. That way your higher application logic doesn't have to worry about how it all works and won't be affected should you find an even faster way to do it in the future.