Redis: finding keys that match some pattern - redis

I want to get from redis all the keys that the number of elements in their list
(Each value is a list type) has more than x items?
How do I do that?
Any simple way or just have to use lua? if lua - how?

There are many ways to achieve that, each with its own pros and cons. The first decision you have to make is whether you want to have the answer to your query pre-prepared or computed ad-hoc.
For pre-prepared, you'll have to maintain an index of the lists by length. For ad-hoc, you'll have to scan all the lists and get their lengths at runtime.
Assuming you are trying to implement an ad-hoc query, a server-side Lua script is a good choice if you already know how. If not, you can either learn it (https://redis.io/commands/eval) or use a regular Redis client in your language of choosing.

Related

Storing keys in a redis set

I'm just starting to use redis and I want to retrieve a part of the key-value pairs (like a SELECT or find).
My plan is to save the relevant keys in a set. So I will use SMEMBERS, get the keys and then use MGET for each element.
Is it the proper way to achieve my goal or there's a better built-in mechanism?
Thanks.
Yes, that's the basic approach to "indexing" in Redis. If your set is
large, you'd want to use SSCAN instead of SMEMBERS. Also, don't use a
single MGET for everything, but rather make batches of constant size
(e.g. 100). These two methods will allow better concurrency.
– Itamar Haber

Dict vs Record in elm

While implementing a simple app I ran into the problem of trying to update a nested record. I found a solution online but it really seems like a whole lot of bloated code.
As I was looking for alternatives I found Dictionaries. This seem like a solution to that problem -- If I use a dictionary inside of a record I can avoid all that bloated code and get nested updates.
Seeing dictionaries and records next to each other made me wonder, why would I use a record instead of a dictionary, or vice versa? The two seem really similar to me, so I am not sure I see the advantage of one or the other. Of course I can see that there is a difference in syntax, but is that all ?
I learned somewhere that the access time complexity of Dict is O(log(n)) -- does it do a binary search on the keys ? -- but I can't find the access time complexity for record, but I am wondering if that is O(1) and that is one of the advantages.
Either way, they both seem to map to 1 single data structure in other languages (e.g Python's dictionaries, JS objects, Java hash-tables), why do we need two in elm ?
Dicts and records might seem very similar when coming from JavaScript, but in a statically typed language they are actually very different. I think just about the only property they have in common is that they are both key-value containers.
The biggest differences, I think, are that Dicts are homogeneous, meaning values must be of the same type, and "dynamically" keyed and sized, meaning keys are not statically checked (ie. at compile-time) and that key-value pairs can be added at runtime. Records on the other hand includes the key names and value types in the record type, which means they can hold values of different types, but also can't have keys added or removed at runtime without changing the type itself.
The benefits of easily being able to insert and update a Dict is something you pay for when you try to get it back out. Dict.get returns a Maybe which you'll then have to handle, because the type doesn't give any guarantee that it contains anything at all. You also won't get a compiler error if you mistype the name of a key.
Overall, a Dict forsakes most of the benefits of static typing. I think a good rule of thumb is that if you know the key names, you should most likely go with records. If you don't, go with Dict.
You also seem about right regarding performance, but I think that's a secondary concern. Record access should be equivalent to accessing the elements of an array by index, since so much information is known at compile time that it can essentially be compiled down to a fixed-size array.

redis - see if a string contains any key from a set of keys

I have a set of strings, which I was planning to store in a redis set. What I want to do is to check if any of these strings [s] is present inside a subject string ( say S1 ).
I read about SSCAN in redis but it allows me to search if any set member matches a pattern. I want the opposite way round. I want to check if any of the patterns matches my string. Is this possible?
What you want to do is not possible, but if your plan is to match prefixes, like in an autocomplete, you can take a look at sorted sets and ZRANGEBYLEX. You can take a look at this example. Even though it's not working right now, the code is very simple.
There are several ways to do it, it just depends how you want it done. SSCAN is a perfectly legitimate approach where you do the processing client-side and potentially over the network. Depending on your requirements, this may or may not be a good choice.
The opposite way is to let Redis do it for you, or as much as possible, to save on bandwidth, latency and client cpu. The trade off is, of course, having Redis' cpu do it so it may impact performance in some cases.
When it comes to letting Redis do the work, please first understand that the functionality you're describing is not included in it. So you need to build your own and, again, that depends on your specific use case (e.g. how big are s and S1, is S1 indexable as well, ...). Without this information it is hard to make accurate recommendations but the naive (mine) approach would be to use Lua for the job. The script's logic should either check all permutations of S1 for existence in s with SISMEMBER, or, do Lua pattern matching of all of s's members to S1.
This solution, of course, has plenty of room for optimization if some assumptions/rules are set.
Edit: Sorted sets and ZLEX* stuff are also possibly good for this kind of thing, as #Soveran points out. To augment his example and for further inspiration, see here for a reversed version and think of the possibilities :) I still can't understand how someone didn't go and implement FTS in Redis!

Elasticsearch querying multiple types and grouped by types?

Suppose I am to search against two types [cars] and [buildings], and I would want the results to be separated. Is there a way one can group results by types?
I understand one simple way will be to query each types separately, but for other use cases one may actually need to query tens or hundreds of types together. Is there a native way or hacky way(like using sort) to achieve this?
This type of grouping behavior is (currently) not available in elasticsearch. It has been a long standing request:
https://github.com/elasticsearch/elasticsearch/issues/256
There are two approaches that can help, both of which are far from perfect, but may be good enough for some use cases.
Client side aggregation. Request a lot more results than you plan on displaying and the then bucket those.
Using multi-query. This allows you to easily pass down some number of queries in a single batch, but will have potential scaling problems if the number of queries gets to large.
This is one feature that Solr has that elasticsearch doesn't, but I have never tried it. I used a similar feature with Autonomy IDOL years back, but the performance was abysmal.
If you want the results separated in groups of documents, you're going to have to restructure your documents, since, elasticsearch is focused on finding matching documents. You might get around this by designing a document that has child documents then you can query for matches on the parent document that represents your type.
I guess there might be some common field (let's say it's [price]) if you want to search against different types. Then it would be reasonable to add some different type like [price_aggregator] and put into it fields [type] and [price]. And then you could easily build your query against just one type. This requires some additional work while indexing and more memory to store index but it's much performant when you search.

Lucene: Getting terms from an unstored field

Is there any way to retrieve all the terms in a particular field which is unfortunately not stored. I can not rebuild the index. Positional based information is not necessary. I just need the list of terms.
UPDATE
I've built a sample index with one stored, another unstored field and tested it with Luke. I was wondering whether I could get access to all terms just like Luke did. This may not be the brightest idea, but might work.
Lucene uses two different concepts: Indexing and Storing. If you want to extract the terms, you dont need to store anything. You can use luke, as well as iterate over the terms through the API. For the java API you can use [1]: How can I get the list of unique terms from a specific field in Lucene?.
Luke is Open Source, so just look at how Luke does it.