Usage of RediSearch for simple string and integer indexing on entities

Usage of RediSearch for simple string and integer indexing on entities - redis

I need to store simple entity objects (C#) in Redis.
I need also to search for them using secondary indexing and exact matching. After struggling with Redis natvie data types, I chose to try RediSearch for this task.
However I can't find how to use it the way I need.
My goal is very simple. Entity data structures are very straightforward. For example I have a Device class which sounds like this (I have omitted most string/int fields which must not be indexed, thus are not relevant for the example):
public class Device
{
public int Id { get; set; } // primary key
public string Name { get; set; } // no indexing on this
public string SerialNumber { get; set; } // this is indexed and unique
public string ServerNode { get; set; } // this is indexed
public int Status { get; set; } // this is indexed, possible values 0..4
public string Info { get; set; } // no indexing on this
}
I need to store instances of the Device class and retrieve them:
individually, by SerialNumber
as a list by ServerNode and/or Status
I'm switching from MySQL to Redis, and just to make a sql-equivalent,
I'm trying to replicate any of these sql-where clauses:
WHERE SerialNumber = 'RK01:"12345678"'
WHERE ServerNode = 'Node-US-01'
WHERE Status BETWEEN 1 AND 3
WHERE ServerNode = 'Node-IT-04' AND Status = 4
My project is written in C#, and I've begun using StackExchange.Redis and NRediSearch packages.
I cannot figure out how to create the correct schema indices (via redis-cli), nor I can understand how to properly store entities using NRediSearch Client class.
I don't need any of the full-text or tokenization/stopwords/etc. features of RediSearch, just "simple plain" text and integer indexing.
My first problem is the presence of punctuation characters (mostly hyphens, but in some cases also double quotes) in the strings.
I've created a schema based on TAG and NUMERIC field types:
FT.CREATE Device NOHL NOOFFSETS NOFREQS STOPWORDS 0 SCHEMA Id NUMERIC SORTABLE SerialNumber TAG SORTABLE ServerNode TAG SORTABLE Status NUMERIC SORTABLE
I've tried to add then "documents" via NRediSearch (but also via redis-cli for testing purposes) in the following way:
via redis-cli:
FT.ADD Device obj:Device:1 1.0 FIELDS Id 1 Name "FoobarDevice" SerialNumber "RK01:\"12345678\"" ServerNode "Node-US-01" Status 1 Info "this and that"
via NRediSearch
var rsc = new Client("Device", redis_database);
var doc = new Document("obj:Device:1");
doc.Set("Id", 1);
doc.Set("Name", "FoobarDevice");
doc.Set("SerialNumber", "RK01:\"12345678\"");
doc.Set("ServerNode", "Node-US-01");
doc.Set("Status", 1);
doc.Set("Info", "this and that");
rsc.AddDocument(doc);
if I type any of these commands in redis-cli, I get the right Device entity dumped on screen:
> FT.GET Device obj:Device:1
or
> HGETALL obj:Device:1
Now the problem: I am not able to perform any query on these indexes.
First of all, it's not clear to me the proper query syntax on the command line; here are some non working examples:
>FT.SEARCH Device #ServerNode:{Node-US-01}
>FT.SEARCH Device "#ServerNode:{Node-US-01}"
>FT.SEARCH Device #ServerNode:"{Node-US-01}"
>FT.SEARCH Device #ServerNode:{"Node-US-01"}
>FT.SEARCH Device #SerialNumber:{RK01:\"12345678\"}
I either get a syntax error or no results.
I know the serial number string is a little bit odd but I cannot change its format.
Should I store escaped version of string values in the document?
Which is the right syntax for reproducing the same results as the sql-like where clauses?
And how can I deal with string search on field values that include double quotes (") in the value itself?
Last but not least, I could not find any clarifying example or documentation about using NRediSearch Query class and QueryBuilder namespace
(but this will maybe be a little less obscure after I understand how RediSaerch "thinks").

You indexed the data just fine, the only problem is with your search query. Your data contains chars that RediSearch tokenize automatically (like - and :). In order to avoid tokanization you need to escape them in the query. Notice that when using redis-cli you must do double escaping so the escape char (\) will actually be sent to the redis.
With " its more problematic, you must escape it in the query and in the redis-cli so you will need three escapes (two so the escape char will be sent to redis and one to escape it in the cli).
It will look something like this:
127.0.0.1:6379> FT.SEARCH Device "#ServerNode:{Node\\-US\\-01}"
1) (integer) 1
2) "obj:Device:1"
3) 1) Id
2) "1"
3) Name
4) "FoobarDevice"
5) SerialNumber
6) "RK01:\"12345678\""
7) ServerNode
8) "Node-US-01"
9) Status
10) "1"
11) Info
12) "this and that"
127.0.0.1:6379> FT.SEARCH Device "#SerialNumber:{RK01\\:\\\"12345678\\\"}"
1) (integer) 1
2) "obj:Device:1"
3) 1) Id
2) "1"
3) Name
4) "FoobarDevice"
5) SerialNumber
6) "RK01:\"12345678\""
7) ServerNode
8) "Node-US-01"
9) Status
10) "1"
11) Info
12) "this and that"
You can read more about tokanization and escaping here: https://oss.redislabs.com/redisearch/Escaping/

Related

Redis Secondary Indicing maintenance add,remove are fine what about updates

All articles and write ups Ive read all talking about insertion only, and i have a problem.
I have implemented a secondary index mechanism using transactions (MULTI) and sets in order to save time when looking up entities, the index is saved by a set names by property name and value.
say we have a Person
Person Jack = new Person() { Id = 1, Name = Jack, Age = 30 }
Person Jena = new Person() { Id = 2, Name = Jena, Age = 30 }
when I choose to index the Age property and insert both, I look up the Age of 1 and 2 prep them and insert them to the corresponding set and update the index in the same transaction.
age_30_index holds ids 1,2
when deleting Jack, I prep Jacks age, 30, and remove id 1 from age_30_index and remove Jack from its set again within one transaction, all great, well .. almost.
the problem start when I want to change the Age and update the cache, look at the following scenario:
var p = GetEntity<Person>(id: 1)
p.Age = 31
UpdateEntity(p)
now with the concept above, i will have age_30_index -> 1,2 and age_31_index -> 1
that is because when updating the entity in cache I don't know what is the value of the property stored in cache.. therefore cant remove it from the index.
another problem is deleting by Id, or deleting like this:
var p = GetEntity<Person>(id: 1)
p.Age = 31
DeleteEntity(p)
An easy solution would be using a distributed lock and lock by entity name, get the entities from cache, delete the indexes and continue, but with tests I ran, the performance is lacking.
any other option I thought about is not thread safe because its not atomic.
Is there any other way to achieve what I'm tying to do?
the project is c# .net framework with redis on windows, redisearch.io seems nice but its out of scope unfortunately.

The Hard Way
If RediSearch is really something you don't want to get into because you are running Redis on Windows - my big comment is that you are using the wrong data structure for storing age, numerics generally should be stored in sorted sets so you can query them more easily
so for your age you would have the sorted set Person:Age and when you add Jack (let's say Jack's Id is 1 and age is 30 like your example) you would set your index like so:
ZADD Person:Age 30 1
then when you need to update Jack's age to 31, all you would need to do is update the member in the sorted set:
ZADD Person:Age 31 1
you'd then be able to query all the 31 year olds with a ZRANGE which in StackExchange.Redis looks like:
db.SortedSetRangeByScore("Person:Age", 31, 31, Exclude.None);
that's it, no need for you to look it up in an older collection and a newer one.
Your bigger concern about atomicity should be surrounding if you need to scale out redis. If you need to do that, there's no good atomic way to update a distributed index without RediSearch. That's because all the keys in a multi-key operation or transaction have to be in the same slot, or redis will not accept the operation. Because you are on windows I'm assuming you are not using a clustered multi-shard environment and that you are either stand-alone or sentinel.
Assuming you are running stand-alone or sentinel you can run the updates inside of a lua script which would allow you to run all of your commands sequentially basically your script would update your entity and then update all the accompanying indexes so if you have a sorted set for age in you example Person:Age your script to update that would look something like:
redis.call('HSET', KEYS[1], ARGV[1])
redis.call('ZADD', 'Person:Age', ARGV[1], KEYS[1])
naturally, you will have to change this script out with whatever the reality of your index is, so you'll probably want to do some kind of dynamic script generation. I'd use scripting over MULTI because MULTI on StackExchange.Redis has slightly different behavior than what's expected by redis due to it's architecture.
The best way with RediSearch and Redis.OM
The best way to handle this would be to use RediSearch (I'm putting this part second even though it really is the right answer because it sounds like something you want to avoid, given your environment). With RediSearch + Redis.OM .NET you would just update the item in question and call collection.Save() so for your example here - the following does everything you're looking for (insert, retrieve, update, delete)
using Redis.OM;
using TestRedisOmUpdate;
var provider = new RedisConnectionProvider("redis://localhost:6379");
provider.Connection.CreateIndex(typeof(Person));
var collection = provider.RedisCollection<Person>();
//insert
var id = await collection.InsertAsync(new() {Name = "Jack", Age = 30});
//update
foreach (var person in collection.Where(p => p.Name == "Jack"))
{
person.Age = 31;
}
await collection.SaveAsync();
//check
foreach (var person in collection.Where(p => p.Name == "Jack"))
{
Console.WriteLine($"{person.Name} {person.Age}");
}
//delete
provider.Connection.Unlink(id);
that all would work off of the model:
[Document]
public class Person
{
[Indexed]
public string Name { get; set; }
[Indexed]
public int Age { get; set; }
}
And you don't have the headache or hassle of maintaining the indexes yourself.

Site search is not filtering results based on Language

For e.g. There is an item with name "ABC" in English and its corresponding versions in Japanese, Korean and Chinese(with translated content). If search keyword is "ABC" , then in Korean 0 results are expected but instead of that its returning Korean version even though there are no words "ABC" expect the item name.
Below is the code for filtering:
query = query.Filter(item => item.Language == Sitecore.Context.Language.Name);
Fetching results:
query = query.Where(x => x.Title.Contains(word) || x.Content.Contains(word));
Please provide your inputs for this issue.
Sitecore Version :8.0
Search engine : Lucene

Your where clause includes || x.Content.Contains(word))
The Contentproperty of the SearchResultItem class is a concatenation of all tokenized fields, including the item name. So for this reason I think the behaviour is correct.
I recommend using the specific fields you want search on rather than using Content.
You may have a field named "content" in your item. If that's the case then you can avoid the conflict of property names in your POCO by simply mapping it to a different property as follows:
[IndexField("content")]
public virtual string ContentField { get; set; }
These blog posts refer to the _content computed index field from which the Content property is derived.
http://andrewwburns.com/2015/09/03/appending-to-the-_content-field-in-sitecore-search-7-2-and-7-5/
https://sdn.sitecore.net/upload/sitecore6/66/sitecore_search_and_indexing_sc66-usletter.pdf
http://sitecoregadgets.blogspot.co.uk/2009/11/working-with-lucene-search-index-in_25.html

Problems understanding Redis ServiceStack Example

I am trying to get a grip on the ServiceStack Redis example and Redis itself and now have some questions.
Question 1:
I see some static indexes defined, eg:
static class TagIndex
{
public static string Questions(string tag) { return "urn:tags>q:" + tag.ToLower(); }
public static string All { get { return "urn:tags"; } }
}
What does that '>' (greater than) sign do? Is this some kind of convention?
Question 2:
public User GetOrCreateUser(User user)
{
var userIdAliasKey = "id:User:DisplayName:" + user.DisplayName.ToLower();
using (var redis = RedisManager.GetClient())
{
var redisUsers = redis.As<User>();
var userKey = redis.GetValue(userIdAliasKey);
if (userKey != null) return redisUsers.GetValue(userKey);
if (user.Id == default(long)) user.Id = redisUsers.GetNextSequence();
redisUsers.Store(user);
redis.SetEntry(userIdAliasKey, user.CreateUrn());
return redisUsers.GetById(user.Id);
}
}
As far as I can understand, first a user is stored with a unique id. Is this necessary when using the client (I know this is not for Redis necessary)? I have for my model a meaningful string id (like an email address) which I like to use. I also see a SetEntry is done. What does SetEntry do exactly? I think it is an extra key just to set a relation between the id and a searchable key. I guess this is not necessary when storing the object itself with a meaningful key, so user.id = "urn:someusername". And how is SetEntry stored as a Redis Set or just an extra key?
Question 3:
This is more Redis related but I am trying to figure out how everything is stored in Redis in order to get a grip on the example so I did:
Started redis-cli.exe in a console
Typed 'keys *' this shows all keys
Typed 'get id:User:DisplayName:joseph' this showed 'urn:user:1'
Typed 'get urn:user:1' this shows the user
Now I also see keys like 'urn:user>q:1' or 'urn:tags' if I do a 'get urn:tags' I get the error 'ERR Operation against a key holding the wrong kind of value'. And tried other Redis commands like smembers but I cannot find the right query commands.

Question 1: return "urn:tags>q:" + tag.ToLower(); gives you the key (a string) for a given tag; the ">" has no meaning for Redis, it's a convention of the developer of the example, and could have been any other character.
Question 3: use the TYPE command to determine the type of the key, then you'll find the right command in redis documentation to get the values.

JsonConvert.DeserializeObject - Handling Empty Strings

I am using Json.NET to serialize an object to be sent to a compact framework 3.5 device (lucky me).
My class on the compact device is:
public class Specification
{
public Guid Id { get; set; }
public String Name { get; set; }
public String Instructions { get; set; }
}
The Json being returned sent to the device is (notice the null for instructions):
string json = #"{""Name"":""Test"",""Instructions"":null,""Id"":""093a886b-8ed4-48f0-abac-013f917cfd6a""}";
...and the method being used to deserialize the json is...
var result = JsonConvert.DeserializeObject<Specification>(json);
On the server I'm using the following to create the Json:
var serializer = new JsonSerializer();
serializer.Formatting = Formatting.None;
serializer.Serialize(context.HttpContext.Response.Output, this.Data);
The problem is that on the compact framework, it's failing to put a value into result.Instructions which is causing a null reference when it is referenced later in the code.
I'm using Newtonsoft.Json.Compact v3.5.0.0 (I think that's the latest version), and on the server I'm using Newtonsoft.Json 4.5.0.0.
Question
How can I either:
a) Change the server code to stick a "" instead of a null value in where a string is null.
or
b) Change the compact framework code to be able to handle null value strings.
Things I've tried
I've been looking through the documentation/examples of Json.Net and have tried a multitude of things like a implementing a DefaultContractResolver, and a custom JsonContract. Maybe the answer lies within those but my lack of understanding of Json.Net at this level isn't helping!!
Further info
I was using System.Runtime.Serialization.Json.DataContractJsonSerializer for the server side serialisation, which did generated quotes in the event of empty strings. Unfortunately, I need more flexibility with the serialization which is why I've started using Json.Net.
Any hints/tips appreciated.

OK - no answers, but having search all yesterday afternoon I went to bed, and search again this morning to find: Serializing null in JSON.NET, which pretty much answers my question.

Changing the value of an Id in RavenDB

We have an entity named Organization that we use the UniqueConstraints-bundle on. We have a property named NetName that is a UniqueConstraint and an automaticly generated Id.
Since this is unneccesary we want to use the NetName-property as Id instead. So that we don't need UniqueConstraints to know that it is unique and also get the benefit from being able to use Load when we have the NetName.
We needed to clean up our netname a bit before using it as an Id so we created a new temporary-property called TempUniqueNetName that now holds the value of:
"organizations/"+ CleanupId(this.NetName)
So we are now ready to simply move that value to our Id. But we can't get it to work. Our problem is that with the PatchRequest below we end up with a new property named Id in the database but the acctual Id still has the same value (see screenshot). Is there a better (correct) way to change the value of an Id?
The Entity:
class Organization {
public string Id { get; set; }
[UniqueConstraint]
public string NetName { get; set; }
public string TempUniqueNetName{ get; set; }
}
We want to do something like this:
_documentStore.DatabaseCommands.UpdateByIndex(typeof(Organizations).Name,
new IndexQuery(),
new[]
{
new PatchRequest()
{
Type = PatchCommandType.Rename,
Name = "TempUniqueNetName",
Value = new RavenJValue("Id")
}
});

I don't think you can change the document key via patching. It's not actually stored with the document or the metadata - it's copied into the #id metadata on load to give you the illusion that it's there, and the Raven Client copies it again into your own identity property in the document. But really, it's a separate value in the underlying esent document store. Raven would have to know specifically how to handle this and fake it for you.
You could manually copy the doc from the old id to the new one and delete the old, but that could be time consuming.
There isn't a great answer for renaming a document key right now. There really should be a DatabaseCommand for rekeying a single document, and separate PatchCommandType to rekey when patching. Perhaps this will be added to raven in the future.

You can check implemtation of PUT-DELETE usage for updating IDs in my github repo.
It should look something like this:
store.DatabaseCommands.Put(updatedKey, null, document.DataAsJson, newMetadata);
store.DatabaseCommands.Delete(oldKey, null);
https://github.com/Sevsoad/SagaUpdater/
Also here is some Raven documentation:
https://ravendb.net/docs/article-page/3.0/csharp/client-api/commands/documents/put

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Usage of RediSearch for simple string and integer indexing on entities - redis

Related

Redis Secondary Indicing maintenance add,remove are fine what about updates

Site search is not filtering results based on Language

Problems understanding Redis ServiceStack Example

JsonConvert.DeserializeObject - Handling Empty Strings

Changing the value of an Id in RavenDB

Categories

Resources