How to use redis to associate metadata with objects - redis

Say for example I have a news article in redis:
SET article:id '{ "title": "this is the title", "content": "this is the content" }'
Now say I would like to associate some metadata like a tag, say "politics". What would the idiomatic way to do this be?
Would it be to add a set for the tags with the set ID following a convention like article:<id>:tags?
SADD article:id:tags 'politics'

You might want to consider using redis hash for that
HMSET article:id "title" "this is the title" "content" "this is the content" "tag" "politics"
If you want to fetch articles by tag, a reversed set might be better
SADD tags:politics article_id

Related

How to sort Redis list of objects using the properties of object

I have JSON data(see the ex below) which I'm storing in Redis list using 'rpush' with a key as 'person'.
Ex data:
[
{ "name": "john", "age": 30, "role": "developer" },
{ "name": "smith", "age": 45, "role": "manager" },
{ "name": "ram", "age": 35, "role": "tester" },
]
Now when I get this data using lrange person 0 -1, it gives me results as '[object object]'.
So, to actually get them with property names I'm storing them by stringifying them and parsing them back to objects to use the object properties.
But the issue with converting to a string is that I'm not able to sort them using any property, say name, age or role.
My question is, how do I store this JSON in Redis list and sort them using any of the properties.
Thanks.
Very recently I posted an answer for a very similar question.
The easiest approach is to use Redis Search module (which makese the approach portable to many clients / languages):
Store each needed object as separate key, following a prefixed key pattern (keys named prefix:something), and standard schema (all user keys are JSON, and all contain the field you want to sort).
Make a search index with FT.CREATE, with ON JSON parameter to search JSON-type keys, and likely PREFIX parameter to search just the needed keys, as well as x AS y parameters for all needed search fields, where x is field name, and y is type of field (TEXT, TAG, NUMERIC, etc. -- see documentation), optionally adding SORTABLE to the fields if they need to be sorted.
Use FT.SEARCH command with any combination of "#field=value" search parameters, and optionally SORTBY.
Otherwise, it's possible to just get all keys that follow a pattern using KEYS command, and use manual language-specific sorting code. That is of course more involved, depends on language and available libraries, and is therefore less portable.

Querying json with RedisJSON

I have a list of users in json format i want to display as a table to an admin user on the user interface. I want to store all the users i redis and i am thinking this is how i would store
JSON.SET user $ '{"id":"1","user_id":"1","one": "1","two": "2","three": "3","rank_score": 1.44444}' EX 60
JSON.SET user $ '{"id":"2","user_id":"2","one": "2","two": "2","three": "3","rank_score": 2.44444}'
JSON.SET user $ '{"id":"3","user_id":"3","one": "3","two": "3","three": "3","rank_score": 3.44444}'
JSON.SET user $ '{"id":"4","user_id":"4","one": "4","two": "4","three": "4","rank_score": 4.44444}'
and later get all records user
this command JSON.GET user $ only returns the 4th record.
The multi command still gave me one record
127.0.0.1:6379> multi
OK
127.0.0.1:6379(TX)> json.get user
QUEUED
127.0.0.1:6379(TX)> exec
1) "{\"id\":\"4\",\"user_id\":\"4\",\"one\":\"4\",\"two\":\"4\",\"three\":\"4\",\"rank_score\":4.44444}"
I would like to:
> Get all
> Get all records where id is a given id
> Get all records order by rank_score desc
> Remove record where id is 2
How do i group all my users so that i can get all of them at once?
Edit:
I have made the users_list to nest everything in
JSON.SET user $ '{"users_list":[{"id":"1","user_id":"1","one": "1","two": "1","three": "1","rank_score": 1.44444},{"id":"2","user_id":"2","one": "2","two": "2","three": "2","rank_score": 2.44444},{"id":"3","user_id":"3","one": "3","two": "3","three": "3","rank_score": 3.44444},{"id":"4","user_id":"4","one": "4","two": "4","three": "4","rank_score": 4.44444}]}'
Readable
JSON.SET user $ '{
"users_list":[
{
"id":"1",
"user_id":"1",
"one": "1",
"two": "1",
"three": "1",
"rank_score": 1.44444
},
{
"id":"2",
"user_id":"2",
"one": "2",
"two": "2",
"three": "2",
"rank_score": 2.44444
},
{
"id":"3",
"user_id":"3",
"one": "3",
"two": "3",
"three": "3",
"rank_score": 3.44444
},
{
"id":"4",
"user_id":"4",
"one": "4",
"two": "4",
"three": "4",
"rank_score": 4.44444
}
]
}'
Will have more than 128 users result in an error?
RedisJSON is still a key-value store at its core. There is only one value with a given key. When you call a JSON.SET with the same path on an existing key again, any values provided override the ones that were there before. So calling JSON.SET user $ ... is overwriting keys.
Instead of using a user key, it is standard practice to use user as prefix, storing keys named like user:1, user:2, etc. This is preferable for several reasons (more performant than storing a single JSON document array, can define ACL rules for specific key patterns, etc), but most importantly for your question:
You can use RediSearch's full-text search using FT.SEARCH command to get keys that conform to any needed combination of parameters. For example, all users that have a specific id. It also allows sorting. For example, if you store users like
JSON.SET user:1 $ '{"id":"1","user_id":"1","one": "1","two": "2","three": "3","rank_score": 1.44444}' EX 60
JSON.SET user:2 $ '{"id":"2","user_id":"2","one": "2","two": "2","three": "3","rank_score": 2.44444}'
You can create an index using FT.CREATE for whatever parameters need to be searchable, for example to index their ids and rank_scores, the command is
FT.CREATE userSearch ON JSON PREFIX 1 user: SCHEMA id NUMERIC rank_score NUMERIC SORTABLE. To explain in detail, this says: create a search index named userSearch, for JSON-type keys, for all keys with a single prefix user:, with the following fields: id is a number, and rank_score is a sortable number.
Here is how you things you mentioned:
Get all (I assume you mean get all users): keys user:*
Get all records where id is a given id (let's we are looking for id 9000): FT.SEARCH userSearch "#id:9000"
Get all records order by rank_score desc: FT.SEARCH userSearch "#rank_score:*" SORTBY rank_score DESC
Remove record where id is 2 (I assume you mean all records): first FT.SEARCH userSearch "#id:2", then for each of the results JSON.DEL key.
For more info read the official documentation, it describes the capabilities of the system in detail.

Using of structured data markup with review authority

I'm trying to structured data for producing the review like this on google search (please see the image) -
According to this link I've to write the following structured data markup -
<script type="application/ld+json">
{
"#context": "http://schema.org/",
"#type": "Review",
"itemReviewed": {
"#type": "Thing",
"name": "Super Book"
},
"author": {
"#type": "Person",
"name": "Joe"
},
"reviewRating": {
"#type": "Rating",
"ratingValue": "7",
"bestRating": "10"
},
"publisher": {
"#type": "Organization",
"name": "Washington Times"
}
}
</script>
But according to this link I've to get review from a trusted review authority. I'm wondering why we need the structured data markup (where we have static 'rating', 'bestRating' etc value definitely these shouldn't be static) or how we can combine this with trusted review authority for getting dynamic ratting that changes over time?
If I'm understanding your question correctly, I think you are confusing two issues. Google requires reviews to be created using Schema markup in order for the review to have a chance to rank directly in the SERPs.
It is the companies that provide reviews: Yelp, Angie's List, Washington Times, etc, that have to format their content management systems to upload user generated review data into the proper markup.
So if you're a web developer working for one of these companies, then it makes sense to code the CMS so that the listings are displayed using schema markup.
If you are the marketer, your job is to get reviews, not format the way they are getting displayed.
There are of course other ways to use Schema markup on your own site to boost organic traffic. Consider for example the first SERP screenshot displayed in this article.
Here the webmaster has used schema markup to list three upcoming events in their result, which gives them four links in a single listing. This causes the listing to stand out and gives increased incentive for users to click, almost guaranteeing a higher click-thru rate than if they'd have not used the markup.

Lucene - Custom Analyzer/Parser for JSON objects?

I have a requirement for a very specific Lucene implementation which stores multiple "Properties" fields with deserialized JSON strings.
Example:
Document:
ID: "99"
Text: "Lorepsum Ipsum"
Properties: "{
"lastModified": "1/2/2015",
"user": "johndoe",
"modifiedChars": 2,
"before": "text a",
"after": "text b",
}"
Properties:"{
"lastModified": "1/2/2013",
"user": "johncotton",
"modifiedChars": 6,
"before": "text aa",
"after": "text bbb",
}"
Properties: "{
"lastModified": "1/3/2015",
"user": "johnmajor",
"modifiedChars": 3,
"before": "text aa",
"after": "text b",
}"
I'm aware that ElasticSearch and Solr have implementations to lookup within JSON objects but I'm using Lucene's core API (3.0.5).
My goal is to use lucene's API and with some added implementation to search within the JSON strings, for example:
Building a type of BooleanQuery where at least one "Properties" Field MUST match all the values in the query. (e.g query "+user:tom +modifiedChars:3 +before:"text A", etc)
I have some ideas but I have no clue where to begin. What I'm asking is some high level ideas to achieve such implementation. A custom analyzer maybe to use with a query parser?
Consider it an open ended question. All suggestions are welcome.
If you will always search for the complete set of values...
Create a "property" field for each set. The value would just be the concatenated set of values ie "1/2/2015:johndoe:2:text a:text b".
Alternatively... create a separate doc for each set. This would allow you to search for different combinations of values without conflating the different sets.
Yes that might mean duplicating the Text field. If it's not big then I wouldn't care too much (especially if you're not using a "stored" field).
Do you need to need to combine text and property in your queries? ("text:ipsum AND property:xxx")
If not then put the text in yet another doc.
If the idea is to search in order to get the "ID" field then some combination of the above ought to work

Ensuring Uniqueness for a sorted set in redis

I am trying to store media objects and have them retrievable by a certain time range through redis. I have chosen a sorted set data type to do this. I am adding elements like:
zAdd: key: media:552672 score: 1355264694
zAdd: key: media:552672 score: 1355248565
zAdd: key: media:552672 score: 1355209157
zAdd: key: media:552672 score: 1355208992
zAdd: key: media:552672 score: 1355208888
zAdd: key: media:552672 score: 1355208815
Where key is unique to the location id the media was taken at and the score is the creation time of the media object. And the value is a json_decode of the media object.
When I go to retrieve using zRevRangeByScore, occasionally there will be duplicate entries. I'm essentially using Redis as a buffer to an external API, if the users are making the same API call twice with X seconds, then I will retrieve the results from the cache, otherwise, I will add it to the cache, not checking to see if it already exists due to the definition of a set not containing duplicates.
Possible known issues:
If the media object attribute changes between caching it will show up as a duplicate
Is there a better way to store this type of data without doing checks on the redis client side?
TLDR;
What is the best way to store and retrieve objects in Redis where you can select a range of objects by timestamp and ensure that they are unique?
Lets make sure we're talking about the same things, so here is the terminology for Redis sorted sets:
ZADD key score member [score] [member]
summary: Add one or more members to a sorted set, or update its score if it already exists
key - the 'name' of the sorted set
score - the score (in our case a timestamp)
member - the string the score is associated with
A sorted set has many members, each with a score
It sounds like your are using a JSON encoded string of the object as the member. The member is what is unique in a sorted set. As you say, if the object changes it will be added as a new member to the sorted set. That is probably not what you want.
A sorted set is the Redis way to store data by timestamp, but the member that is stored in the set is usually a 'pointer' to another key in Redis.
From your description I think you want this data structure:
A sorted set storing all media by created timestamp
A string or hash for each unique media
I recommend storing the media objects in a hash as this allows more flexibility.
Example:
# add some members to our sorted set
redis 127.0.0.1:6379> ZADD media 1000 media:1 1003 media:2 1001 media:3
(integer) 3
# create hashes for our members
redis 127.0.0.1:6379> HMSET media:1 id 1 name "media one" content "content string for one"
OK
redis 127.0.0.1:6379> HMSET media:2 id 2 name "media two" content "content string for two"
OK
redis 127.0.0.1:6379> HMSET media:3 id 3 name "media three" content "content string for three"
OK
There are two ways to retrieve data stored in this way. If you need to get members within specific timestamp ranges (eg: last 7 days) you will have to use ZREVRANGEBYSCORE to retrieve the members, then loop through those to fetch each hash with HGETALL or similar. See pipelining to see how you can do the loop with one call to the server.
redis 127.0.0.1:6379> ZREVRANGEBYSCORE media +inf -inf
1) "media:2"
2) "media:3"
3) "media:1"
redis 127.0.0.1:6379> HGETALL media:2
1) "id"
2) "2"
3) "name"
4) "media two"
5) "content"
6) "content string for two"
If you only want to get the last n members (or eg: 10th most recent to 100th most recent) you can use SORT to get items. See the sort documentation for syntax and how to retrieve different hash fields, limit the results and other options.
redis 127.0.0.1:6379> SORT media BY nosort GET # GET *->name GET *->content1) DESC
1) "media:2"
2) "media two"
3) "content string for two"
4) "media:3"
5) "media three"
6) "content string for three"
7) "media:1"
8) "media one"
9) "content string for one"
NB: sorting a sorted hash by score (BY nosort) only works from Redis 2.6.
If you plan on getting media for the last day, week, month, etc. I would recommend using a seperate sorted set for each one and use ZREMRANGEBYSCORE to remove old members. You can then just use SORT on these sorted sets to retrieve the data.