Querying json with RedisJSON - redis

I have a list of users in json format i want to display as a table to an admin user on the user interface. I want to store all the users i redis and i am thinking this is how i would store
JSON.SET user $ '{"id":"1","user_id":"1","one": "1","two": "2","three": "3","rank_score": 1.44444}' EX 60
JSON.SET user $ '{"id":"2","user_id":"2","one": "2","two": "2","three": "3","rank_score": 2.44444}'
JSON.SET user $ '{"id":"3","user_id":"3","one": "3","two": "3","three": "3","rank_score": 3.44444}'
JSON.SET user $ '{"id":"4","user_id":"4","one": "4","two": "4","three": "4","rank_score": 4.44444}'
and later get all records user
this command JSON.GET user $ only returns the 4th record.
The multi command still gave me one record
127.0.0.1:6379> multi
OK
127.0.0.1:6379(TX)> json.get user
QUEUED
127.0.0.1:6379(TX)> exec
1) "{\"id\":\"4\",\"user_id\":\"4\",\"one\":\"4\",\"two\":\"4\",\"three\":\"4\",\"rank_score\":4.44444}"
I would like to:
> Get all
> Get all records where id is a given id
> Get all records order by rank_score desc
> Remove record where id is 2
How do i group all my users so that i can get all of them at once?
Edit:
I have made the users_list to nest everything in
JSON.SET user $ '{"users_list":[{"id":"1","user_id":"1","one": "1","two": "1","three": "1","rank_score": 1.44444},{"id":"2","user_id":"2","one": "2","two": "2","three": "2","rank_score": 2.44444},{"id":"3","user_id":"3","one": "3","two": "3","three": "3","rank_score": 3.44444},{"id":"4","user_id":"4","one": "4","two": "4","three": "4","rank_score": 4.44444}]}'
Readable
JSON.SET user $ '{
"users_list":[
{
"id":"1",
"user_id":"1",
"one": "1",
"two": "1",
"three": "1",
"rank_score": 1.44444
},
{
"id":"2",
"user_id":"2",
"one": "2",
"two": "2",
"three": "2",
"rank_score": 2.44444
},
{
"id":"3",
"user_id":"3",
"one": "3",
"two": "3",
"three": "3",
"rank_score": 3.44444
},
{
"id":"4",
"user_id":"4",
"one": "4",
"two": "4",
"three": "4",
"rank_score": 4.44444
}
]
}'
Will have more than 128 users result in an error?

RedisJSON is still a key-value store at its core. There is only one value with a given key. When you call a JSON.SET with the same path on an existing key again, any values provided override the ones that were there before. So calling JSON.SET user $ ... is overwriting keys.
Instead of using a user key, it is standard practice to use user as prefix, storing keys named like user:1, user:2, etc. This is preferable for several reasons (more performant than storing a single JSON document array, can define ACL rules for specific key patterns, etc), but most importantly for your question:
You can use RediSearch's full-text search using FT.SEARCH command to get keys that conform to any needed combination of parameters. For example, all users that have a specific id. It also allows sorting. For example, if you store users like
JSON.SET user:1 $ '{"id":"1","user_id":"1","one": "1","two": "2","three": "3","rank_score": 1.44444}' EX 60
JSON.SET user:2 $ '{"id":"2","user_id":"2","one": "2","two": "2","three": "3","rank_score": 2.44444}'
You can create an index using FT.CREATE for whatever parameters need to be searchable, for example to index their ids and rank_scores, the command is
FT.CREATE userSearch ON JSON PREFIX 1 user: SCHEMA id NUMERIC rank_score NUMERIC SORTABLE. To explain in detail, this says: create a search index named userSearch, for JSON-type keys, for all keys with a single prefix user:, with the following fields: id is a number, and rank_score is a sortable number.
Here is how you things you mentioned:
Get all (I assume you mean get all users): keys user:*
Get all records where id is a given id (let's we are looking for id 9000): FT.SEARCH userSearch "#id:9000"
Get all records order by rank_score desc: FT.SEARCH userSearch "#rank_score:*" SORTBY rank_score DESC
Remove record where id is 2 (I assume you mean all records): first FT.SEARCH userSearch "#id:2", then for each of the results JSON.DEL key.
For more info read the official documentation, it describes the capabilities of the system in detail.

Related

How to sort Redis list of objects using the properties of object

I have JSON data(see the ex below) which I'm storing in Redis list using 'rpush' with a key as 'person'.
Ex data:
[
{ "name": "john", "age": 30, "role": "developer" },
{ "name": "smith", "age": 45, "role": "manager" },
{ "name": "ram", "age": 35, "role": "tester" },
]
Now when I get this data using lrange person 0 -1, it gives me results as '[object object]'.
So, to actually get them with property names I'm storing them by stringifying them and parsing them back to objects to use the object properties.
But the issue with converting to a string is that I'm not able to sort them using any property, say name, age or role.
My question is, how do I store this JSON in Redis list and sort them using any of the properties.
Thanks.
Very recently I posted an answer for a very similar question.
The easiest approach is to use Redis Search module (which makese the approach portable to many clients / languages):
Store each needed object as separate key, following a prefixed key pattern (keys named prefix:something), and standard schema (all user keys are JSON, and all contain the field you want to sort).
Make a search index with FT.CREATE, with ON JSON parameter to search JSON-type keys, and likely PREFIX parameter to search just the needed keys, as well as x AS y parameters for all needed search fields, where x is field name, and y is type of field (TEXT, TAG, NUMERIC, etc. -- see documentation), optionally adding SORTABLE to the fields if they need to be sorted.
Use FT.SEARCH command with any combination of "#field=value" search parameters, and optionally SORTBY.
Otherwise, it's possible to just get all keys that follow a pattern using KEYS command, and use manual language-specific sorting code. That is of course more involved, depends on language and available libraries, and is therefore less portable.

Is it correct to do 1-to-1 mapping in Update API request param

There is a need for me to do bulk update of user details.
Let the object details have the following fields,
User First Name
User ID
User Last Name
User Email ID
User Country
An admin can upload the updated data of the users through a csv file. Values with mismatching data needs to be updated. The most probable request format for this bulk update request will be like:(Method 1)
"data" : {
"userArray" : [
{
"id" : 2343565432,
"f_name" : "David",
"email" : "david#testmail.com"
},
{
"id" : 2344354351,
"country" : "United States",
}
.
.
.
]
}
Method 2 : I would send the details in two arrays, one containing the list of similar filed values with respect to their user ids
"data" : {
"userArray" : [
{
"ids" : [23234323432, 4543543543, 45654543543],
"country" : ["United States", "Israel", "Mexico"]
},
{
"ids" : [2323432334543, 567676565],
"email" : ["groove#drivein.com", "zara#foobar.com"]
},
.
.
.
]
}
In method 1, i need to query the database for every user update, which will be more as the no of user edited is more. In contrast, if i use method 2, i query the database only once for each param(i add the array in the query and get those rows whose user id is present in the given array in a single query). And then i can update the each row with their respective details.
But overall in the internet, most of the update api had params in the format specified in method 1 which gives user good readability. But i need to know what will be advantage if i go with method 1 rather than method 2? (I save some query time in method 2 if the no of users count is large which can improve my performance)
I almost always see it being method 1 style.
Woth that said, I don't understand why your DB performance is based on the way the input data is structured. That's just the way information gets into your code.
You can have the client send the data as method 1 and then shim it to method 2 on the backend if that helps you structure the DB queries better

How to use redis to associate metadata with objects

Say for example I have a news article in redis:
SET article:id '{ "title": "this is the title", "content": "this is the content" }'
Now say I would like to associate some metadata like a tag, say "politics". What would the idiomatic way to do this be?
Would it be to add a set for the tags with the set ID following a convention like article:<id>:tags?
SADD article:id:tags 'politics'
You might want to consider using redis hash for that
HMSET article:id "title" "this is the title" "content" "this is the content" "tag" "politics"
If you want to fetch articles by tag, a reversed set might be better
SADD tags:politics article_id

How to do an automated index creation at ElasticSearch?

How to do an automated index creation at ElasticSearch?
Just like wordpress? See: http://gibrown.wordpress.com/2014/02/06/scaling-elasticsearch-part-2-indexing/
In our case we create one index for every 10 million blogs, with 25 shards per index.
Any light?
Thanks!
You do it in whatever your favorite scripting language is. You first run a query getting a count of the number of documents in the index. If it's beyond a certain amount you create a new one, either via an Elasticsearch API or a curl.
Here's the query to find the number of docs:
curl --XGET 'http://localhost:9200/youroldindex/_count'
Here's the index creation curl:
curl -XPUT 'http://localhost:9200/yournewindex/' -d '{
"settings" : {
"number_of_shards" : 25,
"number_of_replicas" : 2
}
}'
You will also probably want to create aliases so that your code can always point to a single index alias and then change the alias as you change your hot index:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html
You will probably want to predefine your mappings too:
curl -XPUT 'http://localhost:9200/yournewindex/yournewmapping/_mapping' -d '
{
"document" : {
"properties" : {
"message" : {"type" : "string", "store" : true }
}
}
}
'
Elasticsearch has fairly complete documentation, a few good places to look:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html

Ensuring Uniqueness for a sorted set in redis

I am trying to store media objects and have them retrievable by a certain time range through redis. I have chosen a sorted set data type to do this. I am adding elements like:
zAdd: key: media:552672 score: 1355264694
zAdd: key: media:552672 score: 1355248565
zAdd: key: media:552672 score: 1355209157
zAdd: key: media:552672 score: 1355208992
zAdd: key: media:552672 score: 1355208888
zAdd: key: media:552672 score: 1355208815
Where key is unique to the location id the media was taken at and the score is the creation time of the media object. And the value is a json_decode of the media object.
When I go to retrieve using zRevRangeByScore, occasionally there will be duplicate entries. I'm essentially using Redis as a buffer to an external API, if the users are making the same API call twice with X seconds, then I will retrieve the results from the cache, otherwise, I will add it to the cache, not checking to see if it already exists due to the definition of a set not containing duplicates.
Possible known issues:
If the media object attribute changes between caching it will show up as a duplicate
Is there a better way to store this type of data without doing checks on the redis client side?
TLDR;
What is the best way to store and retrieve objects in Redis where you can select a range of objects by timestamp and ensure that they are unique?
Lets make sure we're talking about the same things, so here is the terminology for Redis sorted sets:
ZADD key score member [score] [member]
summary: Add one or more members to a sorted set, or update its score if it already exists
key - the 'name' of the sorted set
score - the score (in our case a timestamp)
member - the string the score is associated with
A sorted set has many members, each with a score
It sounds like your are using a JSON encoded string of the object as the member. The member is what is unique in a sorted set. As you say, if the object changes it will be added as a new member to the sorted set. That is probably not what you want.
A sorted set is the Redis way to store data by timestamp, but the member that is stored in the set is usually a 'pointer' to another key in Redis.
From your description I think you want this data structure:
A sorted set storing all media by created timestamp
A string or hash for each unique media
I recommend storing the media objects in a hash as this allows more flexibility.
Example:
# add some members to our sorted set
redis 127.0.0.1:6379> ZADD media 1000 media:1 1003 media:2 1001 media:3
(integer) 3
# create hashes for our members
redis 127.0.0.1:6379> HMSET media:1 id 1 name "media one" content "content string for one"
OK
redis 127.0.0.1:6379> HMSET media:2 id 2 name "media two" content "content string for two"
OK
redis 127.0.0.1:6379> HMSET media:3 id 3 name "media three" content "content string for three"
OK
There are two ways to retrieve data stored in this way. If you need to get members within specific timestamp ranges (eg: last 7 days) you will have to use ZREVRANGEBYSCORE to retrieve the members, then loop through those to fetch each hash with HGETALL or similar. See pipelining to see how you can do the loop with one call to the server.
redis 127.0.0.1:6379> ZREVRANGEBYSCORE media +inf -inf
1) "media:2"
2) "media:3"
3) "media:1"
redis 127.0.0.1:6379> HGETALL media:2
1) "id"
2) "2"
3) "name"
4) "media two"
5) "content"
6) "content string for two"
If you only want to get the last n members (or eg: 10th most recent to 100th most recent) you can use SORT to get items. See the sort documentation for syntax and how to retrieve different hash fields, limit the results and other options.
redis 127.0.0.1:6379> SORT media BY nosort GET # GET *->name GET *->content1) DESC
1) "media:2"
2) "media two"
3) "content string for two"
4) "media:3"
5) "media three"
6) "content string for three"
7) "media:1"
8) "media one"
9) "content string for one"
NB: sorting a sorted hash by score (BY nosort) only works from Redis 2.6.
If you plan on getting media for the last day, week, month, etc. I would recommend using a seperate sorted set for each one and use ZREMRANGEBYSCORE to remove old members. You can then just use SORT on these sorted sets to retrieve the data.