How to store array of objects in Redis? - redis

I have an array of Objects that I want to store in Redis. I can break up the array part and store them as objects but I am not getting how I can get somethings like
{0} : {"foo" :"bar", "qux" : "doe"}, {1} : {"name" "Saras", "age" : 23}
and then search the db based on name and get the requested key back. I need something like this. but can't come close to getting it right.
incr id //correct
(integer) 3
get id //correct
"3"
SADD id {"name" : "Saras"} //wrong
SADD myset {"name" : "Saras"} //correct
(integer) 1
First is getting this part right.
Second is somehow getting the key from the value i.e.
if name==="Saras"
then key=1
Which I find tough. Or I can store it directly as array of objects and use a simple for loop.
for (var i = 0; i < userCache.users.length; i++) {
if (userCache.users[i].userId == userId && userCache.users[i].deviceId == deviceId) {
return i;
}
}
Kindly suggest which route is best with some implementation?

The thing I found working was storing the key as a unique identifier and stringifying the whole object while storing the data and applying JSON.parse while extracting it.
Example code:
client
.setAsync(obj.deviceId.toString(), JSON.stringify(obj))
.then((doc) => {
return client.getAsync(obj.deviceId.toString());
})
.then((doc) => {
return JSON.parse(doc);
}).catch((err) => {
return err;
});
Though stringifying and then parsing it back is a computationally heavy operation and will block the Node.js server if the size of JSON becomes large. I am probably ready to take a hit for lesser complexity because I know my JSON wouldn't be huge, but that needs to be kept in mind while going for this approach.

Redis is pretty simple key-value storage. Yes, there are other data structures like sets, but it has VERY limited query capabilities. For example, if you want to get find data by name, then you would have to to something like that:
SET Name "serialized data of object"
SET Name2 "serialized data of object2"
SET Name3 "serialized data of object3"
then:
GET Name
would return data.
Of course this means that you can't store two entries with the same names.
You can do limited text matching on keys using: http://redis.io/commands/scan
To summarize: I think you should use other tool for complex queries.

The first issue you have, SADD id {"name" : "Saras"} //wrong, is obvious since the "id" key is not of type set, it is a string type.
In redis the only access point to data is through its key.
As kiss said, perhaps you should be looking for other tools.

Related

Redux store design – two arrays or one

I guess this could be applied to any Redux-backed system, but imagine we are building simple React Native app that supports two actions:
fetching a list of messages from a remote API
the ability to mark those messages as having been read
At the moment I have a messagesReducer that defines its state as...
const INITIAL_STATE = {
messages: [],
read: []
};
The messages array stores the objects from the remote API, for example...
messages: [
{ messageId: 1234, title: 'Hello', body: 'Example' },
{ messageId: 5678, title: 'Goodbye', body: 'Example' }
];
The read array stores the numerical IDs of the messages that have been read plus some other meta data, for example...
read: [
{ messageId: 1234, meta: 'Something' },
{ messageId: 5678, meta: 'Time etc' }
];
In the React component that displays a message in a list, I run this test to see if the message should be shown as being read...
const isRead = this.props.read.filter(m => m.messageId == this.props.currentMessage.messageId).length > 0;
This is working great at the moment. Obviously I could have put a boolean isRead property on the message object but the main advantage of the above arrangement is that the entire contents of the messages array can be overwritten by what comes from the remote API.
My concern is about how well this will scale, and how expensive the array.filter method is when the array gets large. Also keep in mind that the app displays a list of messages that could be hundreds of messages, so the filtering is happening for each message in the list. It works on my modern iPhone, but it might not work so well on less powerful phones.
I'm also thinking I might be missing some well established best practice pattern for this sort of thing.
Let's call the current approach Option 1. I can think of two other approaches...
Option 2 is to put isRead and readMeta properties on the message object. This would make rendering the message list super quick. However when we get the list of messages from the remote API, instead of just overwriting the current array we would need to step through the JSON returned by the API and carefully update and delete the messages in the local store.
Option 3 is keep the current read array but also to add isRead and readMeta properties on the message object. When we get the list of messages from the remote API we can overwrite the entire messages array, and then loop through the read array and copy the data into the corresponding message objects. This would also need to happen whenever the user reads a message – data would be duplicated in two places. This makes me feel uncomfortable, but maybe it's ok.
I've struggled to find many other examples of this type of store, but it could be that I'm just Googling the wrong thing. I'm quite new to Redux and some of my terminology is probably incorrect.
I'd really value any thoughts on this.
Using reselect you can memorize the results of the array.filter to prevent the array from being filtered when neither the messages or read arrays have changed, which will allow you to use Option 1.
In this way, you can easily store the raw data in your reducers, and also access the computed data efficiently for display. A benefit from this is that you are decoupling the requirements for data structure and storage from the requirements for the way the data is displayed.
You can learn more about efficiently computing derived data in the redux docs
How about using a lookup table object, where the id's are the keys.
This way you don't need to filter nor loop to see if a certain message id is there. just check if the object holds a key with the corresponding id:
So in your case it will be:
const isRead = !!this.props.read[this.props.currentMessage.messageId];
Small running example:
const read = {
1234: {
meta: 'Something'
},
5678: {
meta: 'Time etc'
}
};
const someMessage = {id: 5678};
const someOtherMessage = {id: 999};
const isRead = id => !!read[id];
console.log('someMessage is ',isRead(someMessage.id));
console.log('someOtherMessage is ',isRead(someOtherMessage.id));
Edit
I recommend reading about Normalizing State Shape from the redux documentations.
There are great examples of designing and organizing the data and state.

Returning the value of a hash stores inside a set in redis

I'm new to Redis... like 30 minutes new, and I'm using the node-redis package to build a web app.
From what I see the best data structure to use to store a webpage's data would be a hash, but I also need to keep track of which webpages I have across the whole app. So this is what I'm doing :
//this is in Redis-CLI
//add the page and it's data to a hash
HMSET pages:/myurl url /myurl title myTitle description myDescription content myContent lang_mirror /frenchurl
//then I add the page to my set
sadd pages pages:/myurl
Now I want to return the values inside pages:/myurl, is there a single call to the set that can do this for me? Or something built into node-redis that does this?
Look into using the command HGETALL, like so:
HGETALL pages:/myurl
Edited based on comment:
Ah, so look at SORT but mind its complexity and memory footprint:
127.0.0.1:6379> HMSET pages:/myurl url /myurl title myTitle description myDescription content myContent lang_mirror /frenchurl
OK
127.0.0.1:6379> SADD pages pages:/myurl
(integer) 1
127.0.0.1:6379> SORT pages BY nosort GET *->url GET *->title GET *->description GET *->content GET *->lang_mirror
1) "/myurl"
2) "myTitle"
3) "myDescription"
4) "myContent"
5) "/frenchurl"
Attentively, you could look into using Lua server-side scripting for this.
You will have to make 2 calls at least, one to get the members of the set and then another one to get all the hashes, here is some sample code using node-redis.
redis.smembers('pages', function(err, pages) {
var multi = redis.multi();
for(var i=0; i<pages.length; ++i) {
multi.hgetall(pages[i]);
}
multi.exec(function(err, pageData) {
console.log(pageData);
});
});

Paging in Rediis

I am running local rediis for my application. I am using ServiceStack.Rediis client with C#.
I am storing items as an object type T with some key.
For example
Key "1234" : {
object {
name : "abcd",
value : "1"
}
}
I am storing something like 10000 objects of same type with key. I would like to apply pagination when i retrieve these objects and only show like 20 per page.
Is this possible? If yes, what should be a good way to resolve this?
Thanks,
Vivek
Use the SCAN command with a count of 20.

Problems understanding Redis ServiceStack Example

I am trying to get a grip on the ServiceStack Redis example and Redis itself and now have some questions.
Question 1:
I see some static indexes defined, eg:
static class TagIndex
{
public static string Questions(string tag) { return "urn:tags>q:" + tag.ToLower(); }
public static string All { get { return "urn:tags"; } }
}
What does that '>' (greater than) sign do? Is this some kind of convention?
Question 2:
public User GetOrCreateUser(User user)
{
var userIdAliasKey = "id:User:DisplayName:" + user.DisplayName.ToLower();
using (var redis = RedisManager.GetClient())
{
var redisUsers = redis.As<User>();
var userKey = redis.GetValue(userIdAliasKey);
if (userKey != null) return redisUsers.GetValue(userKey);
if (user.Id == default(long)) user.Id = redisUsers.GetNextSequence();
redisUsers.Store(user);
redis.SetEntry(userIdAliasKey, user.CreateUrn());
return redisUsers.GetById(user.Id);
}
}
As far as I can understand, first a user is stored with a unique id. Is this necessary when using the client (I know this is not for Redis necessary)? I have for my model a meaningful string id (like an email address) which I like to use. I also see a SetEntry is done. What does SetEntry do exactly? I think it is an extra key just to set a relation between the id and a searchable key. I guess this is not necessary when storing the object itself with a meaningful key, so user.id = "urn:someusername". And how is SetEntry stored as a Redis Set or just an extra key?
Question 3:
This is more Redis related but I am trying to figure out how everything is stored in Redis in order to get a grip on the example so I did:
Started redis-cli.exe in a console
Typed 'keys *' this shows all keys
Typed 'get id:User:DisplayName:joseph' this showed 'urn:user:1'
Typed 'get urn:user:1' this shows the user
Now I also see keys like 'urn:user>q:1' or 'urn:tags' if I do a 'get urn:tags' I get the error 'ERR Operation against a key holding the wrong kind of value'. And tried other Redis commands like smembers but I cannot find the right query commands.
Question 1: return "urn:tags>q:" + tag.ToLower(); gives you the key (a string) for a given tag; the ">" has no meaning for Redis, it's a convention of the developer of the example, and could have been any other character.
Question 3: use the TYPE command to determine the type of the key, then you'll find the right command in redis documentation to get the values.

What are the resources or tools used to manage temporal data in key-value stores?

I'm considering using MongoDB or CouchDB on a project that needs to maintain historical records. But I'm not sure how difficult it will be to store historical data in these databases.
For example, in his book "Developing Time-Oriented Database Applications in SQL," Richard Snodgrass points out tools for retrieving the state of data as of a particular instant, and he points out how to create schemas that allow for robust data manipulation (i.e. data manipulation that makes invalid data entry difficult).
Are there tools or libraries out there that make it easier to query, manipulate, or define temporal/historical structures for key-value stores?
edit:
Note that from what I hear, the 'version' data that CouchDB stores is erased during normal use, and since I would need to maintain historical data, I don't think that's a viable solution.
P.S. Here's a similar question that was never answered: key-value-store-for-time-series-data
There are a couple options if you wanted to store the data in MongoDB. You could just store each version as a separate document, as then you can query to get the object at a certain time, the object at all times, objects over ranges of time, etc. Each document would look something like:
{
object : whatever,
date : new Date()
}
You could store all the versions of a document in the document itself, as mikeal suggested, using updates to push the object itself into a history array. In Mongo, this would look like:
db.foo.update({object: obj._id}, {$push : {history : {date : new Date(), object : obj}}})
// make changes to obj
...
db.foo.update({object: obj._id}, {$push : {history : {date : new Date(), object : obj}}})
A cooler (I think) and more space-efficient way, although less time-efficient, might be to store a history in the object itself about what changed in the object at each time. Then you could replay the history to build the object at a certain time. For instance, you could have:
{
object : startingObj,
history : [
{ date : d1, addField : { x : 3 } },
{ date : d2, changeField : { z : 7 } },
{ date : d3, removeField : "x" },
...
]
}
Then, if you wanted to see what the object looked like between time d2 and d3, you could take the startingObj, add the field x with the value 3, set the field z to the value of 7, and that would be the object at that time.
Whenever the object changed, you could atomically push actions to the history array:
db.foo.update({object : startingObj}, {$push : {history : {date : new Date(), removeField : "x"}}})
Yes, in CouchDB the revisions of a document are there for replication and are usually lost during compaction. I think UbuntuOne did something to keep them around longer but I'm not sure exactly what they did.
I have a document that I need the historical data on and this is what I do.
In CouchDB I have an _update function. The document has a "history" attribute which is an array. Each time I call the _update function to update the document I append to the history array the current document (minus the history attribute) then I update the document with the changes in the request body. This way I have the entire revision history of the document.
This is a little heavy for large documents, there are some javascript diff tools I was investigating and thinking about only storing the diff between the documents but haven't done it yet.
http://wiki.apache.org/couchdb/How_to_intercept_document_updates_and_perform_additional_server-side_processing
Hope that helps.
I can't speak for mongodb but for couchdb it all really hinges on how you write your views.
I don't know the specifics of what you need but if you have a unique id for a document throughout its lifetime and store a timestamp in that document then you have everything you need for robust querying of that document.
For instance:
document structure:
{ "docid" : "doc1", "ts" : <unix epoch> ...<set of key value pairs> }
map function:
function (doc) {
if (doc.docid && doc.ts)
emit([doc.docid, doc.ts], doc);
}
}
The view will now output each doc and its revisions in historical order like so:
["doc1", 1234567], ["doc1", 1234568], ["doc2", 1234567], ["doc2", 1234568]
You can use view collation and start_key or end_key to restrict the returned documents.
start_key=["doc1", 1] end_key=["doc1", 9999999999999]
will return all historical copies of doc1
start_key=["doc2", 1234567] end_key=["doc2", 123456715]
will return all historical copies of doc2 between 1234567 and 123456715 unix epoch times.
see ViewCollation for more details