What is the purpose of colons within Redis keys - redis

I'm learning how to use Redis for a project of mine. One thing I haven't got my head around is what exactly the colons are used for in the names of keys.
I have seen names of key such as these:
users:bob
color:blue
item:bag
Does the colon separate keys into categories and make finding the keys faster? If so can you use multiple colons when naming keys to break them down into sub categories? Lastly do they have anything to do with defining different databases within the Redis server?
I have read through documentation and done numerous Google searches on the matter but oddly I can't find anything discussing this.

The colons have been in earlier redis versions as a concept for storing namespaced data. In early versions redis supported only strings, if you wanted to store the email and the age of 'bob' you had to store it all as a string, so colons were used:
SET user:bob:email bob#example.com
SET user:bob:age 31
They had no special handling or performance characteristics in redis, the only purpose was namespacing the data to find it again. Nowadays you can use hashes to store most of the coloned keys:
HSET user:bob email bob#example.com
HSET user:bob age 31
You don't have to name the hash "user:bob" we could name it "bob", but namespacing it with the user-prefix we instantly know which information this hash should/could have.

Colons are a way to structure the keys. They are not interpreted by redis in any way. You can also use any other delimiter you like or none at all. I personally prefer /, which makes my keys look like file system paths. They have no influence on performance but you should not make them excessively long since redis has to keep all keys in memory.
A good key structure is important to leverage the power of the sort command, which is redis' answers to SQL's join.
GET user:bob:color -> 'blue'
GET user:alice:color -> 'red'
SMEMBERS user:peter:friends -> alice, bob
SORT user:peter:friends BY NOSORT GET user:*:color -> 'blue', 'red'
You can see that the key structure enables SORT to lookup the user's colors by referencing the structured keys.

Related

String vs Hash for string type? Hash will have only one key instead of many

For example, I see many people are doing something like the following:
> set data:1000 "some string 1"
> set data:1001 "some string 2"
But what about using a hash to minimize the number of keys?
> hset data 1000 "some string 1"
> hset data 1001 "some string 2"
In the second way, it will only create one data key instead of creating many keys in the first way.
Which way is recommended?
I just see some people and tutorial are doing hset data:10 01 xxx. This is actually not related to my question. My question is simply asking what it's recommended between set data:1001 xxx and hset data 1001 xxx.
And I don't plan to modify hash-max-zipmap-entries and hash-max-zipmap-value. That means the hash will exceed the two values eventually. In such a config, are the two ways the same? or Which way is recommended?
Reasons to use strings:
you need per value timeouts
the values are semantically isolated
you're on cluster and want the values to be sharded over different nodes to spread load (sharding is based on the key)
Reasons to use hashes:
you want to be able to purge all of them at once (del/unlink), or have a timeout that impacts all of those values at once
you want to be able to enumerate them (prefer hscan/hgetall over scan/keys)
slightly better memory usage on the keys themselves
the values are semantically related
it is OK for all the values to be on the same node (whether single-server or cluster)
This all depends on the tradeoffs you want to support. In general, using hashes will have a smaller memory footprint than using simple keys. In fact, it is about an order of magnitude less memory. And access to hash values is constant time. So, if you are using redis simply as a key-value store, then hashes are way more efficient than simple keys.
However, you will want to use simple keys if you need to support expiration, keyspace notifications, etc, then you will need to use simple keys.
Just be careful to tweak the values for hash-max-zipmap-entries and hash-max-zipmap-value in your redis.conf in order to ensure that hashes are treated correctly for your environment.
You can read all about the details in the memory optimization section of the documentation.

Key types supported by Redis

What are the different key types supported by Redis? The documentation mentions all the various types (strings, set, hashmap, etc) of values supported by Redis, but I couldn't quiet find the key type information.
From redis documentation (Data types intro):
Redis keys
Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content
of a JPEG file. The empty string is also a valid key. A few other
rules about keys:
Very long keys are not a good idea. For instance a key of 1024 bytes is a bad idea not only memory-wise, but also because the
lookup of the key in the dataset may require several costly
key-comparisons. Even when the task at hand is to match the
existence of a large value, hashing it (for example with SHA1) is a
better idea, especially from the perspective of memory and
bandwidth.
Very short keys are often not a good idea. There is little point in writing "u1000flw" as a key if you can instead write
"user:1000:followers". The latter is more readable and the added
space is minor compared to the space used by the key object itself
and the value object. While short keys will obviously consume a bit
less memory, your job is to find the right balance.
Try to stick with a schema. For instance "object-type:id" is a good idea, as in "user:1000". Dots or dashes are often used for multi-word
fields, as in "comment:1234:reply.to" or "comment:1234:reply-to".
The maximum allowed key size is 512 MB.
From my experience any binary sequence typically means a String, but I may not be familiar with languages where you can achieve this by using other data types.
Keys in Redis are all strings, so it doesn't really matter what kind of value you pass into a client. Under-the-hood the RESP protocol is used and it will pass the value as a string to the engine.
Example:
ZADD some_key 1 some_value
some_key is always a string, even if you pass 3 as key, it is handled as a string. This is true for every client.

Out of Process in memory database table that supports queries for high speed caching

I have a SQL table that is accessed continually but changes very rarely.
The Table is partitioned by UserID and each user has many records in the table.
I want to save database resources and move this table closer to the application in some kind of memory cache.
In process caching is too memory intensive so it needs to be external to the application.
Key Value stores like Redis are proving inefficient due to the overhead of serializing and deserializing the table to and from Redis.
I am looking for something that can store this table (or partitions of data) in memory, but let me query only the information I need without serializing and deserializing large blocks of data for each read.
Is there anything that would provide Out of Process in memory database table that supports queries for high speed caching?
Searching has shown that Apache Ignite might be a possible option, but I am looking for more informed suggestions.
Since it's out-of-process, it has to do serialization and deserialization. The problem you concern is how to reduce the serialization/deserizliation work. If you use Redis' STRING type, you CANNOT reduce these work.
However, You can use HASH to solve the problem: mapping your SQL table to a HASH.
Suppose you have the following table: person: id(varchar), name(varchar), age(int), you can take person id as key, and take name and age as fields. When you want to search someone's name, you only need to get the name field (HGET person-id name), other fields won't be deserialzed.
Ignite is indeed a possible solution for you since you may optimize serialization/deserialization overhead by using internal binary representation for accessing objects' fields. You may refer to this documentation page for more information: https://apacheignite.readme.io/docs/binary-marshaller
Also access overhead may be optimized by disabling copy-on-read option https://apacheignite.readme.io/docs/performance-tips#section-do-not-copy-value-on-read
Data collocation by user id is also possible with Ignite: https://apacheignite.readme.io/docs/affinity-collocation
As the #for_stack said, Hash will be very suitable for your case.
you said that Each user has many rows in db indexed by the user_id and tag_id . So It is that (user_id, tag_id) uniquely specify one row. Every row is functional depends on this tuple, you could use the tuple as the HASH KEY.
For example, if you want save the row (user_id, tag_id, username, age) which values are ("123456", "FDSA", "gsz", 20) into redis, You could do this:
HMSET 123456:FDSA username "gsz" age 30
When you want to query the username with the user_id and tag_id, you could do like this:
HGET 123456:FDSA username
So Every Hash Key will be a combination of user_id and tag_id, if you want the key to be more human readable, you could add a prefix string such as "USERINFO". e.g. : USERINFO:123456:FDSA .
BUT If you want to query with only a user_id and get all rows with this user_id, this method above will be not enough.
And you could build the secondary indexes in redis for you HASH.
as the above said, we use the user_id:tag_id as the HASH key. Because it can unique points to one row. If we want to query all the rows about one user_id.
We could use sorted set to build a secondary indexing to index which Hashes store the info about this user_id.
We could add this in SortedSet:
ZADD user_index 0 123456:FDSA
As above, we set the member to the string of HASH key, and set the score to 0. And the rule is that we should set all score in this zset to 0 and then we could use the lexicographical order to do range query. refer zrangebylex.
E.g. We want to get the all rows about user_id 123456,
ZRANGEBYLEX user_index [123456 (123457
It will return all the HASH key whose prefix are 123456, and then we use this string as HASH key and hget or hmget to retrieve infomation what we want.
[ means inclusive, and ( means exclusive. and why we use 123457? it is obvious. So when we want to get all rows with a user_id, we shoud specify the upper bound to make the user_id string's leftmost char's ascii value plus 1.
More about lex index you could refer the article I mentioned above.
You can try apache mnemonic started by intel. Link -http://incubator.apache.org/projects/mnemonic.html. It supports serdeless features
For a read-dominant workload MySQL MEMORY engine should work fine (writing DMLs lock whole table). This way you don't need to change you data retrieval logic.
Alternatively, if you're okay with changing data retrieval logic, then Redis is also an option. To add to what #GuangshengZuo has described, there's ReJSON Redis dynamically loadable module (for Redis 4+) which implements document-store on top of Redis. It can further relax requirements for marshalling big structures back and forth over the network.
With just 6 principles (which I collected here), it is very easy for a SQL minded person to adapt herself to Redis approach. Briefly they are:
The most important thing is that, don't be afraid to generate lots of key-value pairs. So feel free to store each row of the table in a different key.
Use Redis' hash map data type
Form key name from primary key values of the table by a separator (such as ":")
Store the remaining fields as a hash
When you want to query a single row, directly form the key and retrieve its results
When you want to query a range, use wild char "*" towards your key. But please be aware, scanning keys interrupt other Redis processes. So use this method if you really have to.
The link just gives a simple table example and how to model it in Redis. Following those 6 principles you can continue to think like you do for normal tables. (Of course without some not-so-relevant concepts as CRUD, constraints, relations, etc.)
using Memcache and REDIS combination on top of MYSQL comes to Mind.

Redis keys function for match with multiple pattern

How i can find keys with multiple match pattern, for example i've keys with
foo:*, event:*, poi:* and article:* patterns.
how i find keys with redis keys function for match with foo:* or poi:* pattern, its like
find all keys with preffix foo:* or poi:*
You should not do this. KEYS is mainly a debug command. It is not supposed to be used for anything else.
Redis is not a database supporting ad-hoc queries: you are supposed to provide access paths for the data you put into Redis (using extra set or hash or zset indexes).
If you really need to run arbitrary boolean expressions on keys to select data, I would suggest to do it offline by using the rdb-redis-tools package.

Searching for values in Redis

Is it possible to search for occurrences of a specific value in redis?
It's easy enough to do the same for keys
SET firstname "John"
KEYS f?rstname
["firstname"]
But can one search for all occurrences of "John" or better yet "J*hn" ?
As far as I know there is no such option in Redis. As you mentioned KEYS pattern can be used to search for the keys with specific pattern, but similar functionality on values would result into search among all of the keys/fields/elements which may not be trivial since Redis has advanced data structures like hashes, sets and lists. Time complexity of this operation would be possibly even greater than O(N) which is why also KEYS command shouldn't be used in production environments.