Design Redis database table like SQL? - sql

Suppose my database table structure is like this
id name college address
1 xxx nnn xn
2 yyy nnm yn
3 zzz nnz zn
If i want to get the student details based on the name in sql like this
select * from student where name = 'xxx'
so how its is possible in redis database

Redis, like other NoSQL datastores, has different requirements based on what you are going to be doing.
Redis has several data structures that could be useful depending on your need. For example, given your desire for a select * from student where name = 'xxx' you could use a Redis hash.
redis 127.0.0.1:6379> hmset student:xxx id 1 college nnn address xn
OK
redis 127.0.0.1:6379> hgetall student:xxx
1) "id"
2) "1"
3) "college"
4) "nnn"
5) "address"
6) "xn"
If you have other queries though, like you want to do the same thing but select on where college = 'nnn' then you are going to have to denormalize your data. Denormalization is usually a bad thing in SQL, but in NoSQL it is very common.
If your primary query will be against the name, but you may need to query against the college, then you might do something like adding a set in addition to the hashes.
redis 127.0.0.1:6379> sadd college:nnn student:xxx
(integer) 1
redis 127.0.0.1:6379> smembers college:nnn
1) "student:xxx"
With your data structured like this, if you wanted to find all information for names going to college xn, you would first select the set, then select each hash based on the name returned in the set.
Your requirements will generally drive the design and the structures you use.

With just 6 principles (which I collected here), it is very easy for a SQL minded person to adapt herself to Redis approach. Briefly they are:
The most important thing is that, don't be afraid to generate lots of key-value pairs. So feel free to store each row of the table in a different key.
Use Redis' hash map data type
Form key name from primary key values of the table by a separator (such as ":")
Store the remaining fields as a hash
When you want to query a single row, directly form the key and retrieve its results
When you want to query a range, use wild char "*" towards your key.
The link just gives a simple table example and how to model it in Redis. Following those 6 principles you can continue to think like you do for normal tables. (Of course without some not-so-relevant concepts as CRUD, constraints, relations, etc.)

For plain, vanilla redis the other answers are completely correct, however, yesterday (02 - December - 2016) redis 4-rc1 is been released.
redis v4 provides support for modules and I just wrote a small module to embed SQLite into redis itself; rediSQL.
With that module you can actually use a fully functional SQL database inside your redis instace.

Redis just has some basic data structures with it, NoSQL and SQL are different worlds. But You can use Redis like some schemed SQL data store. There are funny program Redisql on github which try to play with Redis via SQL, and the idea behind Redisql is such that #sberry mentioned.

Hope it is not too late since the original question is long for six year of time. You may try my dbx plugin: https://github.com/cscan/dbx
Which support the simple SQL to maintain the hashes in REDIS. Something like this:
127.0.0.1:6379> dbx.select name, tel from phonebook where gender = 'F' order by age desc
or calling from shell
$ redis-cli "dbx.select name, tel from phonebook where gender = 'F' order by age desc"
Hope this help.

You can try searchbox framework. searchbox provides easy way for querying redis data with its Criteria api.

OnceDB is a full-text search in-memory database based on Redis. It supports data management like SQL relational databases and NoSQL schemaless databases.
OnceDB does not change the data storage structure of Redis and is fully compatible with Redis. Redis database files can be directly operated in OnceDB and then returned to Redis for use.
OnceDB automatically creates auxiliary indexes through operators:
= Ordinary field value, no index
# Primary key
? Grouping index
* Keyword grouping index, separated by ',' between keywords
\ Sort index, the score weight of the index is the value of the field
for example, execute the following command to add the user data:
upsert user username # dota password = 123456 title ? SDEI skills * java,go,c
> OK
you can search from the index by an operator, such as searching user data containing the c keyword, and printing the username and password fields.
find user 0 -1 username = * password = * skills * c
find user 0 -1 username = * password = * skills * c
1) (integer) 1
2) "user:dota"
3) "dota"
4) "123456"
5) "java,go,c"
Read more:OnceDB quick start

In SQL database design, we first put everything into the database and then figure out how we will query about that
In Redis Design, we first figure out what queries we need to answer, and then we are going to structure our data.
That is why Redis is super fast. Redis stores data as a hash in some cases. If the record has many attributes, in your case, a student might have "Age,name,class" attributes so storing "student` as the hash is useful.
In Redis, when you build your application, you have to see what you are going to store-users, sessions, products- and based on those things that your app is storing, you have to plan which data structures to use to store each thing.

Related

Out of Process in memory database table that supports queries for high speed caching

I have a SQL table that is accessed continually but changes very rarely.
The Table is partitioned by UserID and each user has many records in the table.
I want to save database resources and move this table closer to the application in some kind of memory cache.
In process caching is too memory intensive so it needs to be external to the application.
Key Value stores like Redis are proving inefficient due to the overhead of serializing and deserializing the table to and from Redis.
I am looking for something that can store this table (or partitions of data) in memory, but let me query only the information I need without serializing and deserializing large blocks of data for each read.
Is there anything that would provide Out of Process in memory database table that supports queries for high speed caching?
Searching has shown that Apache Ignite might be a possible option, but I am looking for more informed suggestions.
Since it's out-of-process, it has to do serialization and deserialization. The problem you concern is how to reduce the serialization/deserizliation work. If you use Redis' STRING type, you CANNOT reduce these work.
However, You can use HASH to solve the problem: mapping your SQL table to a HASH.
Suppose you have the following table: person: id(varchar), name(varchar), age(int), you can take person id as key, and take name and age as fields. When you want to search someone's name, you only need to get the name field (HGET person-id name), other fields won't be deserialzed.
Ignite is indeed a possible solution for you since you may optimize serialization/deserialization overhead by using internal binary representation for accessing objects' fields. You may refer to this documentation page for more information: https://apacheignite.readme.io/docs/binary-marshaller
Also access overhead may be optimized by disabling copy-on-read option https://apacheignite.readme.io/docs/performance-tips#section-do-not-copy-value-on-read
Data collocation by user id is also possible with Ignite: https://apacheignite.readme.io/docs/affinity-collocation
As the #for_stack said, Hash will be very suitable for your case.
you said that Each user has many rows in db indexed by the user_id and tag_id . So It is that (user_id, tag_id) uniquely specify one row. Every row is functional depends on this tuple, you could use the tuple as the HASH KEY.
For example, if you want save the row (user_id, tag_id, username, age) which values are ("123456", "FDSA", "gsz", 20) into redis, You could do this:
HMSET 123456:FDSA username "gsz" age 30
When you want to query the username with the user_id and tag_id, you could do like this:
HGET 123456:FDSA username
So Every Hash Key will be a combination of user_id and tag_id, if you want the key to be more human readable, you could add a prefix string such as "USERINFO". e.g. : USERINFO:123456:FDSA .
BUT If you want to query with only a user_id and get all rows with this user_id, this method above will be not enough.
And you could build the secondary indexes in redis for you HASH.
as the above said, we use the user_id:tag_id as the HASH key. Because it can unique points to one row. If we want to query all the rows about one user_id.
We could use sorted set to build a secondary indexing to index which Hashes store the info about this user_id.
We could add this in SortedSet:
ZADD user_index 0 123456:FDSA
As above, we set the member to the string of HASH key, and set the score to 0. And the rule is that we should set all score in this zset to 0 and then we could use the lexicographical order to do range query. refer zrangebylex.
E.g. We want to get the all rows about user_id 123456,
ZRANGEBYLEX user_index [123456 (123457
It will return all the HASH key whose prefix are 123456, and then we use this string as HASH key and hget or hmget to retrieve infomation what we want.
[ means inclusive, and ( means exclusive. and why we use 123457? it is obvious. So when we want to get all rows with a user_id, we shoud specify the upper bound to make the user_id string's leftmost char's ascii value plus 1.
More about lex index you could refer the article I mentioned above.
You can try apache mnemonic started by intel. Link -http://incubator.apache.org/projects/mnemonic.html. It supports serdeless features
For a read-dominant workload MySQL MEMORY engine should work fine (writing DMLs lock whole table). This way you don't need to change you data retrieval logic.
Alternatively, if you're okay with changing data retrieval logic, then Redis is also an option. To add to what #GuangshengZuo has described, there's ReJSON Redis dynamically loadable module (for Redis 4+) which implements document-store on top of Redis. It can further relax requirements for marshalling big structures back and forth over the network.
With just 6 principles (which I collected here), it is very easy for a SQL minded person to adapt herself to Redis approach. Briefly they are:
The most important thing is that, don't be afraid to generate lots of key-value pairs. So feel free to store each row of the table in a different key.
Use Redis' hash map data type
Form key name from primary key values of the table by a separator (such as ":")
Store the remaining fields as a hash
When you want to query a single row, directly form the key and retrieve its results
When you want to query a range, use wild char "*" towards your key. But please be aware, scanning keys interrupt other Redis processes. So use this method if you really have to.
The link just gives a simple table example and how to model it in Redis. Following those 6 principles you can continue to think like you do for normal tables. (Of course without some not-so-relevant concepts as CRUD, constraints, relations, etc.)
using Memcache and REDIS combination on top of MYSQL comes to Mind.

Mysql representation in redis

I dont have any experience in redis and i would like to use it to cache mysql php results but it getting abit complicated.
In mysql i have two tables
tbl_users: id, username, first_name, last_name
tbl_orders: order_id, order_name, order_date........
Suppose i would like to store both users and orders tables records in redis i picured a json looking like
{
"users":{
"1":{username:"jane", first_name:"user1", last_name:".."}
"2":{username:"jane", first_name:"user1", last_name:".."}
...........
}
"orders":{
"1":{order_name:"jane", order_date:"user1",....}
"1":{order_name:"jane", order_date:"user1", .....}
}
}
}
In this case should i create two redis servers, one for users and other for orders or how should i go about this.
First, you should read An introduction to Redis data types and abstractions, to understand what Redis is and what it can do.
It sounds like you have 3 things you want to cache in Redis:
A specific user:
This is what Redis is best at.
SET user:1 '{username:"jane", first_name:"user1", last_name:".."}',
then read the user JSON back with GET user:1
All users:
This probably isn't worth caching in Redis, since MySQL will be quite fast at returning all the rows in tbl_users already.
You could use a Redis list (via LPUSH and getting all of them LRANGE {listname} 0 -1) to cache it.
But you're just duplicating the functionality of a database table at this point, so I really wouldn't recommend it.
Orders for a specific user:
Again, MySQL should be able to do this efficiently for you, with a good index on tbl_orders, but Redis can cache it using a list as well.

How to design Redis data structures in order to perform queries similar to DB queries in redis?

I have tables like Job, JobInfo. And i want to perform queries like below -
"SELECT J.JobID FROM Job J, JobInfo B WHERE B.JobID = J.JobID AND BatchID=5850 AND B.Status=0 AND J.JobType<>2"
How shall i go about writing my redis data types so that i can map such queries in redis?
IF i try to map the rows of table job in a redis hash for e.g. (hash j jobid 1 status 2) & similarly the rows of table JobInfo in again a redis hash as (hash jinfo jobid 1 jobtype 3.)
So my tables can be a set of hashes. Job table can be set with entries JobSet:jobid & JobInfo table can be set with entries like JobInfoSet:jobid
But i am confused in when i will do a SINTER on JobSet & JobInfoSet. how am i going to query that hash to get keys? As in the hash content of set jobSet is not identical to hash content of table JobInfoSet (they may have different key value pair.
So what exactly am i going to get as an output of SINTER? And how am i going to query that output as key-value pair?
So the tables will be a collection of redis hashes
Redis is not designed to structure the data in SQL way. Beside a in-memory key value store, it supports five types of data structures: Strings, Hashes, Lists, Sets and Sorted Sets. At high level this is a sufficient hint that Redis is designed to solve performance problems that arises due to high computation in relational data models. However, if you want to execute sql query in a in-memory structure, you may want to look at memsql.
Let's break down the SQL statement into different components and I'll try to show how redis can accomplish various parts.
Select J.JobID, J.JobName from Job J;
We translate each row in "Job" into a hash in redis using the SQL primary index as the redis natural index in redis.
For example:
SQL
==JobId==|==Name==
123 Fred
Redis
HSET Job:123 Name Fred
which can be conceptualized as
Job-123 => {"Name":"Fred"}
Thus we can store columns as hash fields in redis
Let's say we do the same thing for JobInfo. Each JobInfo object has its own ID
JobInfo-876 => {"meta1": "some value", "meta2": "bla", "JobID": "123"}
In sql normally we would make a secondary index on JobInfo.JobID but in NoSql land we maintain our own secondary indexes.
Sorted Sets are great for this.
Thus when we want to fetch JobInfo objects by some field, JobId in this case we can add it to a sorted set like this
ZADD JobInfo-JobID 123 JobInfo-876
This results in a set with 1 element in it {JobInfo-876} which has a score of 123. I realize that forcing all JobIDs into the float range for the score is a bad idea, but work with me here.
Now when we want to find all JobInfo objects for a given JobID we just do a log(N) lookup into the index.
ZRANGEBYSCORE JobInfo-JobID 123 123
which returns "JobInfo-876"
Now to implement simple joins we simply reuse this JobInfo-JobID index by storing Job keys by their JobIDs.
ZADD JobInfo-JobID 123 Job-123
Thus when doing something akin to
SELECT J.JobID, J.Name, B.meta1 FROM Job, JobInfo USING (JobID).
This would translate to scanning through the JobInfo-JobID secondary index and reorganizing the Job and JobInfo objects returned.
ZRANGEBYSCORE JobInfo-JobID -inf +inf WITHSCORES
5 -> (Job-123, JobInfo-876)
These objects all share the same JobID. CLient side you'd then asynchronously fetch the needed fields. Or you could embed these lookups in a lua script. This lua script could make redis hang for a long time. Normally redis tries to be fair with clients and prefers you to have short batched queries instead of one long query.
Now we come to a big problem, what if we want to combine secondary indexes. Let's say we have a secondary index on JobInfo.Status, and another on Job.JobType.
If we make a set of all jobs with the right JobType and use that as a filter on the JobInfo-JobID shared secondary index then we not only eliminate the bad Job elements but also every JobInfo element. We could, I guess fetch the scores(JobID) on the intersection and refetch all JobInfo objects with those scores, but we lose some of the filtering we did.
It is at this point where redis breaks down.
Here is an article on secondary indexes from the creator of redis himself: http://redis.io/topics/indexes
He touches multi-dimensional indexes for filtering purposes. As you can see he designed the data structures in a very versatile way. One that is the most appealing is the fact that sorted set elements with the same score are stored in lexicographical order. Thus you can easily have all elements have a score of 0 and piggyback on Redis's speed and use it more like cockroachDB, which relies on a global order to implement many SQL features.
The other answer are completely correct for redis up to version 3.4
The latest releases of redis, from 4.0 onward, include supports for modules.
Modules are extremelly powerfull and it happens that I just wrote a small module to embed SQLite into redis itself; rediSQL.
With that module you can actually use a fully functional SQL database inside your redis instace.

How to perform select on a massive dataset of 10 billion+ rows

When a user registers, email must be unique, and the registration check must take 1 second at most.
How does Facebook / Google manage to perform a select on table with several billion rows, retrieving instant response.
Is it as simple as:
select email from users where email = "xxx#yyy.zzz" limit 1
Does having an index on email field and running this query on a super fast server do the trick?
Or is there more to it?
Short answer, yes. Though with that much data, I'm thinking you may want to look into things like sharding, etc. to make things even faster
When using SQL, indexing and uniqueness can be assured by utilizing primary keys. These primary keys are then used by the backend driving the database to ensure that there are no duplications in the table. Because the keys are used for indexing the rows in the table, this also means that lookup on even a large set of data is much quicker because of these indices. Set the primary key to be the email adress and you should be good to go in this case.
Even when using NoSQL databases like Mongo, Cassandra, etc. it is necessary to create indices on your data so that lookup is quick.

"Grouping" data in Redis

I have recently been looking at Redis and it seems almost perfect as I am doing something that mostly needs key-value based data structures.
As someone who has mostly used MySQL as a database I have got used to grouping data in tables and am quite confused as when reading about Redis I have seen no mention of tables or any other way of grouping data. Does this mean there is no concept of tables in Redis?
For example if I had a simple website where users could post comments about other users in a relational database I could have a table "users" and a table "comments", how would this be done using Redis?
Hopefully this is clear enough, thanks in advance.
Yes, redis is a super-powered key-value store, not a relational database. There are no tables.
However, something can be done. Take a look at LamerNews. It's a hackernews-like site that uses redis as its data store.
Users can be stored in a SET or LIST in REDIS.
User comments have to be stored in a HASH, with keys as commenter:commented and value would be the comment. So if user1 comments on user2 some text like "Hello hw do u do?", then our HASH which we can call as UserComments will have key and values as :
Key= user1:user2
value = "Hello hw do u do?"
From the HASH you could any time get all the comments posted by users, also if you tokenized the key you would get commenter and commented.