How doesn't Hbase use any consensus algorithm like RAFT or Paxos? - locking

I know replication in HBase is done via append only log files, to maintain the same order of entries as that in master to get replicated to slave's WALs I assume there must be some consensus protocol. How is it designed so that there isn't any consensus protocol ?
By ordering for example, update a boolean column to true & then false. If updates are written to replica's wal files in reverse the final state could be true instead of false.

Hbase uses a primary/secondary architecture where only the primary receives requests and thus can achieve consistency without a consensus protocol since writes all go to a single place.
Your question is similar to asking why does postgres not use consensus protocol? Postgres also has a primary/secondary style replication mechanism

Related

Sharing Redis among services

I have a use case in which Microservice A has to do some heavy computation periodically and stores the result in Cache (redis) - something like k8s cron job.
Microservice B depends on the Cache written by A.(B only reads. never modifies cache).
But it looks like db is being shared here. Is this a good design?
(This aws doc shows 2 different services using same redis)
The contents of redis should be treated as ephemeral, not permanent. It's a cache. There is nothing wrong with your design as long as your microservices, especially Microservice B, behave gracefully if they do not find what they expect in redis.
This is actually a very common practice in projects using Redis (for example its the exact way you setup Redis to act as a message broker. One end writes the message and the other reads it.)
Databases are meant do be shared, especially in the modern days where a program can consist of hundreds of micro-parts.
You shouldn't have any issues related to Redis, BUT you HAVE to implement a fallback mechanism for Micro-service B, to handle the case in which no value is found, for example using a timeout and then read again, or getting some default value and using that.

how to achieve multi tenancy in redis?

Since I am fairly new with redis, I am trying to explore options and see how can I achieve multi tenancy with redis.
I read some documentation on redisLabs official page and looks like redis cluster mode supports multi tenancy out of the box with redis enterprise.
I am wondering if such a solution for multi tenancy is available in sentinel mode as well?
I may be completely confused with the multi tenancy that redis enterprise provides. May be it works in a sentinel mode also but nothing seems very clear to me.
Can someone throw some light on multi tenancy in redis and what mode supports it?
If you are going to use redis-cluster, then only one DB is supported.
Redis Cluster does not support multiple databases like the stand alone version of Redis. There is just database 0 and the SELECT command is not allowed.
If you are not going to use cluster mode, then you may take a look on the message posted by the creator of Redis about multiple databases (years ago)
I understand how this can be useful, but unfortunately I consider
Redis multiple database errors my worst decision in Redis design at
all... without any kind of real gain, it makes the internals a lot
more complex. The reality is that databases don't scale well for a
number of reason, like active expire of keys and VM. If the DB
selection can be performed with a string I can see this feature being
used as a scalable O(1) dictionary layer, that instead it is not.
With DB numbers, with a default of a few DBs, we are communication
better what this feature is and how can be used I think. I hope that
at some point we can drop the multiple DBs support at all, but I think
it is probably too late as there is a number of people relying on this
feature for their work.
Salvatore's message
Redis cluster documentation
What i may suggest is prefixing. We are using this method in a SaaS application and all different data types are prefixed with related customer name. We handle some of the operations on application layer.
If you want to go single instance/multiple database then you need to manage them on your codebase via using select command. There may be some libraries to manage them. One of the critical thing is that;
All databases are still persisted in the same RedisDB / Append Only file.

Is there any concept of auto commit in hbase?

I am new to hbase and want to learn more. I just want to know if there is any auto commit concept available in HBASE?
HBase documentation it is not an ACID compliant database. However, it does guarantee certain specific properties.
This specification enumerates the ACID properties of HBase.
Their is a concept of AutoFlush in HBase which is similar to autocommit.
How ever If you are using Apache Phoenix for fetching or updating data in HBase, then you can set property phoenix.connection.autoCommit to true by default it is false.
Commits come majorly at two places : insert/update(Put in HBase) and delete(Delete in HBase)
Since we are in Big Data environment, the requirements would be different when you are ingesting huge volumes of data.
As metnioned in Documentation, the autoCommit should be set to false - for better performance rather than each record maintained individually. It helps in handling buffers in general and load at region server for HBase.
Delete
HBase does not modify data in place, and so deletes are handled by creating new markers called tombstones. These tombstones, along with the dead values, are cleaned up on major compactions
One last word on Phoenix, any layer coming on top of HBase will eventually work based on HBase architecture. Hope this helps in your design

Redis replication and not RO slaves

Good day!
Suppose we have a redis-master and several slaves. The master goal is to store all data while slaves are used for quering data for users. Hovewer quering is a bit complex and some temporary data needs to be stored. And also I want to cache the query result for a couple of minutes.
How should I configure replication to save temporary data and caches?
Redis slaves have optional support to accept writes, however you have to understand a few limitations of writable slaves before to use them, since they have non trivial issues.
Keys created on the slaves will not support expires. Actually in recent versions of Redis they appear to work but are actually leaked instead of expired, until the next time you resynchronize the slave with the master from scratch or issue FLUSHALL or alike. There are deep reasons for this issue... it is currently not clear if we'll deprecate writable slaves at all, find a solution, or deny expires for writable slaves.
You may want, anyway, to use a different Redis numerical DB (SELECT command) in order to store your intermediate data (you may use MULTI/.../MOVE/EXEC transaction in order to generate your intermediate results in the currently selected DB where data belongs, and MOVE the keys off to some other DB, so it will be clear if keys are accumulating and you can FLUSHDB from time to time).
The keys you create on your slave are volatile, they may go away in any moment when the master will resynchronize with the slave. Does not look like an issue for you since if they key is no longer there, you could recompute, but care should be take,
If you elect this slave into a master you have additional keys inside.
So there are definitely things to take in mind in this setup, however it is doable in some way. However you may want to consider alternative strategies.
Lua scripts on the slave side in order to filter your data inside Lua. Not as fast as Redis C commands often.
Precomputation of data directly in the actual data set in order to make your queries possible just using read only commands.
MIGRATE in order to migrate interesting keys from a slave to an instance (another master) designed specifically to perform post-computations.
Hard to tell what's the best strategy without in-depth analysis of the actual use case / problem, but I hope this general guidelines help.

Is this a good use-case for Redis on a ServiceStack REST API?

I'm creating a mobile app and it requires a API service backend to get/put information for each user. I'll be developing the web service on ServiceStack, but was wondering about the storage. I love the idea of a fast in-memory caching system like Redis, but I have a few questions:
I created a sample schema of what my data store should look like. Does this seems like it's a good case for using Redis as opposed to a MySQL DB or something like that?
schema http://www.miles3.com/uploads/redis.png
How difficult is the setup for persisting the Redis store to disk or is it kind of built-in when you do writes to the store? (I'm a newbie on this NoSQL stuff)
I currently have my setup on AWS using a Linux micro instance (because it's free for a year). I know many factors go into this answer, but in general will this be enough for my web service and Redis? Since Redis is in-memory will that be enough? I guess if my mobile app skyrockets (hey, we can dream right?) then I'll start hitting the ceiling of the instance.
What to think about when desigining a NoSQL Redis application
1) To develop correctly in Redis you should be thinking more about how you would structure the relationships in your C# program i.e. with the C# collection classes rather than a Relational Model meant for an RDBMS. The better mindset would be to think more about data storage like a Document database rather than RDBMS tables. Essentially everything gets blobbed in Redis via a key (index) so you just need to work out what your primary entities are (i.e. aggregate roots)
which would get kept in its own 'key namespace' or whether it's non-primary entity, i.e. simply metadata which should just get persisted with its parent entity.
Examples of Redis as a primary Data Store
Here is a good article that walks through creating a simple blogging application using Redis:
http://www.servicestack.net/docs/redis-client/designing-nosql-database
You can also look at the source code of RedisStackOverflow for another real world example using Redis.
Basically you would need to store and fetch the items of each type separately.
var redisUsers = redis.As<User>();
var user = redisUsers.GetById(1);
var userIsWatching = redisUsers.GetRelatedEntities<Watching>(user.Id);
The way you store relationship between entities is making use of Redis's Sets, e.g: you can store the Users/Watchers relationship conceptually with:
SET["ids:User>Watcher:{UserId}"] = [{watcherId1},{watcherId2},...]
Redis is schema-less and idempotent
Storing ids into redis sets is idempotent i.e. you can add watcherId1 to the same set multiple times and it will only ever have one occurrence of it. This is nice because it means you don't ever need to check the existence of the relationship and can freely keep adding related ids like they've never existed.
Related: writing or reading to a Redis collection (e.g. List) that does not exist is the same as writing to an empty collection, i.e. A list gets created on-the-fly when you add an item to a list whilst accessing a non-existent list will simply return 0 results. This is a friction-free and productivity win since you don't have to define your schemas up front in order to use them. Although should you need to Redis provides the EXISTS operation to determine whether a key exists or a TYPE operation so you can determine its type.
Create your relationships/indexes on your writes
One thing to remember is because there are no implicit indexes in Redis, you will generally need to setup your indexes/relationships needed for reading yourself during your writes. Basically you need to think about all your query requirements up front and ensure you set up the necessary relationships at write time. The above RedisStackOverflow source code is a good example that shows this.
Note: the ServiceStack.Redis C# provider assumes you have a unique field called Id that is its primary key. You can configure it to use a different field with the ModelConfig.Id() config mapping.
Redis Persistance
2) Redis supports 2 types persistence modes out-of-the-box RDB and Append Only File (AOF). RDB writes routine snapshots whilst the Append Only File acts like a transaction journal recording all the changes in-between snapshots - I recommend adding both until your comfortable with what each does and what your application needs. You can read all Redis persistence at http://redis.io/topics/persistence.
Note Redis also supports trivial replication you can read more about at: http://redis.io/topics/replication
Redis loves RAM
3) Since Redis operates predominantly in memory the most important resource is that you have enough RAM to hold your entire dataset in memory + a buffer for when it snapshots to disk. Redis is very efficient so even a small AWS instance will be able to handle a lot of load - what you want to look for is having enough RAM.
Visualizing your data with the Redis Admin UI
Finally if you're using the ServiceStack C# Redis Client I recommend installing the Redis Admin UI which provides a nice visual view of your entities. You can see a live demo of it at:
http://servicestack.net/RedisAdminUI/AjaxClient/