Failover AND Load Balancing - mutually exclusive? - load-balancing

For the next generation of one of our products, I have been asked to design a system that has both failover capability (ie there are several nodes, and if one of the nodes crashes there is minimal / no data loss) and load balancing (so each of the nodes only handles part of the data). What I can't quite grok is how I can do both. Suppose a node has all the data but only processes an agreed subset. It changes element 8, say. Now all the other nodes have the wrong element 8. So I need to sync - tell all the other nodes element 8 changed - to maintain integrity. But surely that just makes a mockery of load-balancing?!

The short answer is, it depends very much on your application architecture.
It sounds like you are thinking about this using a bad design anti-pattern -- trying to solve for scale-out processing and disaster recovery at the same time in the same layer. If each node only handles part of the data, then it can't be a failover for the other nodes. A lot of people fall into this trap, since both scale-out and DR can be implemented using a type of federation ... but don't confuse the mechanism with the objective. I would respectfully submit you need to think about this problem a little differently.
The way to approach this problem is in two entirely separate layers:
Layer 1 -- app. Devise a high-level design for your app as if there is no requirement for DR. Ignore the fact there may be another instance of this app elsewhere that will be used in DR. Focus on functional & performance aspects of your app -- what the distinct subsystems should be, if any should scale out for workload reasons. This app as a whole handles 100% of the data -- decide if there is a scale-out / federation approach needed within the app itself -- that does not relate to the DR requirement.
Layer 2 -- DR. Now think of your app as a black box. How many instances of the black box will you need to meet your availability requirements, and how will you maintain the required degree of synchronization between those instances? What are the performance requirements for the failover & recovery (time to availability, allowable data loss if any, how long before you need the next failover env up & running)?
Back to Layer 1 -- choose an implementation approach for your high-level design that uses the recovery approach and tools you identified in Layer 2. For example, if you will use a master-slave DB approach for data synchronization among DR nodes, store everything you want to preserve in a failover in the DB layer, not in app-node-local files or memory. These choices depend on the DR framework you choose.
The design of the app layer and DR layer are related, but if you pick the right tools & approach, they don't have to be strongly coupled. E.g. in Amazon Web Services, you can use IP load balancing to forward requests to the failover app instance, and if you store all relevant data (including sessions and other transient things) in a database and use the DBMS native replication capability, it's pretty simple.
Bottom line:
Don't confuse performance scale-out nodes (app-internal) with DR nodes (entire apps)
Use your choice of DR approach to drive implementation decisions in the app layer
Good luck

Related

Multiple microservices and Redis - one database vs node per application in cloud

I would like to know what is the best practice for using Redis in cloud (Google Memorystore in my case, Standard Tier) for multiple microservices/applications. From what I have researched so far following options are available:
Use single cluster and database, scaled horizontally for all the microservices. This seems most cost-effective as I will use the exact amount of nodes I will need for the whole system. The data isolation is impacted here, but I can reduce the impact e.g. by prefixing the keys with the microservice name.
Use separate clusters and databases for each microservice. In this case the isolation is better, the scaling of the needed cluster will impact a single microservice only, but this doesn't seem cost effective, as many nodes may be underloaded (e.g. microservice M1 utilizes 50% capacity of a node, microservice M2 utilizes 40% capacity of a node so in case 1 both microservices would by served only by a single node).
In theory I could use multiple databases to isolated data in a single cluster, but as far as I have read this is not supported by Redis (and using multiple databases on a single node causes performance issues).
I am leaning towards option 1., but perhaps I am missing something?
Not sure about best practices, I will tell you my experience.
In general I would go with Option #2.
Each microservices gets it's own redis instance or cluster.
Redis clusters follow their own microservice life. Ex they might get respawned when you redeploy or restart a service.
You might pay a bit more but you gain in resiliency and maintenance hassle.

Load balanced instances of Moqui using the same DB instance

Is this configuration in Moqui possible? Everything I've seen on the subject of multiple instances (e.g. this question and the framework doc pages) involves per-instance databases, rather than a common shared data set.
We need the same data available in each application instance (and a consistent cache) so that we can load balance end-users across multiple instances. We will be supporting users world-wide, so we may potentially need to create application instances closer to the user's actual location in order to reduce latency; we also want to ensure we can make best use of elastic horizontal scaling in cloud-based deployments.
Multi-tenant and the newer multi-instance variation on that are the opposite of what you're looking for. They are for large numbers of small instances, not a single large distributed instance with multiple application server instances running against the same database.
For clustering support by default Moqui uses Hazelcast, though that is done through a series of interfaces that can be implemented with other distributed computing tools. Here is the component needed to run a multi-server cluster with Hazelcast:
https://github.com/moqui/moqui-hazelcast
The most important aspects of clustering are cache invalidation for the entity (database) caches and web session replication. It also supports other tools for distributing workload and data as mentioned in the readme.
For distribution across multiple data centers or geographical regions there are much bigger issues. Moqui Framework is primarily for transactional applications like accounting, inventory management, etc that need strict transactional consistency. Big data or NoSQL style eventual consistency and other similar approaches do not do well with ERP and other transactional applications, there is no way to use locks and such in the database to protect against double spend of funds, double reservation or issuance of inventory, etc.
Consider the challenge of distributed relational transactional databases, ie multi-master database clusters. With multi-master setups a transaction must propagate to and commit on all master nodes before it can be considered committed. This has performance impacts even if all master nodes are on the same local network, and unreasonable performance impact if the master nodes are in different data centers or geographical regions.
The main solution to this is geographical sharding at the application level, usually mirroring the structure of a large business with geographic divisions. Moqui has some tool level support for this sort of thing using Entity Sync or other tools to feed data from geographic regions to a central server (or cluster) where reporting, etc can be done. There is no OOTB Entity Sync or other configuration for this sort of deployment, it's not something there has been demand for yet. This only makes sense for extremely large global corporations, not a market where Moqui has any use to my knowledge.
If you're looking at doing something like ecommerce and need the ecommerce sites distributed more widely the problem is easier than coordinating inventory or accounting across multiple global entities. For that just have separate ecommerce instances in different data centers feeding order/etc data to the Moqui ERP instance, very much like any typical external ecommerce application.

Data model design guide lines with GEODE

We are soon going to start something with GEODE regarding reference data. I would like to get some guide lines for the same.
As you know in financial reference data world there exists complex relationships between various reference data entities like Instrument, Account, Client etc. which might be available in database as 3NF.
If my queries are mostly read intensive which requires joins across
tables (2-5 tables), what's the best way to deal with the same with in
memory grid?
Case 1:
Separate regions for all tables in your database and then do a similar join using OQL as you do in database?
Even if you do so, you will have to design it with solid care that related entities are always co-located within same partition.
Modeling 1-to-many and many-many relationship using object graph?
Case 2:
If you know how your join queries look like, create a view model per join query having equi join characteristics.
Confusion:
(1) I have 1 join query requiring Employee,Department using emp.deptId = dept.deptId [OK fantastic 1 region with such view model exists]
(2) I have another join query requiring, Employee, Department, Salary, Address joins to address different requirement
So again I have to create a view model to address (2) which will contain similar Employee and Department data as (1). This may soon reach to memory threshold.
Changes in database can still be managed by event listeners, but what's the recommendations for that?
Thanks,
Dharam
I think your general question is pretty broad and there isn't just one recommended approach to cover all UCs (primarily all your analytical views/models of your data as required by your application(s)).
Such questions involve many factors, such as the size of individual data elements, the volume of data, the frequency of access or access patterns originating from the application or applications, the timely delivery of information, how accurate the data needs to be, the size of your cluster, the physical resources of each (virtual) machine, and so on. Thus, any given approach will undoubtedly require application tuning, tuning GemFire accordingly and JVM tuning regardless of your data model. Still, a carefully crafted data model can determine the extent of such tuning.
In GemFire specifically, such tuning will involve different configuration such as, but not limited to: data management policies, eviction (Overflow) and expiration (LRU, or perhaps custom) settings along with different eviction/expiration thresholds, maybe storing data in Off-Heap memory, employing different partition strategies (PartitionResolver), and so on and so forth.
For example, if your Address information is relatively static, unchanging (i.e. actual "reference" data) then you might consider storing Address data in a REPLICATE Region. Data that is written to frequently (typically "transactional" data) is better off in a PARTITION Region.
Of course, as you know, any PARTITION data (managed in separate Regions) you "join" in a query (using OQL) must be collocated. GemFire/Geode does not currently support distributed joins.
Additionally, certain nodes could host certain Regions, thus dividing your cluster into "transactional" vs. "analytical" nodes, where the analytical-based nodes are updated from CacheListeners on Regions in transactional nodes (be careful of this), or perhaps better yet, asynchronously using an AEQ with AsyncEventListeners. AEQs can be separately made highly available and durable as well. This transactional vs analytical approach is the basis for CQRS.
The size of your data is also impacted by the form in which it is stored, i.e. serialized vs. not serialized, and GemFire's proprietary serialization format (PDX) is quite optimal compared with Java Serialization. It all depends on how "portable" your data needs to be and whether you can keep your data in serialized form.
Also, you might consider how expensive it is to join the data on-the-fly. Meaning, if your are able to aggregate, transform and enrich data at runtime relatively cheaply (compute vs. memory/storage), then you might consider using GemFire's Function Execution service, bringing your logic to the data rather than the data to your logic (the fundamental basis of MapReduce).
You should know, and I am sure you are aware, GemFire is a Key-Value store, therefore mapping a complex object graph into separate Regions is not a trivial problem. Dividing objects up by references (especially many-to-many) and knowing exactly when to eagerly vs. lazily load them is an overloaded problem, especially in a distributed, replicated data store such as GemFire where consistency and availability tradeoffs exist.
There are different APIs and frameworks to simplify persistence and querying with GemFire. One of the more notable approaches is Spring Data GemFire's extension of Spring Data Commons Repository abstraction.
It also might be a matter of using the right data model for the job. If you have very complex data relationships, then perhaps creating analytical models using a graph database (such as Neo4j) would be a simpler option. Spring also provides great support for Neo4j, led by the Neo4j team.
No doubt any design choice you make will undoubtedly involve a hybrid approach. Often times the path is not clear since it really "depends" (i.e. depends on the application and data access patterns, load, all that).
But one thing is for certain, make sure you have a good cursory knowledge and understanding of the underlying data store and it' data management capabilities, particularly as it pertains to consistency and availability, beginning with this.
Note, there is also a GemFire slack channel as well as a Apache DEV mailing list you can use to reach out to the GemFire experts and community of (advanced) GemFire/Geode users if you have more specific problems as you proceed down this architectural design path.

DDD - Persistence Model and Domain Model

I am trying to learn domain-driven design (DDD), and I think I got the basic idea. But there is something confusing me.
In DDD, are the persistence model and domain model different things? I mean, we design our domain and classes with only domain concerns in mind; that's okay. But after that when we are building our repositories or any other data persistence system, should we create another representation of our model to use in persistence layer?
I was thinking our domain model is used in persistence too, meaning our repositories return our domain objects from queries. But today, I read this post, and I'm a little confused:
Just Stop It! The Domain Model Is Not The Persistence Model
If that's true what would be the advantage of having separate persistence objects from domain objects?
Just think of it this way, the domain model should be dependent upon nothing and have no infrastructure code within it. The domain model should not be serializable or inherit from some ORM objects or even share them. These are all infrastructure concerns and should be defined separate from the domain model.
But, that is if you're looking for going for pure DDD and your project values scalability and performance over speed of initial development. Many times, mixing infrastructure concerns with your "domain model" can help you achieve great strides in speed at the cost of scalability. The point is, you need to ask yourself, "Are the benefits of pure DDD worth the cost in the speed of development?". If your answer is yes, then here is the answer to your question.
Let's start with an example where your application begins with a domain model and it just so happens that the tables in the database match your domain model exactly. Now, your application grows by leaps and bounds and you begin to experience performance issues when querying the database. You have applied a few well thought out indexes, but your tables are growing so rapidly that it looks like you may need to de-normalize your database just to keep up. So, with the help of a dba, you come up with a new database design that will handle your performance needs, but now the tables are vastly different from the way they were before and now chunks of your domain entities are spread across multiple tables rather than it being one table for each entity.
This is just one example, but it demonstrates why your domain model should be separate from your persistence model. In this example, you don't want to break out the classes of your domain model to match the changes you made to the persistence model design and essentially change the meaning of your domain model. Instead, you want to change the mapping between your new persistence model and the domain model.
There are several benefits to keeping these designs separate such as scalability, performance, and reaction time to emergency db changes, but you should weigh them against the cost and speed of initial development. Generally, the projects that will gain the most benefit from this level of separation are large-scale enterprise applications.
UPDATE FOR COMMENTATORS
In the world of software development, there is Nth number of possible solutions. Because of this, there exists an indirect inverse relationship between flexibility and initial speed of development. As a simple example, I could hard code logic into a class or I could write a class that allows for dynamic logic rules to be passed into it. The former option would have a higher speed of development, but at the price of a lower degree of flexibility. The latter option would have a higher degree of flexibility, but at the cost of a lower speed of development. This holds true within every coding language because there is always Nth number of possible solutions.
Many tools are available that help you increase your initial development speed and flexibility. For example, an ORM tool may increase the speed of development for your database access code while also giving you the flexibility to choose whatever specific database implementations the ORM supports. From your perspective, this is a net gain in both time and flexibility minus the cost of the tool (some of which are free) which may or may not be worth it to you based on the cost of development time relative to the value of the business need.
But, for this conversation in coding styles, which is essentially what Domain Driven Design is, you have to account for the time it took to write that tool you're using. If you were to write that ORM tool or even write your database access logic in such a way that it supports all of the implementations that tool gives you, it would take much longer than if you were to just hard-code the specific implementation you plan on using.
In summary, tools can help you to offset your own time to production and price of flexibility, often by distributing the cost of that time to everyone who purchases the tool. But, any code including the code that utilizes a tool, will remain affected by the speed/flexibility relationship. In this way, Domain Driven Design allows for greater flexibility than if you were entangle your business logic, database access, service access, and UI code all together, but at the cost of time to production. Domain Driven Design serves Enterprise level applications better than small applications because Enterprise level applications tend to have a greater cost for the initial development time in relation to business value and because they are more complex, they are also more subject to change requiring greater flexibility at a reduced cost in time.
In DDD, are persistence model and domain model different things?
In DDD you have the domain model and the repository. That's it! If inside the repository you will persist the domain model directly OR if you will convert it to a persistence model before persisting it, it's up to you! It's a matter of design, your design.
The domain doesn't care about how models are saved. It's an implementation detail of the repository and it doesn't matter for the domain. That's the entire purpose of Repositories: encapsulate persistence logic & details inside it.
But as developers we know it's not always possible to build a domain 100% immune from persistence interference, even they being different things. Here in this post I detail some Pros & Cons of having the domain model completely free and isolated from the persistence model.
In DDD, are persistence model and domain model different things?
Yes, but that does not necessarily imply a different set of classes to explicitly represent the persistence model.
If using a relational database for persistence an ORM such as NHibernate can take care of representing the persistence model through mappings to domain classes. In this case there are no explicit persistence model classes. The success of this approach depends on that mapping capabilities of the ORM. NHibernate, for example, can support an intermediate mapping class through component mappings. This allows the use of an explicit persistence model class when the need arises.
If using a document database for persistence, there is usually even less need for a persistence model since the domain model only needs to be serializable in order to be persisted.
Therefore, use an explicit persistence model class when there is a complex mapping that cannot be attained with ORM mappings to the domain model. The difference between the domain model and the persistence model remains regardless of implementation.

Distributed Database Computing - Is it really possible within the RDBMS paradigm?

I am asking this in the context of NoSQL - which achieves scalability and performance without being expensive.
So, if I needed to achieve massively parallel distributed computing across databases ...
What are the various methodologies available today (within the RDBMS paradigm) to achieve distributed computing with high-scalability?
Does database clustering & mirroring contribute in any way towards distributed computing?
I guess you are asking about scalability of RDBMS databases. Talking about NoSQL databases based on ( amazon dynamo, BigTable ) are a whole another topic. I am talking about HBase, Cassandra etc. There are also commerical products like Oracle Coherence thats more like a distributed cache and key value store , to put it crudely.
going back to rdbms,
Sharding
to scale RDBMS one can do cusstom sharding. Sharding is a technique where you have multiple table is possibly multiple hosts. And then you decide in a certain fashion to assign certain rows to certain tables. For example you can say that rows 1-1M goes to table1, 1M-2M goes to table2 etc. But, this is a difficult process from an administration point of view. A lot of large scale websites scale by relying on sharding. Other techniques worth mentioning are partioning and mysql federation and mysql cluster.
MPP databases
Then there are databases are there very RDBMS which does distribution and scaling for you. Terradata is the most successful of these companies. I believe they used postgres core code at some point. A significant number of fortune 500 companies and a lot of the airlines use Terradata. But, its ridiculously expensive. There are newer companies like greenplum, vertica, netezza.
Unless you're a very big company with extreme scalability requirements, you can horizontally and ACID scale up your DB by building a cluster of identical RDBMS instances and synchronizing them with JTA transactions.
Take a look to this Java/JDBC based article the JEPLayer framework is used but you can use straight JDBC and JTA code.
Within the RDBMS paradigm: Sharding.
Outside the RDBMS paradigm: Key-value stores.
My pick: (I come from an RDBMS background) Key-value stores of the tabluar type - HBase.
Within the RDBMS paradigm, sharding will not get you far.
Use the RDBMS paradigm to design your model, to get your project up and running.
Use tabular key-value stores to SCALE OUT.
Sharding:
A good way to think about sharding is to see it as user-account-oriented
DB design.
The all schema entities touched by a user-account are kept on one host.
The assignment of user to host happens when the user creates an account.
The least loaded host gets that user.
When that user signs on after account creation, he gets connected
to the host that has his data.
Each host has a set of user accounts.
The problem with this approach is that if the host gets hosed,
a fraction of users will be blacked out.
The solution to this is have a replicated standby host that
becomes the primary when the primary host encounters problems.
Also, it's a fairly rigid setup for processes where the design does
not change dramatically.
From the user standpoint, I've noticed that web sites
with a sharded DB backend are not as quick to "turn on a dime"
to create different business models on their platform.
Contrast this with web sites that have truly distributed
key-value stores. These businesses can host any range of
services. Their platform is just that - a platform.
It's not relational and it does have an API interface,
but it just seems to work.