Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Hi I looking for In memory data grid or similar one.
My use case.
Data griding in memory. scale out available.
backup node available.
persistent backup available.
(optional) free or opensource solution
I did googling and I found candidates below
- Apache Ignite
- Redis cluster
- Hazelcast(community)
I prefer Ignite to Hazelcast because, Ignite support use direct buffer.
But I don't know Redis cluster partitioning whether it is stable or not. and, I don't know if apache ignite performance better than redis cluster or not.
Apache Ignite comparable to redis cluster? or impropert comparison?
Thanks for your answer
But I don't know Redis cluster partitioning whether it is stable or not
Redis cluster feature is stable since 3.x version and used in production by many companies.
Apache Ignite comparable to redis cluster? or impropert comparison?
Comparison Apache Ignite vs Redis only is wrong, because these projects have different grade. Redis is positioned as a storage and not as a Data Grid like Apache Ignite. So for proper comparison Apache Ignite should be compared vs Redisson - Redis Java Client
with features of In-Memory Data Grid. It offers the same features as Apache Ignite.
Redisson supports fully managed Redis services like AWS Elasticache, Azure Redis Cache. You don't need to manage/deploy/maintain Redis cluster by yourself of hire devops to do this. Apache Ignite doesn't offer such feature and you should manage/deploy/maintain it by yourself.
I used Redis in production for one of the largest US mobile network operator (IoT department). It is stable from 2.8 (Master/Slave) but cluster stable is from 3.2. Used 2.8 for 3 years and 3.2 cluster for 2 years on production with about 50k TPS load with no restart for years and no issues (except BGSAVE and memory issues but that was due to RAM limitations).
If we compare Redis and Apache Ignite:
Performance. Redis is faster, single thread and 100% in memory.
Data structure. Redis is key-value storage (even that is not a limitation, you can imagine and map almost everything in key-value models). Ingrid is a data grid as it was mentioned above.
If you are looking for a memory data grid and performance is on second priority then Ingrid will be more appropriate for you.
Redis only provides a key-value storage, while Ignite is much more functional. Here is a good feature comparison provided by GridGain: https://www.gridgain.com/resources/product-comparisons/redis-comparison
Which one to use, depends on your requirements and expectations.
Related
We have checked both Redis installed in Azure VM and Azure Redis Cache both are working same I can't see a difference in the performance Have anyone used both in large scale application if so can anyone share the performance and durability of both ?
Have analysed the following
Monitoring
In-zone replication
Multi-zone replication
Auto fail-over
Data persistence
Backup
Pricing
SSL Authentication & Encryption
All the above Azure redis have the upper hand
Still I want make sure which one is the best
Does using VM has any bottlenecks ?
I would go for Azure Redis Cache. Mainly because its fully managed. At the end of the day you do have nodes under the hood. But why should you care for maintaining a VM? Hotfixes? Patches, Seucirty Updates ..etc ..etc.
I would ask the question the other way around. Why should you use VMs at all?
MG
I've been doing some experiments with Apache Ignite and I've started to look into WAN replication. By this I mean there would be 2 (or more) data centres each running an Ignite cluster. There would be some caches that I would like kept in sync between the two data centres.
Does Apache Ignite support this? If so how is this configured as I can't find any mention of this in the documentation.
At the moment Ignite does not support caches spanning multiple clusters(nor cache mirroring). If however you mean there is only one Ignite cluster consisting of nodes located in different data centers(WAN),that would be possible though would most likely be inefficient! since you will have to use the Replicated Mode.
GridGain provides asynchronous WAN replication on top of Ignite as part of their payed solution: https://www.gridgain.com/docs/latest/administrators-guide/data-center-replication/configuring-replication
The Spring XD documentation (http://docs.spring.io/spring-xd/docs/1.0.0.RC1/reference/html/) recommends Zookeeper to be run in ensemble so that Zookeeper is highly available. There is not lot of details about Redis about high availability.
If I were to run 2 XD admin instances and say 4 Container instances, I see 3 options
should I run a Redis instance in each server that runs container or admin? In that case does the Distributed runtime work properly with different Redis instances handling transport of different modules?
OR
should I run 1 Redis instance in a separate server and configure all XD instances to talk to this instance? In this case 1 instance of Redis is not highly available
OR
should I configure Redis cluster or Redis Sentinel high availability? I am not sure how XD or any other client will connect to a cluster or HA.
Thanks
I would suggest that you run a single Redis instance, there are some settings for persistence that you can change that may meet your requirements.
http://redis.io/topics/persistence
We will be adding support for Redis Sentinal, certainly in the Spring XD 1.1 release, but possibly in a maintenance release depending on what library changes we need to pick up. Spring Data Redis and Spring Boot have recent updates to support Redis Sentinal.
If you are using Redis as a message transport and want higher guarantees, I would switch to using Rabbit HA configuration of the MessageBus.
Cheers,
Mark
I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)
Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.
I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.
Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.
I'm currently developing a project supported on a WebLogic clustered environment. I've successfully set up the cluster, but now I want a load-balancing solution (currently, only for testing purposes, I'm using WebLogic's HttpClusterServlet with round-robin load-balancing).
Is there any documentation that gives a clear comparison (with pros and cons) of the various ways of providing load-balancing for WebLogic?
These are the main topics I want to cover:
Performance (normal and on failover);
What failures can be detected and how fast is the failover recovery;
Transparency to failure (e.g., ability to automatically retry an idempotent request);
How well is each load-balancing solution adapted to various topologies (N-tier, clustering)
Thanks in advance for your help.
Is there any documentation that gives a clear comparison (with pros and cons) of the various ways of providing load-balancing for WebLogic?
It's not clear what kind of application you are building and what kind of technologies are involved. But...
You will find useful information in Failover and Replication in a Cluster and Load Balancing in a Cluster (also look at Cluster Implementation Procedures) but, no real comparison between the different options, at least not to my knowledge. But, the choice isn't that complex: 1. Hardware load balancers will perform better than software load balancers and 2. If you go for software load balancers, then WebLogic plugin for Apache is the recommended (by BEA) choice for production. Actually, for web apps, its pretty usual to put the static files on a web server and thus to use the Apache mod_wl plugin. See the Installing and Configuring the Apache HTTP Server Plug-In chapter.
These are the main topics I want to cover:
Performance (normal and on failover): If this question is about persistent session, WebLogic uses in memory replication by default and this works pretty well with a relatively low overhead.
What failures can be detected and how fast is the failover recovery: It is unclear which protocols you're using. But see Connection Errors and Clustering Failover.
Transparency to failure (e.g., ability to automatically retry an idempotent request): Clarifying the protocols you are using would make answering easier. If this question is about HTTP requests, then see Figure 3-1 Connection Failover.
How well is each load-balancing solution adapted to various topologies (N-tier, clustering): The question is unclear and too vague (for me). But maybe have a look at Cluster Architectures.
Oh, by the way, another nice chapter that you must read Clustering Best Practices.