We are using apache ignite as a IMDG in our micro services environment.
For scalability and load balancing we are considering to use a service registry like eureka or consul which is supported by spring cloud for the deployed micro services.
There is a concept of service grid providing support for node singleton and cluster singleton in apache ignite.
I also see WCF,weblogic and JBoss to having the same sort of features.
I am trying to understand what these service grids are and if i can use them to achieve the same benefits as the eureka service registry provided by netflix and supported by spring cloud.
Can someone guide if i can achieve the same using service grid in apache ignite.
No, you cannot use Apache Ignite Service Grid for the same purposes as Eureka. Eureka is used for load balancing and service discovery over WAN. Using Ignite clusters spanning over multiple AWS zones and remote client machines is not the most efficient way of using it.
More information on Ignite Service Grid can be found here - http://apacheignite.gridgain.org/docs/service-grid
Thanks!
UPD (for the 1st comment):
You cannot (in most cases) span and effectively use Ignite over WAN networks with high latencies and lower throughput characteristics.
As far as local clusters in non-cloud environments - go ahead! This is the best environment for systems of such kind.
Related
For a large online application, use k8s to run it. The scale maybe daily activity user 500,000.
The application inside k8s need messaging feature - Pub/Sub, there are these options:
Kafka
RabbitMQ
Redis
Kafka
It needs zookeeper and good to run on os depends on disk I/O. So if install it into k8s cluster, how? The performance will be worse?
And, if keep Kafka outside of the k8s cluster, connect Kafka from application inside the k8s cluster, how about that performance? They are in the different layer, won't be slow?
RabbitMQ
It's slow than Kafka, but for a daily activity user 500,000 application, is it good enough? If so, maybe it's a good choice.
Redis
It's another option. Maybe the most simple one. But from the internet I got that it will lose message sometimes. If true, that's terrible.
So, the most important thing is, use Kafka(also with zookeeper) on k8s, good or not in this use case?
Yes, running Kafka on Kubernetes is great. Check out this example: https://github.com/Yolean/kubernetes-kafka. It includes ZooKeeper and Kafka as StatefulSets.
PS. Running any of the services in your question on Kubernetes will be pleasant. You can Google the name of the service and "kubernetes" and find example manifests. Many examples here: https://github.com/kubernetes/charts.
For Kafka, you can find some suggestion here. Kubernetes 1.7+ supports local persistent volume, which may be good for Kafka deployment.
You can also take a look to the following project :
https://github.com/EnMasseProject/barnabas
It's about running Kafka on Kubernetes and OpenShift as well. It provides deploying with StatefulSets with persistent volumes or just in memory (for developing or just testing purpose). It provides deploying for Kafka Connect and Prometheus metrics as well.
Another simple configuration of Kafka/Zookeeper on Kubernetes in DigitalOcean with external access:
https://github.com/StanislavKo/k8s_digitalocean_kafka
You can connect to Kafka from outside of AWS/DO/GCE by regular binary protocol. Connection is PLAINTEXT or SASL_PLAINTEXT (user/password).
Kafka cluster is StatefulSet, so you can scale cluster easily.
We have scaled out all sevices in our system by having more than one instance of them registered in eureka service registry.
Also, they are also proxied by a zuul server in the front.
My question is how can we ensure the scalability of zuul proxy when accessed from clients.
One solution i can think of is having multiple instances of the proxy registered in eureka registry. But if that is done how do we decide on which of the instances would be exposed to the clients.
We faced the same issue in our application, having multiple instances of multiple types of micro-service-type applications on our backend. All servers registered with Eureka. The problem is that we also had multiple security gateways configured (based on the architecture described in this excellent tutorial: https://spring.io/guides/tutorials/spring-security-and-angular-js/).
Eventually we decided to use a hardware http load balancer that calls our security gateways in a round-robin approach (our solution is on-prem).
We use Redis with #EnableHttpRedisSession annotation to have the spring session synced across all the servers, so the http load balancer does not have to deal with sticky sessions or stateful considerations. It just does a round-robin to all the security gateways. It doesn't matter if the load balancer hits SG1, SG2 or SG3, they all share the same session information coming from Redis (which is also configured for fail-over with Redis Sentinel).
Is it possible to implement a cache in weblogic (10.3.5.0) which is accessible on every instance of a cluster ?
Does weblogic API offers some API with RMI who offers this possibility ?
Is there a framework like ehcache who offers this possibility ?
Oracle Coherence can handle this situation and it comes bundled with WebLogic 10.3.5.
http://docs.oracle.com/cd/E21764_01/apirefs.1111/e13952/taskhelp/coherence/CreateCoherenceServers.html
"Coherence servers (also known as Coherence data nodes) are stand-alone cache servers, dedicated JVM instances responsible for maintaining and managing cached data."
Recently several service discovery tools have become popular/"mainstream", and I’m wondering under what primary use cases one should employ them instead of traditional load balancers.
With LBs, you cluster a bunch of nodes behind the balancer, and then clients make requests to the balancer, who then (typically) round robins those requests to all the nodes in the cluster.
With service discovery (Consul, ZK, etc.), you let a centralized “consensus” service determine what nodes for particular service are healthy, and your app connects to the nodes that the service deems as being healthy. So while service discovery and load balancing are two separate concepts, service discovery gives you load balancing as a convenient side effect.
But, if the load balancer (say HAProxy or nginx) has monitoring and health checks built into it, then you pretty much get service discovery as a side effect of load balancing! Meaning, if my LB knows not to forward a request to an unhealthy node in its cluster, then that’s functionally equivalent to a consensus server telling my app not to connect to an unhealty node.
So to me, service discovery tools feel like the “6-in-one,half-dozen-in-the-other” equivalent to load balancers. Am I missing something here? If someone had an application architecture entirely predicated on load balanced microservices, what is the benefit (or not) to switching over to a service discovery-based model?
Load balancers typically need the endpoints of the resources it balances the traffic load. With the growth of microservices and container based applications, runtime created dynamic containers (docker containers) are ephemeral and doesnt have static end points. These container endpoints are ephemeral and they change as they are evicted and created for scaling or other reasons. Service discovery tools like Consul are used to store the endpoints info of dynamically created containers (docker containers). Tools like consul-registrator running on container hosts registers container end points in service discovery tools like consul. Tools like Consul-template will listen for changes to containers end points in consul and update the load balancer (nginx) for sending the traffic to. Thus both Service Discovery Tools like Consul and Load Balancing tools like Nginx co-exist to provide runtime service discovery and load balancing capability respectively.
Follow up: what are the benefits of ephemeral nodes (ones that come and go, live and die) vs. "permanent" nodes like traditional VMs?
[DDG]: Things that come quickly to my mind: Ephemeral nodes like docker containers are suited for stateless services like APIs etc. (There is traction for persistent containers using external volumes - volume drivers etc)
Speed: Spinning up or destroying ephemeral containers (docker containers from image) takes less than 500 milliseconds as opposed to minutes in standing up traditional VMs
Elastic Infrastructure: In the age of cloud we want to scale out and in according to users demand which implies there will be be containers of ephemeral in nature (cant hold on to IPs etc). Think of a markerting campaign for a week for which we expect 200% increase in traffic TPS, quickly scale with containers and then post campaign, destroy it.
Resource Utilization: Data Center or Cloud is now one big computer (compute cluster) and containers pack the compute cluster for max resource utilization and during weak demand destroy the infrastructure for lower bill/resource usage.
Much of this is possible because of lose coupling with ephemeral containers and runtime discovery using service discovery tool like consul. Traditional VMs and tight binding of IPs can stifle this capability.
Note that the two are not necessarily mutually exclusive. It is possible, for example, that you might still direct clients to a load balancer (which might perform other roles such as throttling) but have the load balancer use a service registry to locate instances.
Also worth pointing out that service discovery enables client-side load balancing i.e. the client can invoke the service directly without the extra hop through the load balancer. My understanding is that this was one of the reasons that Netflix developed Eureka, to avoid inter-service calls having to go out and back through the external ELB for which they would have had to pay. Client-side load balancing also provides a means for the client to influence the load-balancing decision based on its own perspective of service availability.
If you look at the tools from a completely different perspective, namely ITSM/ITIL, load balancing becomes "just that", whereas service discovery is a part of keeping your CMDB up to date, and ajour with all your services, and their interconnectivity, for better visibility of impact, in case of downtime, and an overview of areas that may need supplementing, in case of High availability applications.
Furthermore, service-discovery only gives you a picture as of the last scan, and not near-real-time (of course dependent on which scanning interval you have set), whereas load balancing will keep an up-to-date picture of your application's health.
We're using spring cloud config server. Spring config clients get updates using spring control bus (RabbitMQ).
Looks like every config client instance creates a queue connected to the 'spring.cloud.bus' exchange.
Any scalability limits on how many app instances can connect to a 'spring.cloud.bus' exchange ?
I suppose RabbitMQ could be scaled to handle this.
Looking for any guidelines on this.
Many thanx,
The spring cloud config server can have multiple instances since it is stateless. That coupled with a RabbitMQ cluster should scale to a very large number of instances.
A viable solution would be spring cloud config behind a load balancer with a RabbitMQ cluster.