When to use Vert.x cluster - module

I have a java (gradle) project with multiple modules. All of the modules run on the same server but I want them to be independent, i.e. module A should be started and stopped no matter if module B is running or stopped and vice versa.
Each module is a vert.x verticle and the modules should be able to communicate with each other.
I read that if the verticles run on the same machine, it's not a good practice to create a clustered vertx instance, but if I don't cluster them, then I got different instances and therefore different eventbusses.
What would you recommend?
(1) Cluster the verticles on the same machine
(2) let the verticles communicate via a router (is this a bad practice?)
(3) Restructure the project in some way

You can deploy verticles to get a deployment id and also undeploy them using the same id. That means they are already independent.
The simplest way would be to have them running in the same process communicating via the event bus.
Options 1 & 2 that you listed would introduce unnecessary latency and make your application perform worse.

Related

Do I need multiple masters on OKD?

So I have a question regarding setting up OKD for our needs - our team has already established that Kubernetes is basically the simplest way for us to manage our stack. We don't have too much workload; probably 3 dedicated servers could work through all of it, but we have a lot of services and tools that are best served by running in docker containers, and we also strongly benefit from running our fairly monolithic core application as a container to make deployment and maintenance simpler.
The question though, is that how many nodes we need; specifically, whether we need HA Master nodes.
From the documentation, it seems that Infrastructure nodes are responsible for routing. Does this mean that even if the master node goes down, the other nodes are still available and routing works, so long as domains point at the infrastructure nodes? Or would a failed master make all the other nodes unreachable?
In our environment router pods are running on infra nodes and we can safely turn off master node without impact for applications.
master node: api, controllers, etcd
infra node: registry, router, metrics, logging etc.
With master turned off you just can't manage cluster, the rest works fine. It is good to have more than one master node for etcd redundancy, but with such small environment I think it makes no sense maintain more.

Flink on yarn use yarn-session or not?

There are two methods to deploy flink applications on yarn. The first one is use yarn-session and all flink applications are deployed in the session. The second method is each flink application deploy on yarn as a yarn application.
My question is what's the difference between these two methods? Which one to choose in product environment?
I can't find any material about this.
I think the first method will save resources since only need one jobmanager(yarn application master). While it is also the disadvantage since the only jobmanager can be the bottleneck while flink applications getting more and more.
Both modes have their uses in production environments.
Session mode generally makes sense when you will be running a bunch of short-lived jobs, and want to avoid the overhead of starting up a cluster for each one. On the other hand, there are security implications, as any credentials available to any of the jobs will be accessible to all of the jobs. Cluster-per-job mode may use more resources overall, but is, in some sense, more straightforward.

Deploying ASP.NET Core application to ElasticBeanstalk without temporary HTTP 404

Currently, ElasticBeanstalk supports ASP.NET Core applications only on Windows platforms (when using the web role), and with Windows-based platform, you can't have Immutable updates or even RollingWithAdditionalBatch for whatever reason. If the application is running with a single instance, you end up with the situation that the only running instance is being updated. (Possible reasons for running a single instance: saving cost because it is just a small backend service, or it might be a service that requires a lot of RAM in comparison to CPU time, so it makes more sense to run one larger instance vs. multiple smaller instances.)
As a result, during deployment of a new application version, for a period of up to 30 seconds, you first get HTTP 503, then HTTP 404, later HTTP 502 Bad Gateway, before the new application version actually becomes available. Obviously this is much worse compared to e.g. using WebDeploy on a single server in a "classic" environment.
Possible workarounds I can think of:
Blue/Green deployments: slow (because it depends on DNS changes), and it seems like it is more suitable for "supervised" deployments, not for automated deploy pipelines.
Modify the autoscaling group to enforce 2 active instances before deployment (so that EB can do its normal Rolling update thing), then change back. However it is far from ideal to mess with resources created and managed by EB (like the autoscaling group), and it requires a fairly complex script (you need to wait for the second instance to become active, need to wait for rolling deployment etc.).
I can't believe that this are the only options. Any other ideas? The minimal viable workaround for me would be to at least get rid of the temporary 404s because this could seriously mislead API clients (or think of the SEO effect in case of a website if a search engine spider gets a 404 for every URL). As long as it is 5xx at least everybody knows it is just a temporary error.
Finally, in Feb 2019, AWS released Elastic Beanstalk Windows Server platform v2, which supports Immutable und Rolling with an additional Batch deployments and platform updates (like their Linux-based stacks already supported for ages):
https://docs.aws.amazon.com/elasticbeanstalk/latest/relnotes/release-2019-02-21-windows-v2.html
This solves the problem even for environments (normally) running just one instance.

How can I deploy data grid application?

I am developing web application based in Spring. I added Apache ignite in maven dependency.
It is very simple application, which is only 2 rest api.
One is querying by key, which return object. another is put data.
But I have a problem: when I develop additional implementation, I don't know how I can deploy this application.
The application always should be available. but I deploy it to one node, then the node may not available.
Is there good method for distributed memory application deploy?
In your case you will typically start an Ignite server node embedded in your application. You can then start multiple instances of application, and as long as nodes discover each other, they will share the data. For more information about discovery configuration see here: https://apacheignite.readme.io/docs/cluster-config

Can Cloudbees instances within an app communicate directly?

I am looking to build an Akka-based application in the cloud, for a garage startup that I'm bootstrapping; by the nature of the app, it's semi-stateful, with as much as possible cached in RAM for performance. (It'll be tolerant of being shut down and restarted periodically, but we want to mostly operate via cached information inside the Actors.)
The architecture is designed for a cluster of servers, communicating between them as necessary so that a user session on node A can query a middleware Actor on node B when appropriate. So my question is, how hard is that in CloudBees?
My understand from this page is that there is no automatic directory service to manage this sort of intra-cluster communication yet, but I can probably live with that -- worse comes to worst, I should be able to manage discovery via the DB, with each node registering itself when it comes up and opening up many-to-many communications with the others.
What I want to check, though, is that this communication is straightforward. Does each node have a reliable local IP that it can advertise for others to contact it on, that is at least stable during this run of the application? Or is there another/better way for a node to advertise its address to the rest of the nodes running this app?
(I assume that the nodes of an app all share the same DB instance.)
Any guidance here would be greatly appreciated. I'd like to choose a hosting provider soon, and keep returning to CloudBees as the most promising-looking of the options...
There are no limitations currently on instances communicating with each other - the trick is in discovering membership. There is an api that will be shortly be released that will allow you to track membership - but for now, the following may work:
To get the port, look at the file names in $PWD/.genapp/ports (as applications can have multiple ports) - (eg System.getenv("PWD") + ".genapp/ports" - list the files in that directory - generally will be just 1 - the file name is the port). There are other ways - for example the "sun.java.command" system property on JVM apps too.
The hostname can be obtained via the usual means (eg InetAddress.getLocalHost().getHostName()): this host
name will be the private name - ie it will resolve to a private IP -
good for node to node communication.
Public IP/hostname: perform a HTTP get (from the server) to the following URL:
http://instance-data/latest/meta-data/public-hostname (will only
return the public IP on the server side of course).
(see http://developer-blog.cloudbees.com/2012/11/finding-port-or-address-of-your.html)
You can then, as you say, on startup, register the appropriate port/private hostname with a DB, and then read that on each node to "seed" the cluster (akka doesn't have to know about all members - just enough seeds) I would think a 2 phase startup: 1: register host/port, 2, look for other members, add them as seed members to the local Akka configuration (may need to periodically do the same for a while, as other nodes startup - to ensure it is seeded enough)
From my reading of Akka setup here: http://doc.akka.io/docs/akka/snapshot/scala/remoting.html
It looks like you can specify the port - so if possible, I would set that to be the app_port environment variable - that means each node can communicate via the private hostname with that port. However, http traffic will also be routed to it - can akka handle this as well - or does it need to have a discrete port for akka and another for any http interface?