I am trying to start a Apache Ray cluster on yarn. As per the documents, to be able to run a Ray application on a given yarn cluster, following command needs to be used
skein application submit [cluster_configuration.yaml]
Where cluster_configuration.yaml has Ray cluster's specification to be created by skein in yarn.
For a Ray cluster to work there are different port numbers which can be specified via configuration, some examples are, --node-manager-port, --object-manager-port etc.
My question is, if multiple users are trying to run their Ray applications, and if they happened to specify the same port numbers for these ports, then will this create a situation of port conflict? If one hadn't specified the manual port numbers, Ray would have tried to use available port numbers randomly, but in case users do specify explicitly , then how Ray is going to handle this situation?
Related
I'm looking into altering the architecture of a hosting service intended to scale arbitrarily.
On a given machine, the service works roughly as follows:
Start a container running Redis cluster client that joins a global cluster.
Start containers for each of the "Models" to be hosted.
Use upstream Redis cluster for managing model global state. Handle namespacing via keys themselves.
I'm wondering if it might be possible to change to something like this:
For each Model, start a container running the Model and a Redis cluster client.
Reverse proxy the Redis service using something like Nginx to be available on a certain path, e.g., <host_ip>:6397/redis-<model_name>. (Note: I can't just proxy from different ports, because in theory this is supposed to be able to scale past 65,535 models running globally.)
Join the Redis cluster by using said path.
Internalizing the Redis service to the container is an appealing idea to me because it is closer to what the hosting service is supposed to achieve. We do want to share compute; we don't want to share a KV store.
Anyways, I haven't seen anything that suggests this is possible. So, sticking with the upstream may be my only option. But, in case anyone knows otherwise, I wanted to check and see.
The scenario: Selenium is a browser automation tool that can be run in a K8s cluster, it consists of Selenium-hub (master) and selenium-nodes (workers) where the hub receives test requests and creates nodes (pods) ondemand (dynamically) to run the test-case, after execution of a test-case the runner node (pod) gets thrown away. also, Selenium supports live-preview of the test being run by the runner and a client (outside of K8s) can basically watch this live preview, there is a little change that when a client is watching the live preview of the test and it ends, another pod gets created with the same IP that the client is actually is still watching, this is a problem since the client may continue watching the run of another test because the client's software is not aware of the length of run and may still fetch the traffic with same user/pass/IP combination.
The question: is it possible to change the way Kubernetes assigns IP addresses?
let's say the first pod to be created gets IP 1.1.1.1 and the second one gets 1.1.1.2 and third 1.1.1.3, and before the fourth request the first pod dies and its IP is free then the fourth pod would be created with IP 1.1.1.1,
What I am trying to do is to tell to the Kubernetes to use previously assigned IP after some time or change the sequence of IP assignment or something similar.
Any ideas?
Technically: yes you can either configure or edit-the-code of your CNI plugin (or write one from scratch).
In practice: I know of none that work quite that way. I know Calico does allow having multiple IP pools so you could have a small one just for Selenium pods but I think it still attempts to minimize reuse. But check the docs for your CNI plugin and see what it offers.
I just investigate DC/OS, I find that DC/OS has three roles:master, slave, slave_public, I want to deploy a cluster which can host master, slave or slave_public roles on one host, but currently I can't do that.
I want to know that why can't put them on one host when designed. If I do that, could I get some suggestions?
I just have the idea. If I can't do, I'll quit using DCOS, I'll use mesos and marathon.
Is there someone has the idea with me? I look forward to the reply.
This is by design, and things are actually being worked on to re-enforce that an machine is installed with only one role because things break with more than one.
If you're trying to demo / experiment with DC/OS and you only have one machine, you can use Virtual Machines or Docker to partition that one machine into multiple machines / parts which you can install DC/OS on. dcos-vagrant and dcos-docker can help you there.
As far as installing though, the configuration for each of the three roles is incompatible with one another. The "master" role causes a whole bunch of pieces of software to be started / installed on a host (Mesos-DNS, Mesos master, marathon, exhibitor, zookeeper, 3dt, adminrouter, rexray, spartan, navstar among others) which listen on various ports. The "slave" role causes a machine to have a mesos-agent (mesos renamed mesos-slave to mesos-agent, hence the disconnect) configured and started on the agent. The mesos-agent is configured to control / most ports greater than 1024 to tasks which are launched by mesos frameworks on the agent. Several of those ports are used by services which are run on masters, resulting in odd conflicts and hard to fix bad behavior.
In the case of running the "slave" and "slave_public" on the same host, those two conflict more directly, because both of them cause mesos-agent to be run on the host, with slightly different configuration. Both the mesos-agent (the one configured with the "slave" role and the one with the "slave_public" role are configured to listen on port 5051. Only one of them can use it though, so you end up with one of the agents being non-functional.
DC/OS only supports running a node as either a master or an agent(slave). You are correct that Mesos does not have this limitation. But DC/OS is more than just a Mesos/Marathon. To enable all the additional features of DC/OS there are various components built around Mesos and Marathon. At times these components behave differently whether they are running on a master or an agent and at other times the components that exist on a master may or may not exist on an agent or vice versa. So running a master and an agent on the same node would lead to conflicts/issues.
If you are looking to run a small development setup before scaling the solution out to a bigger distributed system DC/OS Vagrant might be a good starting point.
I'm running an ElasticSearch cluster in development mode and want it to be production ready.
For that, I want to block all the unnecessary ports, one in particular is port 9200.
The problem is that I will not e able to monitor the cluster with HEAD or Marvel plugin.
I've searched around and saw that ElasticSearch recommendation is to put the entire cluster behind an application that manages the access to the cluster.
I saw some solutions (ElasticSearch HTTP basic authentication) which are insufficient for this matter.
Is there any application that can do it?
Elasticsearch actually have a product for this very purpose called Shield. You can find it here.
I want to know if it is possible to run 2 versions (5.5 and 5.10) of ActiveMQ on the same machine. I simply assume that all I need to do reconfigure the ports on one of them to something different to the other.
The reason for this is that we are using Informatica B2B which uses ActiveMQ # 5.5 with a 3rd party (Fuse) addition for its internal messaging. We would also like to run a separate JMS server on the same machine for various reasons using 5.10 or 5.11.
I have found lots of examples of creating multiple instances, but they apply to using the same installation.
If it is that simple (as just changing the ports), can they also share the same JVM or not?
You can run multiple instances on the same machine by changing the ActiveMQ configuration. You should assign each Broker a unique name and configure the transport connector to listen on different ports. You also want to ensure that they are configure with different data directory instances and so on.
You cannot run two in the same JVN however.