We're currently running Apache Ignite in a docker container but are having problems with the time server sync. Each node is reporting all the known ip addresses which later is used by a remote peer to send a time sync message over UDP.
Is there a way to specify the externally reachable ip address that the peers will use for time sync?
It's possible to set to every node a network interface it should use for all network related communications using IgniteConfiguration.setLocalHost(...) method. The time server will use the addresses specified this way for its needs as well.
However it's not critical that the time server is non workable on your side because it's used for cache CLOCK mode which is discouraged for usage.
Related
We are trying to prevent our application startups from just spinning if we cannot reach the remote cluster. From what I've read Force Server Mode states
In this case, discovery will happen as if all the nodes in topology
were server nodes.
What i want to know is:
Does this client then permanently act as a server which would run computes and store caching data?
If connection to the cluster does not happen at first, a later connection to an establish cluster cause issue with consistency? What would be the expect behavior with a Topology version mismatch? Id their potential for a split brain scenario?
No, it's still a client node, but behaves as a server on discovery protocol level. For example, it can start without any server nodes running.
Client node can never cause data inconsistency as it never stores the data. This does not depend forceServerMode flag.
I have a Redis Cluster that clients are connecting to via HAPRoxy with a Virtual IP. The Redis cluster has three nodes (with each node sharing the same server with a running sentinel instance).
My question is, when i clients gets a "MOVED" error/message from a cluster node upon sending a request, does it bypass the HAProxy the second time when it connects since it has been provided with an IP:port when the MOVEd message was issued? If not, how does the HAProxy know the second time to send it to the correct node?
I just need to understand how this works under the hood.
If you want to use HAProxy in front of Redis Cluster nodes, you will need to either:
Set up an HAProxy for each master/slave pair, and wire up something to update HAProxy when a failure happens, as well as probably intercept the topology related commands to insert the virtual IPs rather than the IPs the nodes themselves have and report via the topology commands/responses.
Customize HAProxy to teach it how to be the cluster-aware Redis client so the actual client doesn't know about cluster at all. This means teaching it the Redis protocol, storing the cluster's topology information, and selecting the node to query based on the key(s) being accessed by the consumer code.
With Redis Cluster the client must be able to access every node in the cluster. Of the two options above Option 2 is the "easier" one, but at this point I wouldn't recommend either.
Conceivably you could use the VIP as a "first place to get the topology info" IP but I suspect you'd have serious issues develop as that original IP would not be one of the ones properly being reported as a nod handling data. For that you could simply use round-robin DNS and avoid that problem, or use the built-in "here is a list of cluster IPs (or names?)" to the initial connection configuration.
Your simplest, and least likely to be problematic, route is to go "full native" and simply give full and direct access to every node in the cluster to your clients and not use HAProxy at all.
So, we have a standalone graphite node which is behind a cname and collecting all the metrics. If this goes down,it's not going to be good.So, my question is how do I not only replicate all the existing whisper data to another node, but also setup replication in place using carbon relay. What would the migration workflow look like in short? How should I configure the carbon relay? I want to do this in as transparent way as possible with minimum downtime.
The migration will require a minimal unavoidable downtime.
I would proceed by:
decrease lifetime of you cname in your dns server (it will a lower bound of the duration of your downtime)
prepare a second server with aggregators, cache and a relay pointing to both boxes (replication factor 2), but stop the relay
stop graphite on your first server (downtime start now)
change the cname to point to the second server
archive all metrics on the first server, copy to the second, extract them
start the relay (end of downtime)
at time of dns change + ttl, all your clients will have moved the second relay, all data is written on both servers
You can then start to make your setup more reliable with a relay on the first server (sharing a virtual ip for instance).
On our setup we have separated relay servers (2 of them in active/active) from aggregator+cache servers.
Redundancy in graphite is however tricky when a server is down since it won't fetch missed updates when up again, you'll have to do it manually.
I've been using ServiceStack PooledRedisClientManager with success. I'm now adding Twemproxy into the mix and have 4 Redis instances fronted with Twemproxy running on a single Ubuntu server.
This has caused problems with light load tests (100 users) connecting to Redis through ServiceStack. I've tried the original PooledRedisClientManager and BasicRedisClientManager, both are giving the error No connection could be made because the target machine actively refused it
Is there something I need to do to get these two to play nice together? This is the Twemproxy config
alpha:
listen: 0.0.0.0:12112
hash: fnv1a_64
distribution: ketama
auto_eject_hosts: true
redis: true
timeout: 400
server_retry_timeout: 30000
server_failure_limit: 3
server_connections: 1000
servers:
- 0.0.0.0:6379:1
- 0.0.0.0:6380:1
- 0.0.0.0:6381:1
- 0.0.0.0:6382:1
I can connect to each one of the Redis server instances individually, it just fails going through Twemproxy.
I haven't used twemproxy before but I would say your list of servers is wrong. I don't think you are using 0.0.0.0 correctly.
Your servers would need to be (for your local testing):
servers:
- 127.0.0.1:6379:1
- 127.0.0.1:6380:1
- 127.0.0.1:6381:1
- 127.0.0.1:6382:1
You use 0.0.0.0 on the listen command to tell twemproxy to listen on all available network interfaces on the server. This mean twemproxy will try to listen on:
the loopback address 127.0.0.1 (localhost),
on your private IP (i.e. 192.168.0.1) and
on your public IP (i.e. 134.xxx.50.34)
When you are specifying servers, the server config needs to know the actual address it should connect on. 0.0.0.0 doesn't make sense. It needs a real value. So when you come to use different Redis machines you will want to use, the private IPs of each machine like this:
servers:
- 192.168.0.10:6379:1
- 192.168.0.13:6379:1
- 192.168.0.14:6379:1
- 192.168.0.27:6379:1
Obviously your IP addresses will be different. You can use ifconfig to determine the IP on each machine. Though it may be worth using a hostname if your IPs are not statically assigned.
Update:
As you have said you are still having issues, I would make these recommendations:
Remove auto_eject_hosts: true. If you were getting some connectivity, then after time you end up with no connectivity, it's because something has caused twemproxy to think there was something wrong with the Redis hosts and reject them.
So eventually when your ServiceStack client connects to twemproxy, there will be no hosts to pass the request onto and you get the error No connection could be made because the target machine actively refused it.
Do you actually have enough RAM to stress test your local machine this way? You are running at least 4 instances of Redis, which require real memory to store the values, twemproxy consumes a large amount of memory to buffer the requests it passes to Redis, this memory pool is never released, see here for more information. Your ServiceStack app will consume memory - more so in Debug mode. You'll probably have Visual Studio or another IDE open, the stress test application, and your operating system. On top of all that there will likely be background processes and other applications you haven't closed.
A good practice is to try to run tests on isolated hardware as far as possible. If it is not possible, then the system must be monitored to check the benchmark is not impacted by some external activity.
You should read the Redis article here about benchmarking.
As you are using this in a localhost situation use the BasicRedisClientManager not the PooledRedisClientManager.
I am looking to build an Akka-based application in the cloud, for a garage startup that I'm bootstrapping; by the nature of the app, it's semi-stateful, with as much as possible cached in RAM for performance. (It'll be tolerant of being shut down and restarted periodically, but we want to mostly operate via cached information inside the Actors.)
The architecture is designed for a cluster of servers, communicating between them as necessary so that a user session on node A can query a middleware Actor on node B when appropriate. So my question is, how hard is that in CloudBees?
My understand from this page is that there is no automatic directory service to manage this sort of intra-cluster communication yet, but I can probably live with that -- worse comes to worst, I should be able to manage discovery via the DB, with each node registering itself when it comes up and opening up many-to-many communications with the others.
What I want to check, though, is that this communication is straightforward. Does each node have a reliable local IP that it can advertise for others to contact it on, that is at least stable during this run of the application? Or is there another/better way for a node to advertise its address to the rest of the nodes running this app?
(I assume that the nodes of an app all share the same DB instance.)
Any guidance here would be greatly appreciated. I'd like to choose a hosting provider soon, and keep returning to CloudBees as the most promising-looking of the options...
There are no limitations currently on instances communicating with each other - the trick is in discovering membership. There is an api that will be shortly be released that will allow you to track membership - but for now, the following may work:
To get the port, look at the file names in $PWD/.genapp/ports (as applications can have multiple ports) - (eg System.getenv("PWD") + ".genapp/ports" - list the files in that directory - generally will be just 1 - the file name is the port). There are other ways - for example the "sun.java.command" system property on JVM apps too.
The hostname can be obtained via the usual means (eg InetAddress.getLocalHost().getHostName()): this host
name will be the private name - ie it will resolve to a private IP -
good for node to node communication.
Public IP/hostname: perform a HTTP get (from the server) to the following URL:
http://instance-data/latest/meta-data/public-hostname (will only
return the public IP on the server side of course).
(see http://developer-blog.cloudbees.com/2012/11/finding-port-or-address-of-your.html)
You can then, as you say, on startup, register the appropriate port/private hostname with a DB, and then read that on each node to "seed" the cluster (akka doesn't have to know about all members - just enough seeds) I would think a 2 phase startup: 1: register host/port, 2, look for other members, add them as seed members to the local Akka configuration (may need to periodically do the same for a while, as other nodes startup - to ensure it is seeded enough)
From my reading of Akka setup here: http://doc.akka.io/docs/akka/snapshot/scala/remoting.html
It looks like you can specify the port - so if possible, I would set that to be the app_port environment variable - that means each node can communicate via the private hostname with that port. However, http traffic will also be routed to it - can akka handle this as well - or does it need to have a discrete port for akka and another for any http interface?