How to manage hundreds of connections with RabbitMQ? - rabbitmq

I have a rabbiMQ server (a cluster), so no problem on this side. But I must connect 1000 or 2000 clients. Each client App must have 1 connection and each client App is using multiple channels. Channels are not a problem. But connections seem to be limited (128 by default).
In such a case, how to you connect properly 2000 clients to RabbitMQ if you can't use 2000 connections? What are the good ways to do this? Is there some know patterns? (Knowing the 2000 clients must be connected all the time).
Many thanks in advance for your help and ideas!

Related

Extra TCP connections on the RabbitMQ server after resource alarm

I have RabbitMQ Server 3.6.0 installed on Windows (I know it's time to upgrade, I've already done that on the other server node).
Heartbeats are enabled on both server and client side (heartbeat interval 60s).
I have had a resource alarm (RAM limit), and after that I have observed the raise of amount of TCP connections to RMQ Server.
At the moment there're 18000 connections while normal amount is 6000.
Via management plugin I can see there is a lot of connections with 0 channels, while our "normal" connection have at least 1 channel.
And even RMQ Server restart won't help: all connections would re-establish.
   1. Does that mean all of them are really alive?
Similar issue was described here https://github.com/rabbitmq/rabbitmq-server/issues/384, but as I can see it was fixed exactly in v3.6.0.
   2. Do I understand right that before RMQ Server v3.6.0 the behavior after resource alarm was like that: several TCP connections could hang on server side per 1 real client autorecovery connection?
Maybe important: we have haProxy between the server and the clients. 
   3. Could haProxy be an explanation for this extra connections? Maybe it prevents client from receiving a signal the connection was closed due to resource alarm?
Are all of them alive?
Only you can answer this, but I would ask - how is it that you are ending up with many thousands of connections? Really, you should only create one connection per logical process. So if you really have 6,000 logical processes connecting to the server, that might be a reason for that many connections, but in my opinion, you're well beyond reasonable design limits even in that case.
To check, see how many connections decrease when you kill one of your logical processes.
Do I understand right that before RMQ Server v3.6.0 the behavior after resource alarm was like that: several TCP connections could hang on server side per 1 real client autorecovery connection?
As far as I can tell, yes. It looks like the developer in this case ran across a common problem in sockets, and that is the detection of dropped connections. If I had a dollar for every time someone misunderstood how TCP works, I'd have more money than Bezos. So, what they found is that someone made some bad assumptions, when actually read or write is required to detect a dead socket, and the developer wrote code to (attempt) to handle it properly. It is important to note that this does not look like a very comprehensive fix, so if the conceptual design problem had been introduced to another part of the code, then this bug might still be around in some form. Searching for bug reports might give you a more detailed answer, or asking someone on that support list.
Could haProxy be an explanation for this extra connections?
That depends. In theory, haProxy as is just a pass-through. For the connection to be recognized by the broker, it's got to go through a handshake, which is a deliberate process and cannot happen inadvertently. Closing a connection also requires a handshake, which is where haProxy might be the culprit. If haProxy thinks the connection is dead and drops it without that process, then it could be a contributing cause. But it is not in and of itself making these new connections.
The RabbitMQ team monitors this mailing list and only sometimes answers questions on StackOverflow.
I recommended that this user upgrade from Erlang 18, which has known TCP connection issues -
https://groups.google.com/d/msg/rabbitmq-users/R3700QdIVJs/taDYKI6bAgAJ
I've managed to reproduce the problem: in the end it was a bug in the way our client used RMQ connections.
It created 1 auto-recovery connection (that's all fine with that) and sometimes it created a separate simple connection for "temporary" purposes.
Step to reproduce my problem were:
Reach memory alarm in RabbitMQ (e.g. set up an easily reached RAM
limit and push a lot of big messages). Connections would be in state
"blocking".
Start sending message from our client with this new "temp" connection.
Ensure the connection is in state "blocked".
Without eliminating resource alarm, restart RabbitMQ node.
The "temp" connection itself was here! Despite the fact auto-recovery
was not enabled for it. And it continued sending heartbeats so the
server didn't close it.
We will fix the client to use one and the only connection always.
Plus we of course will upgrade Erlang.

Active connections on web farm

I'm trying to build a simple chat with websockets. I'm also displaying the current active users in the chat, and here is where the problems start: we use a web farm.
A user can connect through a loadbalancer with a server. When a new connection hits a server, it increases a counter in a SQL database and notifies the other servers in the farm through rabbit MQ.
All other servers fetch the new data and send that number back to their connected users.
If an user disconnects, the same will happen: The server decreases the counter in the SQL database and through rabbit MQ all other servers will know about this.
But, what will happen when a server dies? for example, If 10 users will be connected with this server. When that server goes down, all the users are disconnected, but that is not updated in the database anymore.
What's the best solution to get the total amount of active users in a web farm? And notifying the users when this amount has changed?
Thanks in advance!
Oh btw, we're using signalr
I think the typical way to deal with nodes asynchronously disconnecting from a mesh is to implement a heartbeat/keep-alive mechanism. In this case the heartbeat message would be between servers and there must also be an accessible record of which users are connected to which server. When a server does not produce a heartbeat for a period of time, then all other servers can update their records and mark all the users associated with the server as disconnected.
Looks like you may have a few options on how to keep track of users (SQL database or every server listens a Rabbit MQ message). As far as the heartbeat, you can implement it yourself or try to see if the laodbalancer's detection method can be utilized.

How is a client different from a server-peer?

The documentation says:
GemFire clients are processes that send most or all of their data requests and updates to a GemFire server system. Clients run as standalone processes, without peers of their own.
Fundamentally, all peers communicate among themselves to manage the cache. An entry made by one peer in a region goes to all other peers. Similarly, a client's cache gets updated as soon as there is a change on the server. Also a client is allowed to make new entries into the region that will get propagated to all server peers.
What then is the real difference between a client and a server peer? Based on my understanding, both have access to all data and both can do the same operations.
The major difference between a peer and a client is that the peer connects to all other members of the distributed system; it has at-least 2 connections open at all times to each other member in the distributed system. Clients do not need connections to all servers, a single connection to a single server is enough. Thus, you can have tens of thousands of clients, but may be only hundreds of peers. (The number of connections that the client establishes can be configured while creating a client pool. You can also configure single-hop on the client, which enables it to connect directly to servers against which it wishes to operate).
The performance implication here is that peers can access any data with just one network hop, whereas clients may need at-most 2 network hops (one from client to server, one from server to the node where data lives).
The other differences are:
1. Clients can Register interest, peers cannot.
2. Clients can register Continuous Queries, peers cannot.

Objective-C application with bidirectional file transfer across internet

Basically, I'm building a Dropbox clone that will avoid cloud storage. Ok, I'm not building it, but trying to estimate the amount of work needed.
Been reading different p2p options here on SO, but actually, there are very little topics on centralised p2p connections and how to build them from ground-up. I'm not even sure if it's appropriate to call it a p2p at all.
From ActionScript background I know that it can establish an UDP connection between 2 different clients across the globe with provided centralised server (RTMFP). It's highly abstracted, it doesn't even require to open ports and clients don't know the IPs of each other. So the subset of given options is quite limited.
Anyway, I need create a server-side app and a client-side app that will try to sync files between connected clients. I've read that socket connections are used for file transfers. And the questions here are:
How to pair the clients?
What should server do?
What should client do?
Thank you.
NB
Establishing connections and file syncing solutions are out of the question.

LDAP Connections

I have a very basic question about the LDAP Protocol:
Can a client be connected for an undefined period of time or each authentication requires to open and close a tcp connection?
Professional-quality LDAP servers can be configured to terminate clients after a period of time, a maximum number of operations, or other conditions; or alternatively, leave the the client connected forever. Ask your LDAP server administrator whether client connections are being terminated for any of the conditions listed, or perhaps others.
In addition to what Terry says, professional quality LDAP client APIs use a connection pool to hide all these gory details from you; to keep connections open as long as possible; and to recover from situations where the server imposes a connection termination rule.
LDAP servers may implement multiple limits on the server side , The LDAP client APIs also provide options to set limits at the client side. Some of the server side limits are [ In case of Oracle DSEE]
Size limit - Number of searh result entries returned
Time limit - Time taken to process the request
Idle time Limit - How much time the connection can stay idle ? [keepalive at load balancers can keep the connection alive] . server access log marks connections closed because of idle time .
Lookthrough limit - Number of candidate entries to look through for a given ldap search
Client APIs may set it's own time and size limit