RabbitMQ with F5 Load Balancer - rabbitmq

I'm trying to get RabbitMQ configured behind an F5 load balancer. I have a working RabbitMQ node with the default node name of rabbit#%computername%. It's set to listen on all network interfaces (all IP addresses 0.0.0.0:5671 which is the AMQP SSL port), and it's working fine. However, all client applications that connect to it are currently using the specific node name e.g. "%computername%". In order to take advantage of the fault tolerance of the load balancer, I want to update all my client applications to use the load-balanced name instead of the specific node name e.g. connect using HostName = "balancedname.mycompany.com" instead of "%computername%". However, when I update my client applications to connect to the load-balanced name, the connection fails. How can I get this to work?
I'm a novice at F5, and I did notice that the pool's members' addresses are IP addresses...should these be the node names instead of the IPs? Is that even possible seeing as the node name can be completely arbitrary and doesn't necessarily map to anything that's network-resolveable? I'm in a hosting situation where I don't have write access to the F5, so trying these things out is a bit tricky.
I haven't found very much information at all on load balancing a RabbitMQ setup. I do understand that all RabbitMQ queues only really exist on one node, and I've set up the F5 in an active-passive mode so that traffic will always route to the primary node unless it goes down.
Update 1: It seems that this issue came back to bite me here. I'm using EXTERNAL authentication using an SSL certificate, and since clients were connecting using the load balance name instead of the node name, and the load balance name was NOT used to create the certificate, it was rejecting the connection. I ended up re-generating the certificate and using the load balance name, but that wasn't enough - I also had to add an entry in the Windows hosts file to map 127.0.0.1 and ::1 to the load balance DNS address.
Update 2: Update 1 solves connection problems only for running client applications on the app server that is part of the load balancer, but remote clients don't work. Inner exception says "The certificate chain was issued by an authority that is not trusted". RabbitMQ + SSL is hard. And adding load balancing makes it even harder.

I'm answering my own question in the hopes that it will save folks some time. In my scenario, I needed for clients to connect to a load balanced address like myrabbithost.mycompany.com, and for the F5 to direct traffic to one node as long as it's up and failover to the secondary node if it's down. I had already configured security and was authenticating to RabbitMQ using self-signed certificates. Those certificates had common names specific to each host which was the problem. In order to work with .NET, the common name on the certificate must match the server name being connected to (myrabbithost.mycompany.com in my case). I had to do the following:
Generate new server and client certificates on the RabbitMQ servers with common names of myrabbithost.mycompany.com
Generate a new certificates for the clients to use while connecting in order to use SSL authentication
Still on the RabbitMQ servers, I had to concatenate the multiple cacert.pem files used for the certificate authority so that clients can authenticate to any node using a client certificate generated by any node. When I modified rabbit.config to use the "all.pem" instead of "cacert.pem", clients were able to connect, but it broke the management UI, so I modified the rabbitmq_management settings in rabbit.config to specific the host-specific cacert.pem file and it started working again.
In order to set up high availability, I set up a RabbitMQ cluster, but ran into some problems there as well. In addition to copying the Erlang cookie from the primary node to the secondary node at C:\Windows and C:\users\myusername, I had to kill the epmd.exe process via task manager as the rabbitmqctl join_cluster command was failing with a "node down" error. The epmd.exe process survives RabbitMQ stoppages and it can cause rabbitmqctl.bat to report erroneous errors like "node down" even when it's not down.

Related

Is Weblogic Node Manager SSL setup required while implementiing SSL for Application

In Weblogic, I have more than one Machines created using Node Manager. We have been told to setup SSL implementation for our Application which is deployed across created machines in a single Weblogic Admin Console.
So for the Application we had configured certificate using .jks file and configured SSL listen port by enabling it.
However we have been told to secure Node Manager machines in which application are deployed across as well. While enabling Node Manager type to SSL instead of Plain I am getting SSLException. By the fact we no need to secure Machines which were created using Node Manager, only securing Application is sufficient. Is am I right. Else is it required to Secure Machines -> Node Manager as well.
When I am turning SSL in Machines -> Node Manager, what are the things I have to consider to avoid SSLException. Is the Weblogic restart required If configure this or so. For now I do not have UNIX access, hence I couldn't do that at this moment.
Please advise on this situation. Without securing Machines -> Node Manager I am able run the application. But not able to access it using https. Only http for the Application is working.
Please advise on the situation.
SSL for node manager is optional as there's no application related sensitive data that flows in this layer.
You mention even after configuring jks you can't get the server and hence the application listening on https. Could you elaborate what steps did you follow. Note this has nothing to do with node manager

Taking a server from development to production

I have created a service (WCF) that acts as a backend for a DB. For now it does basic operations such as INSERT, SELECT etc. I have run it locally and now it is time to expose her to the internet and enter 'production'. Is there a best practice to doing so? Bear in mind this service will be hosted on a PC as a Windows Service (not IIS). This is the first time I am putting a Windows Service into production so I am hazy on the details but I think this is the main idea:
On the service: Check for 'rookie' errors such as SQL Injection. Set maximum message sizes to ones marginally higher than the largest message that should be transmitted by my service. Also upgrade self signed X.509 certificate to one issued by a CA. (Where does one store this certificate? Locally on the PC?)
On the PC: Fully patched software (OS etc) and windows firewall with a specific set of rules that allows traffic only on the ports being used (I suppose the safest way to do this is to use the windows tool Allow a program or feature through Windows Firewall ?). Furthermore an updated antivirus running.
On the Network: For the network router, port forward the respective ports being used (the base address is declared as http://localhost:8080 so I guess port 80 for HTTP and 443 for HTTPS? I am using message level Security.)
General precautions: Full message logging on the service to analyze traffic and potential attackers. Also run a Network intrusion detection system such as Snort so that I can sleep a bit better at night.
Am I missing anything obvious? Also should I be hosting in IIS, on security exchange someone said that I would be vulnerable to HTTP attacks if I did not put the code behind a web server. However I have not read this anywhere else

Load balancing for SSL server, NOT web server

I'm finding it pretty difficult to get reliable information on Google about how exactly to do load balancing for anything other than a web server. Here is my situation: I currently have a python/twisted SSL server running on one machine. This is not fast enough so I want to change this so that multiple instances of this server will run on multiple machines behind a load balancer. So suppose I have two copies of this server process: TWISTED1 and TWISTED2. TWISTED1 will run on MACHINE1 and TWISTED2 will run on MACHINE2. Both TWISTED1 and TWISTED2 are SSL server processes. A separate machine LOAD_BALANCER is used to load balance between the two machines.
Where do I put my existing SSL certificate? Do I put an identical copy on both MACHINE1 and MACHINE2? Do I also have an identical copy on LOAD_BALANCER? I do NOT want unencrypted traffic between LOAD_BALANCER and MACHINE1 or MACHINE2, and also the twisted processes are already set up as SSL servers, so it would be unnecessary work to remove SSL from the twisted process. Basically I want to set up load balancing for SSL traffic with minimal changes to the existing twisted scripts. What is the solution?
Regarding the load balancer, is it sufficient to use another machine MACHINE3 and put HAPROXY onto this machine as the load balancer or is it better to use a hardware load balancer like Baracuda?
Note also that most of the connections to the twisted process are persistent connections.
Could you have the certs on one machine and a mount from the other machine to the machine w/ the certs? Allowing you to only have one set of ssl certs.
The problem with load balancing a TLS server is that without the HTTP "forwarded for" header, there's no way to tell where the original connection came from. That's why so much documentation focuses on load-balancing HTTP(S).
You can configure TLS termination in more or less all of the ways that you've described; it sounds like you can simply have your load balancer act as a TCP load balancer. The only reason to have your load balancer have your certificate (with the implication being that it also would have your private key) would be for it to decrypt the traffic to figure out what to do with it, then re-encrypt it to the machines. If you don't need to do that, put the certificate only on the target machines and not on the LB.

Connect to third-party two-way https ws from glassfish behind ssl-terminating-point

Context
I developed an application deployed in a Glassfish 3.1. This application is accessed only by https and sometimes it must connect to third-party webservices located out the customers networks. The customer have other applications inside his network; mine is only a new one "service".
Topology approximation
Big-ip F5 is the ssl end point. The customer have in this device the valid certificate
IIS redirects by domain to the respective service
glassfish is the machine with the application (over, of course, a glassfish 3.1)
How it works
When a user try to connect to _https://somedomain the request arrives to the F5 where the SSL encryption ends; now we have a request to _http://somedomain. In the next step F5 redirects this request to the IIS and this, finally, redirects to glassfish. This petitions are successfully processed.
Points of interest
I've full control over glassfish server and S.O. of the vm where it is located. Not other applications are or will be deployed on this server; it's a dedicated server for the app and some services it needs. The Glassfish runs on a VM with a Debian distribution as S.O. This VM is provided by a VM Server but I don't know the brand, model, etc. The glassfish have the default http listeners configuration.
I don't have any more information about network and other devices and i can't access to
any configuration file of any other device. I can't modify any part of the network for my own but maybe ask, suggest or advice for a change. Network's behavior should not change.
Actually users reach the application without problem.
The used certificate is a simple domain certificate trusted by Verysign
The customer have no idea of how to solve this.
The problem
All the third party WS the application must access have an unique https access and, in some cases, the authentication required is mutual (two-way) and here we find the problem. When the application wants to connect to WS with mutual ssl authentication it sends the glassfish local keystore configuration targeted certificate. Customer wants, if possible, use the same cert for incoming and outcoming secure connections. This cert is in the F5 and i can't add to the glassfish keystore because if I do this I would be breaking Verysign contract requirements. I've been looking for a solution at google, here(stackoverflow), jita,... but only incoming traffic solutions I've found. I understand that maybe a SSL proxy is required but I haven't found any example or alternative solution for the outcoming ssl connections.
What I'm asking for
I'm not english speaker (isn't obvious?) and maybe i doesn't use the correct terms in my search terms. I can understand that this context can be a nightmare and hard to solve but I will stand... The first think I need is to know if exists a solution (or solutions) for this problem and if it (or they!) exist where or how can I find it/them. I've prepared different alternatives to negotiate with the customer but I need to known the true. I've spent tones of hours on this.
There are a couple of solutions.
1)pay verisign more money for a second "license/cert". They will be happy to take your money for the "privilege". :)
2)Create a different virtual server listening on 443 which points to a pool that has your client's server address as the pool member. Then on the virtual server, attach a serverssl profile that is configured to use the same cert you are using for the incoming connections. Then the F5 would authenticate with the same cert along with your app server would not need a client cert installed. Also, if they need to initiate a session to you, you would have to setup a virtual server with a clientssl profile that uses the same cert and requires a client cert to connect.
If your destinations may not be static addresses, then an irule(s) would have to be created to deal with that. Can be handled in 10 or later code with a DNS call in the irule and setting a node for the session to go.

Weblogic Apache plugin and session stickiness

If two web servers are configured in between a load balancer and a weblogic cluster, will the two Apache server maintain session stickiness?
Say for example, the load balancer forwards the first request to the 1st apache and in turn 1st apache forwards to 1st WL managed instance. Even if the second req from the same user is forwarded by the load balancer to the second apache, will the second apache be able to forward it to the 1st WLManaged instance which served the first request rather than the second WLManaged instance which is not aware of the session information at all.
What should ideally be the behaviour of the weblogic apache plugin? The catch is I don't want to enable session replication on the wl server cluster.
According to the section "Failover, Cookies, and HTTP Sessions" of the Apache HTTP Server Plug-In:
When a request contains session information stored in a cookie or in the POST data, or encoded in a URL, the session ID contains a reference to the specific server instance in which the session was originally established (called the primary server) and a reference to an additional server where the original session is replicated (called the secondary server). A request containing a cookie attempts to connect to the primary server. If that attempt fails, the request is routed to the secondary server. If both the primary and secondary servers fail, the session is lost and the plug-in attempts to make a fresh connection to another server in the dynamic cluster list. See Figure 3-1 Connection Failover.
Note: If the POST data is larger than 64K, the plug-in will not parse the POST data to obtain the session ID. Therefore, if you store the session ID in the POST data, the plug-in cannot route the request to the correct primary or secondary server, resulting in possible loss of session data.
Figure 3-1 Connection Failover
In other words, yes, both Apache servers will be able to forward an incoming request to the "right" WebLogic instance as the session ID contains all the required information for that. Note that there is no real need to confirm this with testing but it would very easy though.
UPDATE: Answering the following comment from the OP
I think this document stands good for only one apache server. In my case I have two and the load balancer forwards the requests to both the servers in a 50:50 manner. I did test this and the weblogic plugin is not maintaining the stickiness.
I understood you are using two apache fontend and I'm not sure this document applies to configuration with one apache server only. As explained, the session ID contains a reference of the primary server (and the secondary server as well) so both apache should be able to deal with it. At least, this is my understanding. Actually, I've worked with a similar configuration in the past but can't remember if things were working as I think they should or if the load balancer was configured to handle stickiness too (i.e. forward to a given Apache server). I have a little doubt now...
Could post your plugin configuration (of both apache server if they differ)? Could you also confirm that things are working as expected when only one apache server is up (and test this with both apache if their configuration differ, which shouldn't be the case though)?
When you have 2 Apache instances with a TCP load balancer in front, the stateflow diagram is not applicable anymore, because the Apache instances do not share their states.
I guess that the WebLogic plug-in maintains a state with a directional mapping [IPAddress+Port -> JVMID]. If it receives a cookie with a JVMID it does not know yet (for instance, it has never sent a request to this server yet), it has no way to know which IPAdress+Port it refers to, so it will not be able to reuse these JVMID and it will reassign new primary/secondary ones, which will be identical for 2 instances (maybe swapped), and which might be different if there are strictly more than 2 instances.
I did not confirm it by running specific tests, but on paper it seems not to work in all cases.
The answer is yes. We've got a write up of this on our blog http://blog.c2b2.co.uk/2012/10/basic-clustering-with-weblogic-12c-and.html which provides step by step instructions on setting up web session failover in a cluster.
Essentially the jsessionid cookie encodes the primary and secondary weblogic servers. Mod-wl parses the cookie and routes the request to the primary server. In your case Managed Server 1. If it is down it will automatically route the request to the backup server Managed Server 2.
The diagram above holds true for 2 Apache servers connected to the same WL cluster. The cookie session info contains details on what WLS to connect to and the plugin will respect that. If the primary (the server it originally connected to) WL server ins't available, then the request would be sent to the secondary server (designated such at the time of the first request based on the rules defined in selecting a "Preferred Replication Group"). This secondary server maintains the same session state as the primary WLS server and should be able to handle the request.
If session replication isn't setup (I think this is OFF by default), then there would be no session copied to another server and if the original/primary WL server goes down, you lose the session.
The answer is NO. As you have 2 Apache webserver, you need to implement stickiness at both hardware and software loadbalancer level in order to achieve your requirement.
Means you already have sticky session implemented in Weblogic plug-in for Apache level, but you also need Source IP based stickiness at the hardware loadbalancer level. This will allow your hardware loadbalancer to send the subsequent request from same user to same apace web server.