RabbitMQ: handshake error when attempting to use SSL certificates - ssl

I am trying to use SSL certificates with RabbitMQ but I keep getting handshake errors with the broker.
The certificates that I have generated work fine when using the openssl 's_client' and 's_server' commands in separate terminal windows and utilizing port 8443 as detailed in the SSL Troubleshooting guide (http://www.rabbitmq.com/troubleshooting-ssl.html).
The problem appears when I attempt to connect to the RabbitMQ SSL port 5671 using the same openssl 's_client' command:
Running this:
openssl s_client -connect localhost:5671 -cert /etc/rabbitmq/ssl/client/cert.pem -key /etc/rabbitmq/ssl/client/key.pem -CAfile /etc/rabbitmq/ssl/certificate_auth/cacert.pem
Produces this:
CONNECTED(00000003)
depth=1 CN = RMQCA
verify return:1
depth=0 CN = roger.xxxxxx.com, O = server
verify return:1
139997248210760:error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure:s3_pkt.c:1256:SSL alert number 40
139997248210760:error:140790E5:SSL routines:SSL23_WRITE:ssl handshake failure:s23_lib.c:177:
---
The SSL listener starts fine as indicated in the RabbitMQ log:
=INFO REPORT==== 19-May-2014::15:45:34 ===
started TCP Listener on [::]:5672
=INFO REPORT==== 19-May-2014::15:45:34 ===
started SSL Listener on [::]:5671
When attempting to connect to port 5671 with 's_client' the error appears:
=INFO REPORT==== 19-May-2014::17:20:39 ===
accepting AMQP connection <0.3263.0> ([::1]:58538 -> [::1]:5671)
=ERROR REPORT==== 19-May-2014::17:20:39 ===
SSL: certify: ssl_handshake.erl:1346:Fatal error: handshake failure
=ERROR REPORT==== 19-May-2014::17:20:44 ===
error on AMQP connection <0.3263.0>: {ssl_upgrade_error,
{tls_alert,"handshake failure"}} (unknown POSIX error)
RabbitMQ Config file:
[
{rabbit, [
{ssl_listeners, [5671]},
{ssl_options, [{cacertfile, "/etc/rabbitmq/ssl/certificate_auth/cacert.pem"},
{certfile, "/etc/rabbitmq/ssl/server/cert.pem"},
{keyfile, "/etc/rabbitmq/ssl/server/key.pem"},
{verify, verify_peer},
{fail_if_no_peer_cert, false}]}
]}
].
RabbitMQ info:
[{pid,10375},
{running_applications,
[{rabbitmq_management,"RabbitMQ Management Console","3.2.3"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.2.3"},
{webmachine,"webmachine","1.10.3-rmq3.2.3-gite9359c7"},
{mochiweb,"MochiMedia Web Server","2.7.0-rmq3.2.3-git680dba8"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.2.3"},
{rabbit,"RabbitMQ","3.2.3"},
{ssl,"Erlang/OTP SSL application","5.3.3"},
{public_key,"Public key infrastructure","0.21"},
{crypto,"CRYPTO version 2","3.2"},
{asn1,"The Erlang ASN1 compiler version 2.0.4","2.0.4"},
{os_mon,"CPO CXC 138 46","2.2.14"},
{inets,"INETS CXC 138 49","5.9.8"},
{mnesia,"MNESIA CXC 138 12","4.11"},
{amqp_client,"RabbitMQ AMQP Client","3.2.3"},
{xmerl,"XML parser","1.3.6"},
{sasl,"SASL CXC 138 11","2.3.4"},
{stdlib,"ERTS CXC 138 10","1.19.4"},
{kernel,"ERTS CXC 138 10","2.16.4"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang R16B03-1 (erts-5.10.4) [source] [64-bit] [smp:2:2] [async-threads:30] [hipe] [kernel-poll:true]\n"},
{memory,
[{total,43812088},
{connection_procs,5616},
{queue_procs,42528},
{plugins,451248},
{other_proc,13805200},
{mnesia,72752},
{mgmt_db,10208},
{msg_index,34560},
{other_ets,1159472},
{binary,1030272},
{code,21819091},
{atom,793505},
{other_system,4587636}]},
{vm_memory_high_watermark,0.4},
{vm_memory_limit,787819724},
{disk_free_limit,50000000},
{disk_free,31267266560},
{file_descriptors,
[{total_limit,924},{total_used,4},{sockets_limit,829},{sockets_used,2}]},
{processes,[{limit,1048576},{used,215}]},
{run_queue,0},
{uptime,7893}]
...done.
Any help would be greatly appreciated
Thanks in advance.
UPDATE:
I get the following errors when trying to connect with the rabbitmqadmin utility.
Log File:
=INFO REPORT==== 20-May-2014::14:39:12 ===
accepting AMQP connection <0.16589.0> ([::1]:58922 -> [::1]:5671)
=ERROR REPORT==== 20-May-2014::14:39:12 ===
SSL: certify: ssl_handshake.erl:1346:Fatal error: handshake failure
=ERROR REPORT==== 20-May-2014::14:39:17 ===
error on AMQP connection <0.16589.0>: {ssl_upgrade_error,
{tls_alert,"handshake failure"}} (unknown POSIX error)
The rabbitmqadmin command produced the following:
*** Could not connect: [Errno 1] _ssl.c:492: error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure

I had the same problem as #user3653959 and #Sarah Messer's answer lead me to the solution.
Your client certificate must have the TLS Web Client Authentication "X509v3 Extended Key Usage" attribute. Mine had only TLS Web Server Authentication due to an error in my client generation script.
To check your client certificate's capabilities, you can use the this command:
openssl x509 -noout -text -in client-certificate.pem
Then look for the "X509v3 extensions:" section and the "X509v3 Extended Key Usage:" subsection.
If you generate your client certificate using the example openssl.conf and client and server commands provided in the official "RabbitMQ - TLS Support" guide, it should work out of the box.
The key here is the extendedKeyUsage = 1.3.6.1.5.5.7.3.2 openssl config option in openssl.conf as #Sarah Messer points out. This is the "TLS Web Client Authentication" capability. OpenSSL s_server does not require this capability and that's why it works by default with it, but not with RabbitMQ. keyUsage = digitalSignature is enough as main usage options. Also, the "Common Name" (CN) of the client certificate is not important.
Just for reference
My environment:
RabbitMQ 3.6.2
Erlang 18.2
Ubuntu 14.04.2 LTS (64-bit)
Only TLSv1.2 enabled.
The error I was seeing in my RabbitMQ log:
=ERROR REPORT==== 21-Jun-2016::13:28:21 ===
SSL: certify: ssl_handshake.erl:1492:Fatal error: handshake failure
The error I was seeing via openssl s_client:
140735165813584:error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure:s3_pkt.c:1472:SSL alert number 40
140735165813584:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake failure:s3_pkt.c:656:

I worked my way through similar troubles (using RabbitMQ 2.7.1 / Erlang R14B04). Here's what I've found:
The RabbitMQ plugins page and at least one other site recommend enabling the plugin rabbitmq_auth_mechanism_ssl. If rabbitmq-plugins is an invalid command on your system, this page describes how to enable it on Ubuntu. (Apparently the apt-get package doesn't have quite the expected behavior on Debian-based systems.) Your output (from rabbitmqctl report, I presume) says you don't have rabbitmq_auth_mechanism_ssl enabled.
For your rabbitmq.config, you'll need to make sure "EXTERNAL" is listed as one of the auth_mechanisms. The line's syntax is {auth_mechanisms, ['PLAIN', 'AMQPLAIN', 'EXTERNAL']} and appears as one item in the default, "rabbit" portion of the configuration.
You should also make sure the certificate your client presents has the appropriate values set for both keyUsage and extendedKeyUsage, as RabbitMQ is more strict about these than s_server. For debugging / testing purposes, you may want to be extremely permissive with these. You can set the keyUsage in your openssl config. A broadly-acceptable openssl config may have lines like this
keyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment, keyAgreement, keyCertSign, cRLSign
extendedKeyUsage = 1.3.6.1.5.5.7.3.1, 1.3.6.1.5.5.7.3.2
(I think the .2 OID, "TLS Web Client Authentication" is important for connecting to RabbitMQ, but I haven't done careful tests.)
This will produce certificates with this block near the end:
X509v3 Key Usage:
Digital Signature, Non Repudiation, Key Encipherment, Data Encipherment, Key Agreement, Certificate Sign, CRL Sign
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
There should be more output from s_client. In particular, I'm interested in the final line, which should look something like "Verify return code: 0 (ok)" If you have a non-zero / error message there, post it and pivot off that in your searches. (#19 is surprisingly common, given that it's not really an error.)
When I got to this point, when I tried to make a simple pika.BlockingConnection, the handshake apparently completed just fine, but Rabbit removed EXTERNAL from the list specified in auth_mechanisms in the config. I confirmed I had rabbitmq_auth_mechanism_ssl enabled, but that by itself wasn't enough. (I discovered this by subclassing pika.credentials.ExternalCredentials and pass an instance as the "credentials" item in ConnectionParameters, adding a print start at the top of the subclass's response_for() method.) I fixed that by adding the following line to the rabbit portion of the config file, on the same level as ssl_listeners and ssl_cert_login_from:
{ssl_apps,[asn1,crypto,public_key,ssl]},
(I suspect newer versions of RabbitMQ turn that on by default, but my particular setup did not.)
If you've done all that and you're still having trouble, you might also try replacing "verify_peer" with "verify_none" in your RabbitMQ config. You probably don't want that in production, since it opens you up to anyone with a self-signed certificate, but it's another data point. Also, subclass the relevant things in pika and add in print statements to get more insight on what Rabbit's sending you and how your local client's interpreting it.

Here is the solution which worked for me:
Adding below ciphers in rabbitmq.config:
{ciphers, ["ECDHE-ECDSA-AES256-GCM-SHA384","ECDHE-RSA-AES256-GCM-SHA384",
"ECDHE-ECDSA-AES256-SHA384","ECDHE-RSA-AES256-SHA384", "ECDHE-ECDSA-DES-CBC3-SHA",
"ECDH-ECDSA-AES256-GCM-SHA384","ECDH-RSA-AES256-GCM-SHA384","ECDH-ECDSA-AES256-SHA384",
"ECDH-RSA-AES256-SHA384","DHE-DSS-AES256-GCM-SHA384","DHE-DSS-AES256-SHA256",
"AES256-GCM-SHA384","AES256-SHA256","ECDHE-ECDSA-AES128-GCM-SHA256",
"ECDHE-RSA-AES128-GCM-SHA256","ECDHE-ECDSA-AES128-SHA256","ECDHE-RSA-AES128-SHA256",
"ECDH-ECDSA-AES128-GCM-SHA256","ECDH-RSA-AES128-GCM-SHA256","ECDH-ECDSA-AES128-SHA256",
"ECDH-RSA-AES128-SHA256","DHE-DSS-AES128-GCM-SHA256","DHE-DSS-AES128-SHA256",
"AES128-GCM-SHA256","AES128-SHA256","ECDHE-ECDSA-AES256-SHA",
"ECDHE-RSA-AES256-SHA","DHE-DSS-AES256-SHA","ECDH-ECDSA-AES256-SHA",
"ECDH-RSA-AES256-SHA","AES256-SHA","ECDHE-ECDSA-AES128-SHA",
"ECDHE-RSA-AES128-SHA","DHE-DSS-AES128-SHA","ECDH-ECDSA-AES128-SHA",
"ECDH-RSA-AES128-SHA","AES128-SHA"]},
{fail_if_no_peer_cert,false}]}
]}
]

Related

IBM MQIPT SSL Handshake issue

We are connecting java MQ client to customer IBM MQ server, To connect that we have one MQIPT instance at cloud premises , and one MQIPT instance at non cloud premises . Once disabled SSL security on non cloud premises then we are able to connect that. However , once SSL enabled by non cloud premises we are facing SSL handshake issue. Certificates are shared between us.
We don't have access on that non cloud environment.
We are connecting MQIPT by Java client . and below are the trace which we are getting in mqipt trace.
When we are not setting cipher at mq java client then we are getting below error
In that case MQIPT enabled for all cipher.
Issuer: 'CN=********* TEST CA ****,OU=*****,O=******** AG,C=******'
12:45:13.799 27 1414-2s Processing keyType: RSA
12:45:13.800 27 1414-2s No RSA certificates in keyring
12:45:13.800 27 1414-2s Processing keyType: DSA
12:45:13.800 27 1414-2s No DSA certificates in keyring
12:45:13.800 27 1414-2s Processing keyType: EC
12:45:13.800 27 1414-2s No EC certificates in keyring
12:45:13.800 27 1414-2s WARNING: No suitable certificate to send to the remote server
12:45:13.800 27 1414-2s --------} IPTX509KeyManager.chooseClientAlias() rc=0
12:45:14.184 27 1414-2s SSLHandshakeException handshaking:com.ibm.jsse2.k.a(k.java:7)
But when we set CipherSuite in java MQ client then we are getting error logs in mqipt
MQCPI014 Protocol eyecatcher (16030300) not recognized
MQIPT Version --> IBM MQ Internet Pass-Thru V9.2.0.1
MQIPT conf as below
[global]
CommandPort=1884
RemoteShutDown=true
MinConnectionThreads=5
MaxConnectionThreads=100
IdleTimeout=20
ClientAccess=true
QMgrAccess=true
HTTP=true
HTTPChunking=false
Trace=5
ConnectionLog=true
MaxLogFileSize=50
[route]
Name=Route_1
Active=true
ListenerPort=1414
Destination=mq-dmz-************
DestinationPort=********
HTTP=true
HTTPS=true
SSLClient=true
SSLClientProtocols=TLSv1.2
SSLClientKeyRing="path of key ring PFX file"
SSLClientKeyRingPW="path of password file"
HTTPServer=<Http Server name>
HTTPServerPort=443
URIName=<URI name>
SSLClientCAKeyRing="same as SSLClientKeyRing"
SSLClientCAKeyRingPW="same as SSLClientKeyRingPW"
SSLClientCipherSuites=SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384
Setup for accepting the connection from the MQ client, decrypting, and then re-encrypting and sending on to the next hop should look something like the following:
[route]
Name=Route_1
Active=true
ListenerPort=1414
Destination=mq-dmz-************
DestinationPort=********
HTTP=true
HTTPS=true
SSLClient=true
SSLClientProtocols=TLSv1.2
SSLClientKeyRing="path of key ring PFX file"
SSLClientKeyRingPW="path of password file"
HTTPServer=<Http Server name>
HTTPServerPort=443
URIName=<URI name>
SSLClientCAKeyRing="same as SSLClientKeyRing"
SSLClientCAKeyRingPW="same as SSLClientKeyRingPW"
SSLClientCipherSuites=SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384
SSLServer=true
SSLServerProtocols=TLSv1.2
SSLServerKeyRing="path of key ring PFX file"
SSLServerKeyRingPW="path of password file"
SSLServerCAKeyRing="same as SSLServerKeyRing"
SSLServerCAKeyRingPW="same as SSLServerCAKeyRing"
SSLServerCipherSuites=SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384
What you are missing is that the route is configured from the standpoint of the TLS session, you are either:
TLS Server (you are receiving the inbound connection and decrypting it)
TLS Client (you are connecting out to another queue manager or MQIPT and encrypting)
To accept a TLS connection from your MQ client application you need to configure the SSLServer* equivalents to the already configured SSLClient* settings.

KAFKA and SSL : java.lang.OutOfMemoryError: Java heap space when using kafka-topics command on KAFKA SSL cluster

this is my first post on Stackoverflow, i hope i didnt choose the wrong section.
Context :
Kafka HEAP size is configured on following file :
/etc/systemd/system/kafka.service
With following parameter :
Environment="KAFKA_HEAP_OPTS=-Xms6g -Xmx6g"
OS is "CentOS Linux release 7.7.1908".
Kafka is "confluent-kafka-2.12-5.3.1-1.noarch", installed from the following repository :
# Confluent REPO
[Confluent.dist]
name=Confluent repository (dist)
baseurl=http://packages.confluent.io/rpm/5.3/7
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/5.3/archive.key
enabled=1
[Confluent]
name=Confluent repository
baseurl=http://packages.confluent.io/rpm/5.3
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/5.3/archive.key
enabled=1
I activated SSL on a 3-machine KAFKA cluster few days ago, and suddently, the following command stopped working :
kafka-topics --bootstrap-server <the.fqdn.of.server>:9093 --describe --topic <TOPIC-NAME>
Which return me the following error :
[2019-10-03 11:38:52,790] ERROR Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1':(org.apache.kafka.common.utils.KafkaThread)
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:112)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:651)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:572)
at org.apache.kafka.common.network.Selector.poll(Selector.java:483)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:539)
at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1152)
at java.lang.Thread.run(Thread.java:748)
On the server's log, the following line appears when i try to request it via "kafka-topics" :
/var/log/kafka/server.log :
[2019-10-03 11:41:11,913] INFO [SocketServer brokerId=<ID>] Failed authentication with /<ip.of.the.server> (SSL handshake failed) (org.apache.kafka.common.network.Selector)
I was able to use this command properly BEFORE implementing SSL on the cluster. Here is the configuration i'm using.
All functionnality work properly (consumers, producers...) except "kafka-topics" :
# SSL Configuration
ssl.truststore.location=<truststore-path>
ssl.truststore.password=<truststore-password>
ssl.keystore.type=<keystore-type>
ssl.keystore.location=<keystore-path>
ssl.keystore.password=<keystore-password>
# Enable SSL between brokers
security.inter.broker.protocol=SSL
# Listeners
listeners=SSL://<fqdn.of.the.server>:9093
advertised.listeners=SSL://<fqdn.of.the.server>:9093
There is no problem with the certificate (which is signed by internal CA, internal CA which i added to the truststore specified on the configuration). OpenSSL show no errors :
openssl s_client -connect <fqdn.of.the.server>:9093 -tls1
>> Verify return code: 0 (ok)
The following command is working pretty well with SSL, thanks to parameter "-consumer.config client-ssl.properties"
kafka-console-consumer --bootstrap-server <fqdn.of.the.server>:9093 --topic <TOPIC-NAME> -consumer.config client-ssl.properties
"client-ssl.properties" content is :
security.protocol=SSL
ssl.truststore.location=<truststore-path>
ssl.truststore.password=<truststore-password>
Right now, i'm forced to use "--zookeeper", which according to the documentation, is deprecated :
--zookeeper <String: hosts> DEPRECATED, The connection string for
the zookeeper connection in the form
host:port. Multiple hosts can be
given to allow fail-over.
And of course, it's working pretty well :
kafka-topics --zookeeper <fqdn.of.the.server>:2181 --describe --topic <TOPIC-NAME>
Topic:<TOPIC-NAME> PartitionCount:3 ReplicationFactor:2
Configs:
Topic: <TOPIC-NAME> Partition: 0 Leader: <ID-3> Replicas: <ID-3>,<ID-1> Tsr: <ID-1>,<ID-3>
Topic: <TOPIC-NAME> Partition: 1 Leader: <ID-1> Replicas: <ID-1>,<ID-2> Isr: <ID-2>,<ID-1>
Topic: <TOPIC-NAME> Partition: 2 Leader: <ID-2> Replicas: <ID-2>,<ID-3> Isr: <ID-2>,<ID-3>
So, my question is : why am i unable to use "--bootstrap-server" atm ? Because of the "zookeeper" deprecation, i'm worried about not to be able to consult my topics, and their details...
I believe that kafka-topics needs the same option than kafka-console-consumer, aka "-consumer.config"...
Ask if any additionnal precision needed.
Thanks a lot, hope my question is clear and readable.
Blyyyn
I finally found a way to deal with this SSL error. The key is to use the following setting :
--command-config client-ssl.properties
This is working with the most part of KAFKA commands, like kafka-consumer-groups, and of course kafka-topics. See examples below :
kafka-consumer-groups --bootstrap-server <kafka-hostname>:<kafka-port> --group <consumer-group> --topic <topic> --reset-offsets --to-offset <offset> --execute --command-config <ssl-config>
kafka-topics --list --bootstrap-server <kafka-hostname>:<kafka-port> --command-config client-ssl.properties
ssl-config was "client-ssl.properties",see initial post for content.
Beware, by using IP address on , you'll have an error if the machine certificate doesnt have alternative name with that IP address. Try to have correct DNS resolution and use FQDN if possible.
Hope this solution will help, cheers!
Blyyyn
Stop your Brokers and run below ( assuming you have more that 1.5GB RAM on your server)
export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
then start your Brokers on all 3 nodes and then try it.
Note that for consumer and producer clients you need to prefix security.protocol accordingly inside your client-ssl.properties.
For Kafka Consumers:
consumer.security.protocol=SASL_SSL
For Kafka Producers:
producer.security.protocol=SASL_SSL

Kafka inter broker SSL handshake failed

I am trying to setup inter-broker SSL (not client) authentication and keep seeing the following errors:
[2019-05-17 06:33:47,151] INFO [Controller id=1004, targetBrokerId=1004] Failed authentication with /$IP (SSL handshake failed) (org.apache.kafka.common.network.Selector)
[2019-05-17 06:33:47,151] INFO [SocketServer brokerId=1004] Failed authentication with /$IP (SSL handshake failed) (org.apache.kafka.common.network.Selector)
[2019-05-17 06:33:47,151] ERROR [Controller id=1004, targetBrokerId=1004] Connection to node 1004 (/$IP:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)
My server.properties is:
listeners=PLAINTEXT://$IP:9092,SSL://$IP:9093
security.inter.broker.protocol=SSL
ssl.truststore.password=$PASS
ssl.keystore.password=$PASS
ssl.key.password=$PASS
ssl.endpoint.identification.algorithm=""
ssl.keystore.location=/etc/kafka/kafka.server.keystore.jks
ssl.truststore.location=/etc/kafka/kafka.server.truststore.jks
``
When I run `openssl s_client -debug -connect $IP:9093 -tls1` I get back a list of certificates and `Secure Renegotiation IS supported`
Despite adding `-Djavax.net.debug=all` there's not anything in the logs which points to the problem.
Kafka version 2.2
Any ideas?
I had incorrectly set the value of ssl.endpoint.identification.algorithm="" instead of ssl.endpoint.identification.algorithm", this fixed it.
This value was changed in 2.2 to default to https so setting it to nothing worked.

RabbitMQ Server TLS, client alert: Fatal - Certificate Unknown when starting service

Have RabbitMQ configured to enable TLS with certificates. Key, Cert, and CA defined in .conf file. Upon service startup, error is thrown. Cannot find the cause for this to be thrown and logging isn't giving any more information at the debug level.
Get a client alert failure and am not certain of cause.
2019-03-22 10:04:18.690 [info] <0.7.0> Server startup complete; 4 plugins started.
* rabbitmq_amqp1_0
* rabbitmq_management
* rabbitmq_management_agent
* rabbitmq_web_dispatch
2019-03-22 10:04:24.831 [debug] <0.689.0> Supervisor {<0.689.0>,rabbit_connection_sup} started rabbit_connection_helper_sup:start_link() at pid <0.690.0>
2019-03-22 10:04:24.831 [debug] <0.689.0> Supervisor {<0.689.0>,rabbit_connection_sup} started rabbit_reader:start_link(<0.690.0>, {acceptor,{0,0,0,0},5671}) at pid <0.691.0>
2019-03-22 10:04:24.909 [info] <0.688.0> TLS server: In state certify received CLIENT ALERT: Fatal - Certificate Unknown
Our certs didn't have the correct type of X509v3 Extended Key Usage on the cert.
For x509 Auth, you'll need to assign client web auth when creating the certificate.
X509v3 Extended Key Usage:
TLS Web Client Authentication
This won't fix the issue if your certificate CA is broken and can't be verified, but for my issue, this was the resolution.

Cassandra nodes cannot see each other when internode encryption is enabled

I had set up a 6-node Cassandra cluster spanning two AWS regions / datacenters (3 in each) and everything was working fine. After getting that much working I attempted to enable internode encryption which I cannot get to work properly, despite reading innumerable documents on the subject and fiddling endlessly.
I don't see any errors or anything out of the ordinary in the logs. I do see the following line in the logs which indicates it has started the encrypted messaging service, as expected:
MessagingService.java:482 - Starting Encrypted Messaging Service on SSL port 7001
I have enabled verbose logging for SSL in cassandra-env.sh, however this does not produce any errors or additional information about SSL internode connections that I can see (update below):
JVM_OPTS="$JVM_OPTS -Djavax.net.debug=ssl"
I can connect to from one node to all the others on the encrypted messaging port 7001 using nc, so there's no firewall issue.
ubuntu#ip-5-6-7-8:~$ nc -v 1.2.3.4 7001
Connection to 1.2.3.4 7001 port [tcp/afs3-callback] succeeded!
I can connect to each node locally using cqlsh (I haven't enabled client-server encryption) and can query the system keyspace, etc.
However, if I run nodetool status I see that the nodes cannot see each other. Only the node that I'm querying the cluster on is present in the list. This was not the case before internode encryption was enabled, they could all see each other just fine then.
ubuntu#ip-5-6-7-8:~$ nodetool status
Datacenter: us-east_A
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 1.2.3.4 144.75 KB 256 ? 992ae1bc-77e4-4ab1-a18f-4db62bb0ce6f 1b
My process was this:
Created a certificate authority for my cluster
Created a keystore and truststore for each node and added my CA certificate chain to both
Generated a key pair and CSR for each node, signed it with my CA, and added the resulting certificate to each node's keystore
Updated each node's configuration as reads below
Restarted all nodes
The server encryption configuration I'm using is this, with the appropriate values in the $variables.
server_encryption_options:
internode_encryption: all
keystore: $keystore_path
keystore_password: $keystore_passwd
truststore: $truststore_path
truststore_password: $truststore_passwd
require_client_auth: true
protocol: TLS
algorithm: SunX509
store_type: JKS
cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
If anybody could offer some insight or a direction to look in it would be greatly appreciated.
Update: Cipher Suite Agreement
Apparently SSL debug logging prints to stdout, which is not logged to Cassandra's logfiles, so I didn't see that output before. Running Cassandra in the foreground I can see a ton of SSL errors tracing out, all of which complain of handshake failure, because:
javax.net.ssl.SSLHandshakeException: no cipher suites in common
In an attempt to solve this problem I have switched to the Oracle JRE (I was being lazy and using OpenJDK before) and installed the JCE unlimited strength cryptography policy files to ensure all possible ciphers would be supported.
It didn't fix anything.
This is especially confusing given that all these nodes are exactly identical: hardware, OS vendor and version, Java vendor and version, Cassandra version, and configuration file. I cannot imagine why they cannot agree on a cipher suite under these circumstances.
The following is the full error that is traced:
*** ClientHello, TLSv1.2
RandomCookie: GMT: 1449074039 bytes = { 205, 93, 27, 38, 184, 219, 250, 8, 232, 46, 117, 84, 69, 53, 225, 16, 27, 31, 3, 7, 203, 16, 133, 156, 137, 231, 238, 39 }
Session ID: {}
Cipher Suites: [TLS_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
Compression Methods: { 0 }
***
%% Initialized: [Session-3, SSL_NULL_WITH_NULL_NULL]
%% Invalidated: [Session-3, SSL_NULL_WITH_NULL_NULL]
ACCEPT-/1.2.3.4, SEND TLSv1.2 ALERT: fatal, description = handshake_failure
ACCEPT-/1.2.3.4, WRITE: TLSv1.2 Alert, length = 2
ACCEPT-/1.2.3.4, called closeSocket()
ACCEPT-/1.2.3.4, handling exception: javax.net.ssl.SSLHandshakeException: no cipher suites in common
ACCEPT-/1.2.3.4, called close()
ACCEPT-/1.2.3.4, called closeInternal(true)
INFO 16:33:59 Waiting for gossip to settle before accepting client requests...
Allow unsafe renegotiation: false
Allow legacy hello messages: true
Is initial handshake: true
Is secure renegotiation: false
ACCEPT-/1.2.3.4, setSoTimeout(10000) called
ACCEPT-/1.2.3.4, READ: SSL v2, contentType = Handshake, translated length = 57
After a great deal more poking and prodding I've finally managed to get this to work. The problem was related to certificates and the keystore.
As a result of these problems the SSL handshake would fail either due to certificate chain problems or cipher suite agreement problems. Cassandra rather unhelpfully discards errors related to SSL and logs nothing.
In any case, I managed to get things working by doing the following:
Ensure that the CA generates node certificates with both client and server key usage attributes. Failing to include one or the other will prevent nodes from authenticating to each other properly. This presents itself as the cipher suite agreement error. If you're using OpenSSL to manage your CA, I've included the -extensions configuration I used below.
Ensure that both the root and any intermediate CA certificates you are using (if you're using an intermediary CA) are imported into both the keystore and truststore.
Ensure that the node certificate imported into the keystore includes the full trust chain from the primary certificate down to the CA root, including any intermediaries – even though you have already imported these CA certificates separately into the keystore. Failing to do this presents itself as an invalid certificate chain errors.
OpenSSL CA Config
Here's my extensions section for dual-role client/server certificates. You can include this in your OpenSSL config file and reference it when signing by specifying -extensions dual_cert.
[ dual_cert ]
# Extensions for dual-role user/server certificates (`man x509v3_config`).
basicConstraints = CA:FALSE
nsCertType = client, server
nsComment = "Client/Server Dual-role Certificate"
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer:always
keyUsage = critical, nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = clientAuth, serverAuth
Creating a PEM containing the full trust chain
To create a single PEM file which contains the full trust chain for your node certificate, simply cat all the certificate files in reverse order from the node certificate down to the CA root.
cat node1.crt ca-intermediate.crt ca-root.crt > node1-full-chain.crt