Netty UDT examples on amazon ec2 - udp

I am not a networking guru so I am probably missing something simple.
I have built both the message echo client and msg echo server as runnable jar files using the netty 4.0.11 http://dl.bintray.com/netty/downloads/netty-4.0.11.Final.tar.bz2
I was able to load all of the correct dependencies using maven and the project builds and runs correctly locally and and on the server. I can run a server on my local host and connect to it from a client on the same (localhost , both on my local machine and on my amazon ec2 instance. Again it works connecting to itself (localhost) on both my machine, and my server computer.
THe problem is that I cannot connect to the server from outside the machine it is running, for example, I want to connect to my echo msg server (running on my ec2 instance) from the echo msg client running on my local computer.
I have setup the amazon security settings to allow UDP from the correct port and from the ip of my local machine. I run the Echo server on the ec2 instance and it correctly starts up:
REGISTERED
ACTIVE
DATAGRAM LISTENING bind=/0.0.0.0:1234 peer=null:0
But I just cannot connect from outside the local host, here is the error message I am getting from the client machine,
I have also turned off all firewalls (temporarily) on my local machine. Still, when i try to connect I get:
CONNECT(/70.36.197.242:1234, null)
10:11:17.000 [connect-0] DEBUG com.barchart.udt.EpollUDT - ep 1 rem [id: 0x3287e50e] DATAGRAM CONNECTING bind=/0.0.0.0:55005 peer=null:0
10:11:17.000 [connect-0] DEBUG com.barchart.udt.EpollUDT - ep 1 add [id: 0x3287e50e] DATAGRAM CONNECTING bind=/0.0.0.0:55005 peer=null:0 ERROR_WRITE
10/24/13 10:11:19 AM ===========================================================
udt.echo.message.MsgEchoClientHandler:
rate:
count = 0
mean rate = 0.00 bytes/s
1-minute rate = 0.00 bytes/s
5-minute rate = 0.00 bytes/s
15-minute rate = 0.00 bytes/s
10:11:20.017 [connect-0] WARN com.barchart.udt.nio.SelectionKeyUDT - logic error :
[id: 0x3287e50e] poll=ERROR_WRITE ready=---- inter=-C-- DATAGRAM CONNECTOR CONNECTING bind=/0.0.0.0:55005 peer=null:0
java.lang.Exception: Unexpected error report.
at com.barchart.udt.nio.SelectionKeyUDT.logError(SelectionKeyUDT.java:436) [barchart-udt-bundle-2.3.0.jar:na]
at com.barchart.udt.nio.SelectionKeyUDT.doRead(SelectionKeyUDT.java:205) [barchart-udt-bundle-2.3.0.jar:na]
at com.barchart.udt.nio.SelectorUDT.doResultsRead(SelectorUDT.java:334) [barchart-udt-bundle-2.3.0.jar:na]
at com.barchart.udt.nio.SelectorUDT.doResults(SelectorUDT.java:309) [barchart-udt-bundle-2.3.0.jar:na]
at com.barchart.udt.nio.SelectorUDT.doEpollExclusive(SelectorUDT.java:234) [barchart-udt-bundle-2.3.0.jar:na]
at com.barchart.udt.nio.SelectorUDT.doEpollEnter(SelectorUDT.java:196) [barchart-udt-bundle-2.3.0.jar:na]
at com.barchart.udt.nio.SelectorUDT.select(SelectorUDT.java:455) [barchart-udt-bundle-2.3.0.jar:na]
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:596) [netty-all-4.0.11.Final.jar:na]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:306) [netty-all-4.0.11.Final.jar:na]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) [netty-all-4.0.11.Final.jar:na]
at java.lang.Thread.run(Unknown Source) [na:1.7.0_06]
10:11:20.018 [connect-0] DEBUG com.barchart.udt.EpollUDT - ep 1 rem [id: 0x3287e50e] DATAGRAM CONNECTING bind=/0.0.0.0:55005 peer=null:0
10:11:20.018 [connect-0] ERROR c.barchart.udt.nio.SocketChannelUDT - connect failure : [id: 0x3287e50e] DATAGRAM CONNECTING bind=/0.0.0.0:55005 peer=null:0
10:11:20.018 [connect-0] INFO i.n.handler.logging.LoggingHandler - [id: 0x53f0f817] CLOSE()
Exception in thread "main" java.io.IOException
at com.barchart.udt.nio.SocketChannelUDT.finishConnect(SocketChannelUDT.java:236)
at io.netty.channel.udt.nio.NioUdtMessageConnectorChannel.doFinishConnect(NioUdtMessageConnectorChannel.java:132)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:228)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:502)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:417)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Unknown Source)

Nevermind, after doing some more trouble shooting I found the solution. I had to in addition to allowing the port in the amazon security settings, also allow the UDP port on the server firewall.

Related

Mulesoft - SFTP Connection reset

Trying to connect to an Azure SFTP results in a "connection reset" - same when using the "list" operation in a mule application as well as simply using the "test connection" button in the connector.
Credentials are fine and server is perfectly accessible with different FTP Clients.
Maybe you have an idea or can make more then I from the DEBUG log:
DEBUG org.mule.extension.sftp.internal.connection.SftpConnectionProvider: Connecting to host: 'xyz.blob.core.windows.net' at port: '22'
DEBUG com.jcraft.jsch: Connecting to xyz.blob.core.windows.net port 22
DEBUG com.jcraft.jsch: Connection established
DEBUG com.jcraft.jsch: Remote version string: SSH-2.0-AzureSSH_1.0.0
DEBUG com.jcraft.jsch: Local version string: SSH-2.0-JSCH-0.1.54
DEBUG com.jcraft.jsch: CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256
DEBUG com.jcraft.jsch: CheckKexes: diffie-hellman-group14-sha1,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521
DEBUG com.jcraft.jsch: CheckSignatures: ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521
DEBUG com.jcraft.jsch: SSH_MSG_KEXINIT sent
DEBUG com.jcraft.jsch: Disconnecting from xyz.blob.core.windows.net port 22
ERROR org.mule.extension.sftp.internal.connection.SftpConnectionProvider: Session.connect: java.net.SocketException: Connection reset
com.jcraft.jsch.JSchException: Session.connect: java.net.SocketException: Connection reset
The issue can happen if the preferred authentication method is missing in the SFTP configuration with the identity file specified, and the target SFTP server is only enabled for SSH key based authentication
<sftp:connection host="${sftp.host}" port="${sftp.port}" username="${sftp.username}" passphrase="${sftp.passphrase}" preferredAuthenticationMethods="#[['PUBLIC_KEY']]" identityFile="${sftp.identityfile}" connectionTimeout="${sftp.connectionTimeout}" responseTimeout="${ftp.responseTimeout}">
</sftp:connection>
There is a known issue with connection to Azure SFTP using JSCH. See this post.

Dcm4chee Connection to ldap://ldap:389 broken - reconnect error

I'm using dcm4chee docker stack with ldap and postgreSQL and have a floating error:
ldap:389; socket closed; remaining name 'cn=Devices,cn=DICOM Configuration,dc=mdw,dc=io'
2018-07-13 06:30:42,089 INFO [org.dcm4che3.conf.ldap.ReconnectDirContext] (Thread-0 (ActiveMQ-client-global-threads)) Connection to ldap://ldap:389 broken - reconnect
All three services are running on the same host. What can I do to avoid that error?
Disable the firewall. If firewall was enabled add an exception for all 3 services.

WMQ(IBM Queue) Connection timeout

I'm able connect IBM Queue directly but when u tried to connect from mule getting the below error and not able to deploy. I'm getting the below error
ERROR 2017-04-25 06:45:13,582
[main]org.mule.retry.notifiers.ConnectNotifier: Failed to connect/reconnect:
WebSphereMQConnector
{
name=WMQ2
lifecycle=initialise
this=5e7abaf7
numberOfConcurrentTransactedReceivers=4
createMultipleTransactedReceivers=true
connected=false
supportedProtocols=[wmq]
serviceOverrides=<none>
}
. Root Exception was: Connection timed out: connect. Type: class java.net.ConnectException
ERROR 2017-04-25 06:50:23,943 [main] org.mule.module.launcher.application.DefaultMuleApplication:
************************************************
Message : JMSWMQ0018: Failed to connect to queue manager 'RQACBRKB' with connection mode 'Client' and host name '172.11.11.11(6912)'.
JMS Code : JMSWMQ0018
Element : /WMQ2 # app:config.xml:14 (WMQ)
--------------------------------------------------------------------------------
Root Exception stack trace:
java.net.ConnectException: Connection timed out: connect
at java.net.TwoStacksPlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;AMQ9213: A communications error for occurred [1=java.net.ConnectException[Connection timed out: connect],3=rbitbrka.apl.com] at com.ibm.mq.jmqi.remote.impl.RemoteTCPConnection.connnectUsingLocalAddress(RemoteTCPConnection.java:810) ~[?:?]
PFB connector details:
<wmq:connector name="WMQ5" hostName="${mq.host}" port="${mq.port}" queueManager="${mq.queue.manager}" channel="CLIENTS.SALES.CRM" username="${mq.user}" password="${mq.password}" transportType="CLIENT_MQ_TCPIP" specification="1.1" targetClient="JMS_COMPLIANT" validateConnections="false" doc:name="WMQ" maxRedelivery="-1">
<reconnect frequency="${mq.reconnection.period.ms}" count="${mq.reconnection.attempt}"/>
</wmq:connector>
When i telnet the ip and port getting below error:
C:\Users\111>telnet 172.11.11.11 6912
Connecting To 172.11.11.11...Could not open connection to the host, on port 6912: Connect failed
But when i ping getting responce
C:\Users\111>ping 172.11.11.11
The pertinent pieces of information from your provided error are:-
JMSWMQ0018: Failed to connect to queue manager 'RQACBRKB'
with connection mode 'Client' and host name '172.11.11.11(6912)'.
com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;
MQRC 2538 is MQRC_HOST_NOT_AVAILABLE which is explained in Knowledge Center. In there it mentions the most common reasons for this error:-
The listener has not been started on the remote system. (please check that your listener is running on port 6912 on the machine at IP address 172.11.11.11)
The connection name in the client channel definition is incorrect. (the connection name your client is using is '172.11.11.11(6912)' - is this correct?)
The network is currently unavailable.
A firewall blocking the port, or protocol-specific traffic.
The security call initializing the IBM MQ client is blocked by a security exit on the SVRCONN channel at the server.
"java.net.ConnectException: Connection timed out: connect" usually occurs when you have a config issue or you are unable to connect to the remote server. As mentioned above do you have an error on the MQ end and if not have you checked the connection properties within config. If these are correct, are you able to reach MQ from another client, such as SOAPUI?
Can you also post the connector and flow details?

ActiveMQ Master/Slave on Weblogic - vm transport issue

I am trying to configure ActiveMQ master/slave setup on a single WebLogic machine. The problem is when I start Managed Server1 it successfully connects to vm transport and everything works perfectly, but when I start Managed Server2 I am receiving the following errors in broker logs
INFO 2016-September-27 10:08:00,227 ActiveMQEndpointWorker:124 - Connection attempt already in progress, ignoring connection exception
INFO 2016-September-27 10:08:01,161 TransportConnector:260 - Connector vm://localhost started
INFO 2016-September-27 10:08:30,228 TransportConnector:291 - Connector vm://localhost stopped
INFO 2016-September-27 10:08:30,229 TransportConnector:260 - Connector vm://localhost started
WARN 2016-September-27 10:08:30,228 ActiveMQManagedConnection:385 - Connection failed: javax.jms.JMSException: peer (vm://localhost#61) stopped.
WARN 2016-September-27 10:08:30,231 TransportConnection:823 - Failed to add Connection ID:ndl-wls-300.mydomain.com-52251-1474966937425-65:1 due to java.lang.NullPointerException
ERROR 2016-September-27 10:08:30,233 ActiveMQEndpointWorker:183 - Failed to connect to broker [vm://localhost?create=false]: java.lang.NullPointerException
javax.jms.JMSException: java.lang.NullPointerException
Please help, I am stuck with this.
I still don't see the reason for the slave within the same VM. I suggest you reach out to an ActiveMQ expert consultant to validate your architecture.
However, I think I can help you move a little bit closer to this issue:
There is a fundamental miss understanding here.. the vm url is broken down like this:
vm://${brokerName}?option=value,etc
The first time you create vm://localhost?create=true.. you have created a broker
The second time you reference vm://localhost?create=false.. you have created a client connection to the first broker.
To get two brokers, you'd need two different vm://${brokerName}?create=true

GridGain network connection: Is it possible to forward a node via SSH?

I would like to SSH into a remote machine running a gridgain instance and connect to it from a local gridgain instance. Can this be done?
How is the gridgain network connection being done? As far as I could sse the node spins up and listens on the first available port on 47100-47200. But it opens some more ports too.
It seems not be sufficient to just e.g. forward 47100 on the remote machine (the remote machines gridgain port) to local 47100. Probably the communication is not just client server but symmetrical with the remote node trying to connect to my home node?
Is there documentation on the network protocol?
I tried a symetrically forwarding the
GridTcpCommunicationSpi.DFLT_PORTs (47100+) and
GridTcpDiscoverySpi.DFLT_PORTs (47500+)
ports.
The nodes are able to connect. On the local node I first get this warning:
WARN GridTcpCommunicationSpi - Connect timed out (consider increasing 'connTimeout' configuration property) [addr=/10.240.136.167:47100]
WARN GridTcpDiscoverySpi - Timed out waiting for message delivery receipt (most probably, the reason is in long GC pauses on remote node; consider tuning GC and increasing 'ackTimeout' configuration property). Will retry to send message with increased timeout. Current timeout: 5000.
WARN GridDhtPreloader - <gg-utility-sys-cache> Failed to wait for initial partition map exchange. Possible reasons are:
^-- Transactions in deadlock.
^-- Long running transactions (ignore if this is the case).
^-- Unreleased explicit locks.
WARN GridTcpDiscoverySpi - Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node. Current timeout: 5000.
This is a timeout when somehow trying to connect to connect to 10.240.136.167:47100 - which is the remote machines local IP, which is obviously impossible.
But it looks nice as I get the following:
INFO GridDiscoveryManager - Topology snapshot [ver=2, nodes=2, CPUs=6, heap=2.7GB]
On executing the following broadcast test:
grid.compute().broadcast(new GridRunnable() {
#Override
public void run() {
System.out.println("hello!");
}
});
I get this fatal error on the remote machine, whatever it may be:
[SEVERE][gridgain-#9%pub-null%][GridJobProcessor] Task was not deployed or was redeployed since task execution [taskName=nix.GoogleGridRun$Test, taskClsName=at$
at org.gridgain.grid.kernal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1732)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:654)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.access$1800(GridIoManager.java:62)
at org.gridgain.grid.kernal.managers.communication.GridIoManager$6.body(GridIoManager.java:615)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[19:58:02,237][SEVERE][gridgain-#11%pub-null%][GridJobProcessor] Task was not deployed or was redeployed since task execution [taskName=nix.GoogleGridRun$1, taskClsName=at.a$
For more information see:
Troubleshooting: http://bit.ly/GridGain-Troubleshooting
Documentation Center: http://bit.ly/GridGain-Documentation
class org.gridgain.grid.GridDeploymentException: Task was not deployed or was redeployed since task execution [taskName=nix.GoogleGridRun$1, taskClsName=at.ac.ait.is.infrase$
For more information see:
Troubleshooting: http://bit.ly/GridGain-Troubleshooting
Documentation Center: http://bit.ly/GridGain-Documentation
at org.gridgain.grid.kernal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1107)
at org.gridgain.grid.kernal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1732)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:654)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.access$1800(GridIoManager.java:62)
at org.gridgain.grid.kernal.managers.communication.GridIoManager$6.body(GridIoManager.java:615)
On the client side I don't see anything but:
INFO GridDeploymentLocalStore - Class locally deployed: class nix.GoogleGridRun$1
hello!
When I try to push the broadcast again via the debugger, then I get the following on the local machine and the same error message as before on the remote machine:
ERROR GridTaskWorker - Failed to obtain remote job result policy for result from GridComputeTask.result(..) method (will fail the whole task): GridJobResultImpl [job=o.g.g.kernal.processors.closure.GridClosureProcessor$10#7e89183d, sib=GridJobSiblingImpl [sesId=4c17983b841-43f8b9fa-87ae-4a20-99a1-8d36f5eb74a4, jobId=0d17983b841-ef0084a6-f6a7-4501-87a0-3c5eb7c72bca, nodeId=ef0084a6-f6a7-4501-87a0-3c5eb7c72bca, isJobDone=false], jobCtx=GridJobContextImpl [jobId=0d17983b841-ef0084a6-f6a7-4501-87a0-3c5eb7c72bca, attrs={}], node=GridTcpDiscoveryNode [id=ef0084a6-f6a7-4501-87a0-3c5eb7c72bca, addrs=[10.240.136.167, 127.0.0.1], sockAddrs=[/10.240.136.167:47500, /10.240.136.167:47500, /127.0.0.1:47500], discPort=47500, order=1, loc=false, ver=6.5.0#20140925-sha1:6dc3d773], ex=class o.g.g.GridDeploymentException: Task was not deployed or was redeployed since task execution [taskName=nix.GoogleGridRun$Test, taskClsName=nix.GoogleGridRun$Test, codeVer=0, clsLdrId=eb17983b841-43f8b9fa-87ae-4a20-99a1-8d36f5eb74a4, seqNum=1411761402302, depMode=SHARED, dep=null]
For more information see:
Troubleshooting: http://bit.ly/GridGain-Troubleshooting
Documentation Center: http://bit.ly/GridGain-Documentation
, hasRes=true, isCancelled=false, isOccupied=true]
class org.gridgain.grid.GridException: Remote job threw user exception (override or implement GridComputeTask.result(..) method if you would like to have automatic failover for this exception).
at org.gridgain.grid.compute.GridComputeTaskAdapter.result(GridComputeTaskAdapter.java:109)
at org.gridgain.grid.kernal.processors.task.GridTaskWorker$3.apply(GridTaskWorker.java:819)
at org.gridgain.grid.kernal.processors.task.GridTaskWorker$3.apply(GridTaskWorker.java:812)
at org.gridgain.grid.util.GridUtils.wrapThreadLoader(GridUtils.java:6093)
at org.gridgain.grid.kernal.processors.task.GridTaskWorker.result(GridTaskWorker.java:812)
at org.gridgain.grid.kernal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:708)
at org.gridgain.grid.kernal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:906)
at org.gridgain.grid.kernal.processors.task.GridTaskProcessor$JobMessageListener.onMessage(GridTaskProcessor.java:1138)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:654)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.access$1800(GridIoManager.java:62)
at org.gridgain.grid.kernal.managers.communication.GridIoManager$6.body(GridIoManager.java:615)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: class org.gridgain.grid.GridDeploymentException: Task was not deployed or was redeployed since task execution [taskName=nix.GoogleGridRun$Test, taskClsName=nix.GoogleGridRun$Test, codeVer=0, clsLdrId=eb17983b841-43f8b9fa-87ae-4a20-99a1-8d36f5eb74a4, seqNum=1411761402302, depMode=SHARED, dep=null]
For more information see:
Troubleshooting: http://bit.ly/GridGain-Troubleshooting
Documentation Center: http://bit.ly/GridGain-Documentation
at org.gridgain.grid.kernal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1107)
at org.gridgain.grid.kernal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1732)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:654)
at org.gridgain.grid.kernal.managers.communication.GridIoManager.access$1800(GridIoManager.java:62)
at org.gridgain.grid.kernal.managers.communication.GridIoManager$6.body(GridIoManager.java:615)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
On the local host side I have connections between the virtual and real ports
tcp6 0 0 127.0.0.1:47100 127.0.0.1:38272 VERBUNDEN 12280/java
tcp6 0 0 127.0.0.1:38272 127.0.0.1:47100 VERBUNDEN 12280/java
And some more to and from the ssh client (also java)
tcp6 45832 0 78.101.12.107:47101 146.148.119.62:51867 VERBUNDEN 12280/java
tcp6 231 0 78.101.12.107:47501 146.148.119.62:46219 CLOSE_WAIT 12280/java
tcp6 48 0 78.101.12.107:37129 146.148.119.62:22 VERBUNDEN 12280/java
tcp6 1 0 78.101.12.107:47501 146.148.119.62:44391 CLOSE_WAIT 12280/java
78.101.12.107 = local ip
146.148.119.62 = remote ip
I looked at netstat on a successful local 2 node grid I see the following connections being made:
tcp6 0 0 ::1:47501 ::1:43143 VERBUNDEN 10218/java
tcp6 0 0 ::1:47500 ::1:34708 VERBUNDEN 9496/java
tcp6 0 0 ::1:34708 ::1:47500 VERBUNDEN 10218/java
tcp6 0 0 ::1:43143 ::1:47501 VERBUNDEN 9496/java
These are between the GridTcpCommunicationSpi.DFLT_PORTs and GridTcpDiscoverySpi.DFLT_PORTs - so these should maybe be enough.
Any Ideas on what could be wrong?
Home node should be available from cluster as well. You have 2 options:
Setup VPN
Implement and configure GridAddressResolver for all nodes which will turn their local addresses to external addresses. This will require to setup port forwarding in your home network.