HTTP2 PING frames over AWS ALB (gRPC keepalive ping) - asp.net-core

I'm using AWS Application Load Balancer (ALB) to expose the ASP.NET Core gRPC services. The services are running in Fargate containers and expose unsecured HTTP ports. ALB terminates the outer TLS connection and forwards the unencrypted traffic to a target group based on the route. The gRPC application has several client streaming endpoints and the client can pause the streaming for several minutes. I know that there are HTTP2 PING frames, which can be used in such cases, to keep alive the connection that has no data transmission for some amount of time.
The gRPC server is configured to send HTTP2 pings every 20 seconds for keeping the connection alive. I tested this approach and it works, the ping frames went from the server and were acknowledged by the client.
But this approach fails when it comes to ALB. During the transmission pauses, I don't see any packages from the server behind the load balancer (I use Wireshark). Then after the timeout of 1 minute, the ALB resets the connection.
I tried to use client-sent HTTP2 pings as well. But the connection also resets in 1 minute and I have no evidence whether these ping packages actually reached the server behind the ALB.
I have an assumption that AWS ALB doesn't allow such packets to pass over it, but I didn't find any documentation that proves it.

ALB forwards requests based on HTTP protocol semantics, and not raw HTTP/2 frames. Therefore something like ping frames will only apply for one of the hops.
If you want an end to end ping, you could define a gRPC API which is performing the ping. For server to client you would be required to use a server side streaming APIs. But it might actually be preferrable to let the clients start the pings, to reduce the worker the server has to perform.

The AWS support team responded to my ticket and the short answer is ALB does not support the HTTP2 ping frames. They suggested increasing the value of idle timeout on the load balancer, but this solution may be not applicable in some cases.
As Matthias247 already mentioned, the possible workaround is to define a gRPC API for the purpose of doing a ping.

Since ALB does not support the HTTP2 ping frames. A straightforward way to solve it is to use a custom PING message.
I think you can get another new stream to send messages when the current stream is closed by ALB due to idle timeout (without messages within idle time)
When idle timeout of ALB, the RST_STREAM message with ErrCode=PROTOCOL_ERROR will be sent from ALB to client-side. The client could handle this error in sender and receiver and then get another new stream to send new messages to reuse the http2 connection.
Here are the sample codes with gRPC-go
conn, errD := grpc.Dial(ServerAddress,
grpc.WithTransportCredentials(cred),
grpc.WithConnectParams(grpc.ConnectParams{MinConnectTimeout: time.Duration(63) * time.Second}),
grpc.WithKeepaliveParams(keepalive.ClientParameters{
Time: time.Second * 20,
Timeout: time.Second * 3,
PermitWithoutStream: true,
}))
if errD != nil {
log.Fatalf("net.Connect err: %v", errD)
}
defer conn.Close()
grpcClient := protocol.NewChatClient(conn)
ctx := context.Background()
stream, errS := grpcClient.Stream(ctx, grpc.WaitForReady(true))
if errS != nil {
log.Fatalf("get BidirectionalHello stream err: %v", errS)
}
for i := 0; i < 200; i++ {
err := stream.Send(
// some message
})
if err != nil {
if err == io.EOF {
// get anothe stream to send new message on sender
stream, errS = grpcClient.Stream(ctx, grpc.WaitForReady(true))
if errS != nil {
log.Fatalf("get stream err: %v", errS)
}
} else if s, ok := status.FromError(err); ok {
switch s.Code() {
case codes.OK:
// noop
case codes.Unavailable, codes.Canceled, codes.DeadlineExceeded:
return
default:
return
}
}
}
go func() {
for {
res, errR := stream.Recv()
if errR != nil {
if errR == io.EOF {
log.Printf("stream recv err %+v \n", errR)
}
// get anothe stream to send new message on receiver
stream, errS = grpcClient.Stream(ctx, grpc.WaitForReady(true))
if errS != nil {
log.Fatalf("in recv to get stream err: %v", errS)
}
return
}
log.Printf("recv resp %+v", res)
}
}()
// over the idle timeout of alb (60 seconds)
time.Sleep(time.Duration(61) * time.Second)
}
To view the details of gRPC message, you could run it through GODEBUG=http2debug=2 go run main.go

Related

How does multithreading affect http keep-alive connection?

var (
httpClient *http.Client
)
const (
MaxIdleConnections int = 20
RequestTimeout int = 5
)
// init HTTPClient
func init() {
client := &http.Client{
Transport: &http.Transport{
MaxIdleConnsPerHost: MaxIdleConnections,
},
Timeout: time.Duration(RequestTimeout) * time.Second,
}
return client
}
func makeRequest() {
var endPoint string = "https://localhost:8080/doSomething"
req, err := http.NewRequest("GET", ....)
response, err := httpClient.Do(req)
if err != nil && response == nil {
log.Fatalf("Error sending request to API endpoint. %+v", err)
} else {
// Close the connection to reuse it
defer response.Body.Close()
body, err := ioutil.ReadAll(response.Body)
if err != nil {
log.Fatalf("Couldn't parse response body. %+v", err)
}
log.Println("Response Body:", string(body))
}
}
I have the following code in Go. Go uses http-keep-alive connection. Thus, from my understanding, httpClient.Do(req) will not create a new connection, since golang uses default persistent connections.
From my understanding HTTP persistent connection makes one request at a time, ie the second request can only be made after first response. However if multiple threads call makeRequest() what will happen ? Will httpClient.Do(req) send another request even before previous one got a response ?
I assume server times-out any keep-alive connection made by client. If server were to timeout, then the next time httpClient.Do(req)is called, would it establish a new connection ?
an http.Client has a Transport to which it delegates a lot of the low-level details of making a request. You can change pretty much anything by giving your client a custom Transport. The rest of this answer will largely assume that you're using http.DefaultClient or at least a client with http.DefaultTransport.
When making a new request, if an idle connection to the appropriate server is available, the transport will use it.
If no idle connection is available (because there never was one, or because other goroutines are using them all, or because the server closed the connection, or there was some other error) then the transport will consider making a new connection, limited by MaxConnsPerHost (default: no limit). If MaxConnsPerHost would be exceeded, then the request will block until an existing request completes and a connection becomes available. Otherwise, a new connection will be made for this request.
On completion of a request, the client will cache the connection for later use (limited by MaxIdleConns and MaxIdleConnsPerHost; DefaultTransport has a limit of 100 idle connections globally, and no limit per-host).
Idle connections will be closed after IdleConnTimeout if they aren't used to make a request; for DefaultTransport the limit is 90 seconds.
All of which means that by default, Go will make enough connections to satisfy parallelism (up to certain limits which you can adjust) but it will also reuse keep-alive connections as much as possible by maintaining a pool of idle connections for some length of time.
It will not affect the http keep-alive connection, base on your code, you are using global httpClient, this will not create a new connection if called in multiple thread as you expected, Also it read the response.Body before it closed. If the provided response.Body is an io.Closer, it will closed after the request.

Janus gateway videoroom cancels connection after 60 seconds

"peerConnection new connection state: connected"
{
"janus": "webrtcup",
"session_id": 3414770196795261,
"sender": 4530256184020316
}
{
"janus": "media",
"session_id": 3414770196795261,
"sender": 4530256184020316,
"type": "audio",
"receiving": true
}
... 1 minute passes
"peerConnection new connection state: disconnected"
{
"janus": "timeout",
"session_id": 3414770196795261
}
"peerConnection new connection state: failed"
See pastebin for the full logs.
I'm trying to join a videoroom on my Janus server. All requests seem to succeed, and my device shows a connected WebRTC status for around one minute before the connection is canceled because of a timeout.
The WebRTC connection breaking off seems to match up with the WebSocket connection to Janus' API breaking.
I tried adding a heartbeat WebSocket message every 10 seconds, but that didn't help. I'm
joining the room
receiving my local SDP plus candidates
configuring the room with said SDP
receiving an answer from janus
accepting that answer with my WebRTC peer connection.
Not sure what goes wrong here.
I also tried setting a STUN server inside the Janus config, to no avail. Same issue.
Added the server logs to the pastebin too.
RTFM: Janus' websocket connections require a keepalive every <60s.
An important aspect to point out is related to keep-alive messages for WebSockets Janus channels. A Janus session is kept alive as long as there's no inactivity for 60 seconds: if no messages have been received in that time frame, the session is torn down by the server. A normal activity on a session is usually enough to prevent that; for a more prolonged inactivity with respect to messaging, on plain HTTP the session is usually kept alive through the regular long poll requests, which act as activity as long as the session is concerned. This aid is obviously not possible when using WebSockets, where a single channel is used both for sending requests and receiving events and responses. For this reason, an ad-hoc message for keeping alive a Janus session should to be triggered on a regular basis. Link.
You need to send 'keepalive' message with same 'session_id'to keep the session going. Janus closes session after 60 seconds.
Look for the implementation: https://janus.conf.meetecho.com/docs/rest.html
Or do it my way: i do it every 30 seconds in a runnable handler.
private Handler mHandler;
private Runnable fireKeepAlive = new Runnable() {
#Override
public void run() {
String transactionId = getRandomStringId();
JSONObject request = new JSONObject();
try {
request.put("janus", "keepalive");
request.put("session_id", yourSessionId);
request.put("transaction", transactionId);
} catch (JSONException e) {
e.printStackTrace();
}
myWebSocketConnection.sendTextMessage(request.toString());
mHandler.postDelayed(fireKeepAlive, 30000);
}
};
Then in OnCreate()
mHandler = new Handler();
then call this where WebSocket connection Opens:
mHandler.post(fireKeepAlive);
be sure to remove callback on destroy
mHandler.removeCallbacks(fireKeepAlive);

IllegalArgumentException: "Auth scheme may not be null" in CloseableHttpAsyncClient

I'm running some asynchronous GET requests using a proxy with authentication. When doing HTTPS requests, I'm always running into an exception after 2 successful asyncronous requests:
java.lang.IllegalArgumentException: Auth scheme may not be null
When executing the GET requests without a proxy, or using http instead of https, the exception never occurred.
Example from Apache HttpAsyncClient Examples
HttpHost proxy = new HttpHost("proxyname", 3128);
CredentialsProvider credsProvider = new BasicCredentialsProvider();
credsProvider.setCredentials(new AuthScope(proxy), new UsernamePasswordCredentials("proxyuser", "proxypass"));
CloseableHttpAsyncClient httpClient = HttpAsyncClients.custom().setDefaultCredentialsProvider(credsProvider).build();
httpClient.start();
RequestConfig config = RequestConfig.custom().setProxy(proxy).build();
for (int i = 0; i < 3; i++) {
HttpGet httpGet = new HttpGet(url);
httpGet.setConfig(config);
httpClient.execute(httpGet, new FutureCallback<HttpResponse>() {
public void failed(Exception ex) {
ex.printStackTrace(); // Exception occures here afther 2nd iteration
}
public void completed(HttpResponse result) {
// works for the first and second iteration
}
public void cancelled() {
}
});
}
httpClient.close();
If I run the code above with 'http://httpbin.org/get', there is no exception, but if I run it with 'https://httpbin.org/get', I get the following exception after 2 successful requests:
java.lang.IllegalArgumentException: Auth scheme may not be null
at org.apache.http.util.Args.notNull(Args.java:54)
at org.apache.http.impl.client.AuthenticationStrategyImpl.authSucceeded(AuthenticationStrategyImpl.java:215)
at org.apache.http.impl.client.ProxyAuthenticationStrategy.authSucceeded(ProxyAuthenticationStrategy.java:44)
at org.apache.http.impl.auth.HttpAuthenticator.isAuthenticationRequested(HttpAuthenticator.java:88)
at org.apache.http.impl.nio.client.MainClientExec.needAuthentication(MainClientExec.java:629)
at org.apache.http.impl.nio.client.MainClientExec.handleResponse(MainClientExec.java:569)
at org.apache.http.impl.nio.client.MainClientExec.responseReceived(MainClientExec.java:309)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseReceived(DefaultClientExchangeHandlerImpl.java:151)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.responseReceived(HttpAsyncRequestExecutor.java:315)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:255)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:121)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
at java.lang.Thread.run(Thread.java:748)
Note: I'm using httpasyncclient 4.1.4
If this is the exact code you have been executing then the problem is quite apparent. Welcome to the world of even-driven programming.
Essentially what happens is the following:
The client initiates 3 message exchanges by submitting 3 requests to the client execution pipeline in a tight loop
3 message exchanges get queued up for execution
The loop exits
Client shutdown is initiated
Now the client is racing to execute 3 initiated message exchanges and to shut itself down at the same time
If one is lucky and the target server is fast enough one might get all 3 exchanges before the client shuts down its i/o event processing threads
If unlucky or when the request execution is relatively slow, for instance due, to the use of TLS transport security, some of message exchanges might get terminated in the middle of the process. This is the reason you are seeing the failure when using https scheme but not http.

Paho Rabitmqq connection getting failed

Here is my paho client code
// Create a client instance
client = new Paho.MQTT.Client('127.0.0.1', 1883, "clientId");
// set callback handlers
client.onConnectionLost = onConnectionLost;
client.onMessageArrived = onMessageArrived;
// connect the client
client.connect({onSuccess:onConnect});
// called when the client connects
function onConnect() {
// Once a connection has been made, make a subscription and send a message.
console.log("onConnect");
client.subscribe("/World");
message = new Paho.MQTT.Message("Hello");
message.destinationName = "/World";
client.send(message);
}
// called when the client loses its connection
function onConnectionLost(responseObject) {
if (responseObject.errorCode !== 0) {
console.log("onConnectionLost:"+responseObject.errorMessage);
}
}
// called when a message arrives
function onMessageArrived(message) {
console.log("onMessageArrived:"+message.payloadString);
}
On Rabbitmq server everything is default seetings. When i run this code i get WebSocket connection to 'ws://127.0.0.1:1883/mqtt' failed: Connection closed before receiving a handshake response
What i am missing ?
From my personal experience with Paho MQTT JavaScript library and RabbitMQ broker on windows, here is a list of things that you need to do to be able to use MQTT from JS from within a browser:
Install rabbitmq_web_mqtt plugin (you may find latest binary here, copy it to "c:\Program Files\RabbitMQ Server\rabbitmq_server-3.6.2\plugins\", and enable from command line using "rabbitmq-plugins enable rabbitmq_web_mqtt".
Of course, MQTT plugin also needs to be enabled on broker
For me, client was not working with version 3.6.1 of RabbitMQ, while it works fine with version 3.6.2 (Windows)
Port to be used for connections is 15675, NOT 1883!
Make sure to specify all 4 parameters when making instance of Paho.MQTT.Client. In case when you omit one, you get websocket connection error which may be quite misleading.
Finally, here is a code snippet which I tested and works perfectly (just makes connection):
client = new Paho.MQTT.Client("localhost", 15675, "/ws", "client-1");
//set callback handlers
client.onConnectionLost = onConnectionLost;
client.onMessageArrived = onMessageArrived;
//connect the client
client.connect({
onSuccess : onConnect
});
//called when the client connects
function onConnect() {
console.log("Connected");
}
//called when the client loses its connection
function onConnectionLost(responseObject) {
if (responseObject.errorCode !== 0) {
console.log("onConnectionLost:" + responseObject.errorMessage);
}
}
//called when a message arrives
function onMessageArrived(message) {
console.log("onMessageArrived:" + message.payloadString);
}
It's not clear in the question but I assume you are running the code above in a web browser.
This will be making a MQTT connection over Websockets (as shown in the error). This is different from a native MQTT over TCP connection.
The default pure MQTT port if 1883, Websocket support is likely to be on a different port.
You will need to configure RabbitMQ to accept MQTT over Websockets as well as pure MQTT, this pull request for RabbitMQ seams to talk about adding this capability. It mentions that this capability was only added in version 3.6.x and that the documentaion is still outstanding (as of 9th Feb 2016)

Aut discovering of ip address

I have a server and a client that communicate with each other over an UDP socket. The server are opening port 10002 and are listening for incoming datagrams.
For the client to get the server ip it sends one broadcast datagram which the server responds to. The client code responsible for finding the ip address of the server looks like this:
private IPEndPoint GetServerEP(TimeSpan timeout, UdpClient udpclient)
{
IPEndpoint server = new IPEndPoint(IPAddress.Broadcast, 10002);
byte[] data = GetDiscoverDatagram();
_udpclient.EnableBroadcast = true;
udpclient.Send(data, data.Length, server);
try
{
udpclient.Client.ReceiveTimeout = (int)timeout.TotalMilliseconds;
udpclient.Receive(ref server);
}
catch (SocketException e)
{
string msg = string.Format("Server did not respond within {0} ms", timeout.TotalMilliseconds);
throw new TimeoutException(msg, e);
}
return server;
}
Upon running this, I can see that the server actually receives the broadcast datagram and responds with a packet bound for the same port as the client sends from. However, the client does not receive anything and timeouts.
What am I missing?
Stupid me (or; stupid firewall). The code worked, but the firewall blocked the response packet from the server. After disabling everything works like a charm.