SignalR Core connections not being closed and bringing down IIS - cloudflare

We have SignalR Core running in IIS and the connections are not being closed over time.
This results in a 503 error:
HTTP Error 503.2 - Service Unavailable
The serverRuntime#appConcurrentRequestLimit setting is being exceeded.
When recycling the apppool the Current Connections will drop to 0 and then back up to around 50 (as actual clients reconnect). Over the course of a day it can easily reach 2000. Not every connection is leaking - the number does decrease but trends upwards over time.
Latest .NET Core 3.1 is installed.

We are using Argo Tunnel which creates a tunnel from our webserver to the Cloudflare network.
Turns out it was an out of date cloudflared executable.
The version we were running was from 5/2019 and updating to the new version from 12/2019 has fixed the problem. In fact when stopping the cloudflare service all the connections instantly dropped away.

Related

IIS SSL Issues Under Heavy Website Load -- Non-SSL Works Fine

My IIS 7.5 web server farm (2xWindows 2008 R2 physical servers using Network Load Balancing) is experiencing heavy server use and SSL/TLS requests to port 443 are timing out on what appears to be the TLS negotiation (500+ Get Requests/sec with over 20K Current Connections).
Despite the heavy load, the performance of the server hardware is fine--less than 20% processor utilization, 75% of memory still available, and virtually no processor queuing. Additionally, the bandwidth utilization is fine as well. However, during this heavy usage event, my websites stopped responding to SSL-based (https) requests and clients were unable to negotiate a TLS connection. During this same time, requests using http to the same websites were working fine and the websites were very responsive (I disabled the IIS rewrite rule from http to https). The problem may have gone away after I uninstalled my CA issued certificate and reinstalled the same one and then restarted all web services however I can't say for sure that this corrected it because I also stopped forcing the use of SSL.
In troubleshooting, the only thing I see is that my Windows event logs are filled with Event ID: 36887, which seems to be related to SSL but the meaning of the error is vague to me. This is the description of the error message:
"This error message indicates the computer received an SSL fatal alert message from the server ( It is not a bug in the Schannel or the application that uses Schannel). Sometimes is caused by the installation of third party web browser (other than Internet Explorer)."
There are hundreds of entries per minute corresponding to the time of the performance issues. After this occurred, I was told to enable the CAPI2 log but since the issue is not occurring now, I only see informational messages in this log.
What would cause this problem with TLS unable to negotiate a connection under a heavy load in my networked balanced web farm and how I can prevent this from occurring again?

Event ID 36887, A fatal alert was received from the remote endpoint. The TLS protocol defined fatal alert code is 40

This is resulting from an outbound connection to Equifax's new TLS 1.2-enabled URL.
Background:
Servers: Windows 2012 R2, .NET 4.6.2, all TLS 1.x Enabled in Test, Stage and Production tiers per this. IIS configurations match between servers (app pools/code except tier-specific configurations/IIS settings.)
Servers are load balanced via Citrix Netscaler, but this site uses Port 80/HTTP, no HTTPS configuration.
Both tiers use the same Equifax URL, but with tier-specific credentials.
The Situation:
Prod will not communicate with their site, we get the opening error.
Our stage environment has no problem communicating.
What we have done:
- Validated TLS reg settings match
- Swapped the prod web.config to the Stage server and the communication worked, so it seems unlikely that it is a web.config issue in production.
- Validated .NET versions
- Checked LSA fips reg setting (set to 0)
- checked for wonky updates known to cause issues
We are going to setup a network trace, but for the moment we are at a bit of a loss. I would appreciate any insights as to what I might be missing.
Developers had to do the following:
Added the specification of using 4.6 per Microsoft recommendations.
Updated some other .NET references in the web.config to point specifically to 4.6.2
They made some changes in some older code pieces to make them 4.6.2-compliant.

performance issue with Apache2 server and wso2esb

Am working on wso2esb and created several proxy services and using Apache 2 proxy to call this services, So every request of ESB goes through the Apache2 proxy ,After an usage of 10-15 days the performance becomes slow day by day along with all the other servers performance which are using Apache 2 server are also slow.
Tried reloading and restarting Apache server but no change in performance.
When i restarted the esb server then it was back to normal stage, Doesn't know what esb is holding and causing this performance issue.
what can be done to resolve this with out restarting the esb server.

ASPX.NET Core - random 502 error on IIS server

We just launched a site that runs on ASP.NET Core 1.1, Windows 2008R2, IIS 7.5 with all the latest patches to 2008 and asp.net.
The site runs fine, but goes down with no apparent pattern. All of a sudden it would start returning 502 response:
502 – Web server received an invalid response while acting as a gateway or proxy server
Restarting the site in IIS, or recycling site’s application pool brings the site up, but the problem reoccurs within a few hours. As workaround, we configured IIS to recycle app pool every 90 minutes, and that seemed to keep the site up all of the time.
Any recommendations on how to troubleshoot this problem?
Thank you!
IIS says with this 502 error that Kestrel (behind it) returned something "wrong". Enable (more) logging, and inspect logs before first 502 response. Some previous request "breaks" your app.

Worklight: unable to access server after about 30 mins in background

My app is unable to access the server after about 30 minutes in background on Android phone and iPhone.
I know it should be related to the serverSessionTimeout. However, I cannot connect to server any more after that occurred, and I tried to invoke "WL.Client.connect()", but it didn't work. I always got request timeout response.
I test my app on local (without DMZ) via Worklight Studio embedded server, and it worked fine. Only get this issue on UAT (DMZ) and PROD (DMZ).
Project architecture:
1. DMZ (IBM IMC/LMC)
2. LAN
3. Worklight 6.0
4. Production environment
5. No Load Balancer and cluster setup
My assumption:
1. it seems the DMZ kept the credential between DMZ and WL server and didn't refresh it when try to connect WL server again after WL session is timeout.
This issue is fixed now.
The root cause is because we sent out two requests to WL server going through IMC at the same time, only the second one shake hands with WL server successfully.