HAProxy stick table fails? - wcf

I have a HAProxy balancing load on tcp-connections. For some reason, it seems that the stick-table fails once in a while. Here are few rows from my log where we can see that there has been an connection to both backends from single ip. Connections are made at the exact same timestamp, could it be that the stick table is not thread safe?
Expire time is only 10s for a reason, tcp connections in our software are quite sticky so we need short expire time for possible server maintenance procedures.
Any idea how this is possible? Why two simultaneous connections are routed to different backends? Our software cannot handle this situation, WCF is used for the communication.
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:12872 [11/Apr/2017:14:05:21.608] https_frontend https_backend/srv1
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:12872 [11/Apr/2017:14:05:21.608] https_frontend https_backend/srv1
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:3342 [11/Apr/2017:14:05:21.608] https_frontend https_backend/srv2
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:3342 [11/Apr/2017:14:05:21.608] https_frontend https_backend/srv2
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:22543 [11/Apr/2017:14:05:34.994] https_frontend https_backend/srv2
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:22543 [11/Apr/2017:14:05:34.994] https_frontend https_backend/srv2
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:26566 [11/Apr/2017:14:05:34.995] https_frontend https_backend/srv2
Apr 11 14:06:07 accuna haproxy[22129]: xx.xx.133.145:26566 [11/Apr/2017:14:05:34.995] https_frontend https_backend/srv2
HAproxy settings:
backend https_backend
mode tcp
option tcplog
balance leastconn
stick-table type ip size 200k expire 10s store conn_cur
tcp-request content track-sc0 src
stick on src
server srv1 10.1.4.210:443 check
server srv2 10.1.5.38:443 check

Since I met the same problem five years later, maybe it's not stressed enough in the docs. Turned out, my stick on rule
stick on req.payload(0,0),mqtt_field_value(connect,username) table mqtt_usernames
was checked too fast, before the content has arrived. I adjusted it with an inspect delay in my frontend config:
tcp-request inspect-delay 10s
acl content_present req_len gt 0
tcp-request content accept if content_present
and hopefully it's correct. At least, it passes my tests.

Related

(gcloud.beta.compute.ssh) [/usr/bin/ssh] exited with return code [255]

Try to using ssh connect google cloud computer engine (macOs Catalina)
gcloud beta compute ssh --zone "us-west1-b" "mac-vm" --project "mac-vm-282201"
and get error
ssh: connect to host 34.105.11.187 port 22: Operation timed out
ERROR: (gcloud.beta.compute.ssh) [/usr/bin/ssh] exited with return code [255].
and I try
ssh -I ~/.ssh/mac-vm-key asd61404#34.105.11.187
also get error
ssh: connect to host 34.105.11.187 port 22: Operation timed out
so I found this code to diagnose it
gcloud compute ssh —zone "us-west1-b" "mac-vm" —project "mac-vm-282201" —ssh-flag="-vvv"
return
OpenSSH_7.9p1, LibreSSL 2.7.3
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 48: Applying options for *
debug2: resolve_canonicalize: hostname 34.105.11.187 is address
debug2: ssh_connect_direct
debug1: Connecting to 34.105.11.187 [34.105.11.187] port 22.
I don't know, how can I fix this issue.
Thanks in advance!
here is my recent Serial console
Jul 4 02:28:39 mac-vm google_network_daemon[684]: For info, please visit https://www.isc.org/software/dhcp/
Jul 4 02:28:39 mac-vm dhclient[684]:
Jul 4 02:28:39 mac-vm dhclient[684]: Listening on Socket/ens4
[ 19.458355] google_network_daemon[684]: Listening on Socket/ens4
Jul 4 02:28:39 mac-vm google_network_daemon[684]: Listening on Socket/ens4
Jul 4 02:28:39 mac-vm dhclient[684]: Sending on Socket/ens4
[ 19.458697] google_network_daemon[684]: Sending on Socket/ens4
Jul 4 02:28:39 mac-vm google_network_daemon[684]: Sending on Socket/ens4
Jul 4 02:28:39 mac-vm systemd[1]: Finished Wait until snapd is fully seeded.
Jul 4 02:28:39 mac-vm systemd[1]: Starting Apply the settings specified in cloud-config...
Jul 4 02:28:39 mac-vm systemd[1]: Condition check resulted in Auto import assertions from block devices being skipped.
Jul 4 02:28:39 mac-vm systemd[1]: Reached target Multi-User System.
Jul 4 02:28:39 mac-vm systemd[1]: Reached target Graphical Interface.
Jul 4 02:28:39 mac-vm systemd[1]: Starting Update UTMP about System Runlevel Changes...
Jul 4 02:28:39 mac-vm systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Jul 4 02:28:39 mac-vm systemd[1]: Finished Update UTMP about System Runlevel Changes.
[ 20.216129] cloud-init[718]: Cloud-init v. 20.1-10-g71af48df-0ubuntu5 running 'modules:config' at Sat, 04 Jul 2020 02:28:39 +0000. Up 20.11 seconds.
Jul 4 02:28:39 mac-vm cloud-init[718]: Cloud-init v. 20.1-10-g71af48df-0ubuntu5 running 'modules:config' at Sat, 04 Jul 2020 02:28:39 +0000. Up 20.11 seconds.
Jul 4 02:28:39 mac-vm systemd[1]: Finished Apply the settings specified in cloud-config.
Jul 4 02:28:39 mac-vm systemd[1]: Starting Execute cloud user/final scripts...
Jul 4 02:28:41 mac-vm google-clock-skew: INFO Synced system time with hardware clock.
[ 20.886105] cloud-init[725]: Cloud-init v. 20.1-10-g71af48df-0ubuntu5 running 'modules:final' at Sat, 04 Jul 2020 02:28:41 +0000. Up 20.76 seconds.
[ 20.886430] cloud-init[725]: Cloud-init v. 20.1-10-g71af48df-0ubuntu5 finished at Sat, 04 Jul 2020 02:28:41 +0000. Datasource DataSourceGCE. Up 20.87 seconds
Jul 4 02:28:41 mac-vm cloud-init[725]: Cloud-init v. 20.1-10-g71af48df-0ubuntu5 running 'modules:final' at Sat, 04 Jul 2020 02:28:41 +0000. Up 20.76 seconds.
Jul 4 02:28:41 mac-vm cloud-init[725]: Cloud-init v. 20.1-10-g71af48df-0ubuntu5 finished at Sat, 04 Jul 2020 02:28:41 +0000. Datasource DataSourceGCE. Up 20.87 seconds
Jul 4 02:28:41 mac-vm systemd[1]: Finished Execute cloud user/final scripts.
Jul 4 02:28:41 mac-vm systemd[1]: Reached target Cloud-init target.
Jul 4 02:28:41 mac-vm systemd[1]: Starting Google Compute Engine Startup Scripts...
Jul 4 02:28:41 mac-vm startup-script: INFO Starting startup scripts.
Jul 4 02:28:41 mac-vm startup-script: INFO Found startup-script in metadata.
Jul 4 02:28:42 mac-vm startup-script: INFO startup-script: sudo: ufw: command not found
Jul 4 02:28:42 mac-vm startup-script: INFO startup-script: Return code 1.
Jul 4 02:28:42 mac-vm startup-script: INFO Finished running startup scripts.
Jul 4 02:28:42 mac-vm systemd[1]: google-startup-scripts.service: Succeeded.
Jul 4 02:28:42 mac-vm systemd[1]: Finished Google Compute Engine Startup Scripts.
Jul 4 02:28:42 mac-vm systemd[1]: Startup finished in 1.396s (kernel) + 20.065s (userspace) = 21.461s.
Jul 4 02:29:06 mac-vm systemd[1]: systemd-hostnamed.service: Succeeded.
Jul 4 02:43:32 mac-vm systemd[1]: Starting Cleanup of Temporary Directories...
Jul 4 02:43:32 mac-vm systemd[1]: systemd-tmpfiles-clean.service: Succeeded.
Jul 4 02:43:32 mac-vm systemd[1]: Finished Cleanup of Temporary Directories.

Failed s3fs mount due to Timezone skew

Apr 22 05:54:59 ubuntuserver s3fs[10143]: s3fs.cpp:set_s3fs_log_level(297): change debug level from [CRT] to [INF]
Apr 22 05:54:59 ubuntuserver s3fs[10143]: PROC(uid=0, gid=0) - MountPoint(uid=0, gid=0, mode=40755)
Apr 22 05:54:59 ubuntuserver s3fs[10145]: init v1.85(commit:381835e) with OpenSSL
Apr 22 05:54:59 ubuntuserver s3fs[10145]: check services.
Apr 22 05:54:59 ubuntuserver s3fs[10145]: check a bucket.
Apr 22 05:54:59 ubuntuserver s3fs[10145]: curl.cpp:ResetHandle(1879): The S3FS_CURLOPT_KEEP_SENDING_ON_ERROR option could not be set. For maximize performance you need to enable this option and you should use libcurl 7.51.0 or later.
Apr 22 05:54:59 ubuntuserver s3fs[10145]: URL is https://s3-us-west-2.amazonaws.com/bucketubuntuserver/
Apr 22 05:54:59 ubuntuserver s3fs[10145]: URL changed is https://bucketubuntuserver.s3-us-west-2.amazonaws.com/
Apr 22 05:55:01 ubuntuserver s3fs[10145]: curl.cpp:RequestPerform(2273): HTTP response code 403, returning EPERM. Body Text: <?xml version="1.0" encoding="UTF-8"?>#012<Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the current time is too large.</Message>
<RequestTime>Mon, 22 Apr 2019 05:54:59 GMT</RequestTime>
<ServerTime>2019-04-22T06:23:01Z</ServerTime>
<MaxAllowedSkewMilliseconds>900000</MaxAllowedSkewMilliseconds>
<RequestId>2CDB15BFC9072D0D</RequestId><HostId>grA/XIvT7zLUh9jLUxYGAs8jOtMs762CPMX+TM6GdAVvAB36/b8hH0dVOugVBWRpHX3O63V2Bv8=</HostId></Error>
Apr 22 05:55:01 ubuntuserver s3fs[10145]: curl.cpp:CheckBucket(3305): Check bucket failed, S3 response: <?xml version="1.0" encoding="UTF-8"?>#012<Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the current time is too large.</Message>
<RequestTime>Mon, 22 Apr 2019 05:54:59 GMT</RequestTime>
<ServerTime>2019-04-22T06:23:01Z</ServerTime>
<MaxAllowedSkewMilliseconds>900000</MaxAllowedSkewMilliseconds><RequestId>2CDB15BFC9072D0D</RequestId><HostId>grA/XIvT7zLUh9jLUxYGAs8jOtMs762CPMX+TM6GdAVvAB36/b8hH0dVOugVBWRpHX3O63V2Bv8=</HostId></Error>
Apr 22 05:55:01 ubuntuserver s3fs[10145]: s3fs.cpp:s3fs_check_service(3868): invalid credentials(host=https://s3-us-west-2.amazonaws.com) - result of checking service.
Apr 22 05:55:01 ubuntuserver s3fs[10145]: Pool full: destroy the oldest handler
Apr 22 05:55:01 ubuntuserver s3fs[10145]: s3fs.cpp:s3fs_exit_fuseloop(3444): Exiting FUSE event loop due to errors
Apr 22 05:55:01 ubuntuserver s3fs[10145]: destroy
i had my credentials correct,but i wasnt able to mount s3 due to the clock difference. My server is using UTC which was late by 26 minutes. My problem is solved by fixing ntp sync but:-
1) I want to confirm if the s3fs or any aws tool i use also send the Clock information to S3 ? as is present but its GMT instead of UTC. The s3 seems to be using UTC when comparing it to servers properly synced to ntp.
2) Can we use any timezone provided that is properly synced with good NTP server ?
S3 signs requests including the client's current time to prevent attackers from replaying requests at a later time. Thus if your client has the incorrect time, the server will treat it as an invalid request. Both the client and server use UTC/GMT; the time zone does not matter. Configuring ntp as you did should resolve these issues.

Nginx and Solr server on port 8983 but can't access Admin area

Im running Nginx and I just installed solr.
Service status reports everythign is ok...
sudo service solr statusroot#closer:~# sudo service solr status
● solr.service - LSB: Controls Apache Solr as a Service
Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
Active: active (exited) since Sat 2018-07-14 18:21:14 UTC; 1s ago
Docs: man:systemd-sysv-generator(8)
Process: 2549 ExecStop=/etc/init.d/solr stop (code=exited, status=0/SUCCESS)
Process: 2699 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS)
Jul 14 18:21:08 closer solr[2699]: If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
Jul 14 18:21:08 closer solr[2699]: *** [WARN] *** Your Max Processes Limit is currently 3896.
Jul 14 18:21:08 closer solr[2699]: It should be set to 65000 to avoid operational disruption.
Jul 14 18:21:08 closer solr[2699]: If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
Jul 14 18:21:08 closer solr[2699]: Warning: Available entropy is low. As a result, use of the UUIDField, SSL, or any other features that require
Jul 14 18:21:08 closer solr[2699]: RNG might not work properly. To check for the amount of available entropy, use 'cat /proc/sys/kernel/random/entr
Jul 14 18:21:14 closer solr[2699]: [194B blob data]
Jul 14 18:21:14 closer solr[2699]: Started Solr server on port 8983 (pid=2751). Happy searching!
Jul 14 18:21:14 closer solr[2699]: [14B blob data]
Jul 14 18:21:14 closer systemd[1]: Started LSB: Controls Apache Solr as a Service.
but if I try to go to xxx.xxx.xxx.xxx:8983/solr
I can't access the page...why?
do i have to ufw port 8983?
do i have to start apache?
else?
It worked after
sudo ufw allow 8983
I can't believe all the online guides never mentioned it.

IE 11 shows "Page can't be displayed" on long running backend servlet processing

I have a servlet that takes more than 6 minutes to complete a operation. The application is hosted on weblogic 12c that is accessed via BIGIP F5 load balancer and then a apache server. Apache uses wl_proxy to communicate with weblogic. Whenever this servlet is called IE shows "This page can't be displayed". I turned on the wl_proxy log on apache server and found the following:
Exception type [READ_TIMEOUT] (no read after 300 seconds) raised at line 212 of ../nsapi/Reader.cpp
So I added the WLIOTimeout directive in wl_proxy.conf that fixed the one part of the problem. Still it shows the same error exactly after 5 minutes, and this time I saw the following error in wl_proxy log:
Fri Jul 31 12:49:05 2015 <396114383469453> created a new connection to preferred server 'xxx.x.xxx.xxx/5096' for '/getUserActivitiesReport.do?action=GENERATEREPORT', Local port:36249
Fri Jul 31 12:55:02 2015 <396114383469453> URL::parseHeaders: CompleteStatusLine set to [HTTP/1.1 200 OK]
Fri Jul 31 12:55:02 2015 <396114383469453> URL::parseHeaders: StatusLine set to [200 OK]
Fri Jul 31 12:55:02 2015 <396114383469453> parsed all headers OK
Fri Jul 31 12:55:02 2015 <396114383469453> sendResponse() : r->status = '200'
Fri Jul 31 12:55:02 2015 <396114383469453> Write to the browser failed: calling URL::close at line 680 of ap_proxy.cpp
Fri Jul 31 12:55:02 2015 <396114383469453> *******Exception type [WRITE_ERROR_TO_CLIENT] raised at line 681 of ap_proxy.cpp
Fri Jul 31 12:55:02 2015 <396114383469453> *NOT* failing over after sendResponse() exception: WRITE_ERROR_TO_CLIENT
Fri Jul 31 12:55:02 2015 <396114383469453> request [/getUserActivitiesReport.do?action=GENERATEREPORT] did NOT process successfully..................
Aapache access log for this request:
xxx.xxx.xxx.xxx - - [31/Jul/2015:12:49:05 +0000] "POST /getUserActivitiesReport.do?action=GENERATEREPORT HTTP/1.1" 200 10 "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko" "PsHK9qECrbkAAA95AFgAAAAG" 80 357322233
Now why the browser closed the connection! AFAIK the IE 11 times out after 60 minutes. Also in the IE developer I saw the connection as "Aborted".
Could any one faced this types of issue. Any idea if there is any timeout set at F5 level?
Thanks in advance,
Debojit
I have the same problem, the solution is add this two parameters in the Weblogic plug-in configuration:
WLIOTimeoutSecs 14400
WLSocketTimeoutSecs 14400
Be careful with the parameters (14400), this are only examples.
And restart your http server.
My configuration:
<IfModule mod_weblogic.c>
<Location />
WebLogicHost 172.x.x.x
WebLogicPort 7003
SetHandler weblogic-handler
WLIOTimeoutSecs 14400
WLSocketTimeoutSecs 14400
</Location>
</IfModule>
More info: Doc ID 2554989.1 and
https://docs.oracle.com/middleware/12211/webtier/develop-plugin/PLGWL.pdf

What's the hard limit for apache ThreadsPerChild parameter in httpd.conf?

i'm using the ibm http server which is based on Apache. When i tried to increase the parameter ThreadsPerChild more than 1000, the http server always only start up 1000 worker threads. Below is the related information:
error log:
[Thu Jul 05 10:50:45 2012] [debug] mpm_winnt.c(564): Child 9040: retrieved 2 listeners from parent
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Acquired the start mutex.
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Starting 1000 worker threads.
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Starting thread to listen on port 81.
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Starting thread to listen on port 80.
httpd.conf
<IfModule mpm_winnt.c>
ThreadLimit 2048<br>
ThreadsPerChild 2000
MaxRequestsPerChild 0
</IfModule>
IHS 7.0.0.0
OS winNT
BTW, another concern with ThreadsPerChild is whether one Apache thread handles one client connection here, or one thread can take care of more than one client connection?
Please help me out.
Thanks very much
On limits of ThreadsPerChild setting, quoting from IBM HTTP Server Performance Tuning ;
On 64-bit Windows OS'es, each instance of is limited to approximately
2500 ThreadsPerChild. On 32-bit Windows, this number is closer to
5000. These numbers are not exact limits, because the real limits are the sum of the fixed startup cost of memory for each thread + the
maximum runtime memory usage per thread, which varies based on
configuration and workload. Raising ThreadsPerChild and approaching
these limits risks child process crashes when runtime memory usage
puts the process address space over the 2GB or 3GB barrier.
The interesting to note here is ThreadsPerChild is not the only parameter for tuning concurrent connections to IHS. You may find information about other parameters (like maxClients) and tuning methodology at the following link;
Tuning IBM HTTP Server to maximize the number of client connections to WebSphere Application Server