I understand that VUGen's web_set_timeout function allows me to set a timeout value higher than the usual value (which seems to be 120 seconds).
What I do not understand: Doesn't this imply that all users would have to set their browser http POST timeout config value to a new, higher value? Don't I then test with a (simulated/virtual) user configuration that no real-world user would/could use?
Wouldn't I also require all proxies between the user and the webserver to be configured with an at-least-as-high timeout value to use a custom timeout value in the browser? Otherwise my user's transactions will fail while my load test would pass?
Context: Load test of an browser- (Ajax) based frontend with VUGen 9.51. Browser times out on web server request with Error -27728 Step download timeout (120 seconds) has expired when downloading non-resource(s), and I hesitate using the web_set_timeout fore obvious reasons.
Each browser has a different time-out value defined. This value can also be changed rather easily by users.
Have a look at http://support.microsoft.com/kb/181050 for info on IE timeouts.
In short it says:
Internet Explorer imposes a time-out limit for the server to return data.
By default, the time-out limit is as follows:
Internet Explorer 4.0 and Internet Explorer 4.01 5 minutes
Internet Explorer 5.x and Internet Explorer 6.x 60 minutes
Internet Explorer 7 and Internet Explorer 8 60 minutes
Internet Explorer does not wait endlessly for the server to come
back with data when the server has a problem.
Also many services that are used today are machine-to-machine services (othen SOAP requests
are used for this) and they may have time-outs that are interface specific.
The place in VuGen where this is set from the UI is from the "Run-Time Settings | Preferences | Options" - in this list there are the following timeouts that can be set:
HTTP-Request connect timeout default 120 seconds
HTTP-Request response timeout default 120 seconds
In practice however, if a normal web-ui takes more than 5-10 seconds to respond to user clicks then the service will be considered slow by the users.
The exception here is SAP EP where 30+ minutes of waiting for simple thins is OK ... :)
Related
Eric here,
I did a fresh install of Centreon 21.10.8, with a Central and Database server.
Soon after adding hosts and services to be monitored, I noticed that the status of hosts and services go to UNKNOWN for a few seconds (about 5s to 10s) before coming back to normal. The same happens in the monitoring views: the status of monitored hosts and services go to UNKNOWN for a few seconds (about 5s to 10s) before coming back to normal. This happens at random every few minutes.
However the real status of the servers is unchanged, and checks on the command line from Central poller are OK.
--
OS: Redhat 8
Centreon Version: Centreon 21.10.8
Browser: Firefox 106.0.1, Chrome 107.0.5304.63
Steps to reproduce:
I simply open the browser on a monitoring view and observe for a few minutes.
What I have tried:
Test Network Performance: I tried to see what is going on in the browser; I can see a lot of ajax/xhr requests from the browser to the server. The execution time of these requests in the browser does not seem to be long (100ms to 200ms for the top counter statuses and 1s to 2s for the monitoring views). I tried the same requests via curl in the cli on the Central server and I get the same execution times.
Modifiy Refresh Settings: I tried changing the settings Administration > Setting > Centreon web > Statistics page Refresh Interval from 15s to 47s and Administration > Setting > Centreon web > Monitoring page Refresh Interval from 15s to 73s
I also noticed their is a javascript called vendor.2d6b7428.js that makes a large number of status requests (once every 2s) to the API right after the first status requests initiated by the Web Page itself. Found it on the server at location /usr/share/centreon/www/static/vendor.2d6b7428.js and in the header of the Centreon web page in a statement:
<script defer="defer" scr="./static/vendor.2d6b7428.js"><script>
The flapping behavior persists.
A solution was found in github issue #5609 ; the solution consisted of setting the parameter Instance timeout (Configuration > Pollers > Broker configuration > Output > Instance timeout OR “instance_timeout” in /etc/centreon-broker/central-broker.json) to it's default value. The value previously set by my team was 20 seconds, which caused a race condition between the freshness verification task, and the refresh interval for resource statuses, resulting in the flapping statuses.
Additional information for future readers:
The "instance_timeout" (Configuration > Pollers > Broker configuration > Output > Instance timeout) defines a freshness time for the statuses in the GUI; passed that interval, statuses are considered expired and shown as UNKNOWN until refreshed by a routine call to the API. Default value is 300s.
The "monitoring_default_refresh_interval" (Administration > Parameters > Centreon UI > Refresh Properties) defines an interval of time after which a query will be made to API to update the status of resources. Default value is 15s.
I have an Apache server with 16GB of Ram. The script cap.php returns a very small chunk of data (500B). It starts a mysql connection and makes a simple query.
However, the response from the server is, in my opinion, too lengthy.
I attach a screenshot of the Developer Tool Panel in Chrome.
Beside SSL and the TTFB there is a strange delay of 300ms (Stalled).
If I try a curl from the WebServer:
curl -w '\nLookup time:\t%{time_namelookup}\nConnect time:\t%{time_connect}\nPreXfer time:\t%{time_pretransfer}\nStartXfer time:\t%{time_starttransfer}\n\nTotal time:\t%{time_total}\n' -k -H 'miyazaki' https://127.0.0.1/ui/cap.php
Lookup time: 0.000
Connect time: 0.000
PreXfer time: 0.182
StartXfer time: 0.266
Total time: 0.266
Does anyone know what that is?
Eventually, I found that if you use SSL it is really better and it does really matter to switch on the KeepAlive directive into Apache. See the picture below.
According to the Chrome documentation:
Stalled/Blocking
Time the request spent waiting before it could be sent. This time is inclusive of any time spent in proxy negotiation.
Additionally, this time will include when the browser is waiting for
an already established connection to become available for re-use,
obeying Chrome's maximum six TCP connection per origin rule.
So this appears to be a client issue with Chrome talking to the network rather than a server config issue. As you are only making one request I think we can rule out the TCP limit per origin (unless you have lots of other tabs using up these connections) so would guess either limitations on your PC (network card, RAM, CPU) or infrastructure issues (e.g. You connect via a proxy and it takes time to set up that connection).
Your curl request doesn't seem to show this delay as it has just a 0.182 wait time to send the request (which is easily explained with https negotiation) and then a 0.266 total time to download (including the 0.182). This compares with 0.700 seconds when using Chrome so don't understand why you say "total time is similar" when to me it's clearly not?
Finally I do not understand your follow up answer. It looks to me like you have made request, presumably after a recent other request as this has skipped the whole network connection stage (including any grey stalling, blue DNS lookup, orange initial connection and purple https connection). So of course this quicker. But it's not comparing like for like with your first screenshot in your question and is not addressing your question.
But yes you absolutely should be using keep-alives (they are on by default in most web server so usually takes extra efforts to turn them off) and https resumption techniques (not on by default unless you explicitly add this to your https config) to benefit any additional requests sent shortly after the first. But these will not benefit the first connection of the session.
I'm using a large app instance to run a basic java web application (GWT + Spring). There's an expensive operation within my application (report) which takes a long time to execute.
I've tried running it with the cloudbees SDK on my local machine with similar settings as it would be on the cloud and it seems to function just fine. It runs in about 3-4 minutes.
On the cloud, it seems to be taking longer. The problem isn't the fact that it takes long. What happens in that cloudbees terminates the session after 5 minutes and gives me an error in my browser saying 'Unable to connect to server. Please contact your administrator'. A report which doesn't take as long runs just fine. My application has a session timeout of 30 minutes, so that isn't a problem either.
What could possibly be going wrong? Is it something to do with cloudbees?
This may be due to proxy buffering of your request through the routing layer (revproxy) - so it most likely isn't a session timeout - but the http connection getting cut.
You can either set proxyBuffering=false via the bees CLI command (eg when you deploy the app) - this will ensure longer running connections can work.
Ideally, however, you could change the app slightly to return to the browser with some token which you can poll with to get completion status, as even with a connection that lasts that long, over the internet it may provide a bad experience vs locally.
When attempting to connect/communicate with my service i have to wait for almost exactly 20 seconds each time before the exception is fired. Since this all gonna be running on a local network, I would like decrease that timeout period to 5 seconds? I tried decreasing the receiveTimeout on my client, but it didn't work. I looked all over my code for a 20 second timeout variable set, but couldn't find any. What should i be changing?
There are different timeout settings http://msdn.microsoft.com/en-us/library/ms731078.aspx. They can be set for example in a config file (web.config or app.config) see http://msdn.microsoft.com/en-us/library/ms731343.aspx as an example. Under http://msdn.microsoft.com/en-us/library/ms731399.aspx you can choose the binding which you use and set the corresponding setting.
UPDATED: You probably have the timeout set on the TCP level. Try reducing the TcpMaxConnectRetransmissions (Default value 2) or TcpInitialRTT (Default value 3, on NT 4.0 the parameter has the name InitialRTT) parameters in the registry, reboot your computer and try your experiments one more time. About affect of 21 seconds you can read in http://support.microsoft.com/kb/223450, http://support.microsoft.com/kb/175523, http://support.microsoft.com/kb/170359 or http://www.boyce.us/windows/tipcontent.asp?ID=189. You can read a description of the TCP/IP default configuration values at http://support.microsoft.com/kb/314053 (for Windows XP) and http://technet.microsoft.com/en-us/library/cc739819(WS.10).aspx (for Windows Server 2003 with SP2).
What you may actually be seeing is the cold start from your webapp. The Service Not Found exception would fire back pretty quickly unelss you had hit it pretty hard and you started queueing service requests beyond what WCF was configured to do.
However, if you had your website unloaded (appdomain and worker process) it could take 20 seconds to hit to the code that builds the channel to your service. So it may be something masked.
If your website and service are in different application pools then this is maginfied because it has to cold start the website and then coldstart the service, which are done in succession instead of simultaneously.
To somewhat alleviate this you can use a keepalive/ping service. Something that just constantly hits the URL to keep the AppDomain in memory and the worker process alive (if not shared). By default IIS 6 will shutdown the worker process after 20 minutes of inactivity, so when the first request comes in, http.sys starts up a new worker process, which loads the framework, which loads your app, which starts the pipeline, which executes your code, which delivers to your user. :)
How do I find the HTTP timeout set on the WebLogic 8.1 application server?
I only have Weblogic 9 and 10 available but on those platforms, you can go to the console, click on the name of your domain, then (in "Configuration" tab) "Web Applications". There you will have 3 parameters:
Post Timeout: The amount of time this server waits between receiving chunks of data in an HTTP POST data before it times out. (This is used to prevent denial-of-service attacks that attempt to overload the server with POST data.)
Maximum Post Time: Max Post Time (in seconds) for reading HTTP POST data in a servlet request. MaxPostTime < 0 means unlimited
Maximum Post Size: The maximum post size this server allows for reading HTTP POST data in a servlet request. A value less than 0 indicates an unlimited size.
However, there might be other parameters involved depending on what your problem exactly is.