During load testing (using Load UI) of a new .Net web api using Mono hosted on a medium sized Amazon server I'm receiving the following results (in chronological order over the course of about ten minutes)
5 connections per second for 60 seconds
No errors
50 connections per second for 60 seconds
No errors
100 connections per second for 60 seconds
Received 3 errors, appearing later during the run
2014-02-07 00:12:10Z Error HttpResponseExtensions Error occured while Processing Request: [IOException] Write failure Write failure|The socket has been shut down
2014-02-07 00:12:10Z Info HttpResponseExtensions Failed to write error to response: {0} Cannot be changed after headers are sent.
5 connections per second for 60 seconds
No errors
100 connections per second for 30 seconds
No errors
100 connections per second for 60 seconds
Received 1 error same as above, appearing later during the run
100 connections per second for 45 seconds
No errors
Doing some research on this, this error seems to be a standard one received when a client closed the connection. As this is only occurring during the heavier load tests, I am wondering if it is just getting to the upper limits of what the server instance can support? If not any suggestions on hunting down the source of the errors?
Related
I have a test that was alerting because it was taking extra time for an asset to load. We changed from waitForNoRequest to a pause (at Catchpoint's suggestion). That did not seem to have the expected effect of waiting for things to load. We increased the pause from 3000 to 12000 and that helped to allow the page to load and stop the alert. We noticed some more alerts, so I tried to increase the pause to something like 45000 and it would not allow me to pause for that long.
So the main question here is - what functionality does both of these different features provide? What do I gain by pausing instead of waiting, if anything?
Here's the test, data changed to protect company specific info. Step 3 is where we had some failures and we switched between pause and wait.
// Step - 1
open("https://website.com/")
waitForNoRequest("2000")
click("//*[#id=\"userid\"]")
type("//*[#id=\"userid\"]", "${username}")
setStepName("Step1-Login-")
// Step - 2
clickMouseAndWait("//*[#id=\"continue\"]")
waitForVisible("//*[#id=\"challenge-password\"]")
click("//*[#id=\"challenge-password\"]")
type("//*[#id=\"challenge-password\"]", "${password}")
setStepName("Step2-Login-creds")
// Step - 3
clickMouseAndWait("//*[#id=\"signIn\"]")
setStepName("Step3-dashboard")
waitForTitle("Dashboard")
waitForNoRequest("3000")
click("//*[#id=\"account-header-wrapper\"]")
waitForVisible("//*[#id=\"logout-link\"]")
click("//*[#id=\"logout-link\"]")
// Step - 4
clickAndWait("//*[text()=\"Sign Out\"]")
waitForTitle("Login - ")
verifyTextPresent("You have been logged out.")
setStepName("Step5-Logout")
Rachana here, I’m a member of the Technical Service Team here at Catchpoint, I’ll be happy to answer your questions.
Please find the differences below between waitForNoRequest and Pause commands:
Pause
Purpose: This command pauses the script execution for a specified amount of time, whether there are HTTP/s requests downloading or not. Time value is provided in milliseconds, it can range between 100 to 30,000 ms.
Explanation: This command is used when the agent needs to wait for a set amount of time and this is not impacted by the way the requests are loaded before proceeding to the next step or command. Only a parameter is required for this action.
WaitForNoRequest
Purpose: This commands waits for a specified amount of time, when there was no HTTP/s requests downloading. The wait time parameter can range between 1,000 to 5,000 ms.
Explanation: The only parameter for this action is a wait time. The agent will wait for that specified amount of time before moving onto the next step/command. Which will, in return, allow necessary requests more time to load after document complete.
For instance when you add waitforNoRequest(5000), initially agent waits 5000 ms after doc complete for any network activity. During that period if there is any network activity, then the agent waits another 5000 ms for the next network activity to end and the process goes on until no other request loads within the specified timeframe(5000 ms).
A pause command with 12000 ms, gives exactly 12 seconds to load the page. After 12 seconds the script execution will continue to next command no matter the page is loaded or not.
Since waitForNoRequest has a max time value of 5000 ms, you can tell the agent to wait for a gap of 5 seconds when there is no network activity. In this case, the page did not have any network activity for 3 seconds and hence proceeded to the next action. The page was not loaded completely and the script failed.
I tried to increase the pause to something like 45000 and it would not allow me to pause for that long.
We allow a maximum of 30 seconds pause time hence 45 seconds will not work.
Please reach out to our support team and we’ll be glad to connect you with our scripting SMEs and help you with any scripting needs you might have.
Probably explanation is simple - but I couldn't find answer to my question:
I am running jmeter test from one VM (worker) to another (target). On worker I have jmeter with 100 threads (100 users). On target I have API that runs on Apache. When I run "apachetop -f access_log" on target, I see only about 7 req/s.
Can someone explain me, why I don't see 100 req/s on target?
In test result in jmeter I see always 200 OK, so all request are hitting the target, and moreover target always responds. So I am not dropping any requests here. Network bandwidth between machines is 1G. What I am missing here?
Thanks,
Daddy
100 users doesn't necessarily mean 100 requests per second, even more, it is highly unlikely.
According to JMeter glossary:
Elapsed time. JMeter measures the elapsed time from just before sending the request to just after the last response has been received. JMeter does not include the time needed to render the response, nor does JMeter process any client code, for example Javascript.
Roughly, if JMeter is able to get response from server in 1 second - you will get 100 requests/second. If response time will be 2 seconds - throughput will be 50 requests/second, etc, response time 4 seconds - 25 requests/second, etc.
Also JMeter configuration matters. If you don't provide enough loops you may run into a situation where some threads already finished and some are not even started. See JMeter Test Results: Why the Actual Users Number is Lower than Expected article for more detailed explanation.
Your target load = 100 threads ( you are assuming it should generate 100 req/sec as per your plan)
Your actual load = 7 req / sec = 7*3600 / hour = 25200
Per thread throughput = 25200 / 100 threads = 252 iterations / thread / hour
Per transaction time = 3600 / 252 = 14.2 secs
This means, JMeter should be actually sending each request every 14 secs per thread. i.e., 100 requests for every 14.2 secs.
Now, analyze your JMeter summary report for the transaction timers to find out where the remaining 13.2 secs are being spent.
Possible issues are
1. High DNS resolution time (DNS issue)
2. High connection setup time (indicates load balancer issues)
3. High Request send time (indicates n/w or firewall throttling issues)
4. High request receive time (same as #3)
Now, the time that you see in Apache logs are mostly visible to JMeter as time to first byte time. I am not sure about the machine that you are running your testing. If your worker can support curl, use Curl to find the components for a single request.
echo 'request payload for POST'
| curl -X POST -H 'User-Agent: myBrowser' -H 'Content-Type: application/json' -d #- -s -w '\nDNS time:\t%{time_namelookup}\nTCP Connect time:\t%{time_connect}\nAppCon Protocol time:\t%{time_appconnect}\nRedirect time:\t%{time_redirect}\nPreXfer time:\t%{time_pretransfer}\nStartXfer time:\t%{time_starttransfer}\n\nTotal time:\t%{time_total}\n' http://mytest.test.com
If the above output indicates no such issues then the time must have been spent within JMeter. You should tune your JMeter implementation by using various options like beanshell / JSR223 etc.
I'm maintaining an antedeluvian Notes application which connects to a SAP back-end via a manually done 'Webservice'
The server is running Domino Release 7.0.4FP2 HF97.
The Webservice is not the more recently Webservice Consumer, but a large Java agent which is using Apache soap.jar (org.apache.soap). Below an example of the calling code.
private Call setupSOAPCall() {
Call call = new Call();
SOAPHTTPConnection conn = new SOAPHTTPConnection();
call.setSOAPTransport(conn);
call.setEncodingStyleURI(Constants.NS_URI_SOAP_ENC);
There has been a change in the SAP system which is now taking 8 minutes to complete (verified by SAP Team).
I'm getting an error message as follows:
[SOAPException: faultCode=SOAP-ENV:Client; msg=For input string: "906 "; targetException=java.lang.NumberFormatException: For input string: "906 "]
I found a blog article describing the error message quite closely:
https://thejavablog.wordpress.com/category/jmeter/
and I've come to the hypothesis that it is a timeout message that is returning to my Call object and that this timeout message is being incorrectly parsed, hence the NumberFormat Exception.
Looking at my logs I can see that there is a time difference of 62 seconds between my call and the response.
I recommended that the server setting in the server document, tab Internet Protocols/HTTP/Timeouts/Request timeouts be changed from 60 seconds to 600 seconds, and the http task restarted with
tell http restart
I've re-run the tests and I am getting the same error, and the time difference is still slightly more than 60 seconds, which is not what I was expecting.
I read Michael Rulnau's blog entry
http://www.mruhnau.net/2014/06/how-to-overcome-domino-webservice.html
which points to this APR
http://www-01.ibm.com/support/docview.wss?uid=swg1LO48272
but I'm not convinced that this would apply in this case, since there is no way that IBM would know that my Java agent is in fact making a Soap call.
My current hypothesis is that I have to use either the setTimeout() method on
org.apache.axis.client.Call
https://axis.apache.org/axis/java/apiDocs/org/apache/axis/client/Call.html
or on the org.apache.soap.transport.http.SOAPHTTPConnection
https://docs.oracle.com/cd/B13789_01/appdev.101/b12024/org/apache/soap/transport/http/SOAPHTTPConnection.html
and that the timeout value is an apache default, not something that is controlled by the Domino server.
I'd be grateful for any help.
I understand your approach, and I hope this is the correct one to solve your problem.
Add a debug (console write would be fine) that display the default Timeout then try to increase it to 10 min.
SOAPHTTPConnection conn = new SOAPHTTPConnection();
System.out.println("time out is :" + conn.getTimeout());
conn.setTimeout(600000);//10 min in ms
System.out.println("after setting it, time out is :" + conn.getTimeout());
call.setSOAPTransport(conn);
Now keep in mind that Dommino has also a Max LotusScript/Java execution time, check this value and (at least for a try) change it: http://www.ibm.com/support/knowledgecenter/SSKTMJ_9.0.1/admin/othr_servertasksagentmanagertab_r.html (it's version 9 help but this part should be identical)
I've since discovered that it wasn't my code generating the error; the default timeout for the apache axis SOAPHTTPConnetion is 0, i.e. no timeout.
I have a job that periodically does some work involving ServerXmlHttpRquest to perform an HTTP POST. The job runs every 60 seconds.
And normally it runs without issue. But there's about a 1 in 50,000 chance (every two or three months) that it will hang:
IXMLHttpRequest http = new ServerXmlHttpRequest();
http.open("POST", deleteUrl, false, "", "");
http.send(stuffToDelete); <---hang
When it hangs, not even the Task Scheduler (with the option enabled to kill the job if it takes longer than 3 minutes to run) can end the task. I have to connect to the remote customer's network, get on the server, and use Task Manager to kill the process.
And then its good for another month or three.
Eventually i started using Task Manager to create a process dump,
so i could analyze where the hang is. After five crash dumps (over the last 11 months or so) i get a consistent picture:
ntdll.dll!_NtWaitForMultipleObjects#20()
KERNELBASE.dll!_WaitForMultipleObjectsEx#20()
user32.dll!MsgWaitForMultipleObjectsEx()
user32.dll!_MsgWaitForMultipleObjects#20()
urlmon.dll!CTransaction::CompleteOperation(int fNested) Line 2496
urlmon.dll!CTransaction::StartEx(IUri * pIUri, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4453 C++
urlmon.dll!CTransaction::Start(const wchar_t * pwzURL, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4515 C++
msxml3.dll!URLMONRequest::send()
msxml3.dll!XMLHttp::send()
Contoso.exe!FrobImporter.TFrobImporter.DeleteFrobs Line 971
Contoso.exe!FrobImporter.TFrobImporter.ImportCore Line 1583
Contoso.exe!FrobImporter.TFrobImporter.RunImport Line 1070
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.HandleFrobImport Line 433
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.CoreExecute Line 71
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.Execute Line 84
Contoso.exe!Contoso.Contoso Line 167
kernel32.dll!#BaseThreadInitThunk#12()
ntdll.dll!__RtlUserThreadStart()
ntdll.dll!__RtlUserThreadStart#8()
So i do a ServerXmlHttpRequest.send, and it never returns. It will sit there for days (causing the system to miss financial transactions, until come Sunday night i get a call that it's broken).
It is of no help unless someone knows how to debug code, but the registers in the stalled thread at the time of the dump are:
EAX 00000030
EBX 00000000
ECX 00000000
EDX 00000000
ESI 002CAC08
EDI 00000001
EIP 732A08A7
ESP 0018F684
EBP 0018F6C8
EFL 00000000
Windows Server 2012 R2
Microsoft IIS/8.5
Default timeouts of ServerXmlHttpRequest
You can use serverXmlHttpRequest.setTimeouts(...) to configure the four classes of timeouts:
resolveTimeout: The value is applied to mapping host names (such as "www.microsoft.com") to IP addresses; the default value is infinite, meaning no timeout.
connectTimeout: A long integer. The value is applied to establishing a communication socket with the target server, with a default timeout value of 60 seconds.
sendTimeout: The value applies to sending an individual packet of request data (if any) on the communication socket to the target server. A large request sent to a server will normally be broken up into multiple packets; the send timeout applies to sending each packet individually. The default value is 30 seconds.
receiveTimeout: The value applies to receiving a packet of response data from the target server. Large responses will be broken up into multiple packets; the receive timeout applies to fetching each packet of data off the socket. The default value is 30 seconds.
The KB305053 (a server that decides to keep the connection open will cause serverXmlHttpRequest to wait for the connection to close) seems like it plausibly could be the issue. But the 30 second default timeout would have taken care of that.
Possible workaround - Add myself to a Job
The Windows Task Scheduler is unable to terminate the task; even though the option is enabled to do do.
I will look into using the Windows Job API to add my self process to a job, and use SetInformationJobObject to set a time limit on my process:
CreateJobObject
AssignProcessToJobObject
SetInformationJobObject
to limit my process to three minutes of execution time:
PerProcessUserTimeLimit
If LimitFlags specifies
JOB_OBJECT_LIMIT_PROCESS_TIME, this member is the per-process
user-mode execution time limit, in 100-nanosecond ticks. Otherwise,
this member is ignored.
The system periodically checks to determine
whether each process associated with the job has accumulated more
user-mode time than the set limit. If it has, the process is
terminated.
If the job is nested, the effective limit is the most
restrictive limit in the job chain.
Although since Task Scheduler uses Job objects to also limit a task's time, i'm not hopeful that the Job Object can limit a job either.
Edit: Job objects cannot limit a process by process time - only user time. And with a process idle waiting for an object, it will not accumulate any user time - certainly not three minutes worth.
Bonus Reading
How can a ServerXMLHTTP GET request hang? (GET, not POST)
KB305053: ServerXMLHTTP Stops Responding When You Send a POST Request (which says the timeout should expire; where mine does not)
MS Forums: oHttp.Send - Hangs (HEAD, not POST)
MS Forums: ASP to test SOAP WebService using MSXML2.ServerXMLHTTP Send hangs
CC to MS Support Forums
Consider switching to a newer, supported API.
msxml6.dll using MSXML2.ServerXMLHTTP.6.0
winhttpcom.dll using WinHttp.WinHttpRequest.5.1.
The msxml3.dll library is no longer supported and is only kept around for compatibility reasons. Plus, there were a number of security and stability improvements included with msxml4.dll (and newer) that you are missing out on.
Getting the following error from SQL Server 2008:
A fatal error occurred while reading the input stream from the
network. The session will be terminated (input error: 109, output
error: 0).
I am running Windows 7 Pro 64 BIT / 12 GIG RAM, IIS.
Under light load (1 request per second) I get the above error, not sure why.
I am using LoadUI (as this is a service based app) to execute the same request 1 per second. Which works fine. Now if I make another request from a Web App then I get the above error message, IIS Process's memory jumps (can carries on) from couple of hundred meg to 9 GIG in matter of 5 seconds.
It appears that SQL Server cannot handle the amount of requests/connections on Windows 7, I am hoping I will not see this in PROD.
Any help would be great.
Paul.