Setting a timeout on webservice consumer built with org.apache.axis.client.Call and running on Domino - apache

I'm maintaining an antedeluvian Notes application which connects to a SAP back-end via a manually done 'Webservice'
The server is running Domino Release 7.0.4FP2 HF97.
The Webservice is not the more recently Webservice Consumer, but a large Java agent which is using Apache soap.jar (org.apache.soap). Below an example of the calling code.
private Call setupSOAPCall() {
Call call = new Call();
SOAPHTTPConnection conn = new SOAPHTTPConnection();
call.setSOAPTransport(conn);
call.setEncodingStyleURI(Constants.NS_URI_SOAP_ENC);
There has been a change in the SAP system which is now taking 8 minutes to complete (verified by SAP Team).
I'm getting an error message as follows:
[SOAPException: faultCode=SOAP-ENV:Client; msg=For input string: "906 "; targetException=java.lang.NumberFormatException: For input string: "906 "]
I found a blog article describing the error message quite closely:
https://thejavablog.wordpress.com/category/jmeter/
and I've come to the hypothesis that it is a timeout message that is returning to my Call object and that this timeout message is being incorrectly parsed, hence the NumberFormat Exception.
Looking at my logs I can see that there is a time difference of 62 seconds between my call and the response.
I recommended that the server setting in the server document, tab Internet Protocols/HTTP/Timeouts/Request timeouts be changed from 60 seconds to 600 seconds, and the http task restarted with
tell http restart
I've re-run the tests and I am getting the same error, and the time difference is still slightly more than 60 seconds, which is not what I was expecting.
I read Michael Rulnau's blog entry
http://www.mruhnau.net/2014/06/how-to-overcome-domino-webservice.html
which points to this APR
http://www-01.ibm.com/support/docview.wss?uid=swg1LO48272
but I'm not convinced that this would apply in this case, since there is no way that IBM would know that my Java agent is in fact making a Soap call.
My current hypothesis is that I have to use either the setTimeout() method on
org.apache.axis.client.Call
https://axis.apache.org/axis/java/apiDocs/org/apache/axis/client/Call.html
or on the org.apache.soap.transport.http.SOAPHTTPConnection
https://docs.oracle.com/cd/B13789_01/appdev.101/b12024/org/apache/soap/transport/http/SOAPHTTPConnection.html
and that the timeout value is an apache default, not something that is controlled by the Domino server.
I'd be grateful for any help.

I understand your approach, and I hope this is the correct one to solve your problem.
Add a debug (console write would be fine) that display the default Timeout then try to increase it to 10 min.
SOAPHTTPConnection conn = new SOAPHTTPConnection();
System.out.println("time out is :" + conn.getTimeout());
conn.setTimeout(600000);//10 min in ms
System.out.println("after setting it, time out is :" + conn.getTimeout());
call.setSOAPTransport(conn);
Now keep in mind that Dommino has also a Max LotusScript/Java execution time, check this value and (at least for a try) change it: http://www.ibm.com/support/knowledgecenter/SSKTMJ_9.0.1/admin/othr_servertasksagentmanagertab_r.html (it's version 9 help but this part should be identical)

I've since discovered that it wasn't my code generating the error; the default timeout for the apache axis SOAPHTTPConnetion is 0, i.e. no timeout.

Related

ServerXmlHttpRequest hanging sometimes when doing a POST

I have a job that periodically does some work involving ServerXmlHttpRquest to perform an HTTP POST. The job runs every 60 seconds.
And normally it runs without issue. But there's about a 1 in 50,000 chance (every two or three months) that it will hang:
IXMLHttpRequest http = new ServerXmlHttpRequest();
http.open("POST", deleteUrl, false, "", "");
http.send(stuffToDelete); <---hang
When it hangs, not even the Task Scheduler (with the option enabled to kill the job if it takes longer than 3 minutes to run) can end the task. I have to connect to the remote customer's network, get on the server, and use Task Manager to kill the process.
And then its good for another month or three.
Eventually i started using Task Manager to create a process dump,
so i could analyze where the hang is. After five crash dumps (over the last 11 months or so) i get a consistent picture:
ntdll.dll!_NtWaitForMultipleObjects#20()
KERNELBASE.dll!_WaitForMultipleObjectsEx#20()
user32.dll!MsgWaitForMultipleObjectsEx()
user32.dll!_MsgWaitForMultipleObjects#20()
urlmon.dll!CTransaction::CompleteOperation(int fNested) Line 2496
urlmon.dll!CTransaction::StartEx(IUri * pIUri, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4453 C++
urlmon.dll!CTransaction::Start(const wchar_t * pwzURL, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4515 C++
msxml3.dll!URLMONRequest::send()
msxml3.dll!XMLHttp::send()
Contoso.exe!FrobImporter.TFrobImporter.DeleteFrobs Line 971
Contoso.exe!FrobImporter.TFrobImporter.ImportCore Line 1583
Contoso.exe!FrobImporter.TFrobImporter.RunImport Line 1070
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.HandleFrobImport Line 433
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.CoreExecute Line 71
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.Execute Line 84
Contoso.exe!Contoso.Contoso Line 167
kernel32.dll!#BaseThreadInitThunk#12()
ntdll.dll!__RtlUserThreadStart()
ntdll.dll!__RtlUserThreadStart#8()
So i do a ServerXmlHttpRequest.send, and it never returns. It will sit there for days (causing the system to miss financial transactions, until come Sunday night i get a call that it's broken).
It is of no help unless someone knows how to debug code, but the registers in the stalled thread at the time of the dump are:
EAX 00000030
EBX 00000000
ECX 00000000
EDX 00000000
ESI 002CAC08
EDI 00000001
EIP 732A08A7
ESP 0018F684
EBP 0018F6C8
EFL 00000000
Windows Server 2012 R2
Microsoft IIS/8.5
Default timeouts of ServerXmlHttpRequest
You can use serverXmlHttpRequest.setTimeouts(...) to configure the four classes of timeouts:
resolveTimeout: The value is applied to mapping host names (such as "www.microsoft.com") to IP addresses; the default value is infinite, meaning no timeout.
connectTimeout: A long integer. The value is applied to establishing a communication socket with the target server, with a default timeout value of 60 seconds.
sendTimeout: The value applies to sending an individual packet of request data (if any) on the communication socket to the target server. A large request sent to a server will normally be broken up into multiple packets; the send timeout applies to sending each packet individually. The default value is 30 seconds.
receiveTimeout: The value applies to receiving a packet of response data from the target server. Large responses will be broken up into multiple packets; the receive timeout applies to fetching each packet of data off the socket. The default value is 30 seconds.
The KB305053 (a server that decides to keep the connection open will cause serverXmlHttpRequest to wait for the connection to close) seems like it plausibly could be the issue. But the 30 second default timeout would have taken care of that.
Possible workaround - Add myself to a Job
The Windows Task Scheduler is unable to terminate the task; even though the option is enabled to do do.
I will look into using the Windows Job API to add my self process to a job, and use SetInformationJobObject to set a time limit on my process:
CreateJobObject
AssignProcessToJobObject
SetInformationJobObject
to limit my process to three minutes of execution time:
PerProcessUserTimeLimit
If LimitFlags specifies
JOB_OBJECT_LIMIT_PROCESS_TIME, this member is the per-process
user-mode execution time limit, in 100-nanosecond ticks. Otherwise,
this member is ignored.
The system periodically checks to determine
whether each process associated with the job has accumulated more
user-mode time than the set limit. If it has, the process is
terminated.
If the job is nested, the effective limit is the most
restrictive limit in the job chain.
Although since Task Scheduler uses Job objects to also limit a task's time, i'm not hopeful that the Job Object can limit a job either.
Edit: Job objects cannot limit a process by process time - only user time. And with a process idle waiting for an object, it will not accumulate any user time - certainly not three minutes worth.
Bonus Reading
How can a ServerXMLHTTP GET request hang? (GET, not POST)
KB305053: ServerXMLHTTP Stops Responding When You Send a POST Request (which says the timeout should expire; where mine does not)
MS Forums: oHttp.Send - Hangs (HEAD, not POST)
MS Forums: ASP to test SOAP WebService using MSXML2.ServerXMLHTTP Send hangs
CC to MS Support Forums
Consider switching to a newer, supported API.
msxml6.dll using MSXML2.ServerXMLHTTP.6.0
winhttpcom.dll using WinHttp.WinHttpRequest.5.1.
The msxml3.dll library is no longer supported and is only kept around for compatibility reasons. Plus, there were a number of security and stability improvements included with msxml4.dll (and newer) that you are missing out on.

Why does my code run 2 different ways with the exact same code?

I have an application that I have started developing that monitors websocket messages from all clients connected to the websocket server by relaying all messages received from the server to this application.
Problem
When I run my program (In visual studio I hit Start), it builds and starts up perfectly, and does most of the functionality the same everytime. However, I have a common occurance of a portion of code that will not run the same. Below is the small snippet of that code.
msg = "set name monitor"
SendMessage2(socket, msg, msg.Length)
msg = "set monitor 1"
SendMessage2(socket, msg, msg.Length)
Console.WriteLine("We are after our second SendMessage2 function")
I know that the two calls to SendMessage2 are always executed because visual studio's debug console will output the following
We are at the end of the SendMessage2 Sub
We are at the end of the SendMessage2 Sub
We are after our second SendMessage2 function
I also know when it executes correctly because my websocket server will either output one of the two blocks
Output when app runs correctly
Client 4 connected
New thread created
Connection received. Parsing headers.
Message from socket #4: "set name monitor"
Message from socket #4: "set monitor 1"
Output when app runs incorrectly
Client 4 connected
New thread created
Connection received. Parsing headers.
Message from socket #4: "set name monitor"
Notice how the second output is missing the second message from the monitor application.
What have I tried
Using a string variable to call the functions
Calling the functions using static string arguments (not using the variable msg)
SyncLocking the functions separately
SyncLocking inside the SendMessage2 function
Reordering the functions (swapping the strings to change behavior)
TL;DR
Why is it that even when I do not change my code, my program will execute two separate ways? Am I doing something incorrectly when calling my SendMessage2 Sub?
I am all out of ideas. I am willing to try any recommendation to fix this problem.
All code can be found on GitHub here
So I figured it out.
It is actually not the VB application that is messing up. Nor was my server. While debugging I was looking at the number of bytes received by my server and I noticed the following:
Client 4 connected
New thread created
Connection received. Parsing headers.
bytes read: 25
Message from socket #4: "set name monitor"
bytes read: 22
Message from socket #4: "set monitor 1"
Ok great we have 25 bytes from set name monitor and 22 bytes from set monitor 1
Client 4 connected
New thread created
Connection received. Parsing headers.
bytes read: 47
Message from socket #4: "set name monitor"
And boom. Both programs were doing their jobs, sending the correct number of bytes every time and reading the correct number. However, the VB application is sending them so quickly back to back that my server was reading all 47 bytes at a time instead of the separate 25 and 22 bytes.
Solution
I solved this problem by implementing a secondary buffer in my server to store off all bytes after the first message should multiple messages by group like this. Now I check if my secondaryBuffer is empty before reading in new bytes.
Here is a portion of the code used to solve the problem
/*Byte Check*/
for (j=0; j < bytes; j++) {
if (j == 0)
continue;
if (readBuffer[j] == '\x81' && readBuffer[j-1] == '\x00' && readBuffer[j-2] == '\x00') {
secondaryBytes = bytes - j;
printf("Potential second message attached to this message\nCopying it to the secondary buffer.\n");
memcpy(secondaryBuffer, readBuffer + j, secondaryBytes);
break;
}//END IF
}//END FOR LOOP
/**/

VBA Error - Reflection.FTP.3

I had this error pop up on exactly the 3rd line of code below. There seem to be no explanation on the Internet for this behaviour.
I'm looking at why this error came up, and fixed itself after few minutes.
Set Ftp = CreateObject("Reflection.FTP.3")
Ftp.Open "xxx.xxx.xxx.xxx", "username", "password"
Ftp.SetCurrentDirectory "DirectoryName/DirectoryName/DirectoryName"
What was the error?
Run-time error '-2147418113 (8000ffff)':
Method 'SetCurrentDirectory' of object 'IReflectionFTP' failed
More details:
Application: Excel Macro
Language : VB (VBA)
*Is this because of a coding error? *
Not likely. The macro has been long running and this came up for the first time.
*Is it because of a FTP service disruption? *
May be. But logs have a recording for every second and there seems to be no outage.
It seems to me there is a connection problem here - maybe a timeout? I assume that your three lines of code don't execute one after another (ie the SetCurrentDirectory is after some more code). This error will come up if the Ftp object doesn't have a valid connection that is logged in. Change the IP for the Open command to an invalid one and you'll see you get the same error.
Try setting the following line of code before SetCurrentDirectory command.
If FTP.Status = rcLoggedIn + rcConnected Then
Ftp.SetCurrentDirectory "DirectoryName/DirectoryName/DirectoryName"
Else
'Error handle
End If
Note, that you are late binding the object so for it to work for you, you'll need the If statement to be:
If FTP.Status = 17 Then
Also, if it is a timeout problem then I'd set the Timeout period for the session to be longer, ie FTP.TimeoutSession = 300.

How to diagnose "the operation has timed out" HttpException

I am calling 5 external servers to retrieve XML-based data for each request for a particular webpage on my IIS 6 server. Present volume is between 3-5 incoming requests per second, meaning 15-20 outgoing requests per second.
99% of the outgoing requests from my server (the client) to the external servers (the server) work OK but about 100-200 per day end up with a "The operation has timed out" exception.
This suggests I have a resource problem on my server - some shortage of sockets, ports etc or a thread lock but the problem with this theory is that the failures are entirely random - there are not a number of requests in a row that all fail - and two of the external servers account for the majority of the failures.
My question is how can I further diagnose these exceptions to determine if the problem is on my end (the client) or on the other end (the servers)?
The volume of requests precludes putting an analyzer on the wire - it would be very difficult to capture these few exceptions. I have reset CONNECTIONS and THREADS in my machine.config and the basic code looks like:
Dim hRequest As HttpWebRequest
Dim responseTime As String
Dim objWatch As New Stopwatch
Try
' calculate time it takes to process transaction
objWatch.Start()
hRequest = System.Net.WebRequest.Create(url)
' set some defaults
hRequest.Timeout = 5000
hRequest.ReadWriteTimeout = 10000
hRequest.KeepAlive = False ' to prevent open HTTP connection leak
hRequest.SendChunked = False
hRequest.AllowAutoRedirect = True
hRequest.MaximumAutomaticRedirections = 3
hRequest.Accept = "text/xml"
hRequest.Proxy = Nothing 'do not waste time searching for a proxy
hRequest.ServicePoint.Expect100Continue = False
Dim feed As New XDocument()
' use *Using* to auto close connections
Using hResponse As HttpWebResponse = DirectCast(hRequest.GetResponse(), HttpWebResponse)
Using reader As XmlReader = XmlReader.Create(hResponse.GetResponseStream())
feed = XDocument.Load(reader)
reader.Close()
End Using
hResponse.Close()
End Using
objWatch.Stop()
' Work here with returned contents in "feed" document
Return XXX' some results here
Catch ex As Exception
objWatch.Stop()
hRequest.Abort()
Return Nothing
End Try
Any suggestions?
By default, HttpWebRequest limits you to 2 connections per HTTP/1.1 server. So, if your requests take time to complete, and you have incoming requests queuing up on the server, you will run out of connection and thus get timeouts.
You should change the max outgoing connections on ServicePointManager.
ServicePointManager.DefaultConnectionLimit = 20 // or some big value.
You said that you are doing 5 outgoing request for each incoming request to the ASP page. Is that 5 different servers, or the same server?
DO you wait for the previous request to complete, before issuing the next one? Is the timeout happening while it is waiting for a connection, or during the request/response?
If the timeout is happening during the request/response then it means that the target server is under stress. The only way to find out if this is the case, is to run wireshark/netmon on one of the machines, and look at the network trace to see if the request from the app is even making it through to the server, and if it is, whether the target server is responding within the given timeout.
If this is a thread starvation issue, then one of the ways to diagnose it is to attach windbg.exe debugger to w3wp.exe process, when you start getting timeout. Then load the sos.dll debugging extension. And run the !threads command, followed by !threadpool command. It will show you how many Worker threads and completion port threads are utilized/remaining. If the #completionport threads or worker threads are low, then that will contribute to the timeout.
Alternatively, you can monitor ASP.NET and System.net perf counters. See if the ASP.NET request queue is increasing monotonically - this might indicate that your outgoing requests are not completing fast enough.
Sorry, there are no easy answers here. THere is a lot of avenues you will need to explore. If I were you, I would start off by attaching windbg.exe to w3wp when you start getting timeouts and do what I described earlier.

Finding the Uptime of a server programatically

Does anyone know of a way to programatically find the uptime of a server running Windows 2000? We have a service running on the machine written in VB.NET, that reports back to our server via a webservice.
Another way is to use the performance counters from .NET e.g.
Dim pc As PerformanceCounter = New PerformanceCounter("System", "System Up Time")
pc.NextValue() ' This returns zero for a reason I don't know
' This call to NextValue gets the correct value
Dim ts As TimeSpan = TimeSpan.FromSeconds(pc.NextValue())
So basically, the PerformanceCounter class will return the number of seconds the system has been up and from there you can do what you want.
If you have SNMP enabled, you can query the following OID: 1.3.6.1.2.1.1.3.0. This will give you the system uptime. It is defined as "The time (in hundredths of a second) since the network management portion of the system was last re-initialized."