I/O Exception: Server Key ColdFusion Issue - ssl

I’m a long-time reader here but a newbie at posting a question. Hopefully I’ll cover everything you guys need to hopefully help.
Background information:
We are running ColdFusion 10 on two servers that are load balanced (I’m not sure how they are load balanced – they are not clustered and are not using sticky sessions, this much I know). Unfortunately, I do not have access to our CF server admin at all; I have to rely on others.
I’ve implemented a punch out system that allows our users to connect to a vendor’s site to shop, then returns their items to our cart on our site. This has been working in our development servers without any issues. Everything worked well when we tested this in production as well. However, when we moved it into production last week, we started getting an error, but only when the code was running off of ONE of the load balanced servers. The error we received back from the vendor site stated that the error detail was: “I/O Exception: Server Key”. All of the research I conducted led me to believe that our CF servers needed the vendors cert (it is an https connection), so I told this to our server guy. He reinstalled the certs (he had said that they were there) and that did seem to solve the problem. I was successfully able to punch out to our vendor site from both of our load balanced servers.
We did a bit more testing (which all seemed fine) and then put it back into production this morning only to have the same issue occur. On one of the servers, this is working and on the other one it is not. My server guy tells me that the vendor certs are currently in place in the ColdFusion keystore.
Here is the cfhttp call I’m using:
<cfhttp url="#vendorURL#" method="POST" throwOnError="no" result="returnedObj">
<cfhttpparam type="XML" name="xmlPunchoutData" value="#trim(RequestPunchoutXML)#" />
</cfhttp>
Where ‘RequestPunchoutXML’ has a xml structure requesting a punch out from the vendor.
This looks possibly related: ColdFusion 10 - CFHTTP - Random peer not authenticated on SSL calls (cacerts file updated) but the error I'm getting isn't this one, though I think that they are probably related.
Questions: Any idea what is going on here? Could a badly set up load balancer be the issue here? Is it possible that the cfhttp call is starting from one of the servers and getting the response returned to the other? Could there be some reason that the certs are failing? Is this some other issue altogether that I have not yet identified? Any thoughts/ideas/suggestions would be greatly helpful.
Thanks in advance,
Janice

Related

mod_perl2 with apache 2.22 Apache2::RequestIO::print: (103) Software caused connection abort

I’m trying to get a mod_perl2 application ported to AWS. As part of the port I thought I’d move from Debian Squeeze to Wheezy with the latest stable mod_perl & Apache2 combination.
The application works right up to the point I try and write JSON responses to the client. At this point, each request is canceled on the client and on the server I get the error
Apache2::RequestIO::print: (103) Software caused connection abort
whenever I write to the client, i.e.:
$self->req->print($output);
I’ve tried tcpdumping the response to the client, and I can see it being written out, but no response is received on the client end and it just barfs chips. I can’t find any information on how to get around this.
I found quite a few people asking about this question on the net without many answers. The solution to my problem was very specific but I thought I’d post what I did anyway, it may help someone.
The client was canceling the request before the response was fully written, which was crapping out Apache::RequestIO (for reasons I still don’t know).
I couldn’t work out why I was seeing this behavior.
By using tcpdump I could see that data was being written out to the client – and it looked fine.
By inspecting the page in Chrome and looking at the network stack, I could see that my request for data was being canceled after no response was received (which was odd because the code worked fine on other servers and I could see the response was being written). Debugging was may harder because with Apache crashing out with an error in print IO I couldn’t check if the bytes written equaled the bytes of data. I wasn’t sure if something was getting stuck on the server side.
So, I changed the Content-Type of the response from application/json to text/html, so that I could query the page and just look at the actual response as text. Once I did that, I could see that the response was fine.
I started to look for other causes, and I found that in the migration to the new server, I’d missed altering some URLs in the DB to point to the new server, which meant my application was trying to get some data from the old DB.
This in turn was causing a load of timing issues, which was causing my problems. Once I fixed the config, the problems went away.

Exception while dispatching incoming RPC call : encodedRequest cannot be empty

The similar problem is described here: GWT IllegalArgumentException: encodedRequest cannot be empty
My GWT application is deployed in Tomcat6, which is linked with Apache by using Coyote/JK2 connectors. For SSO I use the mod_auth_sspi/1.0.4.
When I use IE8, pages is not displayed, but for Firefox everything OK. In Tomcat logs I see the following:
SEVERE: Exception while dispatching incoming RPC call
java.lang.IllegalArgumentException: encodedRequest cannot be empty
at com.google.gwt.user.server.rpc.RPC.decodeRequest(RPC.java:232)
at org.spring4gwt.server.SpringGwtRemoteServiceServlet.processCall(SpringGwtRemoteServiceServlet.java:32)
at com.google.gwt.user.server.rpc.RemoteServiceServlet.processPost(RemoteServiceServlet.java:248)
at com.google.gwt.user.server.rpc.AbstractRemoteServiceServlet.doPost(AbstractRemoteServiceServlet.java:62)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:643)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at gov.department.it.server.RequestInterceptorFilter.doFilter(RequestInterceptorFilter.java:90)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:311)
at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:776)
at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:705)
at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:898)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
at java.lang.Thread.run(Thread.java:619)
What have I tried so far:
1) Can't find the registry key DisableNTLMPreAuth (IMHO it's not the solution, because in my case IE 8 is actively used).
2) I have installed and configured the Native Windows Authentication Framework WAFFLE
web.xml:
...
<filter>
<filter-name>NegotiateSecurityFilter</filter-name>
<filter-class>waffle.servlet.NegotiateSecurityFilter</filter-class>
<init-param>
<param-name>waffle.servlet.spi.NegotiateSecurityFilterProvider/protocols</param-name>
<param-value>NTLM</param-value>
</init-param>
</filter>
...
<filter-mapping>
<filter-name>NegotiateSecurityFilter</filter-name>
<url-pattern>/my-app/*</url-pattern>
</filter-mapping>
...
But it did not help.
3) In worker.properties I set socket_keepalive=0, but it did not help too -
worker.ajp13.type=ajp13
worker.ajp13.host=localhost
worker.ajp13.port=8009
worker.ajp13.lbfactor=50
worker.ajp13.cachesize=10
worker.ajp13.cache_timeout=600
worker.ajp13.socket_keepalive=0
worker.ajp13.socket_timeout=300
What else can I try to do?
You have rediscovered the 7 year old bug #1 in mod_auth_sspi which has affected numerous projects, frustrated numerous developers, and caused uncountable wasted man-hours over the years. Yet it still stands unresolved because the maintainer doesn't consider it a bug. Nor has it been addressed by Microsoft for older browsers, because indications are that IE9 doesn't have this problem.
Cause
It is caused by IE trying to be 'smart' and sending a zero content-length POST (I named it 0POST to try making it an indexable term to benefit those who rediscover it in the next 7 years.) with an NTLM auth header in anticipation of being challenged by the server. IE does this when it has been authenticated before in that protection space. So it knows that it will be challenged again. Sadly mod_auth_sspi is not as smart as IE, so bad things happen on the server side when a 0POST arrives and it is let through to the apps without being challenged. Except that sometimes this can happen even for unprotected areas, if they are under an area that requires authentication.
Other browsers don't pretend to be as smart as IE and don't try to save a few bytes on the first round trip for "performance", so they don't run into this problem. Here is Microsoft's explanation of this behavior.
Horrible Workaround
In Apache httpd.conf set
SSPIPerRequestAuth On
This is equivalent to the DisableNTLMPreAuth IE client-side fix you mentioned, which is impractical for a large user group. Plus it amounts to crippling all non-Apache apps also, which may be capable of handling a 0POST. There are literally NO examples of this setting being discussed or its side effects explained on the web, so I am including this only link I found that sheds some light on it. Anyway, making one server side change seems to be the lesser of the two evils. Although now, by changing the server config, you have crippled all other innocent browsers visiting this site as well.
The problem with this workaround is that it forces EVERY request to perform an SSPI handshake which results in a lot of extra 401 traffic and can affect performance. For performance, NTLM authentication is treated as 'session-based' not 'request-based' which means that the handshake occurs only at the start of the session. When using this setting, you should also set filters to prevent your log filling up with 401s. Also note that this requires KeepAlive to be turned on.
I am not sure your setup is the same as the one described in the WAFFLE fix; were they using Apache like you? I think WAFFLE applies to Tomcat, whereas you have Apache in front, so Apache is handling authentication. You might consider using that setup instead of Apache. If you can use that setup, it may be a better option than this workaround because WAFFLE has explicitly accounted for 0POST and can handle it. The author had also discovered this gem while working with GWT like you.
Interestingly, for jcifs, a fix for this very issue was posted 9 Years ago. The author also provided an excellent explanation later:
The code in the filter examines all HTTP POST requests and determines
if they contain an NTLM type 1 message. If the request contains an
NTLM type 1 message, the filter responds with a dummy type 2 message
to entertain IE's desire to re-negotiate NTLM prior to submitting any
POST data. The browser should then respond with an NTLM type 3
message along with the post data which the filter should then allow to
chain to the rest of the web application.
A simple patch was also created for mod_auth_sspi 5 years ago, if you are interested. See diff in the author's own repo. I am not sure if I agree with that approach though. It tries to detect IE/0POST, whereas I think the right fix should be to detect if the client is requesting auth with a NTLM Type 1 header, as in the jcifs filter. (Type 1 simply means that it is the first message of the handshake)
I wonder if anyone has used alternatives to mod_auth_sspi like mod_auth_ntlm_winbind and if they don't exhibit this behavior. If you have, please leave a comment. We already know WAFFLE works, but it is not a mod_auth_sspi replacement.
One alternative is to forget NTLM and use Kerberos, (mod_auth_kerb) but many people find that too complicated to setup. IE will behave this way on any challenge-response scheme, so odds are that kerb auth could run into the same problem, since a similar 401 sequence happens in both cases. But being a different module, its possible it is capable of handling this.
Lastly, I should mention that there is yet another issue that this per-request auth workaround doesn't seem to fix. I haven't seen it discussed anywhere, but I have found that sometimes after the 0POST, the server waits for a very long time before it responds with the final 200 response with the results of the (proper) POST. This long delay happens only in the end though, NOT immediately in response to the 0POST. That goes fine, and the handshake completes, but the server doesn't respond until after a long wait which I have noticed is suspiciously close to 90 seconds, like some sort of timeout. The practical result of this is that when users log in, IE8 will sometimes hang for 90sec waiting for server response. I thought the KeepAlive might be causing it, but it is not even explicitly defined in my config, so I assume it is at the 15sec Apache default. But I am sure this is related to the 0POST, because it happens only right after a successful 0POST auth handshake. Our server is in a separate (2-way) trusted domain across a firewall, so maybe that has something to do with it.
Diverse Examples of This Issue
https://confluence.atlassian.com/display/JIRAKB/NullPointerException+when+Authenticating+from+IE
http://trac.edgewall.org/ticket/2696
http://trac.edgewall.org/ticket/4560
https://drupal.org/node/82530
http://www.webmasterworld.com/apache/3087425.htm
Why "Content-Length: 0" in POST requests?
https://jira.springsource.org/browse/SEC-1087
The most hilarious example is how IE's smartness affected Microsoft's own products! They themselves couldn't understand how to deal with IE's behavior, causing a bug in ISA Server 2006.
http://support.microsoft.com/kb/942638

Sporadic invalid_request 400 errors connecting to Shopify /admin/oauth/access_token

I am using a java raw HTTP client to connect to Shopify API (specifically, using Play Framework with the non-defualt sync driver which is actually the JDK's default driver).
My application usually manages to connect successfully and convert the temporary access token into a permanent one by calling the /admin/oauth/access_token endpoint.
However, sometimes I get this error result from the API:
Generic Error(400)
{"error":"invalid_request"}
I haven't been able to reproduce the issue with my test stores - I've tried installing a fresh store, reinstalling existing stores after uninstalling, I'm not sure why this call sometimes fail and how to debug it. The API call still continues to succeed for some stores using our application.
Some things that I am doing:
Even if the URL of the store is on a custom domain, I'm always using the https://foo.myshopfiy.com/admin/oauth/access_token URL and not the URL of the custom domain, to prevent a redirect.
I am always using an https URL and never an http one, again to prevent a redirect (we noticed a few issues with redirect with the Java HTTP client, so we aim to have zero redirects)
A thread I found about this error suggest possible problems with our SSL certificates, however I don't think this is my problem because some requests work for us, and the result of running openssl on our machine does't show any issues.
How should I proceed? Open a support ticket with Shopify?
FYI, I see that this specific problem only started yesterday on Feb 19 2013, so it might be a temporary issue.
FYI, the problem was caused by reusing a temporary access code.
Our fault - Shopify could have been more clear in their error message though.

ColdFusion SSL authentication failure

I have a simple cfhttp request (a login) going out to an SSL server:
<cfhttp url="https://www2.[domain].com/api/user/login" method="POST" port="443" >
<cfhttpparam type="formfield" name="username" value="[username]" >
<cfhttpparam type="formfield" name="password" value="[password]" >
</cfhttp>
The request fails before it begins, and the ColdFusion server says:
I/O Exception: peer not authenticated
Both development environments work smashingly. They receive the login session and then hand that to the collector process which successfully taps the remote web service for data.
After I spent a day trying to get the correct certificate into the ColdFusion stores, I had the bright idea to actually compare them to the working development environments. I looked at them (keytool -list), and they are identical.
Now that the obvious is absolved the questions I'm left with are twofold:
Is there some other certificate repository I need to check, or alternately, is there a place where I can get ColdFusion to tell me what certificate repository it needs to find the certificate IN (on the off chance it can and has been altered) or if that is even possible.
Identify and correct else could be causing this.
Are the development and production environments the same? Are they all, for example, ColdFusion 9 Standard or ColdFusion 8 Enterprise?
In my experience, this error is usually caused by one of two things:
The administrator failed to install the certificate into the cacarts repository, or they installed it into the wrong one.
ColdFusion Enterprise and ColdFusion Developer edition (for ColdFusion 8 and ColdFusion 9 both, I believe) have an issue with the built-in BSafe CryptoJ library that is installed and certain types of certificates (I have not yet been able to determine a pattern) that causes this error. There are some workarounds if this is the case.
First, I would explore the possibility that you are importing into the wrong certificate repository. It can be hard to tell which repository is being used. In your CF Admin under "Setting Summary" you should be able to find the location of the JRE that is being used. It is listed under "Java Home". Take that directory and add lib/security to the end of it and that should be the location of the cacaerts file that is being used. I say should because I have seen at least one weird situation where it was not.
I HAD the same problem and I tried everything and can't fix it. Strange is that everything worked fine then suddenly stopped working. It might be a Java update on the server causing the problem or a change of the certificate from the website the CFHTTP is trying to access.
Anyway, here is a link I setup for a "demo" of this problem:
http://www.viaromania.eu/https.cfm
As you can see, I am trying to access a HTTPS service using CFHTTP tag. And it is not working. I deleted the certificate from C:\ColdFusion9\runtime\jre\lib\security\cacerts, generated a new one from the website URL, imported back, installed "certman" under CFID/admministrator, checked the certificate, it's there... and it's listed in my test page.
If you scroll to the bottom of my test page, you'll see a similar CFHTTP to https://www.google.com and this works fine, even if there is no certificate installed on the server.
It is important to mention that the request is working just perfect on my development machine, and here I also don't have any certificate installed...
AND THIS HOW I FIXED IT
1. Updated ColdFusion 9.0.2 with this - https://helpx.adobe.com/coldfusion/kb/cumulative-hotfix-1-coldfusion-902.html
2. Installed Java JDK 1.7.0_79 from here http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
3. Changed the Java Home in ColdFusion Administrator / Server Settings → Java and JVM from "C:\ColdFusion9\runtime\lib\jre" to "C:\Program Files\Java\jdk1.7.0_79\jre"
That's it. I don't know if it uses any certificate or not. They were installed in the "C:\ColdFusion9\runtime\lib\jre\lib\security\cacerts" and not moved from there or anything.

WCF Services not working from Silverlight Application after Deploying

Okay I have seen some very similar questions here but none seem to be answered to my liking. I have created a Silverlight application that calls a couple of services to populate various comboboxes from the database. I got this working without too much trouble on my local machine.
So now I want to deploy it to our webserver. It was relatively straight forward to get ISS7 to load the Silverlight application. However, none of my services seem to be working properly, in that the comboboxes are empty. In IE I get the following error:
Message: Unhandled Error in Silverlight Application An exception occurred during the operation, making the result invalid. Check InnerException for exception details. at System.ComponentModel.AsyncCompletedEventArgs.RaiseExceptionIfNecessary()
at MyTestPage.ViewModel.MyService.GetInfoCompletedEventArgs.get_Result()
at MyTestPage.ViewModel.MainPageViewModel.b__2(Object s, GetInfoCompletedEventArgs ea)
at MyTestPage.ViewModel.MyService.MyServiceClient.OnGetInfoCompleted(Object state)
Line: 1
Char: 1
Code: 0
URI: http://www.mywebsite.com/MyTestPage.aspx
My problem is that this error only occurs when deploying on the webserver and I have no clue how to debug this problem. The error says to check the InnerException but I haven't found an answer yet (after hours of searching) that tells me how I should do this.
I have tried browsing to the services and I am able to do so using the domain name i.e. http://test.myserver.com/Services/MyService.svc. However when logged onto the server and using http://localhost:3456/Services/MyService.svc - which is the path in the ServicesReferences.ClientConfig file - It cannot be found.
Some answers here seem to suggest using a clientaccesspolicy.xml file but I don't understand why this should be necessary if the services are hosted on the same server as the application - they aren't required when debugging on my local machine. Despite my reservations I have tried adding a clientaccesspolicy.xml file to the root of the application but this still doesn't make any difference.
So I have a couple of questions:
1) How do I get access to the InnerException when I am running the application on the webserver? Is there a specific log file I can view or turn on?
2) If, for some reason, I am trying to access the service in a cross domain fashion (even though they are located on the same server) how do I configure the application so that this isn't required?
UPDATE:
Ok, I was able to get the tracing to work. I can now see the trace details on the page when it loads but it doesn't really tell me anything useful. I have also added the option to write the details to the disk. Initially this file wasn't being written and I couldn't understand why. Then I noticed that refreshing my silverlight application was not triggering a write to the log. It was only when I manually browsed to the services that the log file was updated. This seems to indicate to me that my silverlight application is not hitting the services at all (for some reason). I tried cutting out the View Model object and hitting the service directly from the xaml code behind file but this didn't make any difference either.
At this point after spending more than two days trying to figure this out, I am thinking about starting again from scratch.
For my mind it shouldn't be this difficult to deploy something that works on a development machine to a webserver.
I pretty much gave up on my initial approach. I had another go following along from this video http://www.silverlight.net/learn/videos/all/net-ria-services-intro/. It uses Domain services instead of the WCF Services and it was actually fairly straight forward to get it going on the webserver. The example is two years old now so maybe there are better ways to do this now (I am open to suggestions) but at least it worked within an hour of trying it (compared to 2.5 days and getting nowhere).