About losing HTTP Requests - apache

I have a server to which my client sends a HTTP GET request with some values. The server on its end simply stores these values to a database.
Now, I am observing that sometimes I do not observe these values in the database. One of the following could have happened:
The client never sent it
The server never received it
The server failed in writing to the database
My strongest doubt is that the reason is 2 - but I am unable to explain it completely. Since this is an HTTP request (which means there is TCP underneath) reliable delivery of the GET request should be guaranteed, right? Is it possible that even though I send a GET request to the server - it was never received by the server? If yes, what is TCP doing there?
Or, can I confidently assert that if the server is up and running and everything sent to the server is written to the database, then the absence of the details of the GET request in the database means the client never sent it?
Not sure if the details will help - but I am running a tomcat server and I am just sending a name-value pair through the get request.

There are a few things you seem to be missing. First of all, yes, if TCP finishes successfully, you pretty much have a guarantee that your message (i.e. the TCP payload) has reached the other side: TCP assures that it will take care of lost packages and the order in which packages arrive. However, this is not universially failproof, as there are still things beyond the powers of TCP (think of a physical disconnect by cutting through an ethernet cable). There is also no assertion regarding the syntactical correctness of the protocol "above." Any checks beyond delivering a bit-perfect copy is simply not TCP's concern.
So, there is a chance that the requests issued by your client are faulty or that they are indeed correct but not parsed correctly by your server. Former is striking me as more likely as latter one as Tomcat is a very mature piece of software. I think it would help tremendously if you would record and analyse some of your generated traffic through e.g. Wireshark.
You do not really mention what database you have in use. But there are some sacrificing acid-compliance in favour of increased write speeds. The nature of these databases brings it that you can never be really sure wether something actually got written to disk or is still residing in some buffer in memory. Should you happen to use such a db, this were another line of investigation.
Programmatically, I advise you take the following steps when dealing with HTTP traffic:
Has writing to the socket finishes without error?
Could a response be read from the socket?
Does the response carry a code in the 2xx range (indicating a successful operation)?
If any of these fail, you should really log something.
On a realated note, what you are doing there does not call for the GET method but for POST as you are changing application state. Consider it as a nice-to-have ;)

Without knowing the specifics, you can break it down into two parts. The HTTP request and the DB write. The client will receive a 200 OK response from the server when its GET request has been acknowledged. I've written code under Tomcat to connect to a MySQL DB using DAO. In the case of a failure an exception would be thrown and logged. Which ever method you're using, you'll want to figure out how failures are logged.

Related

How to handle the application if connection breaks in between a web service call

In several interviews I have been asked about handling of connection, web service calls, server responses and all. Even now I am not clear about many things.Could you please help me to get a better idea about the following scenarios?
What is the advantage of using NSURLSessionDataTask instead of NSURLConnection-I have an idea like data loss will not happen even if the connection breaks for NSURLSessionDataTask but not for the latter.But how it works?
If the connection breaks after sending the request to a server or while connecting to server , How can we handle the code at our end in case of NSURLConnection and NSURLSessionDataTask?-My idea is to use Reachability classes and check when it becomes online.
The data we are sending got updated at the server side. But we don't get the response from server. What can we do at our side to handle this situation?- Incrementing timeOutInterval is the only thing that we can do?
Please help me with these scenarios. Thank you very much in advance!!
That's multiple questions, really, but I'll try to answer them all briefly.
Most failure handling is the same between NSURLConnection and NSURLSession. The main advantages of the latter are support for background downloads and cancelling groups of related requests.
That said, if you're doing a large download that you think might fail, NSURLSession does provide download tasks that let you resume the download if your network connection fails, similar to what NSURLDownload used to do on OS X (never available on iOS). This only helps for downloading large files, though, not for large uploads (which require significant server-side support to resume) or other requests.
Your intuition is correct. When a connection fails, create a reachability object monitoring that particular hostname to see when it would be a good time to try the request again. Then, try the request again.
You might also display some sort of advisory UI to say that you have no Internet connection. (By advisory, I mean something that the user doesn't have to click on and that does not impact offline use of the app any more than necessary; look at the Facebook app for a great example.)
Provide a unique identifier when you make the request, and store that on the server along with the server's response until the client acknowledges receipt of the response (or purge it anyway after some reasonable number of days). When the upload finishes, the server gives you back its response if it can.
If something goes wrong, the client asks the server to resend the response associated with that unique identifier. Once your client has the data, it acknowledges receipt and the server deletes the response. If you ask the server for the response and it doesn't have one, then the upload didn't really complete.
With some additional work, this approach can make it possible to support long-running uploads more reliably. If an upload fails, ask the server how much data it got for that identifier, then tell the server that you're going to upload new data starting at the next byte. On the server side, overwrite the old data starting at that byte (just in case some data was still being written when you asked for the length).
Hope that helps.

How to keep an API idempotent while receiving multiple requests with the same id at the same time?

From a lot of articles and commercial API I saw, most people make their APIs idempotent by asking the client to provide a requestId or idempotent-key (e.g. https://www.masteringmodernpayments.com/blog/idempotent-stripe-requests) and basically store the requestId <-> response map in the storage. So if there's a request coming in which already is in this map, the application would just return the stored response.
This is all good to me but my problem is how do I handle the case where the second call coming in while the first call is still in progress?
So here is my questions
I guess the ideal behaviour would be the second call keep waiting until the first call finishes and returns the first call's response? Is this how people doing it?
if yes, how long should the second call wait for the first call to be finished?
if the second call has a wait time limit and the first call still hasn't finished, what should it tell the client? Should it just not return any responses so the client will timeout and retry again?
For wunderlist we use database constraints to make sure that no request id (which is a column in every one of our tables) is ever used twice. Since our database technology (postgres) guarantees that it would be impossible for two records to be inserted that violate this constraint, we only need to react to the potential insertion error properly. Basically, we outsource this detail to our datastore.
I would recommend, no matter how you go about this, to try not to need to coordinate in your application. If you try to know if two things are happening at once then there is a high likelihood that there would be bugs. Instead, there might be a system you already use which can make the guarantees you need.
Now, to specifically address your three questions:
For us, since we use database constraints, the database handles making things queue up and wait. This is why I personally prefer the old SQL databases - not for the SQL or relations, but because they are really good at locking and queuing. We use SQL databases as dumb disconnected tables.
This depends a lot on your system. We try to tune all of our timeouts to around 1s in each system and subsystem. We'd rather fail fast than queue up. You can measure and then look at your 99th percentile for timings and just set that as your timeout if you don't know ahead of time.
We would return a 504 http status (and appropriate response body) to the client. The reason for having a idempotent-key is so the client can retry a request - so we are never worried about timing out and letting them do just that. Again, we'd rather timeout fast and fix the problems than to let things queue up. If things queue up then even after something is fixed one has to wait a while for things to get better.
It's a bit hard to understand if the second call is from the same client with the same request token, or a different client.
Normally in the case of concurrent requests from different clients operating on the same resource, you would also want to implementing a versioning strategy alongside a request token for idempotency.
A typical version strategy in a relational database might be a version column with a trigger that auto increments the number each time a record is updated.
With this in place, all clients must specify their request token as well as the version they are updating (typical the IfMatch header is used for this and the version number is used as the value of the ETag).
On the server side, when it comes time to update the state of the resource, you first check that the version number in the database matches the supplied version in the ETag. If they do, you write the changes and the version increments. Assuming the second request was operating on the same version number as the first, it would then fail with a 412 (or 409 depending on how you interpret HTTP specifications) and the client should not retry.
If you really want to stop the second request immediately while the first request is in progress, you are going down the route of pessimistic locking, which doesn't suit REST API's that well.
In the case where you are actually talking about the client retrying with the same request token because it received a transient network error, it's almost the same case.
Both requests will be running at the same time, the second request will start because the first request still has not finished and has not recorded the request token to the database yet, but whichever one ends up finishing first will succeed and record the request token.
For the other request, it will receive a version conflict (since the first request has incremented the version) at which point it should recheck the request token database table, find it's own token in there and assume that it was a concurrent request that finished before it did and return 200.
It's seems like a lot, but if you want to cover all the weird and wonderful failure modes when your dealing with REST, idempotency and concurrency this is way to deal with it.

Twisted - succes (or failure) callback for LineReceiver sendLine

I'm still trying to master Twisted while in the midst of finishing an application that uses it.
My question is:
My application uses LineReceiver.sendLine to send messages from a Twisted TCP server.
I would like to know if the sendLine succeeded.
I gather that I need to somehow add a success (and error?) callback to sendLine but I don't know how to do this.
Thanks for any pointers / examples
You need to define "succeeded" in order to come up with an answer to this.
All sendLine does immediately (probably) is add some bytes to a send buffer. In some sense, as long as it doesn't raise an exception (eg, MemoryError because your line is too long or TypeError because your line was the number 3 instead of an actual line) it has succeeded.
That's not a very useful kind of success, though. Unfortunately, the useful kind of success is more like "the bytes were added to the send buffer, the send buffer was flushed to the socket, the peer received the bytes, and the receiving application acted on the data in a persistent way".
Nothing in LineReceiver can tell you that all those things happened. The standard solution is to add some kind of acknowledgement to your protocol: when the receiving application has acted on the data, it sends back some bytes that tell the original sender the message has been handled.
You won't get LineReceiver.sendLine to help you much here because all it really knows how to do is send some bytes in a particular format. You need a more complex protocol to handle acknowledgements.
Fortunately, Twisted comes with a few. twisted.protocols.amp is one: it offers remote method calls (complete with responses) as a basic feature. I find that AMP is suitable for a wide range of applications so it's often safe to recommend for new development. It largely supersedes the older twisted.spread (aka "PB") which also provides both remote method calls and remote object references (and is therefore more complex - in my experience, more complex than most applications need). There are also some options that are a bit more standard: for example, Twisted Web includes an HTTP implementation (HTTP, as you may know, is good at request/response style interaction).

Netty SSL mode strange behavior

I am trying to understand, why does Netty SSL mode work on strange way?
Also, the problem is following, when any SSL client(https browser, java client using ssl, also any ssl client application) connects to Netty server I get on beginning the full message, where I can recognize correctly the protocol used, but as long the channel stays connected, any following messages have strange structure, what is not happening same way with non-ssl mode.
As example on messageReceived method when the https browser connects to my server:
I have used PortUnificationServerHandler to switch protocols.. (without using nettys http handler, it is just example, because i use ssl mode for my own protocol too)
first message is ok, I get full header beginning with GET or POST
than I send response...
second message is only one byte long and contains "G" or "P" only.
third message is than the rest beginning either with ET or OST and the rest of http header and body..
here again follows my response...
fourth message is again one byte long and again contains only one byte..
fifth message again the rest... and on this way the game goes further..
here it is not important, which sub protocol is used, http or any else, after first message I get firstly one byte and on second message the rest of the request..
I wanted to build some art of proxy, get ssl data and send it unencoded on other listener, but when I do it directly without waiting for full data request, the target listener(http server as example) can not handle such data, if the target gets one byte as first only (even if the next message contains the rest), the channel gets immediately closed and request gets abandoned..
Ok, first though would be to do following, cache the first byte temporarily and wait for next message and than join those messages, and only than response, that works fine, but sometimes that is not correct approach, because the one byte is sometimes really the last message byte, and if i cache it and await wrongly next message, i can wait forever, because the https browser expects at this time some response and does not send any data more..
Now the question, is it possible to fix this problem with SSL? May be there are special settings having influence on this behavior?
I want fully joined message at once as is and not firstly first byte and than the rest..
Can you please confirm, that with newer Netty versions you have same behaving by using PortUnificationServerHandler (but without netty http handler, try some own handler.)
Is this behavior Ok so, I do not believe, it was projected so to work..
What you're experiencing is likely to be due to the countermeasures against the BEAST attack.
This isn't a problem. What seems to be the problem is that you're assuming that you're meant to read data in terms of messages/packets. This is not the case: TCP (and TLS/SSL) are meant to be used as streams of continuous data. You should keep reading data while data is available. Where to split incoming data where it's meaningful is guided by the application protocol. For HTTP, the indications are the blank line after the header and the Content-Length or chunked transfer encoding for the entity.
If you define your own protocol, you'll need a similar mechanism, whether you use plain HTTP or SSL/TLS. Assuming you don't need it only works by chance.
I had experienced this issue and found it was caused bu using JDK1.7. Moving back to JDK1.6 solved it. I did not have time to investigate further but have assumed for now that the SSLEngine implementation has changed in the JDK. I will investigate further when time permits.

WCF Server Push connectivity test. Ping()?

Using techniques as hinted at in:
http://msdn.microsoft.com/en-us/library/system.servicemodel.servicecontractattribute.callbackcontract.aspx
I am implementing a ServerPush setup for my API to get realtime notifications from a server of events (no polling). Basically, the Server has a RegisterMe() and UnregisterMe() method and the client has a callback method called Announcement(string message) that, through the CallbackContract mechanisms in WCF, the server can call. This seems to work well.
Unfortunately, in this setup, if the Server were to crash or is otherwise unavailable, the Client won't know since it is only listening for messages. Silence on the line could mean no Announcements or it could mean that the server is not available.
Since my goal is to reduce polling rather than immediacy, I don't mind adding a void Ping() method on the Server alongside RegisterMe() and UnregisterMe() that merely exists to test connectivity of to the server. Periodically testing this method would, I believe, ensure that we're still connected (and also that no Announcements have been dropped by the transport, since this is TCP)
But is the Ping() method necessary or is this connectivity test otherwise available as part of WCF by default - like serverProxy.IsStillConnected() or something. As I understand it, the channel's State would only return Faulted or Closed AFTER a failed Ping(), but not instead of it.
2) From a broader perspective, is this callback approach solid? This is not for http or ajax - the number of connected clients will be few (tens of clients, max). Are there serious problems with this approach? As this seems to be a mild risk, how can I limit a slow/malicious client from blocking the server by not processing it's callback queue fast enough? Is there a kind of timeout specific to the callback that I can set without affecting other operations?
Your approach sounds reasonable, here are some links that may or may not help (they are not quite exactly related):
Detecting Client Death in WCF Duplex Contracts
http://tomasz.janczuk.org/2009/08/performance-of-http-polling-duplex.html
Having some health check built into your application protocol makes sense.
If you are worried about malicious clients, then add authorization.
The second link I shared above has a sample pub/sub server, you might be able to use this code. A couple things to watch out for -- consider pushing notifications via async calls or on a separate thread. And set the sendTimeout on the tcp binding.
HTH
I wrote a WCF application and encountered a similar problem. My server checked clients had not 'plug pulled' by periodically sending a ping to them. The actual send method (it was asynchronous being a server) had a timeout of 30 seconds. The client simply checked it received the data every 30 seconds, while the server would catch an exception if the timeout was reached.
Authorisation was required to connect to the server (by using the built-in feature of WCF that force the connecting person to call a particular method first) so from a malicious client perspective you could easily add code to check and ban their account if they do something suspicious, while disconnecting users who do not authenticate.
As the server I wrote was asynchronous, there wasn't any way to really block it. I guess that addresses your last point, as the asynchronous send method fires off the ping (and any other sending of data) and returns immediately. In the SendEnd method it would catch the timeout exception (sometimes multiple for the client) and disconnect them, without any blocking or freezing of the server.
Hope that helps.
You could use a publisher / subscriber service similar to the one suggested by Juval:
http://msdn.microsoft.com/en-us/magazine/cc163537.aspx
This would allow you to persist the subscribers if losing the server is a typical scenario. The publish method in this example also calls each subscribers on a separate thread, so a few dead subscribers will not block others...