WCF service writes log only if client receives results - wcf

I'm working on a WCF service to help our new code interoperate with a legacy system. The process goes like this:
Client calls the service with a request for the legacy system.
Service writes the request into a database.
Legacy system services request from the DB in its own time and writes results back into the DB (updating a status flag to say results are ready).
Client retrieves results by calling a second service method, which polls the DB until the ready flag is set.
Just before returning the results, the service updates the status flag to client has results, so that the related DB rows can be deleted.
My concern is the race condition at the last step. I can see this happening:
Service updates status to client has results.
Client times out after waiting for the service to poll the DB.
Service tries to return results. Hilarity ensues.
One way to solve this would be to have three service calls instead of two: the second call retrieves results, and the last one is an explicit acknowledgement by the client that it has them. I'd like to know whether there is a way which doesn't impose this extra "protocol" burden on the client though.
I've looked briefly into using transactions in WCF, and it sounds like they might be able to do what I need. The client (optionally) starts a transaction, flows it to the service, which uses it if it's there, and commits it when done. This seems as if it implicitly does the "third call".
Does this idea have any merit? Any disadvantages that you can see? Are there any other avenues I could explore?

Using transaction flow is possible but flowing transaction in polling scenario (in each poll call) is terrible architecture. What you generally need is transaction flow for the real read operation where service modifies the record and returns data back to the client. The client will commit the transaction and it will commit changes performed by the service.
Using transactional processing places some additional requirements on your service and clients.
Another approach can be transactional MSMQ:
Client calls the service with a request for the legacy system = client sends a message to the service's queue
Service writes the request into a database = service processes the message from its queue
Legacy system services request from the DB in its own time and writes results back into the DB (updating a status flag to say results are ready).
Service polls the database and places messages to correct client queues. Placing the message and modifying database records runs in transaction
Client processes incoming message
Transactional queue allows transactional reading (the message is removed from the queue only if transaction is committed) and writing (the message is added to the queue only if transactions is committed). That will allow deleting records before the client reads the message because the message will remain in the queue until he successfully reads it (or until it timeouts and even after that it can be passed to some error queues).
In both cases you should think about clients who will consume the service. Transaction flowing can be interoperable but not every web service stack supports it. MSMQ is not interoperable.

Why not reduce the likelihood of the client timing out by doing this instead:
Client calls service with a request for the legacy system.
Service writes the request into a database.
Legacy system services request from the DB in its own time and writes results back into the DB (updating a status flag to say results are ready).
Client calls a service to find out whether the results are ready. NB. no polling: just returns with an immediate yes or no.
If the results are NOT ready, client waits a bit and then goes back to step 4.
If the results ARE ready, call the service to retrieve the results. The service can update the status to "Client has results" at that point.
By doing this, the client won't be waiting for the service call in step 4. to return for a prolonged period, and the chances of a timeout should be minimal.
However, you're never going to be 100% certain that the client has received the results unless the client makes a final service call to say so. (What if, for example, the client dies after making the very last request?)

Related

Can RabbitMQ (or similar message queuing system) be used to single thread requests per user?

The issue is we have some modern web applications that are integrated with a legacy system that was never designed to support multiple concurrent requests from a single user. Basically there are certain types of requests that the legacy system can only handle one-at-a-time from a single user. It can handle multiple concurrent requests coming from different users, but for technical reasons cannot handle multiple from a single user. In these situations, the user's first request will complete successfully, but any subsequent requests from that same user that come in while the first request is still executing will fail.
Because our apps are ajax enabled, multi-tab/multi-browser friendly, and just the fact that there are multiple apps - there are certain scenarios where a user could wind up having more than one of these types of requests being sent to the legacy system at the same time.
I'm trying to determine if something like RabbitMQ could be positioned in front of the legacy system and leveraged to single-thread requests per user/IP. The thinking being that the web apps would send all requests to MQ, and they'd stack into per-user queues and pass on to the legacy system one at a time.
I don't know if there would be concerns about the potential number of queues this could create - we have a user-base of approx 4,000.
And I know we could somewhat address this in the web apps individually, but since there are multiple apps it'd be duplicating logic across them, and you'd still have the potential for two different apps to fire off concurrent requests.
Any feedback would be appreciated. Thanks-
I'm not sure a unique queue per user will work as you would need to have a backend worker process listening for messages on that queue that would need to be dynamically created.
Below is one option but it does have a performance bottleneck potential as a single backend process would be handling all requests sequentially. You could use multiple worker processes but you wouldn't know if one had completed before the other causing a race condition if your app requires a specific sequence of actions.
You could simply put all transactions (from all users) into a single queue and have a backend process pull off of that queue and service the request. If there needs to be a response back to the user once the request was serviced, then the worker process could respond back to a separate queue with a correlationID that could be used to send the response date back to the correct user.
I've done this before with ExpressJS apps where the following flow would happen:
The user/process/ajax makes a request
Express takes the payload from the request object and sends it to a RabbitMQ queue with a unique correlationId (e.g. UUID).
Express then takes the response object and stores it in a responseStore object with the key being the correlationId
Meanwhile, a backend worker process pulls the item from the queue, does some work and then sends a message to a different response queue with the same correlationId
The ExpressJS application has a connection to the response queue and when it receives a message, it takes the correlationId from the response and looks for a response object stored with same correlationId in the responseStore. If it finds it, it takes the payload from the message and does something like response.send(payload) or response.json(payload)
To do this, you should also have a mechanism that stores the creation time of the response object in the responseStore along with the response object. Then have a separate process that will check the responseStore and clean up old response objects after a certain timeout in case there are issues with the backend process completing.
Look here for more info on RPC with RabbitMQ:
https://www.rabbitmq.com/tutorials/tutorial-six-javascript.html
Hope this helps.

Controlling Gemfire cache updates in the background

I will be implementing a Java program that acts as a gemfire client. The program will continuosly process records that it receives on its port from a remote program. Each record will be processed using the static data cached with my program. The cache may get updated behind the scenes in my program when it is changed on the gemfire server. The processing of one record may take a few seconds. I run the risk of processing half the record with static data that was prevalent before the change and rest of the record with static data that has taken effect after the change. Is there a way I can tell gemfire to not apply the cache to the local client until I am done processing the ongoing record?
Regards,
Yash
Consider this approach: Use a Continuous query "Select *" instead of event registration. A CQ does not update the client region like a subscription does. Make your client region LOCAL. After receiving the CQ event on the client, execute your long running process and put the value that you received from the CQ into your client region. Decoupling client and server in this way will allow your client to run long-running processes.
Alternatively: if you must have the client cache proxied with the server as an absolute requirement, then keep the interest registration AND register a CQ. Ignore the event callback from the subscription but handle your long-running process using the event callback from the CQ.
The following is from page 683 at http://gemfire.docs.pivotal.io/pdf/pivotal-gemfire-ug.pdf
CQs do not update the client region. This is in contrast to other server-to-client messaging like the updates sent to satisfy interest registration and responses to get requests from the client's Pool.

In what scenarios is recommended a reliable session?

In few words, if I am not wrong, a session is used when I want to ensure that the packages are sent in order, and to be able to use sessions is needed a reliable connection.
But my doubt what kind of applications need that? In my case is a simple application in which a client request to a service data from a database, the service get the data from the database and send to the client the results. Also the client can requeset to add, modify or delete data from database. In this case, should I need a reliable connection and sessions or not?
Thanks.
Session presumes that for some period of time you want to retain some data. Such a period of time, as far as session is concerned, refers to client's lifecycle that is when client opens up proxy, both service along with session are created, when client closes proxy service and session terminate their actions. There is exception when closing proxy does not actually perform it right away and this occures when you invoke one-way-operation. Service will keep working as long as operation performs its action despite the fact that it previously received an order to get rid of instance.
Based on provided information I assume the best choice would be PerCall. You do not store any data between calls and every single call can be perceived separately. Additionaly, leverage of ConcurrencyMode set to multiple so as to allow services being created simultaneously.
Personally, I find session useful in MSMQ, whenever I want to specific number of messages be wrapped into single queue-message. If error occures, regardless of whether which message is in charge of it, the whole queue-message is rolled back.

WCF basicHttpBinding: Rollback when reply to client fails

I am exposing a WCF service through a basicHttpBinding that executes several operations on a database.
I want to guarantee that if the client does not receive the reply the database operations are rolled back (without any transaction flow through WCF).
E.g. the client calls the "DoX" method which executes on the server but before it is finished the client crashes. The database operations should then be rolled back as soon as the reply can not be send to the client.
Is there any way to do that? Will the [OperationBehavior(TransactionScopeRequired=true)] attribute work in such a manner? Is there a possibility to handle communication errors on the server side?
Update 1:
It seems [OperationBehavior(TransactionScopeRequired=true)] commits the transaction before the reply is send to the client and thus can not be used to perform a rollback if the client does not receive the reply.
Update 2:
To state it clearly again, I do not have the need for the transaction to interact in any way with the client side. The client should neither know of the transaction, have the ability to cancel or commit it, nor should any transaction flow through the binding. The only place I want the transaction to rollback is on the server side if the transport channel can not deliver the message to the receiving client. With the case of TCP/IP this information should be readily available to the server. (No ACK of the TCP packet send back to the client)
So a hypothetical execution flow on the server side (notice the lack of client side) should be:
Receive client request
Start transaction
Execute all logic inside the service operation
Send reply back to client
if (reply.failedToReceive) { transaction.Rollback() } // due to a failing TCP/IP transmission
There is no easy answer to this question. You are asking for a behaviour that is implemented in WS-* but done using basic SOAP. I think your only option if you REALLY can't switch to wsHttpBinding or use duplex as suggested by #Trevor Pilley is to try to mimic the behaviour of WS-Transaction in your own custom protocol based on basic SOAP.
You should be able to get some simplification over the full WS-Transaction specification because
You will probably only need to support transactions over a single service - you will not be doing a distributed transaction over several independent services
You will not need to support both short a transactions (WS-AtomicTransaction) as well as long running transactions (WS-BusinessActivity) probaby atomic transactions would do
You would not need to support any kind of extensibility model (WS-Coordination)
You would not need to implement a discovery/metadata model that describes the protocol (e.g. like WSDL) because you would be coding the protocol behaviour directly into the client and service.
However, you would probably need elements of both WS-Coordination and WS-AtomicTransaction. This is not a simple task by any means and it will be easy to miss something subtle that could cause either rollbacks to not happen or (just as bad) to destroy the performance of your service by having long duration locks all over your database due to crashed clients.
Like I say, this is a complex behaviour and if you cannot use ready-made, standardised protocols, there is no simple answer.

WCF call order in single concurrency mode

Assume a WCF service with ServiceBehavior.ConcurrencyMode = Single.
When exactly does the service start blocking for concurrent calls?
For example, say we have two clients: Slow and Fast.
At time 0 Slow starts a slow service call that includes a huge chunk of data.
At time 1 Fast makes a fast service call.
At time 2 the slow data finally arrives and the service code is executed on the server.
Assuming buffers configured in WCF to be larger than the huge chunk, which call will get executed first?
In other words, does blocking start when all call data has been received at the server side or when the client initiates the call?
Is the service blocked during the data transfer or only during code execution?
Unless you configure InstanceContextMode to Single as well both calls will be executed concurrently. So suppose that you have InstanceContextMode set to Single.
I didn't test it but I would expect such behavior. Concurrency mode is service behavior so it takes place once the service instance / instance context is resolved. In buffered mode that happens after whole message is received in streaming mode it should happen after message headers are received. So in case of buffered transport I would expect that fast client will be processed first and in case of streamed transport it depends if message headers from slow client was already received.
But as I wrote before this is only my expectation.