ASP Net Core 3 Session (state) concurrency and integrity - asp.net-core

I have a page which requests multiple requests concurrently. So those requests are in the very same session. For accessing the session I use everywhere IHttpContextAccessor.
My problem is that regardless of the timing, some request does not see other requests already set session state, instead sees some previous state. (again in timing, the set state operation happened already, still)
As far as I know each requests has its own copy of the state, which is written back... (well "when"?) to the common "one" state. If this "when" is the delayed to when request is completely served, then the scenario what I experiencing is easily happen: The 2nd concurrent request within the session got his copy after the 1st request modified the state but before as it was finished completely.
However this all above means that in case of concurrent request serving within a session there is no way to maintain session integrity. The 2nd not seeing the already done changes by the 1st, will write back something what is not consistent with the already done 1st process change.
Am I missing something?
Is there any workaround? (with some cost of course)

First, you may know this already, but it bears point out, just in case: session state is specific to one client. What you're talking about here, then, is the same client throwing multiple concurrent requests at the same time, each of which is touching the same piece of session state. That, in general, seems like a bad design. If there's some actual application reason to have multiple concurrent requests from the same client, then what those requests do should be idempotent or at least not step on each others toes. If it's a situation where the client is just spamming the server, either due to impatience or maliciousness, it's really not your concern whether their session state becomes corrupted as a result.
Second, because of the reasons outline above, concurrency is not really a concern for sessions. There's no use case I can imagine where the client would need to send multiple simultaneous requests that each modify the same session key. If there is, please elucidate by editing your question accordingly. However, I'd still imagine it would be something you likely shouldn't be persisting in the session in the first place.
That said, the session is thread-safe in that multiple simultaneous writes/reads will not cause an exception, but no guarantee is or can be made about integrity. That's universal across all concurrency scenarios. It's on you, as the developer, to ensure data integrity, if that's a concern. You do so, by designing a concurrency strategy. That could be anything from locks/semaphores to gate access or just compensating for things happening out of band. For example, with EF, you can employ concurrency tokens in your database tables to prevent one request overwriting another. The value of the token is modified with each successful update, and the application-known value is checked against the current database value before the update is made, to ensure that it has not been modified since the application initiated the update. If it has, then an exception is thrown to give the application a chance to catch and recover by cancelling the update, getting the fresh data and modifying that, or just pushing through an overwrite. This is to elucidate that you would need to come up with some sort of similar strategy if the integrity of the session data is important.

Related

Can I send an API response before successful persistence of data?

I am currently developing a Microservice that is interacting with other microservices.
The problem now is that those interactions are really time-consuming. I already implemented concurrent calls via Uni and uses caching where useful. Now I still have some calls that still need some seconds in order to respond and now I thought of another thing, which I could do, in order to improve the performance:
Is it possible to send a response before the sucessfull persistence of data? I send requests to the other microservices where they have to persist the results of my methods. Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull?
With that, the front-end could already begin working even though my API is not 100% finished.
I saw that there is a possible status-code 207 but it's rather used with streams where someone wants to split large files. Is there another possibility? Thanks in advance.
"Is it possible to send a response before the sucessfull persistence of data? Can I already send the user the result in a first response and make a second response if the persistence process was sucessfull? With that, the front-end could already begin working even though my API is not 100% finished."
You can and should, but it is a philosophy change in your API and possibly you have to consider some edge cases and techniques to deal with them.
In case of a long running API call, you can issue an "ack" response, a traditional 200 one, only the answer would just mean the operation is asynchronous and will complete in the future, something like { id:49584958, apicall:"create", status:"queued", result:true }
Then you can
poll your API with the returned ID to see if the operation that is still ongoing, has succeeded or failed.
have a SSE channel (realtime server side events) where your server can issue status messages as pending operations finish
maybe using persistent connections and keepalives, or flushing the response in the middle, you can achieve what you point out, ie. like a segmented response. I am not familiar with that approach as I normally go for the suggesions above.
But in any case, edge cases apply exactly the same: For example, what happens if then through your API a user issues calls dependent on the success of an ongoing or not even started previous command? like for example, get information about something still being persisted?
You will have to deal with these situations with mechanisms like:
Reject related operations until pending call is resolved "server side": Api could return ie. a BUSY error informing that operations are still ongoing when you want to, for example, delete something that still is being created.
Queue all operations so the server executes all them sequentially.
Allow some simulatenous operations if you find they will not collide (ie. create 2 unrelated items)

Maintain Consistency in Microservices [duplicate]

What is the best way to achieve DB consistency in microservice-based systems?
At the GOTO in Berlin, Martin Fowler was talking about microservices and one "rule" he mentioned was to keep "per-service" databases, which means that services cannot directly connect to a DB "owned" by another service.
This is super-nice and elegant but in practice it becomes a bit tricky. Suppose that you have a few services:
a frontend
an order-management service
a loyalty-program service
Now, a customer make a purchase on your frontend, which will call the order management service, which will save everything in the DB -- no problem. At this point, there will also be a call to the loyalty-program service so that it credits / debits points from your account.
Now, when everything is on the same DB / DB server it all becomes easy since you can run everything in one transaction: if the loyalty program service fails to write to the DB we can roll the whole thing back.
When we do DB operations throughout multiple services this isn't possible, as we don't rely on one connection / take advantage of running a single transaction.
What are the best patterns to keep things consistent and live a happy life?
I'm quite eager to hear your suggestions!..and thanks in advance!
This is super-nice and elegant but in practice it becomes a bit tricky
What it means "in practice" is that you need to design your microservices in such a way that the necessary business consistency is fulfilled when following the rule:
that services cannot directly connect to a DB "owned" by another service.
In other words - don't make any assumptions about their responsibilities and change the boundaries as needed until you can find a way to make that work.
Now, to your question:
What are the best patterns to keep things consistent and live a happy life?
For things that don't require immediate consistency, and updating loyalty points seems to fall in that category, you could use a reliable pub/sub pattern to dispatch events from one microservice to be processed by others. The reliable bit is that you'd want good retries, rollback, and idempotence (or transactionality) for the event processing stuff.
If you're running on .NET some examples of infrastructure that support this kind of reliability include NServiceBus and MassTransit. Full disclosure - I'm the founder of NServiceBus.
Update: Following comments regarding concerns about the loyalty points: "if balance updates are processed with delay, a customer may actually be able to order more items than they have points for".
Many people struggle with these kinds of requirements for strong consistency. The thing is that these kinds of scenarios can usually be dealt with by introducing additional rules, like if a user ends up with negative loyalty points notify them. If T goes by without the loyalty points being sorted out, notify the user that they will be charged M based on some conversion rate. This policy should be visible to customers when they use points to purchase stuff.
I don’t usually deal with microservices, and this might not be a good way of doing things, but here’s an idea:
To restate the problem, the system consists of three independent-but-communicating parts: the frontend, the order-management backend, and the loyalty-program backend. The frontend wants to make sure some state is saved in both the order-management backend and the loyalty-program backend.
One possible solution would be to implement some type of two-phase commit:
First, the frontend places a record in its own database with all the data. Call this the frontend record.
The frontend asks the order-management backend for a transaction ID, and passes it whatever data it would need to complete the action. The order-management backend stores this data in a staging area, associating with it a fresh transaction ID and returning that to the frontend.
The order-management transaction ID is stored as part of the frontend record.
The frontend asks the loyalty-program backend for a transaction ID, and passes it whatever data it would need to complete the action. The loyalty-program backend stores this data in a staging area, associating with it a fresh transaction ID and returning that to the frontend.
The loyalty-program transaction ID is stored as part of the frontend record.
The frontend tells the order-management backend to finalize the transaction associated with the transaction ID the frontend stored.
The frontend tells the loyalty-program backend to finalize the transaction associated with the transaction ID the frontend stored.
The frontend deletes its frontend record.
If this is implemented, the changes will not necessarily be atomic, but it will be eventually consistent. Let’s think of the places it could fail:
If it fails in the first step, no data will change.
If it fails in the second, third, fourth, or fifth, when the system comes back online it can scan through all frontend records, looking for records without an associated transaction ID (of either type). If it comes across any such record, it can replay beginning at step 2. (If there is a failure in step 3 or 5, there will be some abandoned records left in the backends, but it is never moved out of the staging area so it is OK.)
If it fails in the sixth, seventh, or eighth step, when the system comes back online it can look for all frontend records with both transaction IDs filled in. It can then query the backends to see the state of these transactions—committed or uncommitted. Depending on which have been committed, it can resume from the appropriate step.
I agree with what #Udi Dahan said. Just want to add to his answer.
I think you need to persist the request to the loyalty program so that if it fails it can be done at some other point. There are various ways to word/do this.
1) Make the loyalty program API failure recoverable. That is to say it can persist requests so that they do not get lost and can be recovered (re-executed) at some later point.
2) Execute the loyalty program requests asynchronously. That is to say, persist the request somewhere first then allow the service to read it from this persisted store. Only remove from the persisted store when successfully executed.
3) Do what Udi said, and place it on a good queue (pub/sub pattern to be exact). This usually requires that the subscriber do one of two things... either persist the request before removing from the queue (goto 1) --OR-- first borrow the request from the queue, then after successfully processing the request, have the request removed from the queue (this is my preference).
All three accomplish the same thing. They move the request to a persisted place where it can be worked on till successful completion. The request is never lost, and retried if necessary till a satisfactory state is reached.
I like to use the example of a relay race. Each service or piece of code must take hold and ownership of the request before allowing the previous piece of code to let go of it. Once it's handed off, the current owner must not lose the request till it gets processed or handed off to some other piece of code.
Even for distributed transactions you can get into "transaction in doubt status" if one of the participants crashes in the midst of the transaction. If you design the services as idempotent operation then life becomes a bit easier. One can write programs to fulfill business conditions without XA. Pat Helland has written excellent paper on this called "Life Beyond XA". Basically the approach is to make as minimum assumptions about remote entities as possible. He also illustrated an approach called Open Nested Transactions (http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper142.pdf) to model business processes. In this specific case, Purchase transaction would be top level flow and loyalty and order management will be next level flows. The trick is to crate granular services as idempotent services with compensation logic. So if any thing fails anywhere in the flow, individual services can compensate for it. So e.g. if order fails for some reason, loyalty can deduct the accrued point for that purchase.
Other approach is to model using eventual consistency using CALM or CRDTs. I've written a blog to highlight using CALM in real life - http://shripad-agashe.github.io/2015/08/Art-Of-Disorderly-Programming May be it will help you.

In what scenarios is recommended a reliable session?

In few words, if I am not wrong, a session is used when I want to ensure that the packages are sent in order, and to be able to use sessions is needed a reliable connection.
But my doubt what kind of applications need that? In my case is a simple application in which a client request to a service data from a database, the service get the data from the database and send to the client the results. Also the client can requeset to add, modify or delete data from database. In this case, should I need a reliable connection and sessions or not?
Thanks.
Session presumes that for some period of time you want to retain some data. Such a period of time, as far as session is concerned, refers to client's lifecycle that is when client opens up proxy, both service along with session are created, when client closes proxy service and session terminate their actions. There is exception when closing proxy does not actually perform it right away and this occures when you invoke one-way-operation. Service will keep working as long as operation performs its action despite the fact that it previously received an order to get rid of instance.
Based on provided information I assume the best choice would be PerCall. You do not store any data between calls and every single call can be perceived separately. Additionaly, leverage of ConcurrencyMode set to multiple so as to allow services being created simultaneously.
Personally, I find session useful in MSMQ, whenever I want to specific number of messages be wrapped into single queue-message. If error occures, regardless of whether which message is in charge of it, the whole queue-message is rolled back.

How to keep an API idempotent while receiving multiple requests with the same id at the same time?

From a lot of articles and commercial API I saw, most people make their APIs idempotent by asking the client to provide a requestId or idempotent-key (e.g. https://www.masteringmodernpayments.com/blog/idempotent-stripe-requests) and basically store the requestId <-> response map in the storage. So if there's a request coming in which already is in this map, the application would just return the stored response.
This is all good to me but my problem is how do I handle the case where the second call coming in while the first call is still in progress?
So here is my questions
I guess the ideal behaviour would be the second call keep waiting until the first call finishes and returns the first call's response? Is this how people doing it?
if yes, how long should the second call wait for the first call to be finished?
if the second call has a wait time limit and the first call still hasn't finished, what should it tell the client? Should it just not return any responses so the client will timeout and retry again?
For wunderlist we use database constraints to make sure that no request id (which is a column in every one of our tables) is ever used twice. Since our database technology (postgres) guarantees that it would be impossible for two records to be inserted that violate this constraint, we only need to react to the potential insertion error properly. Basically, we outsource this detail to our datastore.
I would recommend, no matter how you go about this, to try not to need to coordinate in your application. If you try to know if two things are happening at once then there is a high likelihood that there would be bugs. Instead, there might be a system you already use which can make the guarantees you need.
Now, to specifically address your three questions:
For us, since we use database constraints, the database handles making things queue up and wait. This is why I personally prefer the old SQL databases - not for the SQL or relations, but because they are really good at locking and queuing. We use SQL databases as dumb disconnected tables.
This depends a lot on your system. We try to tune all of our timeouts to around 1s in each system and subsystem. We'd rather fail fast than queue up. You can measure and then look at your 99th percentile for timings and just set that as your timeout if you don't know ahead of time.
We would return a 504 http status (and appropriate response body) to the client. The reason for having a idempotent-key is so the client can retry a request - so we are never worried about timing out and letting them do just that. Again, we'd rather timeout fast and fix the problems than to let things queue up. If things queue up then even after something is fixed one has to wait a while for things to get better.
It's a bit hard to understand if the second call is from the same client with the same request token, or a different client.
Normally in the case of concurrent requests from different clients operating on the same resource, you would also want to implementing a versioning strategy alongside a request token for idempotency.
A typical version strategy in a relational database might be a version column with a trigger that auto increments the number each time a record is updated.
With this in place, all clients must specify their request token as well as the version they are updating (typical the IfMatch header is used for this and the version number is used as the value of the ETag).
On the server side, when it comes time to update the state of the resource, you first check that the version number in the database matches the supplied version in the ETag. If they do, you write the changes and the version increments. Assuming the second request was operating on the same version number as the first, it would then fail with a 412 (or 409 depending on how you interpret HTTP specifications) and the client should not retry.
If you really want to stop the second request immediately while the first request is in progress, you are going down the route of pessimistic locking, which doesn't suit REST API's that well.
In the case where you are actually talking about the client retrying with the same request token because it received a transient network error, it's almost the same case.
Both requests will be running at the same time, the second request will start because the first request still has not finished and has not recorded the request token to the database yet, but whichever one ends up finishing first will succeed and record the request token.
For the other request, it will receive a version conflict (since the first request has incremented the version) at which point it should recheck the request token database table, find it's own token in there and assume that it was a concurrent request that finished before it did and return 200.
It's seems like a lot, but if you want to cover all the weird and wonderful failure modes when your dealing with REST, idempotency and concurrency this is way to deal with it.

Building a reliable service in WCF

I am currently designing a service (wsHttp) which should be used to return sensitive data. As soon as a client asks for this data, I get it from the database, compile a list, then delete the data from the database and return the list.
My concern is that something happens on the way back to the client (network issues, ...) I have already deleted the data from the database, but the client will never get it.
Which out of the box solution do I have here?
This is an inherent problem in the distributed computing. There is no easy solution. The question is how important it is to recover from such errors.
For example, if one deletes some records but the client gets disconnected, next time he connects he will see those records as deleted. Even if he tries to delete them again (data stayed in the UI), this will do no harm.
For banks transferring money, they have an error resolution mechanism where they match the transactions that happened between them in a second process. Conflicts will be dealt manually.
Some systems such as NServiceBus rely on MSMQ for storing messages and eventual consistency where a message destined to a client will eventually arrive whenever he is connected again.
There is no out of the box solution for this. You would need to implement some form of user/automated confirmation that the data had been recieved and only delete once this was returned.
Ed
There is an easy solution. But it doesn't come in a box.
Protocols like WS-ReliableMessaging (or equally TCP/IP) give you a layer of reliability under your messaging, but all bets are off once that layer offloads the message to the layer above.
So reliability can only be fully addressed at the absolute highest layer - the application layer, not by any lower layer down the communication stack. This makes it a first class business concern, not a purely technical concern.
The problem can be solved with a slight change to the process of deleting your sensitive data.
Instead of deleting it immediately, flag it for deletion. Then, build into the business processes that drive your service the assertion that the client must acknowledge receipt of the sensitive data. Then, when you get the acknowledgement back you can safely delete the data flagged for deletion, knowing that it has been received.
I recently wrote a blog post reasoning that reliability is a first class business concern that cannot be offloaded to a lower layer.