How to avoid DB deadlocks when multiple Kafka messages produced for same item?

How to avoid DB deadlocks when multiple Kafka messages produced for same item? - asp.net-core

We have 2 difference web applications. lets named them A and B.
When user change analyse of item in A app, A app do stuff things and produce a kafka mesage.
A rest API in B app consume the message via Confluent http sink connector.
The rest API in B app call SQL Stored Procedure that update records with transaction.
When (happens a lot) the user changing analyze of the same item in A app constantly- a deadlock caused in the DB. because the SP still works on records when another call for same item reach.
what is the best practice to handle this issue?
manage some global list with current items(IDs) enter to SP and remove them when SP finish? handle it on DB? other suggestion?
some relevant info:
the apps are ASP .Net Core.
stored in load balancing envoirment(AWS).
Any relevant answer is appreciated.
Thanks!

Make sure the same item is always published with the same key (eg use the hashcode of the item). This ensures that all requests from app A will go on the same topic partition.
In app B make sure the procedure call is done in the consumer polling thread (don't spawn a new thread) so that all procedure calls for the same item will be guaranteed to execute sequentially.
This will resolve deadlocks at the cost of performance. For multiple items you can scale horizontally with multiple consumers (as long as you have plenty of partitions). If performance on repeated requests for the same item is too slow then you have a more complex design issue to address.

Related

Shared Elasticsearch Index

I'm working on a new implementation where I have some queries regarding the pros and cons of having a shared Database in a microservice architecture.
Context:
Service A listens to an event from Kafka and based on the parameters updates a particular table. This table is owned entirely by Service A and not shared. Some of the data in this table needs to be accessed by other services based on the value of a particular field.
My Approach:
Once the Table is updated, if we know that this data might be required by some other service(by checking the value of the field) write it to an ES index. I want to keep the ES index shared across services.
The other services would read the ES index whenever required. These services would use the index only for read while Service A is the only service which writes to the index.
Also, I've added a fallback API in Service A which hits the table in case ES is down. Please check out the diagram, I've added a link to that below.
Issues:
One issue I can think of is that if ES is completely down then Service A won't be able to write to ES and hence that row update will fail. How do I handle this?
I also need help figuring out the fundamental scalability and deployment issues that can be counter productive to a microservice architecture by introducing a shared ES index. I think I have eliminated some of the resiliency issues by adding a fallback API for the other services in case ES is down.
Please criticise my design. Design Diagram

I see three options:
Option A: Service A needs to implement something equivalent to the two-phase commit protocol where an event consumed from Kafka by Service A would not be acknowledged until both the DB and ES have acknowledged their write.
It puts a big burden on your service, which in case one of the two sub-system goes down (DB and/or ES) would have to spend time retrying, and is not able to consume more events from Kafka. Events would start piling up in the topic. 2PC is hard to implement right in a distributed environment.
Option B: Service A consumes from Kafka topic A, does its things and produces another event in another Kafka topic B. Two other consumer groups responsible for updating sub-systems would then consume those events from topic B, one would keep updating the DB and another would keep updating ES. Service A can do its job rapidly and not have to worry or get bogged down with updates. Each updates can be retried independently by each consumer group without impacting upstream event consumption. Eventually, everything will be in synched.
Option C: It's a variation of option B, more lightweight. Service A consumes events from the Kafka topic, does its job and updates the DB as it does now. Another process (CDC, Logstash, etc) consumes updates from the DB and updates ES asynchronously and is also responsible for retrying is ES is down. Eventually, everything will be in synched as well.
There are other options, but these 3 are the most obvious ones to me.

Send notice to Azure Web role from a Azure Worker role - Best Practice

Situation
Users can upload Documents, a queue message will be placed onto the queue with the documents ID. The Worker Role will pick this up and get the document. Parse it completely with Lucene. After the parsing is complete the Lucene IndexSearcher on the Webrole should be updated.
On the Web role I'm keeping a static Lucene IndexSearcher because otherwise you have to make a new IndexSearch every search request and this gives a lot of overhead etc.
What I want do to is send a notice from the Worker Role to the Web Role that he needs to update his IndexSearcher.
Possible Solutions
Make some sort of notice queue. The Web Role starts an endless task that keeps checking the notice queue. If he finds a message then he should update the IndexSearch.
Start a WCF Service on the Worker Role and connect with the Web Role. Do a callback from the Worker Role and tell the Web Role through the Service that he needs to update his IndexSearcher.
Just update it on a regular interval
What would be the best solution or is there any other solution for this?
Many thanks !

If your worker roles write each finished job's details to a table using a PK of something like (DateTime.MaxValue - DateTime.UtcNow).Ticks.ToString("d19"), you will have a sorted list of the latest jobs that have been processed. Set your web role to poll the table like so:
var q = ctx.CreateQuery<LatestJobs>("jobstable")
.Where(j => j.PartitionKey.CompareTo(LastIndexTime.GetReverseTicks()) < 0)
.Take(1)
.AsTableServiceQuery()
if (q.Count() > 0)
{
//new jobs exist since last check... re-index.
}
For worker roles that do the indexing work, this is great because they can write indiscriminately to the table without worry of conflict. For you, you also have an audit log of the jobs they are processing (assuming you put some details in there).
However, you have one remaining problem: it sounds like you have 1 web role that updates the index. This one web role can of course poll this table on whatever frequency you choose (just track the LastIndexTime for searching later). Your issue is how to control concurrency of the web role(s) if you have more than one. Does each web role maintain it's own index or do you have one stored somewhere for all? Sorry, but I am not an expert in Lucene if that should be obvious.
Anyhow, if you have multiple instances in your WebRole and a single index that all can see, you need to prevent multiple roles from updating the index over and over. You can do this through leasing the index (if stored in blob storage).
Update based on comment:
If each WebRole instance has its own index, then you don't have to worry about leasing. That is only if they are sharing a blob resource together. So, this technique should work fine as-is and your only potential obstacle is that the polling intervals for the web roles could be slightly out of sync, causing somewhat different results until all update (depending on which instance you hit). Poll every 30 seconds on the table and that will be your max out of sync. Each web role instance simply needs to track the last time it updated and do incremental searches from that point.

Depending on upload frequency, you may find queue messages to cause you unneeded updates. For instance, if you get a dozen uploads and process them in close time proximity, you'd now have a dozen queue messages, each telling your web role to update. It would make more sense to keep a single signal (maybe a table row or SQL Azure row). You could simply set a row value to 1, signaling the need to update. When your web role detects this change, reset to 0 and start the update. Note: If using an Azure Table row, you'd need to poll for updates (and depending on traffic, you could start accumulating a large number of transactions). You could use the AppFabric Cache for this signal as well.
You could use a WCF service on an internal endpoint on your Web Role. However, you still have the burst issue (if you get, say, a dozen uploads while the webrole is updating, you don't want to then do another dozen updates).

WCF: Efficiently consuming large numbers of singleton requests via SQL job?

I'm planning to build a console app to run as part of a SQL 2005 job which will gather records from a database table, create a request object for a WCF service, pass this object to the service for processing, receive a response object, and update a log table with its data. This will be for processing at least several thousand records each time the job step executes.
The WCF service currently exposes a single method which I'd be hitting once for each record in the table, so I imagine I'd want to open a channel to the service, keep it open during processing, then close and dispose and such when complete.
Beyond maintaining the connection, how else could I minimize this console app's performance as a bottleneck? Should I not use a console app and instead try using SQLCLR or some other means to perform this processing?

You've probably considered Service Broker...

MSMQ v Database Table

An existing process changes the status field of a booking record in a table, in response to user input.
I have another process to write, that will run asynchronously for records with a particular status. It will read the table record, perform some operations (including calls to third party web services), and update the record's status field to indicate that processing is completed (or In Error, with an error count).
This operation sounds very similar to a queue. What are the benefits and tradeoffs of using MSMQ over a SQL Table in this situation, and why should I choose one over the other?
It is our software that is adding and updating records in the table.
It is a new piece of work (a Windows Service) that will be performing the asynchronous processing. This needs to be "always up".

There are several reasons, which were discussed on the Fog Creek forum here: http://discuss.fogcreek.com/joelonsoftware5/default.asp?cmd=show&ixPost=173704&ixReplies=5
The main benefit is that MSMQ can still be used when there is intermittant connectivity between computers (using a store and forward mechanism on the local machine). As far as the application is concerned it delivered the message to MSMQ, even though MSMQ will possibly deliver the message later.
You can only insert a record to a table when you can connect to the database.
A table approach is better when a workflow approach is required, and the process will move through various stages, and these stages need persisting in the DB.

If the rate at which booking records is created is low I would have the second process periodically check the table for new bookings.
Unless you are already using MSMQ, introducing it just gives you an extra platform component to support.
If the database is heavily loaded, or you get a lot of lock contention with two process reading and writing to the same region of the bookings table, then consider introducing MSMQ.

I also like this answer from le dorfier in the previous discussion:
I've used tables first, then refactor
to a full-fledged msg queue when (and
if) there's reason - which is trivial
if your design is reasonable.
Thanks, folks, for all the answers. Most helpful.

With MSMQ you can also offload the work to another server very easy by changing the location of the queue to another machine rather then the db server.
By the way, as of SQL Server 2005 there is built in queue in the DB. Its called SQL server Service Broker.
See : http://msdn.microsoft.com/en-us/library/ms345108.aspx

Also see previous discussion.

If you have MSMQ expertise, it's a good option. If you know databases but not MSMQ, ask yourself if you want to become expert in another technology; whether your application is a critical one; and which you'd rather debug when there's a problem.

I have recently been investigating this myself so wanted to mention my findings. The location of the Database in comparison to your application is a big factor on deciding which option is faster.
I tested inserting the time it took to insert 100 database entries versus logging the exact same data into a local MSMQ message. I then took the average of the results of performing this test several times.
What I found was that when the database is on the local network, inserting a row was up to 4 times faster than logging to an MSMQ.
When the database was being accessed over a decent internet connection, inserting a row into the database was up to 6 times slower than logging to an MSMQ.
So:
Local database - DB is faster, otherwise MSMQ is.

Instead of making raw MSMQ calls, it might be easier if you implement your sevice as a queued COM+ component and make queued function calls from your client application. In the end, the asynchronous service still uses MSMQ in the background, but your code will be much clearer and easier to use.

I would probably go with MSMQ, or ActiveMQ myself. I would suggest (presuming that you are considering MSMQ you are using windows, with MS technology) looking into WCF, or if you are using MS-SQL 2005+ having a trigger that calls into .net code to run your processing.

Service Broker was introduced in SQL 2005 and it is designed to be very quick at handling messages as the process is relatively simple (I believe its roots were in triggers). If you are concerned about scalability, in SQL 2008 they have released an independant processing executable to separate the processing from SQL Server (in standard Service Broker, everything is controlled by the SQL Server instances).
I would definitely consider using Service Broker over MSMQ but this is dependant on your SQL Development/DBA resources and their knowledge.

Besides of Mitch's answer, some other scenarios:
1. each of your message have its own due date to trigger the action, this can be done through MQ as well, but in this case I prefer to store it into db as it is more controllable;
2. subscriber needs to filter message and then process a portion of it, this can be done by LINQ too, depends on how complex the filter is, the db approach is better because I can use linq to EF do complex query easily;
3. For deployment, i want fully automated deployment process so that DB is a better choice for me. I am not a big fan of manual configurations.

I am currently working on a project with specific requirements. A brief overview of these are as follows:
Data is retrieved from external webservices
Data is stored in SQL 2005
Data is manipulated via a web GUI
The windows service that communicates with the web services has no coupling with our internal web UI, except via the database.
Communication with the web services needs to be both time-based, and triggered via user intervention on the web UI.
The current (pre-pre-production) model for web service communication triggering is via a database table that stores trigger requests generated from the manual intervention. I do not really want to have multiple trigger mechanisms, but would like to be able to populate the database table with triggers based upon the time of the call. As I see it there are two ways to accomplish this.
1) Adapt the trigger table to store two extra parameters. One being "Is this time-based or manually added?" and a nullable field to store the timing details (exact format to be determined). If it is a manaully created trigger, mark it as processed when the trigger has been fired, but not if it is a timed trigger.
or
2) Create a second windows service that creates the triggers on-the-fly at timed intervals.
The second option seems like a fudge to me, but the management of option 1 could easily turn into a programming nightmare (how do you know if the last poll of the table returned the event that needs to fire, and how do you then stop it re-triggering on the next poll)
I'd appreciate it if anyone could spare a few minutes to help me decide which route (one of these two, or possibly a third, unlisted one) to take.

Why not use a SQL Job instead of the Windows Service? You can encapsulate all of you db "trigger" code in Stored Procedures. Then your UI and SQL Job can call the same Stored Procedures and create the triggers the same way whether it's manually or at a time interval.

The way I see it is this.
You have a Windows Service, which is playing the role of a scheduler and in it there are some classes which simply call the webservices and put the data in your databases.
So, you can use these classes directly from the WebUI as well and import the data based on the WebUI trigger.
I don't like the idea of storing a user generated action as a flag (trigger) in the database where some service will poll it (at an interval which is not under the user's control) to execute that action.
You could even convert the whole code into an exe which you can then schedule using the Windows Scheduler. And call the same exe whenever the user triggers the action from the Web UI.

#Vaibhav
Unfortunately, the physical architecture of the solution will not allow any direct communication between the components, other than Web UI to Database, and database to service (which can then call out to the web services). I do, however, agree that re-use of the communication classes would be the ideal here - I just can't do it within the confines of our business*
*Isn't it always the way that a technically "better" solution is stymied by external factors?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas