SQL Server Service Broker Deadlock - sql

I'm investigating a process that I did not build. It uses the service broker to create a queue of contacts that then need an action against them.
There is then a handler that receives 10k records and passes them to a stored procedure to process.
What happens if that final process fails for a deadlock with no error handling? Do these go back into the queue? If not what would I need to do to get them to go back into the queue?

Service broker queues can be accessed from within a transaction. So if you do something like this in your code (the below is pseudo-code; actual robust service broker code is a little beyond the scope of your question):
begin tran
receive top(10000) message_body
into #table
from dbo.yourQueue;
while(1=1)
begin
select top(1) #message = message
from #table;
if (#message is null)
break;
exec dbo.processMessage #message;
end
commit tran
… then you're set. What I'm saying is that as long as you're doing your receive and processing in the same transaction, any failure (deadlocks included) will rollback the transaction and put the messages back on the queue. Make sure you read up on poison message handling, though! If you get too many rollbacks, SQL will assume that there's an un-processable message and shut down the queue. That's a bad day when that happens.

Related

Prevent two connections from reading same row

I am looking at building a database based Message Queue implementation. I will essentially have a database table which will contain a autogenerated id (bigint), a message id and message data. I will be writing a pull based consumer which will query for the oldest record (min(id)) from the table and hands it over for processing.
Now my doubt is how would I handle the querying of the oldest record when there are mulitple threads of consumer. How do I lock the first read record to the first consumer and basically not even make it visible to the next one.
One idea that I have is to add another column called locked by where I will store, lets say the thread name and select the record for update and immediately update the locked by column and then continue processing it. So that I will not select the locked columns in the next query.
Is this a workable solution?
Edit:
Essentially, this is what I want.
Connection one queries the database table for a row. Reads first row and locks it while reading for update.
Connection two queries the database table for a row. Should not be able to read the first row, should read second row if available and lock it for update.
Similar logic for connection 3, 4, etc..
Connection one updates the record with its identifier. Processes it and subsequently deletes the record.
Connection one queries the database table for a row. Reads first row and locks it while reading for update.
Connection two queries the database table for a row. Should not be able to read the first row, should read second row if available and
lock it for update.
Similar logic for connection 3, 4, etc..
Connection one updates the record with its identifier. Processes it and subsequently deletes the record.
TL;DR, see Rusanu's using tables as queues. The example DDL below is gleaned from the article.
CREATE TABLE dbo.FifoQueueTable (
Id bigint not null identity(1,1)
CONSTRAINT pk_FifoQueue PRIMARY KEY CLUSTERED
,Payload varbinary(MAX)
);
GO
CREATE PROCEDURE dbo.usp_EnqueueFifoTableMessage
#payload varbinary(MAX)
AS
SET NOCOUNT ON;
INSERT INTO dbo.FifoQueueTable (Payload) VALUES (#Payload);
GO
CREATE PROCEDURE dbo.usp_DequeueFifoTableMessage
AS
SET NOCOUNT ON;
WITH cte AS (
SELECT TOP(1) Payload
FROM dbo.FifoQueueTable WITH (ROWLOCK, READPAST)
ORDER BY Id
)
DELETE FROM cte
OUTPUT deleted.Payload;
GO
This implementation is simple but handing the unhappy path can be complex depending on the nature of the messages and the cause of the error.
When message loss is acceptable, one can simply use the default autocommit transaction and log errors.
In cases where messages must not be lost, the dequeue must be done in a client-initiated transaction and committed only after successful processing or no message read. The transaction will also ensure messages are not lost if the application or database service crashes. A robust error handling strategy depends on the type of error, nature of messages, and message processing order implications.
A poison message (i.e. an error in the payload that prevents the message from ever being successfully), one can insert the bad message into a dead letter table for subsequent manual review and commit the to transaction.
A transient error, such as a failure calling an external service, can be handled with techniques like:
Rollback the transaction so the message is first in the FIFO queue for retry next iteration.
Requeue the erred message and commit so the message is last in the FIFO queue for retry.
Enqueue the erred message in a separate retry queue along with a retry count. The message can be inserted into dead letter table once a retry limit is reached.
The app code can also include retry logic during message processing but should avoid long running database transactions and fallback to one techniques above after some retry threshold.
These same concepts can be implemented with Service Broker to facilitate a T-SQL only solution (internal activation) but adds complexity when that's not a requirement (as in your case). Note that SB queues intrinsically implement the "READPAST" requirement but, because all messages within the same conversation group are locked, the implication is that each message will need to be in a separate conversation.

SQL Server, ServicerBroker for asynchronized execution of stored procedures

I followed this tutorial:
https://gallery.technet.microsoft.com/scriptcenter/Using-Service-Broker-for-360c961a
and it is working for me,
However,
I don't understand some thing:
At PROCEDURE proc_BrokerTargetActivProc we have infinite loop: WHILE (1=1). Why ? After all, during creating queue we bind messages with this procedure:PROCEDURE_NAME = proc_BrokerTargetActivProc.
In addition, I am not sure If I correctly understand way of working it:
ExecuteProcedureAsync push to queue message with name of procedure to execute.
What now ? How does it work that BrokerTargetActivProc will be called with exactly one message ?
What about parameter MAX_QUEUE_READERS = 5 ?
Thank in advance,
Regards
You have three questions here.
Q. "Why do we have an infinite loop in the activation procedure?"
A. The idea here is that there is some non-zero cost for starting a procedure. If you have a bunch of messages on the queue, having your already executing procedure handle them is cheaper than executing the procedure for each message.
Q. How will the activation procedure be called with only one message?
A. That is an implementation detail of the way that BrokerTargetActivProc is written. Specifically, the RECEIVE TOP(1) statement. In my environment, I receive multiple messages off of the queue at once (e.g. RECEIVE TOP(1000)). That choice (and the implications of that choice) is up to you.
Q. What about parameter MAX_QUEUE_READERS = 5?
A. In order to fully appreciate this, a reading of this article is useful. It outlines when activation occurs on a service broker queue. Having MAX_QUEUE_READERS be greater than one says that you're allowing the server to have more than one process getting messages at a time. This would be useful in the case where you have a bunch of messages come in in a short period of time and you want to increase throughput by having multiple executions of your activation procedure active at once to run through those messages.
Follow-up questions from the comments:
Q: Who is calling the BrokerTargetActivProc procedure?
A: The procedure is called when activation is deemed to be necessary (see the article linked to above). As far as the executing context, you set that when you set the procedure as the activation procedure for the queue. For instance, if I wanted it to execute as foo_user, I'd do:
alter queue [TASK_QUEUE] with activation (
procedure_name = `BrokerTargetActivProc`,
execute as 'foo_user'
);
Q: How do you pass parameters to the activation procedure?
A: You don't. The point of the activation procedure is to de-queue messages and process them. So, all of the information should be in the message (which may drive queries etc).
Q: What about error handling?
A: You have to be careful here. Anything that causes a receive statement to rollback can trigger what is called poison message handling. That said, what I do is I wrap the receive and subsequent processing of a message in a try/catch block in the activation stored procedure and in the catch, I put the message into a table for later investigation. But how you handle errors and what you do with the messages that caused them is up to you!

SQL Service Broker Internal Activation Questions

I setup Internal Activation for two stored procedures. One, inserts one or more records , the other, updates one or more records in the same table. So, I have two initiator, two target queues.
It works fine on development so far, but I wonder what types of problems I might encounter when we move it to prod where these two stored procedures are frequently called. We have already experiencing deadlock issues caused by these two stored procedures. Asynchronous execution is my main goal with this implementation.
Questions :
Is there a way to use one target queue for both stored procedures to prevent any chance of deadlocks?
Is there anything I can do to make it more reliable? like one execution error should not stop incoming requests
to the queue?
Tips to improve scalability (high number of execution per second)?
Can I set RETRY if there is a deadlock?
Here is the partial code of the insert stored procedure;
CREATE QUEUE [RecordAddUsersQueue];
CREATE SERVICE [RecordAddUsersService] ON QUEUE [RecordAddUsersQueue];
ALTER QUEUE [AddUsersQueue] WITH ACTIVATION
( STATUS = ON,
MAX_QUEUE_READERS = 1, --or 10?
PROCEDURE_NAME = usp_AddInstanceUsers,
EXECUTE AS OWNER);
CREATE PROCEDURE [dbo].[usp_AddInstanceUsers] #UsersXml xml
AS
BEGIN
DECLARE #Handle uniqueidentifier;
BEGIN DIALOG CONVERSATION #Handle
FROM SERVICE [RecordAddUsersService]
TO SERVICE 'AddUsersService'
ON CONTRACT [AddUsersContract]
WITH ENCRYPTION = OFF;
SEND ON CONVERSATION #Handle
MESSAGE TYPE [AddUsersXML] (#UsersXml);
END
GO
CREATE PROCEDURE [dbo].[usp_SB_AddInstanceUsers]
AS
BEGIN
DECLARE #Handle uniqueidentifier;
DECLARE #MessageType sysname;
DECLARE #UsersXML xml;
WHILE (1 = 1)
BEGIN
BEGIN TRANSACTION;
WAITFOR
(RECEIVE TOP (1)
#Handle = conversation_handle,
#MessageType = message_type_name,
#UsersXML = message_body
FROM [AddUsersQueue]), TIMEOUT 5000;
IF (##ROWCOUNT = 0)
BEGIN
ROLLBACK TRANSACTION;
BREAK;
END
IF (#MessageType = 'ReqAddUsersXML')
BEGIN
--<INSERT>....
DECLARE #ReplyMsg nvarchar(100);
SELECT
#ReplyMsg = N'<ReplyMsg>Message for AddUsers Initiator service.</ReplyMsg>';
SEND ON CONVERSATION #Handle
MESSAGE TYPE [RepAddUsersXML] (#ReplyMsg);
END
ELSE
IF #MessageType = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
BEGIN
END CONVERSATION #Handle;
END
ELSE
IF #MessageType = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
BEGIN
END CONVERSATION #Handle;
END
COMMIT TRANSACTION;
END
END
GO
Thank you,
Kuzey
Is there a way to use one target queue for both stored procedures to prevent any chance of deadlocks?
You can and you should. There is no reason for having two target services/queues/procedures. Send, to the same service, two different message types for the two operations you desire. The activated procedure should then execute logic for Add or logic for Update, depending on message type.
Is there anything I can do to make it more reliable? like one execution error should not stop incoming requests to the queue?
SSB activation will be very reliable, that's not going to be a problem. As long as you adhere strictly to transaction boundaries (do not commit dequeue operations before processing is complete), you'll never lose a message/update.
Tips to improve scalability (high number of execution per second)?
Read Writing Service Broker Procedures and Reusing Conversations. To achieve a high throughput processing, you will have to dequeue and process in batches (TOP(1000)) into #table variables. See Exception handling and nested transactions for a pattern that can be applied to process a batch of messages. You'll need to read and understand Conversation Group Locks.
Can I set RETRY if there is a deadlock?
No need to, SSB activation will retry for you. As you rollback, the dequeue (RECEIVE) will rollback thus making the messages again available for activation, and the procedure will automatically retry. Note that 5 rollbacks in a row will trigger the poison message trap
MAX_QUEUE_READERS = 1, --or 10?
If 1 cannot handle the load, add more. As long as you understand proper conversation group locking, the parallel activated procedures should handle unrelated business items and never deadlock. If you encounter deadlocks between instances of activated procedure on the same queue, it means you have a flaw in the conversation group logic and you allow messages seen by SSB as uncorrelated (different groups) to modify the same database records (same business entities) and lead to deadlocks.
BTW, you must have an activated procedure on the initiator service queue as well. See How to prevent conversation endpoint leaks.

SQL Service Broker example not working

I have two databases on the same instance.
One called ICMS, one called CarePay_DEV1
When a change happens in ICMS (Source), it needs to send a message to CarePay_Dev1 (Destination).
I am new to Broker Services, and am trying to make a message go to the queue. Once that works, I will hopefully get the data into a table in the destination, and that will then be processed by .Net code. But I just want something to appear in the destination first.
So, step 1: I enable the service on the two databases
-- Enable Broker on CarePay
ALTER DATABASE CarePay_Dev1
SET ENABLE_BROKER;
-- Enable Broker on Source
ALTER DATABASE ICMS_TRN
SET ENABLE_BROKER;
Step 2: Create the message type on the source and destination.
-- Create Message Type on Receiver:
USE CarePay_DEV1
GO
CREATE MESSAGE TYPE [IcmsCarePayMessage]
VALIDATION=WELL_FORMED_XML;
-- Create Message Type on Sender:
USE ICMS_TRN
GO
CREATE MESSAGE TYPE [IcmsCarePayMessage]
VALIDATION=WELL_FORMED_XML;
I then create the Contacts, on both databases:
-- Create Message Type on Receiver:
USE CarePay_DEV1
GO
CREATE MESSAGE TYPE [IcmsCarePayMessage]
VALIDATION=WELL_FORMED_XML;
-- Create Message Type on Sender:
USE ICMS_TRN
GO
CREATE MESSAGE TYPE [IcmsCarePayMessage]
VALIDATION=WELL_FORMED_XML;
I then create the message queues on both databases:
-- CREATE Sending Messagw Queue
USE ICMS_TRN
GO
CREATE QUEUE CarePayQueue
-- CREATE Receiving Messagw Queue
USE CarePay_Dev1
GO
CREATE QUEUE CarePayQueue
And finally, I create the services on both databases:
-- Create the message services
USE ICMS_TRN
GO
CREATE SERVICE [CarePayService]
ON QUEUE CarePayQueue
USE CarePay_DEV1
GO
CREATE SERVICE [CarePayService]
ON QUEUE CarePayQueue
Now, the queues should be ready, so then I try and send something from the source to the destination:
-- SEND THE MESSAGE!
USE ICMS_TRN
GO
DECLARE #InitDlgHandle UNIQUEIDENTIFIER
DECLARE #RequestMessage VARCHAR(1000)
BEGIN TRAN
BEGIN DIALOG #InitDlgHandle
FROM SERVICE [CarePayService]
TO SERVICE 'CarePayService'
ON CONTRACT [IcmsCarePayContract]
SELECT #RequestMessage = N'<Message>The eagle has landed!</Message>';
SEND ON CONVERSATION #InitDlgHandle
MESSAGE TYPE [IcmsCarePayMessage] (#RequestMessage)
COMMIT TRAN
I get:
Command(s) completed successfully.
But then when I try select from the destination queue, it's empty.
/****** Script for SelectTopNRows command from SSMS ******/
SELECT TOP 1000 *, casted_message_body =
CASE message_type_name WHEN 'X'
THEN CAST(message_body AS NVARCHAR(MAX))
ELSE message_body
END
FROM [CarePay_DEV1].[dbo].[CarePayQueue] WITH(NOLOCK)
Can anyone spot the issue? I can't see where I tell the destination which database to send the message to - which could be part of the issue?
I highly recommend you read Adam Machanic's Service Broker Advanced Basics Workbench, specifically the section entitled "Routing and Cross-Database Messaging".
In addition, for future troubleshooting you may want to use SSBDiagnose or also read through Remus Rusanu's numerous articles on the topic
I think the initiator service sent a message to yourself. Try to change the name of destination (terget) service.

Sql Server Service Broker

Currently we are using service broker to send the messages back and forth, which is working fine. But we wanted to group those messages by using the RELATED_CONVERSATION_GROUP. We wanted to use our own database persisted uuid as a RELATED_CONVERSATION_GROUP = #uuid from our database, but even though we use the same uuid every time the conversion_group_id comes different each time we receive the queue.
Do you guys know what is wrong with way i am creating the broker or the receive call, i have provided both the broker creation code and the receive call code below. Thanks
below is the code "Service Broker creation code"
CREATE PROCEDURE dbo.OnDataInserted
#EntityType NVARCHAR(100),
#MessageID BIGINT,
#uuid uniqueidentifier,
#message_body nvarchar(max)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #conversation UNIQUEIDENTIFIER
BEGIN DIALOG CONVERSATION #conversation
FROM SERVICE DataInsertSndService
TO SERVICE 'DataInsertRcvService'
ON CONTRACT DataInsertContract
WITH RELATED_CONVERSATION_GROUP = #uuid;
SEND ON CONVERSATION #conversation
MESSAGE TYPE DataInserted
(CAST(#message_body))
below is the code "Receive code"
WHILE 0 < ##TRANCOUNT ROLLBACK; SET NOCOUNT ON
BEGIN TRANSACTION;
DECLARE
#cID as uniqueidentifier,
#conversationHandle as uniqueidentifier,
#conversationGroupId as uniqueidentifier,
#tempConversationGroupId as uniqueidentifier,
#message_body VARBINARY(MAX)
RAISERROR ('Awaiting Message ...', 16, 1) WITH NOWAIT
;WAITFOR (RECEIVE TOP (1)
#cID = Substring(CAST(message_body as nvarchar(max)),4,36),
#conversationHandle = [conversation_handle],
#conversationGroupId = [conversation_group_id],
#message_body = message_body
FROM DataInsertRcvQueue)
RAISERROR ('Message Received', 16, 1) WITH NOWAIT
Select #tempConversationGroupId = conversationGroupID from ConversationGroupMapper where cID = #cID;
declare #temp as nvarchar(max);
Set #temp = CAST(#tempConversationGroupId as nvarchar(max));
if #temp <> ''
BEGIN
MOVE CONVERSATION #conversationHandle TO #tempConversationGroupId;
RAISERROR ('Moved to Existing Conversation Group' , 16, 1) WITH NOWAIT
END
else
BEGIN
insert into ConversationGroupMapper values (#cID,#conversationGroupId);
RAISERROR ('New Conversation Group' , 16, 1) WITH NOWAIT
END
WAITFOR DELAY '000:00:10'
COMMIT
RAISERROR ('Committed' , 16, 1) WITH NOWAIT
Elaboration
Our situation is that we need to receive items from this Service Broker queue in a loop, blocking on WAITFOR, and hand them off to another system over an unreliable network. Items received from the queue are destined for one of many connections to that remote system. If the item is not successfully delivered to the other system, the transaction for that single item should be rolled back and the item will be returned to the queue. We commit the transaction upon successful delivery, unlocking the sequence of messages to be picked up by a subsequent loop iteration.
Delays in a sequence of related items should not affect delivery of unrelated sequences. Single items are sent into the queue as soon as they are available and are forwarded immediately. Items should be forwarded single-file, though order of delivery even within a sequence is not strictly important.
From the loop that receives one message at a time, a new or existing TcpClient is selected from our list of open connections, and the message and the open connection are passed along though the chain of asynchronous IO callbacks until the transmission is complete. Then we complete the DB Transaction in which we received the Item from the Service Broker Queue.
How can Service Broker and conversation groups be used to assist in this scenario?
Conversation groups are a local concept only, used exclusively for locking: correlated conversations belong in a group so that while you process a message on one conversation, another thread cannot process a correlated message. There is no information about conversation groups exchanged by the two endpoints, so in your example all the initiator endpoints end up belonging to one conversation group, but the target endpoints are each a distinct conversation group (each group having only one conversation). The reason the system behaves like this is because conversation groups are designed to address a problem like, say, a trip booking service: when it receives a message to 'book a trip', it has to reserve a flight, a hotel and a car rental. It must send three messages, one to each of these services ('flights', 'hotels', 'cars') and then the responses will come back, asynchronously. When they do come back, the processing must ensure that they are not processed concurrently by separate threads, which would each try to update the 'trip' record status. In messaging, this problem is know as 'message correlation problem'.
However, often conversation groups are deployed in SSB solely for performance reasons: they allow larger RECEIVE results. Target endpoints can be moved together into a group by using MOVE CONVERSATION but in practice there is a much simpler trick: reverse the direction of the conversation. Have your destination start the conversations (grouped), and the source sends its 'updates' on the conversation(s) started by the destination.
Some notes:
Don't use the fire-and-forget pattern of BEGIN/SEND/END. You're making it impossible to diagnose any problem in future, see Fire and Forget: Good for the military, but not for Service Broker conversations.
Never ever use WITH CLEANUP in production code. It is intended for administrative last-resort action like disaster recovery. If you abuse it you deny SSB any chance to properly track the message for correct retry delivery (if the message bounces on the target, for whatever reason, it will be lost forever).
SSB does not guarantee order across conversations, only within one conversation. Starting a new conversation for each INSERT event does not guarantee to preserve, on target, the order of insert operations.