I am currently working on a prototype to allow a client to update a subscriber database schema so that they can eventually change the subscriber to match the new version of their application then switch over to that database when they deploy the front end code.
My hope was that I could issue schema changes (for example, change a column data type) while keeping the replication stored procedures up to date to properly convert any replicated data. While the subscriber might hold locks on big tables being updated it could then just queue up the changes from the publisher instead of causing locking issues with the still-running application. I hope I'm explaining this well enough...
Here's what I tried:
BEGIN TRANSACTION
BEGIN TRY
UPDATE dbo.Big_Table SET some_string = REPLACE(some_string, ',', '')
ALTER TABLE dbo.Big_Table ALTER COLUMN some_string INT
DECLARE #sql VARCHAR(MAX)
SET #sql = 'create procedure [dbo].[sp_MSins_dboBig_Table]
#c1 bigint,
#c2 varchar(20),
#c3 varchar(30)
as
begin
declare #c2_new int
set #c2_new = cast(replace(#c2, '','', '''') as int)
insert into [dbo].[Big_Table] (
[my_id],
[some_string],
[another_string]
)
values (
#c1,
#c2_new,
#c3
)
end -- '
EXEC(#sql)
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
END CATCH
This specific script would change a VARCHAR column that contains numeric data into an INT while at the same time removing any commas that might be included in a number like "1,325".
The problem is, this causes blocking at the publisher as well. I've seen references to pausing replication, but none of them have concrete steps to follow (I don't have a lot of replication experience). It's typically, "turn off some jobs".
I also saw a page on switching updating modes, but I think that only applies to update-able subscribers.
Any suggestions on how to handle this situation?
How do you have replication set up now if the publisher and subscriber schema don't match? This is a really "interesting" setup. In general, messing with the default procedures is going to cause a headache. Because they're system maintained, your new version could be overwritten at any point (though in practice, this wouldn't happen). If you don't have an updating subscriber, the subscriber should be treated as read-only lest you break replication. Perhaps I'm missing something, though.
Related
Hear me out! I know this use case sounds suspect, but...
I have a stored procedure which checks a table (effectively a cache) for data for a given requested ID. If it doesn't find any data for that ID, or deems it out of date, it executes a second stored procedure which will pull data from a separate database (using dynamic SQL, source DB name is based on the requested ID) and insert it into the local table. It then selects from this table.
If the data is in the table, everything returns quickly (ms), but if it needs to be brought in from the other database, it takes about 10 seconds. We're seeing race conditions where two concurrent instances check the local cache, see something is missing, and queue up sequential ingestions of the remote data into the cache. To avoid double-insertion, the cache-populating procedure will clear whatever is already there for this id, but this just means the first instance of the procedure can selecting no rows because the second instance deleted the just-inserted records before re-inserting them itself.
I think I want to put a lock around the entire procedure (checking the cache, potentially populating the cache, selecting from the cache) - although I'm open to other solutions. I think the overall caching approach has to remain on-demand though, the remote databases come and go by the hundreds, and we only want to cache the ones actually requested by reporting as-needed.
BEGIN TRANSACTION;
BEGIN TRY
-- Take out a lock intended to prevent anyone else modifying the cache while we're reading and potentially modifying it
EXEC sp_getapplock #Resource = '[private].[cache_entries]', #LockOwner='Transaction', #LockMode = 'Exclusive', #LockTimeout = 120000;
-- Invoke a stored procedure that ingests any required data that is not already cached
EXEC [private].populate_cache #required_dbs
-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries
COMMIT TRANSACTION; -- Free the lock
END TRY
BEGIN CATCH --Ensure we release our lock on failure
ROLLBACK TRANSACTION;
THROW
END CATCH;
The alternative to sp_getapplock is to use locking hints with your transaction. Both are reasonable approaches. Locking hints can be complex, but they protect the target object itself rather than a single code path. So sometimes necessary. sp_getapplock is simple (with Transaction as owner), and reliable.
You can do this without sp_getapplock, which tends to inhibit concurrency a lot.
The way to do this is to continue do your checks within a transaction, but to apply a HOLDLOCK hint, as well as a UPDLOCK hint.
HOLDLOCK aka the SERIALIZABLE isolation level, will place a lock not only on the ID you specify, but even on the absence of such data, in other words it will prevent anyone else inserting into that ID.
You must use both these hints, as well as have an index that matches that SELECT, otherwise you could run into major blocking and deadlocking problems due to full table scans.
Also, you don't need a CATCH and ROLLBACK. Just use SET XACT_ABORT ON; which ensures a rollback in any event of an error.
SET XACT_ABORT ON; -- always have this set
BEGIN TRANSACTION;
DECLARE #SomeData nvarchar(100) = (
SELECT ce.SomeColumn
FROM [private].cache_entries ce WITH (HOLDLOCK, UPDLOCK)
WHERE ce.SomeCondition = 1
);
IF #SomeData IS NULL
BEGIN
-- Invoke a stored procedure that ingests any required data that is not already cached
EXEC [private].populate_cache #required_dbs
END
-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries
COMMIT TRANSACTION; -- Free the lock
We have a transactional replication setup where the subscriber is also a publisher to a second set of subscribers. I think this is because of the slow link between the primary publisher and the subscriber. The subscriber publishes the same set of articles to multiple local subscribers.
One problem we have is when the primary publisher/subscriber setup needs to be reinitialized, we have to remove the second publisher/subscriber setup. We get errors regarding dropping of tables otherwise. They can't be dropped by the initialization process because they are being used for replication by the second setup.
Maybe this is the way it has to be done but I'm wondering if there is a better way. Looking for any tips or suggestions.
Thanks,
Kevin
Maybe. The procedure to add an article (sp_addarticle) takes a parameter #pre_creation_cmd that specifies what to do before creating the article. The default is "drop", but can be "none" (do nothing), "delete" (deletes all data in the destination table), or "truncate" (truncates the destination table). In your case, I'd choose "delete" since you can't truncate a replicated table, either.
But I have to say that if it were me, I wouldn't do that either. I'd make my re-init script a sqlcmd script that looks something like:
:connect $(REPEATER_INSTANCE)
use [$(REPEATER_DB)];
declare arts cursor for
select p.name as pub, a.name as art
from sysarticles as a
join syspublications as p
on a.pubid = p.pubid;
open arts;
declare #p sysname, #a sysname;
while(1=1)
begin
fetch next from arts into #p, #a
if (##fetch_status <> 0)
break;
exec sp_droparticle #publication = #p, #article #a;
end
close arts;
deallocate arts;
:connect $(PUBLISHER)
use [$(PUBLISHER_DB)];
--whatever script you use to create the publication here
Note: that's completely untested (I don't have replication set up at home), but should be pretty close.
Lastly (and rhetorically), why are you re-initializing so often? That should be a rare event. If it's not, you may have a configuration issue (e.g. if you're constantly lagging behind so far that you exceed the distributor retention, increase the distributor retention).
I'm using SQL Server 2008 R2.
I have a view; let's call it view1. This view is complex and slow. It cannot be made into an indexed view because it uses left joins and various other trickery. As such, we created a stored procedure which basically:
obtains an exclusive lock
selects * into computed_view1_tmp from view1; (slow)
creates indexes on the above computed table (slow)
renames computed_view1 to computed_view1_todelete; and does the same for its indexes (assumed fast)
renames computed_view1_tmp to computed_view1; and does the same for its indexes (assumed fast)
drops the table computed_view1_todelete (slow)
releases the lock.
We run this procedure when we know we're changing the data in our web application. We then have other views, such as view2 using computed_view1 instead of view1.
Once in a while, we get:
Invalid object name 'dbo.computed_view1'. Could not use view or
function 'dbo.view2 because of binding errors.
I assume this is because we're trying to access dbo.computed_view1 at the same time as it's being renamed. I assume this is a very short period, but the frequency I am seeing this error in my logs makes me wonder if something else might be at play. I'm getting the error many times per day on a site with about a dozen users active throughout the day.
In development, this procedure takes about five seconds given the amount of data in the view. Renaming is instantaneous. In production, it must be taking longer but I don't understand why. I once saw the procedure fail to obtain the exclusive lock within 90 seconds.
Any thoughts on how to fix or a better solution?
Edit: Extra notes on my locking - maybe I'm not doing this right:
BEGIN TRANSACTION
DECLARE #result int
EXEC #result = sp_getapplock #Resource = 'lock_computed_view1', #LockMode = 'Exclusive', #LockTimeout = 90
IF #result NOT IN ( 0, 1 ) -- Only successful return codes
BEGIN
PRINT #result
RAISERROR ( 'Lock failed to acquire...', 16, 1 )
END
ELSE
BEGIN
// rest of the magic
END
EXEC #result = sp_releaseapplock #Resource = 'lock_computed_view1'
COMMIT TRANSACTION
If you're locking and transaction scope is right I would expect other transactions to wait and never see the view missing. This might be a SQL Server idiosyncrasy that I don't know about.
It is often possible to do without dynamic DDL. Here are two ways to do it:
TRUNCATE the computed table and insert into it. This takes an exclusive automatically. No need to rename. All of this is atomic and supports rollback.
Use a staging table with the same schema. Work on that. So far no service interruption at all. Then, SWITCH PARTITION the staging table with the production table. This is quick and atomic. This does not require Enterprise Edition.
With these approaches the problem is solved by just not renaming.
I know that there are other questions with the exact title as the one I posted but each of them are very specific to the query or procedure they are referencing.
I manage a Blackboard Learn system here for a college and have direct database access. In short there is a stored procedure that is causing system headaches. Sometimes when changes to the system get committed errors are thrown into logs in the back end, identifying a stored procedure known as bbgs_cc_setStmtStatus and erroring out with The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
Here is the code for the SP, however, I did not write it, as it is a stock piece of "equipment" installed by Blackboard when it populates and creates the tables for the application.
USE [BBLEARN]
GO
/****** Object: StoredProcedure [dbo].[bbgs_cc_setStmtStatus] Script Date: 09/27/2013 09:19:48 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[bbgs_cc_setStmtStatus](
#registryKey nvarchar(255),
#registryVal nvarchar(255),
#registryDesc varchar(255),
#overwrite BIT
)
AS
BEGIN
DECLARE #message varchar(200);
IF (0 < (SELECT count(*) FROM bbgs_cc_stmt_status WHERE registry_key = #registryKey) ) BEGIN
IF( #overwrite=1 ) BEGIN
UPDATE bbgs_cc_stmt_status SET
registry_value = #registryVal,
description = #registryDesc,
dtmodified = getDate()
WHERE registry_key = #registryKey;
END
END
ELSE BEGIN
INSERT INTO bbgs_cc_stmt_status
(registry_key, registry_value, description) VALUES
(#registryKey, #registryVal, #registryDesc);
END
SET #message = 'bbgs_cc_setStmtStatus: Saved registry key [' + #registryKey + '] as status [' + #registryVal + '].';
EXEC dbo.bbgs_cc_log #message, 'INFORMATIONAL';
END
I'm not expecting Blackboard specific support, but I want to know if there is anything I can check as far as SQL Server 2008 is concerned to see if there is a system setting causing this. I do have a ticket open with Blackboard but have not heard anything yet.
Here are some things I have checked:
tempdb system database:
I made the templog have an initial size of 100MB and have it auto grow by 100MB, unrestricted to see if this was causing the issue. It didn't seem to help. Our actual tempdb starts at 4GB and auto grows by a gig each time it needs it. Is it normal for the space available in the tempdb to be 95-985 of the actual size of the tempdb? For example, right now tempdb has a size of 12388.00 MB and the space available is 12286.37MB.
Also, the log file for the main BBLEARN table had stopped growing because it reached its maximum auto grwoth. I set its initial size to 3GB to increase its size.
I see a couple of potential errors that could be preventing the commit but without knowing more about the structure these are just guesses:
The update clause in the nested if is trying to update a column (or set of columns) that must be unique. Because the check only verifies that at least one item exists but does not limit that check to making sure only one item exists
IF (0 < (SELECT ...) ) BEGIN
vs.
IF (1 = (SELECT ...) ) BEGIN
you could be inserting non-unique values into rows that must be unique. Check to make sure there are no constraints on the attributes the update runs on (specifically look for primary key, identity, and unique constraints). Likelyhood of this being the issue: Low but non-zero.
The application is not passing values to all of the parameters causing the #message string to null out and thus causing the logging method to error as it tries to add a null string. Remember that in SQL anything + null = null so, while you're fine to insert and update values to null you can't log nulls in the manner the code you provided does. Rather, to account for nulls, you should change the setter for the message variable to the following:
SET #message = 'bbgs_cc_setStmtStatus: Saved registry key [' + COALESCE(#registryKey, '') + '] as status [' + COALESCE(#registryVal,'') + '].';
This is far more likely to be your problem based on the reported error but again, without the app code (which might be preventing null parameters from being passed) there isn't any way to know.
Also, I would note that instead of doing a
IF (0 < (SELECT count(*) ...) ) BEGIN
I would use
IF (EXISTS (SELECT 1 ...) ) BEGIN
because it is more efficient. You don't have to return every row of the sub-query because the execution plan will run the FROM statement first and see that rows exist rather than having to actually evaluate the select, count those rows, and then compare that with 0.
Start with those suggestions and, if you can come back with more information, I can help you troubleshoot more.
Maybe you could use a MERGE statement :
http://msdn.microsoft.com/fr-fr/library/bb510625%28v=sql.100%29.aspx
I think it will be more efficient.
Launch SSMS
Navigate to the database name
Right click on the database name and choose Properties > Options
Change "Delayed Durability" to Allowed
Click OK
With the help of others on SO I've knocked up a couple of Tables and Stored Procedures, this morning, as I'm far from a DB programmer.
Would someone mind casting an eye over this and telling me if it's thread-safe? I guess that's probably not the term DBAs/DB developers use but I hope you get the idea: basically, what happens if this sp is executing and another comes along at the same time? Could one interfere with the other? Is this even an issue in SQL/SPs?
CREATE PROCEDURE [dbo].[usp_NewTicketNumber]
#ticketNumber int OUTPUT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO [TEST_Db42].[dbo].[TicketNumber]
([CreatedDateTime], [CreatedBy])
VALUES
(GETDATE(), SUSER_SNAME())
SELECT #ticketNumber = IDENT_CURRENT('[dbo].[TicketNumber]');
RETURN 0;
END
You probably do not want to be using IDENT_CURRENT - this returns the latest identity generated on the table in question, in any session and any scope. If someone else does an insert at the wrong time you will get their id instead!
If you want to get the identity generated by the insert that you just performed then it is best to use the OUTPUT clause to retrieve it. It used to be usual to use the SCOPE_IDENTITY() for this but there are problems with that under parallel execution plans.
The main SQL equivalent of thread safety is when multiple statements are executed that cause unexpected or undesirable behaviour. The two main types of such behaviour I can think of are locking (in particular deadlocks) and concurrency issues.
Locking problems occur when a statement stops other statements from accessing the rows it is working with. This can affect performance and in the worst scenario two statements make changes that cannot be reconciled and a deadlock occurs, causing one statement to be terminated.
However, a simple insert like the one you have should not cause locks unless something else is involved (like database transactions).
Concurrency issues (describing them very poorly) are caused by one set of changes to database records overwriting other changes to the same records. Again, this should not be a problem when inserting a record.
The safest way to go here would probably be to use the Output clause, since there is a known bug in scope_idendity under certain circumstances ( multi/parallel processing ).
CREATE PROCEDURE [dbo].[usp_NewTicketNumber]
AS
BEGIN
DECLARE #NewID INT
BEGIN TRANSACTION
BEGIN TRY
declare #ttIdTable TABLE (ID INT)
INSERT INTO
[dbo].[TicketNumber]([CreatedDateTime], [CreatedBy])
output inserted.id into #ttIdTable(ID)
VALUES
(GETDATE(), SUSER_SNAME())
SET #NewID = (SELECT id FROM #ttIdTable)
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
SET #NewID = -1
END CATCH
RETURN #NewID
END
This way you should be thread safe, since the output clause uses the data that the insert actually inserts, and you won't have problems across scopes or sessions.
CREATE PROCEDURE [dbo].[usp_NewTicketNumber]
#NewID int OUTPUT
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRY
BEGIN TRANSACTION
INSERT INTO
[dbo].[TicketNumber] ([CreatedDateTime], [CreatedBy])
VALUES
(GETDATE(), SUSER_SNAME())
SET #NewID = SCOPE_IDENTITY()
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF XACT_STATE() <> 0
ROLLBACK TRANSACTION;
SET #NewID = NULL;
END CATCH
END
I would not use RETURN for meaningful use data: either recordset or output parameter. RETURN would normally be used for error states (like system stored procs do in most cases):
EXEC #rtn = EXEC dbo.uspFoo
IF #rtn <> 0
--do error stuff
You can also use the OUTPUT clause to return a recordset instead.
This is "thread safe", that is it can be run concurrently.
First off - why don't you just return the new ticket number instead of 0 all the time? Any particular reason for that?
Secondly, to be absolutely sure, you should wrap your INSERT and SELECT statement into a TRANSACTION so that nothing from the outside can intervene.
Thirdly, with SQL Server 2005 and up, I'd wrap my statements into a TRY....CATCH block and roll back the transaction if it fails.
Next, I would try to avoid specifying the database server (TestDB42) in my procedures whenever possible - what if you want to deploy that proc to a new server (TestDB43) ??
And lastly, I'd never use a SET NOCOUNT in a stored procedure - it can cause the caller to erroneously think the stored proc failed (see my comment to gbn below - this is a potential problem if you're using ADO.NET SqlDataAdapter objects only; see the MSDN docs on how to modify ADO.NET data with SqlDataAdapter for more explanations).
So my suggestion for your stored proc would be:
CREATE PROCEDURE [dbo].[usp_NewTicketNumber]
AS
BEGIN
DECLARE #NewID INT
BEGIN TRANSACTION
BEGIN TRY
INSERT INTO
[dbo].[TicketNumber]([CreatedDateTime], [CreatedBy])
VALUES
(GETDATE(), SUSER_SNAME())
SET #NewID = SCOPE_IDENTITY()
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
SET #NewID = -1
END CATCH
RETURN #NewID
END
Marc
I agree with David Hall's answer, I just want to expand a bit on why ident_current is absolutely the wrong thing to use in this situation.
We had a developer here who used it. The insert from the client application happened at the same time the database was importing millions of records through an automated import. The id returned to him was from one of the records my process imported. He used this id to create records for some child tables which were now attached to the wrong record. Worse we now have no idea how many times this happened before someone couldn't find the information that should have been in the child tables (his change had been on prod for several months). Not only could my automated import have interfered with his code, but another user inserting a record at the smae time could have done the same thing. Ident_current should never be used to return the identity of a record just inserted as it is not limited to the process that calls it.