SQL Server concurrency - sql

I asked two questions at once in my last thread, and the first has been answered. I decided to mark the original thread as answered and repost the second question here. Link to original thread if anyone wants it:
Handling SQL Server concurrency issues
Suppose I have a table with a field which holds foreign keys for a second table. Initially records in the first table do not have a corresponding record in the second, so I store NULL in that field. Now at some point a user runs an operation which will generate a record in the second table and have the first table link to it. If two users simultaneously try to generate the record, a single record should be created and linked to, and the other user receives a message saying the record already exists. How do I ensure that duplicates are not created in a concurrent environment?
The steps I need to carry out are:
1) Look up x number of records in table A
2) Perform some business logic that prepares a single row which is inserted into table B
3) Update the records selected in step 1) to point to the newly created record in table B
I can use scope_identity() to retrieve the primary key of the newly created record in table B, so I don't need to worry about the new record being lost due to simultaneous transactions. However I need to eliminate the possibility of concurrently executing processes resulting in a duplicate record in table B being created.

In SQL Server 2008, this can be handled with a filtered unique index:
CREATE UNIQUE INDEX ix_MyIndexName ON MyTable (FKField) WHERE FkField IS NOT NULL
This will require all non-null values be unique, and the database will enforce it for you.

The 2005 way of simulating a unique filtered index for constraint purposes is
CREATE VIEW dbo.EnforceUnique
WITH SCHEMABINDING
AS
SELECT FkField
FROM dbo.TableB
WHERE FkField IS NOT NULL
GO
CREATE UNIQUE CLUSTERED INDEX ix ON dbo.EnforceUnique(FkField)
Connections that update the base table will need to have the correct SET options but unless you are using non default options this will be the case anyway in SQL Server 2005 (ARITH_ABORT used to be the problem one in 2000)

Using a computed column
ALTER TABLE MyTable ADD
OneNonNullOnly AS ISNULL(FkField, -PkField)
CREATE UNIQUE INDEX ix_OneNullOnly ON MyTable (OneNonNullOnly);
Assumes:
FkField is numeric
no clash of FkField and -PkField values

Decided to go with the following:
1) Begin transaction
2) UPDATE tableA SET foreignKey = -1 OUTPUT inserted.id INTO #tempTable
FROM (business logic)
WHERE foreignKey is null
3) If ##rowcount > 0 Then
3a) Create record in table 2.
3b) Capture ID of newly created record using scope_identity()
3c) UPDATE tableA set foreignKey = IdOfNewRecord FROM tableA INNER JOIN #tempTable ON tableA.id = tempTable.id
Since I write junk into the foreign key field in step 2), those rows are locked and no concurrent transactions will touch them. The first transaction is free to create the record. After the transaction is committed, the blocked transaction will execute the update query, but won't capture any of the original rows due to the WHERE clause only considering NULL foreignKey fields. If no rows are returned (##rowcount = 0), the current transaction exits without creating the record in table B, and returns some sort of error message to the client. (e.g. Error: Record already exists)

Related

GORM Auto-increments primary key even if data wasnt inserted into DB [duplicate]

I'm using MySQL's AUTO_INCREMENT field and InnoDB to support transactions. I noticed when I rollback the transaction, the AUTO_INCREMENT field is not rollbacked? I found out that it was designed this way but are there any workarounds to this?
It can't work that way. Consider:
program one, you open a transaction and insert into a table FOO which has an autoinc primary key (arbitrarily, we say it gets 557 for its key value).
Program two starts, it opens a transaction and inserts into table FOO getting 558.
Program two inserts into table BAR which has a column which is a foreign key to FOO. So now the 558 is located in both FOO and BAR.
Program two now commits.
Program three starts and generates a report from table FOO. The 558 record is printed.
After that, program one rolls back.
How does the database reclaim the 557 value? Does it go into FOO and decrement all the other primary keys greater than 557? How does it fix BAR? How does it erase the 558 printed on the report program three output?
Oracle's sequence numbers are also independent of transactions for the same reason.
If you can solve this problem in constant time, I'm sure you can make a lot of money in the database field.
Now, if you have a requirement that your auto increment field never have gaps (for auditing purposes, say). Then you cannot rollback your transactions. Instead you need to have a status flag on your records. On first insert, the record's status is "Incomplete" then you start the transaction, do your work and update the status to "compete" (or whatever you need). Then when you commit, the record is live. If the transaction rollsback, the incomplete record is still there for auditing. This will cause you many other headaches but is one way to deal with audit trails.
Let me point out something very important:
You should never depend on the numeric features of autogenerated keys.
That is, other than comparing them for equality (=) or unequality (<>), you should not do anything else. No relational operators (<, >), no sorting by indexes, etc. If you need to sort by "date added", have a "date added" column.
Treat them as apples and oranges: Does it make sense to ask if an apple is the same as an orange? Yes. Does it make sense to ask if an apple is larger than an orange? No. (Actually, it does, but you get my point.)
If you stick to this rule, gaps in the continuity of autogenerated indexes will not cause problems.
I had a client needed the ID to rollback on a table of invoices, where the order must be consecutive
My solution in MySQL was to remove the AUTO-INCREMENT and pull the latest Id from the table, add one (+1) and then insert it manually.
If the table is named "TableA" and the Auto-increment column is "Id"
INSERT INTO TableA (Id, Col2, Col3, Col4, ...)
VALUES (
(SELECT Id FROM TableA t ORDER BY t.Id DESC LIMIT 1)+1,
Col2_Val, Col3_Val, Col4_Val, ...)
Why do you care if it is rolled back? AUTO_INCREMENT key fields are not supposed to have any meaning so you really shouldn't care what value is used.
If you have information you're trying to preserve, perhaps another non-key column is needed.
I do not know of any way to do that. According to the MySQL Documentation, this is expected behavior and will happen with all innodb_autoinc_lock_mode lock modes. The specific text is:
In all lock modes (0, 1, and 2), if a
transaction that generated
auto-increment values rolls back,
those auto-increment values are
“lost.” Once a value is generated for
an auto-increment column, it cannot be
rolled back, whether or not the
“INSERT-like” statement is completed,
and whether or not the containing
transaction is rolled back. Such lost
values are not reused. Thus, there may
be gaps in the values stored in an
AUTO_INCREMENT column of a table.
If you set auto_increment to 1 after a rollback or deletion, on the next insert, MySQL will see that 1 is already used and will instead get the MAX() value and add 1 to it.
This will ensure that if the row with the last value is deleted (or the insert is rolled back), it will be reused.
To set the auto_increment to 1, do something like this:
ALTER TABLE tbl auto_increment = 1
This is not as efficient as simply continuing on with the next number because MAX() can be expensive, but if you delete/rollback infrequently and are obsessed with reusing the highest value, then this is a realistic approach.
Be aware that this does not prevent gaps from records deleted in the middle or if another insert should occur prior to you setting auto_increment back to 1.
INSERT INTO prueba(id)
VALUES (
(SELECT IFNULL( MAX( id ) , 0 )+1 FROM prueba target))
If the table doesn't contain values or zero rows
add target for error mysql type update FROM on SELECT
If you need to have the ids assigned in numerical order with no gaps, then you can't use an autoincrement column. You'll need to define a standard integer column and use a stored procedure that calculates the next number in the insert sequence and inserts the record within a transaction. If the insert fails, then the next time the procedure is called it will recalculate the next id.
Having said that, it is a bad idea to rely on ids being in some particular order with no gaps. If you need to preserve ordering, you should probably timestamp the row on insert (and potentially on update).
Concrete answer to this specific dilemma (which I also had) is the following:
1) Create a table that holds different counters for different documents (invoices, receipts, RMA's, etc..); Insert a record for each of your documents and add the initial counter to 0.
2) Before creating a new document, do the following (for invoices, for example):
UPDATE document_counters SET counter = LAST_INSERT_ID(counter + 1) where type = 'invoice'
3) Get the last value that you just updated to, like so:
SELECT LAST_INSERT_ID()
or just use your PHP (or whatever) mysql_insert_id() function to get the same thing
4) Insert your new record along with the primary ID that you just got back from the DB. This will override the current auto increment index, and make sure you have no ID gaps between you records.
This whole thing needs to be wrapped inside a transaction, of course. The beauty of this method is that, when you rollback a transaction, your UPDATE statement from Step 2 will be rolled back, and the counter will not change anymore. Other concurrent transactions will block until the first transaction is either committed or rolled back so they will not have access to either the old counter OR a new one, until all other transactions are finished first.
SOLUTION:
Let's use 'tbl_test' as an example table, and suppose the field 'Id' has AUTO_INCREMENT attribute
CREATE TABLE tbl_test (
Id int NOT NULL AUTO_INCREMENT ,
Name varchar(255) NULL ,
PRIMARY KEY (`Id`)
)
;
Let's suppose that table has houndred or thousand rows already inserted and you don't want to use AUTO_INCREMENT anymore; because when you rollback a transaction the field 'Id' is always adding +1 to AUTO_INCREMENT value.
So to avoid that you might make this:
Let's remove AUTO_INCREMENT value from column 'Id' (this won't delete your inserted rows):
ALTER TABLE tbl_test MODIFY COLUMN Id int(11) NOT NULL FIRST;
Finally, we create a BEFORE INSERT Trigger to generate an 'Id' value automatically. But using this way won't affect your Id value even if you rollback any transaction.
CREATE TRIGGER trg_tbl_test_1
BEFORE INSERT ON tbl_test
FOR EACH ROW
BEGIN
SET NEW.Id= COALESCE((SELECT MAX(Id) FROM tbl_test),0) + 1;
END;
That's it! You're done!
You're welcome.
$masterConn = mysql_connect("localhost", "root", '');
mysql_select_db("sample", $masterConn);
for($i=1; $i<=10; $i++) {
mysql_query("START TRANSACTION",$masterConn);
$qry_insert = "INSERT INTO `customer` (id, `a`, `b`) VALUES (NULL, '$i', 'a')";
mysql_query($qry_insert,$masterConn);
if($i%2==1) mysql_query("COMMIT",$masterConn);
else mysql_query("ROLLBACK",$masterConn);
mysql_query("ALTER TABLE customer auto_increment = 1",$masterConn);
}
echo "Done";

Adding a computed column that uses MAX

I need to create a sequential number column for record number proposes
I am OK with losing sequence if I delete a row from the middle of the table
For example
1
2
3
If I delete 2, I am ok with new column been 4.
I tried to alter my table to
alter table [dbo].[mytable]
add [record_seq] as (MAX(record_seq) + 1)
but I am getting An aggregate may not appear in a computed column expression or check constraint.
Which is a bit confusing? do I need to specify an initial value? is there a better way?
If you're looking to allocate a sequence number even in cases where the table doesn't get a record inserted, I would handle it in the process responsible for performing those inserts. Create another table, in this table keep track of the max identity value of that sequence. Each time you want to perform an insert, reserve the sequence number you want by updating that table first. If you rely on selecting the max existing value, you could be at risk of multiple sessions getting the same "new" sequence number before inserting. Even if the insert fails, you will have incremented that control table so nothing else uses that value that has been reserved.
Its not supported in MsSql. You can use identity column:
ALTER TABLE [dbo].[mytable]
ADD [record_seq] INT IDENTITY
Or use trigger to update your seq column after insert and/or delete

Trying to copy one table from another database to another in SQL Server 2008 R2

i am trying to copy table information from a backup dummy database to our live sql database(as an accident happened in our program, Visma Business, where someone managed to overwrite 1300 customer names) but i am having a hard time figuring out the perfect code for this, i've looked around and yes there are several similar problems, but i just can't get this to work even though i've tried different solutions.
Here is the simple code i used last time, in theory all i need is the equivilant of mysqls On Duplicate, which would be MERGE on SQL server? I just didn't quite know what to write to get that merge to work.
INSERT [F0001].[dbo].[Actor]
SELECT * FROM [FDummy].[dbo].[Actor]
The error message i get with this is:
Violation of PRIMARY KEY constraint 'PK__Actor'. Cannot insert duplicate key in object 'dbo.Actor'.
What error message says is simply "You cant add same value if an attribute has PK constraint". If you already have all the information in your backup table what you should do is TRUNCATE TABLE which removes all rows from a table, but the table structure and its columns, constraints, indexes, and so on remain.
After that step you should follow this answer . Or alternatively i recommend a tool called Kettle which is open source and easy to use for these kinds of data movements. That will save you a lot of work.
Here are thing which can be the reason :
You have multiple row in [FDummy].[dbo].[Actor] with same data in a column which is going to be inserted in primary key column of [F0001].[dbo].[Actor].
You have existing rows in [FDummy].[dbo].[Actor] with some value x in primary key column and there is/are row(s) in [F0001].[dbo].[Actor] with same value x in the column which is going to be inserted in primary key column.
List item
-- to check first point. if it returns row then you have some problem
SELECT ColumnGoingToBeMappedWithPK,
Count(*)
FROM [FDummy].[dbo].[Actor]
GROUP BY ColumnGoingToBeMappedWithPK
HAVING Count(*) > 1
-- to check second point. if count is greater than 0 then you have some problem
SELECT Count(*)
FROM [FDummy].[dbo].[Actor] a
JOIN [F0001].[dbo].[Actor] b
ON a.ColumnGoingToBeMappedWithPK = b.PrimaryKeyColumn
The MERGE statement will be possibly the best for you here, unless the primary key of the Actor table is reused after a previous record is deleted, so not autoincremented and say record with id 13 on F0001.dbo.Actor is not the same "actor" information as on FDummy.dbo.Actor
To use the statement with your code, it will look something like this:
begin transaction
merge [F0001].[dbo].[Actor] as t -- the destination
using [FDummy].[dbo].[Actor] as s -- the source
on (t.[PRIMARYKEY] = s.[PRIMARYKEY]) -- update with your primary keys
when matched then
update set t.columnname1 = s.columnname1,
t.columnname2 = s.columnname2,
t.columnname3 = s.columnname3
-- repeat for all your columns that you want to update
output $action,
Inserted.*,
Deleted.*;
rollback transaction -- change to commit after testing
Further reading can be done at the sources below:
MERGE (Transact-SQL)
Inserting, Updating, and Deleting Data by Using MERGE
Using MERGE in SQL Server to insert, update and delete at the same time

Optimize Delete

I have a importer system which updates the column of already existing rows in a Table. Since UPDATE was taking time I changed it to DELETE and BULK INSERT.
Here is my database setup snippet
Table: ParameterDefinition
Columns: Id, Name, Other Cols
Table: ParameterValue
Columns: Id, CustId, ParameterDefId, Value
I get the values associated to ParamterDefinition.Name from my XML source, so to import I first delete all the existing ParamterValue with all the ParamterDefinition.Name passed in the XML and finally do bulk insert of all the values from XML. Here is my query
DELETE FROM ParameterValue WHERE CustId = ? AND ParameterDefId IN (?,?...?);
For 1000 Customers the above DELETE statement is called 1000 times which is very time consuming now, approximately 64 seconds.
Is there any better way to handle DELETE of 1000 customers?
Thanks,
Sheeju
Create a temporary table for the bulk-insert (ParameterValue_Import). Do the bulk-inserts to this table, then update/insert/delete based on the imported data.
INSERT INTO .. SELECT .. WHERE NOT EXISTS ( .. ) for the new rows
UPDATE .. FROM for the updates
DELETE FROM WHERE NOT EXISTS ( .. ) for the deletion
Bulk operations have better performance than standalone operations. Most DBMSs are designed to handle set based operations instead of record based ones.
Edit
To delete or update one record based on a WHERE clause which refers to only one record, the DBMS should either do a full table scan (if there is no index for the where condition) or do an index lookup. Only after the record successfully identified, the DBMS proceeds the original request (update or delete). Based on the number of records in the table and/or the size/depth of the index, this could be really expensive. This process are done for each and every command in the batch. Summing up the total cost, it could be more than if you are updating/deleting records based on another table. (Especially if the operations are update/delete nearly all records in the target table.)
When you are trying to delete/update several records at once (e.g. based on another table), the DBMS could do the lookups with only one table scan/index lookup and do a logical join when processing your request.
The cost of purely updating a record is the same in each case, just the total cost of lookup could be significantly different.
Furthermore deleting then inserting a record to update it could require more resources: when you are deleting a record, all related indexes will be updates, and when you insert the new record, the indexes will be updated once more, while with updating the record, only those indexes should be updated, which are related to an updated column (and the index update should be done only once).
I am giving the exact syntax to the above idea given by #Pred
After Bulk Insert lets say you have data in "ParamterValue_Import"
To INSERT The Records in "ParamterValue_Import" which are not in "ParamterValue"
INSERT INTO ParameterValue (
CustId, ParameterDefId, Value
)
SELECT
CustId, ParameterDefId, Value
FROM
ParameterValue_Import
WHERE
NOT EXISTS (
SELECT null
FROM ParameterValue
WHERE ParameterValue.CustId = ParameterValue_Import.CustId
);
To UPDATE The Records in "ParamterValue" which are also in "ParamterValue_Import"
UPDATE
ParameterValue
SET
Value = ParameterValue_Import.Value
FROM
ParameterValue_Import
WHERE
ParameterValue.ParameterDefId = ParameterValue_Import.ParameterDefId
AND ParameterValue.CustId = ParameterValue_Import.CustId;

Auto Increment feature of SQL Server

I have created a table named as ABC. It has three columns which are as follows:-
The column number_pk (int) is the primary key of my table in which I have made the auto increment feature on for that column.
Now I have deleted two rows from that table say Number_pk= 5 and Number_pk =6.
The table which I get now is like this:-
Now if I again enter two new rows in this table with the same value I get the two new Number_pk starting from 7 and 8 i.e,
My question is that what is the logic behind this since I have deleted the two rows from the table. I know that a simple answer is because I have set the auto increment on for the primary key of my table. But I want to know is there any way that I can insert the two new entries starting from the last Number_pk without changing the design of my table?
And how the SQL Server manage this record since I have deleted the rows from the database??
The logic is guaranteeing that the generated numbers are unique. An ID field does not neccessarily have to have a meaning, but rather is most often used to identify a unique record, thus making it easier to perform operations on it.
If your database is designed properly, the deleted ID numbers would not have been possible to delete if they were referenced by any other tables in a foreign key relationship, thus preventing records from being orphaned in that way.
If you absolutely want to have entries sequences, you could consider issuing a RESEED, but as suggested, it would not really give you much advantages.
The identity record is "managed" because SQL Server will keep track of which numbers have been issued, regardless of whether they are still present or not.
Should you ever want to delete all records from a table, there are two ways to do so (provided no foreign key relatsons exist):
DELETE FROM Table
DELETE just removes the records, but the next INSERTED value will continue where the ID numbering left of.
TRUNCATE TABLE
TRUNCATE will actually RESEED the table, thus guaranteeing it starts again at the value you originally specified (most likely 1).
Although you should not do this until their is a specific requirement.
1.) Get the max id:
Declare #id int
Select #id = Max(Number_pk) From ABC
SET #id = #id + 1;
2.) And reset the Identity Column:
DBCC CHECKIDENT('ABC', RESEED, #id)
DBCC CHECKIDENT (Transact-SQL)