I (will) have hundreds of thousand of records where i insert once, never update with many rows holding the same previousId. Can i guaranteed a start/end index? where i insert X number of rows in table_c with a transaction and write the start and end (or start and length or end and length) into table_b instead of having each row hold table_b ID?
if so how do i write the SQL? i was thinking
begin transaction
insert XYZ rows into tbl_c
c_rowId = last_insert_rowid
insert table_b with data + start=c_rowId-lengthOfInsert, end=c_rowId;
commit; end transaction
would this work as i expect?
It seems like what you want is an autonumber column. In most DBMS, you can define this column as part of the table definition, and it will number the rows automatically. You can use a function to get the ID of the last row inserted; in SQL Server (T-SQL), you can use the SCOPE_IDENTITY() function. If the IDs -have- to be a certain value or range of values, you may need to do it manually and use a locking hint to prevent another concurrent transaction from modifying your identifier information.
You can lock the full table in sql server by using tablockx.
so:
begin the transaction,
select max(txnid) from table with (tablockx) -- now the table is locked
--figure out how many more you are going to insert
lastId = maxNum + numToInsert
set allow insert auto_increment on insert -- not sure what this is
while (moreToInsert)
insert int table (id,x,y) values (id+1,'xx','yy')
commit transaction
The problem with this though is that it totally locks the table. An auto incrementing column (auto_increment (mysql) or identity mssql) might be what you really want and would not limit access to the whole table. (This was noted in another answer)
Related
I'm using MySQL's AUTO_INCREMENT field and InnoDB to support transactions. I noticed when I rollback the transaction, the AUTO_INCREMENT field is not rollbacked? I found out that it was designed this way but are there any workarounds to this?
It can't work that way. Consider:
program one, you open a transaction and insert into a table FOO which has an autoinc primary key (arbitrarily, we say it gets 557 for its key value).
Program two starts, it opens a transaction and inserts into table FOO getting 558.
Program two inserts into table BAR which has a column which is a foreign key to FOO. So now the 558 is located in both FOO and BAR.
Program two now commits.
Program three starts and generates a report from table FOO. The 558 record is printed.
After that, program one rolls back.
How does the database reclaim the 557 value? Does it go into FOO and decrement all the other primary keys greater than 557? How does it fix BAR? How does it erase the 558 printed on the report program three output?
Oracle's sequence numbers are also independent of transactions for the same reason.
If you can solve this problem in constant time, I'm sure you can make a lot of money in the database field.
Now, if you have a requirement that your auto increment field never have gaps (for auditing purposes, say). Then you cannot rollback your transactions. Instead you need to have a status flag on your records. On first insert, the record's status is "Incomplete" then you start the transaction, do your work and update the status to "compete" (or whatever you need). Then when you commit, the record is live. If the transaction rollsback, the incomplete record is still there for auditing. This will cause you many other headaches but is one way to deal with audit trails.
Let me point out something very important:
You should never depend on the numeric features of autogenerated keys.
That is, other than comparing them for equality (=) or unequality (<>), you should not do anything else. No relational operators (<, >), no sorting by indexes, etc. If you need to sort by "date added", have a "date added" column.
Treat them as apples and oranges: Does it make sense to ask if an apple is the same as an orange? Yes. Does it make sense to ask if an apple is larger than an orange? No. (Actually, it does, but you get my point.)
If you stick to this rule, gaps in the continuity of autogenerated indexes will not cause problems.
I had a client needed the ID to rollback on a table of invoices, where the order must be consecutive
My solution in MySQL was to remove the AUTO-INCREMENT and pull the latest Id from the table, add one (+1) and then insert it manually.
If the table is named "TableA" and the Auto-increment column is "Id"
INSERT INTO TableA (Id, Col2, Col3, Col4, ...)
VALUES (
(SELECT Id FROM TableA t ORDER BY t.Id DESC LIMIT 1)+1,
Col2_Val, Col3_Val, Col4_Val, ...)
Why do you care if it is rolled back? AUTO_INCREMENT key fields are not supposed to have any meaning so you really shouldn't care what value is used.
If you have information you're trying to preserve, perhaps another non-key column is needed.
I do not know of any way to do that. According to the MySQL Documentation, this is expected behavior and will happen with all innodb_autoinc_lock_mode lock modes. The specific text is:
In all lock modes (0, 1, and 2), if a
transaction that generated
auto-increment values rolls back,
those auto-increment values are
“lost.” Once a value is generated for
an auto-increment column, it cannot be
rolled back, whether or not the
“INSERT-like” statement is completed,
and whether or not the containing
transaction is rolled back. Such lost
values are not reused. Thus, there may
be gaps in the values stored in an
AUTO_INCREMENT column of a table.
If you set auto_increment to 1 after a rollback or deletion, on the next insert, MySQL will see that 1 is already used and will instead get the MAX() value and add 1 to it.
This will ensure that if the row with the last value is deleted (or the insert is rolled back), it will be reused.
To set the auto_increment to 1, do something like this:
ALTER TABLE tbl auto_increment = 1
This is not as efficient as simply continuing on with the next number because MAX() can be expensive, but if you delete/rollback infrequently and are obsessed with reusing the highest value, then this is a realistic approach.
Be aware that this does not prevent gaps from records deleted in the middle or if another insert should occur prior to you setting auto_increment back to 1.
INSERT INTO prueba(id)
VALUES (
(SELECT IFNULL( MAX( id ) , 0 )+1 FROM prueba target))
If the table doesn't contain values or zero rows
add target for error mysql type update FROM on SELECT
If you need to have the ids assigned in numerical order with no gaps, then you can't use an autoincrement column. You'll need to define a standard integer column and use a stored procedure that calculates the next number in the insert sequence and inserts the record within a transaction. If the insert fails, then the next time the procedure is called it will recalculate the next id.
Having said that, it is a bad idea to rely on ids being in some particular order with no gaps. If you need to preserve ordering, you should probably timestamp the row on insert (and potentially on update).
Concrete answer to this specific dilemma (which I also had) is the following:
1) Create a table that holds different counters for different documents (invoices, receipts, RMA's, etc..); Insert a record for each of your documents and add the initial counter to 0.
2) Before creating a new document, do the following (for invoices, for example):
UPDATE document_counters SET counter = LAST_INSERT_ID(counter + 1) where type = 'invoice'
3) Get the last value that you just updated to, like so:
SELECT LAST_INSERT_ID()
or just use your PHP (or whatever) mysql_insert_id() function to get the same thing
4) Insert your new record along with the primary ID that you just got back from the DB. This will override the current auto increment index, and make sure you have no ID gaps between you records.
This whole thing needs to be wrapped inside a transaction, of course. The beauty of this method is that, when you rollback a transaction, your UPDATE statement from Step 2 will be rolled back, and the counter will not change anymore. Other concurrent transactions will block until the first transaction is either committed or rolled back so they will not have access to either the old counter OR a new one, until all other transactions are finished first.
SOLUTION:
Let's use 'tbl_test' as an example table, and suppose the field 'Id' has AUTO_INCREMENT attribute
CREATE TABLE tbl_test (
Id int NOT NULL AUTO_INCREMENT ,
Name varchar(255) NULL ,
PRIMARY KEY (`Id`)
)
;
Let's suppose that table has houndred or thousand rows already inserted and you don't want to use AUTO_INCREMENT anymore; because when you rollback a transaction the field 'Id' is always adding +1 to AUTO_INCREMENT value.
So to avoid that you might make this:
Let's remove AUTO_INCREMENT value from column 'Id' (this won't delete your inserted rows):
ALTER TABLE tbl_test MODIFY COLUMN Id int(11) NOT NULL FIRST;
Finally, we create a BEFORE INSERT Trigger to generate an 'Id' value automatically. But using this way won't affect your Id value even if you rollback any transaction.
CREATE TRIGGER trg_tbl_test_1
BEFORE INSERT ON tbl_test
FOR EACH ROW
BEGIN
SET NEW.Id= COALESCE((SELECT MAX(Id) FROM tbl_test),0) + 1;
END;
That's it! You're done!
You're welcome.
$masterConn = mysql_connect("localhost", "root", '');
mysql_select_db("sample", $masterConn);
for($i=1; $i<=10; $i++) {
mysql_query("START TRANSACTION",$masterConn);
$qry_insert = "INSERT INTO `customer` (id, `a`, `b`) VALUES (NULL, '$i', 'a')";
mysql_query($qry_insert,$masterConn);
if($i%2==1) mysql_query("COMMIT",$masterConn);
else mysql_query("ROLLBACK",$masterConn);
mysql_query("ALTER TABLE customer auto_increment = 1",$masterConn);
}
echo "Done";
I have a bankcustomer table which looks like this:
create table bankcustomer
(
cpr char(10) primary key,
name varchar(30) not null
)
And an account table which looks like this:
create table account
(
accountnr int identity(1001,1) primary key,
accountowner char(10) foreign key references bankcustomer,
created date not null,
balance decimal(14,2) not null
)
I want to write a trigger in SQL Server that limits a bankcustomer so that a bankcustomer can have no more than 3 accounts in the account table.
I create an account for a specific bank customer by inserting a value in the accountowner column in the account table that matches a cpr from the bankcustomer table (when inserting a record into the account table).
I have this code so far:
create trigger mytrigger4
on account
for insert
as
if exists (select count(*)
from inserted
join bankcustomer on inserted.accountowner = bankcustomer.cpr
where cpr = inserted.accountowner
having count(*) > 3)
begin
rollback tran
raiserror('A customer must have a maximum of 3 accounts', 16, 1)
end
go
The problem is that I can keep creating accounts (insert records in the account table) for a customer even though the customer already has 3 accounts. Which means the code in the trigger does not work at all.
Any help would be appreciated!
Let's think about your code. First, it is apparent that you test using single row inserts only. And that probably carries over into your sql code generally. That's bad, because an insert (or update or delete or merge) can affect any number of rows. While you can expect that the majority of inserts from an application are likely to be single rows, there are always situations that affect multiple rows. And that is an assumption that should always be in your mind when writing sql code generally - and triggers specifically.
Your test is based on exists. That is testing for the existence of rows generate by the query inside the exists clause. So - look carefully at your query. First, you count but do not group. Therefore, you are counting all the rows generated by the query. This is incorrect because of the assumption mentioned earlier. But let's ignore that for the moment and examine the select statement alone.
Your select statement joins inserted to the parent bankcustomer. The join is correct but why is there a where clause? And let's sidetrack into best practices. Always - ALWAYS - give each table a useful alias and use that alias when referencing columns. Why? Because this makes it easier for others to read and understand your query. BTW - a useful alias is not a single character. Yep - writing code can be a little work.
Let's continue. Your query counts all rows in the resultset. If you insert a single row, what is the result of your join? We know that an account is associated with a single bankcustomer. So when you insert a single row into account, the join will produce A SINGLE ROW. And counting that single row resultset will always produce a single row with the value of - tadah - 1. Now there is a way to cause your trigger generate an error. Insert 4 or more rows with a single statement. The error will probably not be accurate, but it will kill the transaction and display a message.
So you see that your logic is flawed. You need to count rows in the actual table (account), not inserted. But not all the rows - because that would be inefficient. You just need to consider all rows that "share" the accountowner values found in inserted. Note the plural "values". This is where your single row assumption fails. So, how to do that? Here is one way to write your trigger. Note that the first query is included to let you "see" what the count query is producing - this is for debugging only. Production triggers should never return a resultset in any fashion.
alter trigger mytrigger4
on account
for insert
as begin
select cust.cpr, count(*)
from bankcustomer as cust join account as acc on cust.cpr = acc.accountowner
where exists (select * from inserted as ins where ins.accountowner = cust.cpr)
group by cust.cpr;
if exists (select cust.cpr, count(*)
from bankcustomer as cust join account as acc on cust.cpr = acc.accountowner
where exists (select * from inserted as ins where ins.accountowner = cust.cpr)
group by cust.cpr
having count(*) > 3)
begin
rollback tran
raiserror('A customer must have a maximum of 3 accounts', 16, 1)
end
end;
go
I'll leave it to you to actually test is thoroughly - which includes the use of insert statements that insert multiple rows. And, of course, you will vary the test data to include customers that have no accounts, less than 3 accounts, exactly 3 accounts, and more than 3 accounts (because sometimes things happen and extra accounts get added despite your best efforts).
I have a importer system which updates the column of already existing rows in a Table. Since UPDATE was taking time I changed it to DELETE and BULK INSERT.
Here is my database setup snippet
Table: ParameterDefinition
Columns: Id, Name, Other Cols
Table: ParameterValue
Columns: Id, CustId, ParameterDefId, Value
I get the values associated to ParamterDefinition.Name from my XML source, so to import I first delete all the existing ParamterValue with all the ParamterDefinition.Name passed in the XML and finally do bulk insert of all the values from XML. Here is my query
DELETE FROM ParameterValue WHERE CustId = ? AND ParameterDefId IN (?,?...?);
For 1000 Customers the above DELETE statement is called 1000 times which is very time consuming now, approximately 64 seconds.
Is there any better way to handle DELETE of 1000 customers?
Thanks,
Sheeju
Create a temporary table for the bulk-insert (ParameterValue_Import). Do the bulk-inserts to this table, then update/insert/delete based on the imported data.
INSERT INTO .. SELECT .. WHERE NOT EXISTS ( .. ) for the new rows
UPDATE .. FROM for the updates
DELETE FROM WHERE NOT EXISTS ( .. ) for the deletion
Bulk operations have better performance than standalone operations. Most DBMSs are designed to handle set based operations instead of record based ones.
Edit
To delete or update one record based on a WHERE clause which refers to only one record, the DBMS should either do a full table scan (if there is no index for the where condition) or do an index lookup. Only after the record successfully identified, the DBMS proceeds the original request (update or delete). Based on the number of records in the table and/or the size/depth of the index, this could be really expensive. This process are done for each and every command in the batch. Summing up the total cost, it could be more than if you are updating/deleting records based on another table. (Especially if the operations are update/delete nearly all records in the target table.)
When you are trying to delete/update several records at once (e.g. based on another table), the DBMS could do the lookups with only one table scan/index lookup and do a logical join when processing your request.
The cost of purely updating a record is the same in each case, just the total cost of lookup could be significantly different.
Furthermore deleting then inserting a record to update it could require more resources: when you are deleting a record, all related indexes will be updates, and when you insert the new record, the indexes will be updated once more, while with updating the record, only those indexes should be updated, which are related to an updated column (and the index update should be done only once).
I am giving the exact syntax to the above idea given by #Pred
After Bulk Insert lets say you have data in "ParamterValue_Import"
To INSERT The Records in "ParamterValue_Import" which are not in "ParamterValue"
INSERT INTO ParameterValue (
CustId, ParameterDefId, Value
)
SELECT
CustId, ParameterDefId, Value
FROM
ParameterValue_Import
WHERE
NOT EXISTS (
SELECT null
FROM ParameterValue
WHERE ParameterValue.CustId = ParameterValue_Import.CustId
);
To UPDATE The Records in "ParamterValue" which are also in "ParamterValue_Import"
UPDATE
ParameterValue
SET
Value = ParameterValue_Import.Value
FROM
ParameterValue_Import
WHERE
ParameterValue.ParameterDefId = ParameterValue_Import.ParameterDefId
AND ParameterValue.CustId = ParameterValue_Import.CustId;
I have a problem. I need to get last inserted rows in all tables in Firebird db. And one more, these rows must contain specfied column name. I read some articles about rdb$ but have a few experience with that one.
There is no reliable way to get "last row inserted" unless the table has a timestamp field which stores that information (insertion timestamp).
If the table uses integer PK generated by sequense (generator in Firebird lingo) then you could query for the higest PK value but this isn't reliable either.
There is no concept of 'last row inserted'. Visibility and availability to other transactions depends on the time of commit, transaction isolation specified etc. Even use of a generator or timestamp as suggested by ain does not really help because of this visibility issue.
Maybe you are better of specifying the actual problem you are trying to solve.
SELECT GEN_ID(ID_HEDER,0)+1 FROM ANY_TABLE INTO :ID;
INSERT INTO INVOICE_HEADER (No,Date_of,Etc) VALUES ('122','2013-10-20','Any text')
/* ID record of INVOICE_HEADER table gets the ID_number from the generator above. So
now we have to check if the ID =GEN_ID(ID_HEADER,0) */
IF (ID=GEN_ID(ID_HEADER,0)) THEN
BEGIN
INSERT INTO INVOICE_FOOTER (RELACION_ID, TEXT, Etc) Values (ID, 'Text', Etc);
END
ELSE
REVERT TRANSACTION
That is all
I asked two questions at once in my last thread, and the first has been answered. I decided to mark the original thread as answered and repost the second question here. Link to original thread if anyone wants it:
Handling SQL Server concurrency issues
Suppose I have a table with a field which holds foreign keys for a second table. Initially records in the first table do not have a corresponding record in the second, so I store NULL in that field. Now at some point a user runs an operation which will generate a record in the second table and have the first table link to it. If two users simultaneously try to generate the record, a single record should be created and linked to, and the other user receives a message saying the record already exists. How do I ensure that duplicates are not created in a concurrent environment?
The steps I need to carry out are:
1) Look up x number of records in table A
2) Perform some business logic that prepares a single row which is inserted into table B
3) Update the records selected in step 1) to point to the newly created record in table B
I can use scope_identity() to retrieve the primary key of the newly created record in table B, so I don't need to worry about the new record being lost due to simultaneous transactions. However I need to eliminate the possibility of concurrently executing processes resulting in a duplicate record in table B being created.
In SQL Server 2008, this can be handled with a filtered unique index:
CREATE UNIQUE INDEX ix_MyIndexName ON MyTable (FKField) WHERE FkField IS NOT NULL
This will require all non-null values be unique, and the database will enforce it for you.
The 2005 way of simulating a unique filtered index for constraint purposes is
CREATE VIEW dbo.EnforceUnique
WITH SCHEMABINDING
AS
SELECT FkField
FROM dbo.TableB
WHERE FkField IS NOT NULL
GO
CREATE UNIQUE CLUSTERED INDEX ix ON dbo.EnforceUnique(FkField)
Connections that update the base table will need to have the correct SET options but unless you are using non default options this will be the case anyway in SQL Server 2005 (ARITH_ABORT used to be the problem one in 2000)
Using a computed column
ALTER TABLE MyTable ADD
OneNonNullOnly AS ISNULL(FkField, -PkField)
CREATE UNIQUE INDEX ix_OneNullOnly ON MyTable (OneNonNullOnly);
Assumes:
FkField is numeric
no clash of FkField and -PkField values
Decided to go with the following:
1) Begin transaction
2) UPDATE tableA SET foreignKey = -1 OUTPUT inserted.id INTO #tempTable
FROM (business logic)
WHERE foreignKey is null
3) If ##rowcount > 0 Then
3a) Create record in table 2.
3b) Capture ID of newly created record using scope_identity()
3c) UPDATE tableA set foreignKey = IdOfNewRecord FROM tableA INNER JOIN #tempTable ON tableA.id = tempTable.id
Since I write junk into the foreign key field in step 2), those rows are locked and no concurrent transactions will touch them. The first transaction is free to create the record. After the transaction is committed, the blocked transaction will execute the update query, but won't capture any of the original rows due to the WHERE clause only considering NULL foreignKey fields. If no rows are returned (##rowcount = 0), the current transaction exits without creating the record in table B, and returns some sort of error message to the client. (e.g. Error: Record already exists)