Get exact ID within thousands of concurrent transactions - sql

I have a very crowd table with thousands of logs. Every time a new log is inserted in a database, I need to do update some tables using the ID of new log. So I get the last ID using these two lines of code:
objcon.execute "insert into logs (member,refer) values (12,12345)"
objcon.execute "select top 1 id from logs order by id desc"
I am afraid if the second line get another ID from a most recent order because there are thousands of new logs in one second.
This is a sample scenario and I know that there is built in methods to get the ID of recently inserted row. But my exact question is that if there is a logical order of transactions in a server (both IIS and SQL server) or it is possible that a new transaction finishes before an old transaction so the second line, get the ID of another log?

It is definitely possible that your second query will get id from another transaction. I strongly suggest that you use SCOPE_IDENTITY(). These kind of methods are provided in DBMS just for this exact scenario where you insert a row and then do select the last row from that table, but in between these 2 operations other connections might have inserted new rows.

Yes. Concurrent transactions can cause problems with what you are trying to do.
The right solution is the output clause. The code looks like this:
declare #ids table (id int);
insert into logs (member, refer)
output inserted.id into #ids
values (12, 12345);
select *
from #ids;
You can find multiple discussions on the web about why OUTPUT is better. Here are some reasons:
You can return multiple columns, not just the identity.
It is session- and table- safe.
It handles multiple rows.
It is the same syntax for SELECT, UPDATE, and DELETE.

If you don't specify a WHERE clause on the SELECT query, you would need to execute these queries in a transaction and under the SNAPSHOT isolation level before committing the changes. That way, only changes made by the current transaction are visible.
It would be better to use SCOPE_IDENTITY() to return the last identity value generated in the outermost scope of current connection. This differs from ##IDENTITY in that the value is not affected by triggers that might also generate identify values.
objcon.execute "insert into logs (member,refer) values (12,12345)"
objcon.execute "select SCOPE_IDENTITY() AS id;"

Related

Which Job inserted this record / row into the table?

I have dozens of different SQL Jobs calling different Sprocs, which insert rows into a common table.
Is there any way, given a row in the table, to retrieve the job which triggered the insert?
Input: Row ID, TableName, DBName
Output: Job ID which inserted Row
Not generally, as far as I'm aware. You could have the insert query include that data. Or you could get it from a log, maybe based on the primary key or another unique key, if your inserts are unique. You might be able to turn on some SQL server equivalent of the general log; but that's devastating to high volume performance and you'd still have to pull it from a log file. I recommend you consider whether you can diagnose your components from their logs in addition to their effects in the database.

SELECT, check and INSERT if condition is met

I have the following tables: Tour, Customer, TourCustomerXRef. Assuming every Tour has a capacity Cap. Now a request arrives at the backend for a booking.
What I need to do is:
SELECT and count() all of the entries in TourCustomerXRef where the tourid=123
In the program code (or the query?): If count() < Cap
True: INSERT into TourCustomerXRef, return success
False: return an error
However, it might be possible that the backend api is being called concurrently. This could result into the SELECT statement returning a number under the max. capacity both times, resulting in the INSERT being executed multiple times (even though there is just one place left).
What would be the best prevention for above case? set transaction isolation level serializable? or Repeatable Read?
I'm worried that the application logic (even though it's just a simple if) could hurt the performance, since read and write locks would not allow the api to execute querys that just want to select the data without inserting anything, right?
(Using mariaDB)
You can try using lock hint "FOR UPDATE" in the SELECT (mysql example):
BEGIN TRANSACTION;
SELECT * FROM TourCustomerXRef FOR UPDATE WHERE ...;
INSERT ...;
COMMIT;

How to truncate and add new rows to the table with a select query never getting empty results

I have a requirement where a table holds the state of certain things.
This table is truncated and new status data in inserted in it every second.
The problem is that if a select query is executed between a delete and the following insert, the user will get empty table in return.
SQL Transactions would not help here i think but not sure.
Also, if the select query is executed between the delete and insert query, it shouldn't return error because its blocked by a database lock. it should just wait till the delete + insert operation is finished.
What would be the best way to implement such a system?
How should i form the "delete + insert" query and the "select" query?
Thank you in advance.
--------additional information
This table would be result of a multiple heavy queries and will be updated every second so that the applications do not run those heavy queries and instead, they would get the required information from this table.
so a truncate and insert every second and multiple selects at random.
Don't truncate the table. Instead, insert the new status using an identity primary key or the date as the primary key. Then do:
select top 1 date
from table
order by date desc
or
select max(date)
from table
(These should have basically the same execution plan.)
Then, you insert the new date. When the insert is done, the data is immediately available.
You can then delete older rows at your leisure.
From your description, this table always contains only one row, the last status change. The contents changes about every second, apparently 24 hours a day.
Rather than change the data with a truncate/insert pair of operations, why not just update the one row? One operation, no race condition, no locking conflicts at all.
There is even a way to do that without changing any existing code:
Rename the table
Create a view which shows the row from the renamed table. Name it the original table name.
Create an "instead of insert" trigger on the view. The trigger performs an update to the table rather than an insert. This could be performed with a merge statement which will work if the table should ever happen to be empty.
Oops, I was wrong. You would still have to change the code to remove the truncate statement. It will not work against the view but it will throw an exception. Unfortunately, you can't intercept the truncate with a trigger and simply ignore it.
Then when the insert is executed, a truncate is no longer necessary and the insert converted into an update (or merge). One operation.

Capture log of fetched table rows

I am using db2 (if you have a solution with another database, I am still interested), and am trying to identify every row that is fetched from a specific table. The solution needs to be at the database level, because I do not have access to the actual SELECT statements that cause the fetch. I would at a minimum like to capture one or more column values into a log/table for every row that is fetched from a specific table.
Here's an example:
Table1 structure
CustNo (primary key)
CustName
Table 1 (two rows)
12345, Joe's Crab Shack
98765, Morton's The Steakhouse
Process
1) Before select, log file is empty
2) Execute: SELECT CustName from Table1 where CustNo=12345
3) After select, log file contains:
LogFile1
---------
12345
4) Execute: SELECT * from Table1
5) After select, log file contains:
LogFile1
---------
12345
12345
98765
Thank you for any advice/recommendations....
If you're willing to call a SP to log this info, you might want simply to add a *READ trigger. It's rarely a good idea to try to get some function to run whenever any record is read from a file, but a *READ trigger is possibly the most efficient way possible.
ADDPFTRG FILE(X) TRGTIME(*AFTER) TRGEVENT(*READ) PGM(Y)
Use that form of the command to add your "read-only" trigger program (Y) to a file (X). Program Y should probably do something fast like push the relevant data items onto a data queue. Then have multiple batch instances of a program that pulls entries off the queue and writes them to a log file. You really don't want a read-only trigger doing any more work than possible, and database I/O should be off the list.
Expect performance to suffer some.
You can get the operations on the database via db2audit, but you cannot get the values used. Using the values for audit or logging will compromise sensitive data.
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.admin.cmd.doc/doc/r0002072.html
Actually, if you put the ID of all read rows for a given table, it is like copying the ID column several times in the log table. At the same time, it does not give any order context, because the order in which the rows are inserted is not the same order as the rows will be stored or retrieved.
You have to rethink your logging strategy, because just inserting the 'fetched' ID is not enough. You also have to insert some context information, like who (user), when (date), from where (machine) in order to exploit that data.
Another thing you can do is to wrap the select in a stored procedure and insert the ID values in the log table, before returning the opening cursor to the caller.

SQL Table locking for concurrency [duplicate]

This question already has answers here:
Only inserting a row if it's not already there
(7 answers)
Closed 9 years ago.
I'm trying to make sure that one and only one row gets inserted into a table, but I'm running into issues where multiple processes are bumping into each other and I get more than one row. Here's the details (probably more detail than is needed, sorry):
There's a table called Areas that holds a hierarchy of "areas". Each "area" may have pending "orders" in the Orders table. Since it's a hierarchy, multiple "areas" can be grouped under a parent "area".
I have a stored procedure called FindNextOrder that, given an area, finds the next pending order (which could be in a child area) and "activates" it. "Activating" it means inserting the OrderID into the QueueActive table. The business rule is that an area can only have one active order at a time.
So my stored procedure has a statement like this:
IF EXISTS (SELECT 1 FROM QueueActive WHERE <Order exists for the given area>) RETURN
...
INSERT INTO QueueActive <Order data for the given area>
My problem is that there every once in a while, two different processes will call this stored procedure at almost the same time. When each one does the check for an existing row, each comes back with a zero. Because of that both processes do the insert statement and I end up with TWO active orders instead of just one.
How do I prevent this? Oh, and I happen to be using SQL Server 2012 Express but I need a solution that works in SQL Server 2000, 2005, and 2008 as well.
I already did a search for exclusively locking a table and found this answer but my attempt to implement this failed.
I would use some query hints on your select statement. The trouble is coming along because your procedure is only taking out shared locks and thus the other procedures can join in.
Tag on a WITH (ROWLOCK, XLOCK, READPAST) to your SELECT
ROWLOCK ensures that you are only locking the row.
XLOCK takes out an exclusive lock on the row, that way no one else can read it.
READPAST allows the query to skip over any locked rows and keep working instead of waiting.
The last one is optional and depends upon your concurrency requirements.
Further reading:
SQL Server ROWLOCK over a SELECT if not exists INSERT transaction
http://technet.microsoft.com/en-us/library/ms187373.aspx
Have you tried to create a trigger that rolls back second transaction if there is one Active order in a table?