Retrieving ids of batch inserted rows in SQLite - sql

I'm using SQLite's last_insert_rowid() to grab the last inserted row ID following a batch insert. Is there any risk of race conditions that could cause this value to not return the last id of the batch insert? For example, is it possible that in between the completion of the insert and the calling of last_insert_rowid() some other process may have written to the table again?

last_insert_rowid() is connection-dependent, so there is a risk when multiple threads are using the same connection, without SQLite switched to Serialized threading mode.

last_insert_rowid() returns information about the last insert done in this specific connection; it cannot return a value written by some other process.
To ensure that the returned value corresponds to the current state of the database, take advantage of SQLite's ACID guarantees (here: atomicity): wrap the batch inserts, the last_insert_rowid() call, and whatever you're doing with the ID inside a single transaction.
In any case, the return value of last_insert_rowid() changes only when some insert is done through this connection, so you should never access the same connection from multiple threads, or if you really want to do so, manually serialize entire transactions.

Related

Understanding locks and query status in Snowflake (multiple updates to a single table)

While using the python connector for snowflake with queries of the form
UPDATE X.TABLEY SET STATUS = %(status)s, STATUS_DETAILS = %(status_details)s WHERE ID = %(entry_id)s
, sometimes I get the following message:
(snowflake.connector.errors.ProgrammingError) 000625 (57014): Statement 'X' has locked table 'XX' in transaction 1588294931722 and this lock has not yet been released.
and soon after that
Your statement X' was aborted because the number of waiters for this lock exceeds the 20 statements limit
This usually happens when multiple queries are trying to update a single table. What I don't understand is that when I see the query history in Snowflake, it says the query finished successfully (Succeded Status) but in reality, the Update never happened, because the table did not alter.
So according to https://community.snowflake.com/s/article/how-to-resolve-blocked-queries I used
SELECT SYSTEM$ABORT_TRANSACTION(<transaction_id>);
to release the lock, but still, nothing happened and even with the succeed status the query seems to not have executed at all. So my question is, how does this really work and how can a lock be released without losing the execution of the query (also, what happens to the other 20+ queries that are queued because of the lock, sometimes it seems that when the lock is released the next one takes the lock and have to be aborted as well).
I would appreciate it if you could help me. Thanks!
Not sure if Sergio got an answer to this. The problem in this case is not with the table. Based on my experience with snowflake below is my understanding.
In snowflake, every table operations also involves a change in the meta table which keeps track of micro partitions, min and max. This meta table supports only 20 concurrent DML statements by default. So if a table is continuously getting updated and getting hit at the same partition, there is a chance that this limit will exceed. In this case, we should look at redesigning the table updation/insertion logic. In one of our use cases, we increased the limit to 50 after speaking to snowflake support team
UPDATE, DELETE, and MERGE cannot run concurrently on a single table; they will be serialized as only one can take a lock on a table at at a time. Others will queue up in the "blocked" state until it is their turn to take the lock. There is a limit on the number of queries that can be waiting on a single lock.
If you see an update finish successfully but don't see the updated data in the table, then you are most likely not COMMITting your transactions. Make sure you run COMMIT after an update so that the new data is committed to the table and the lock is released.
Alternatively, you can make sure AUTOCOMMIT is enabled so that DML will commit automatically after completion. You can enable it with ALTER SESSION SET AUTOCOMMIT=TRUE; in any sessions that are going to run an UPDATE.

Do ##Identity And Scope_Identity Always Return The Record Just Added?

I understand the differences between ##IDENTITY and SCOPE_IDENTITY, I think, but I'm struggling to find out exactly how they're generated.
All the documentation tells me that these functions return the ID of the last record added to a table, but if I have a Stored Procedure containing an INSERT statement, and that procedure is part of a heavily-used database that could be getting executed by multiple users at the same time, if those two users both insert a record into the same table fractions of a second apart, is it possible that if I call ##IDENTITY or SCOPE_IDENTITY from the Stored Procedure right after the INSERT statement, they could actually return the ID of a record inserted by a different user?
I think the answer is that SCOPE_IDENTITY would avoid this because, as the name suggests, it gets the identity of the last record added from within the scope of the call to SCOPE_IDENTITY (in this case, from within the same Stored Procedure), but since I'm not entirely sure what the definition of the scope is, I don't know if I'm right in thinking this.
Both ##identity and scope_identity() will return the id of a record created by the same user.
The ##identity function returns the id created in the same session. The session is the database connection, so that is normally the same thing as the user.
The scope_identity() function returns the id created in the same session and the same scope. The scope is the current query or the current stored procedure.
So, the difference between them is if you for example call a procedure from another procedure; if the called procedure inserts a record, using ##identity after the call will return that id, but scope_identity() will not.
Way back before I knew better I ran a query like
select max(id) from table
to get the ID of a record I just inserted. When you use something like this in a production environment where you have multiple users adding records concurrently, bad things happen.
You are implying that ##Identity and scope_identity() work the same way as the query above. That's not the case. They both return the value of identity columns generate via inserts WITHIN THE CURRENT USER'S SESSION ONLY!. Scope_Identity() is useful if you have tables that have triggers on them and the trigger logic does it's own inserts. In those cases ##Identity would return the identity value generated within the trigger, which is probably not what you want. It's for that reason I almost always prefer Scope_Identity() to ##Identity

Oracle SQL and autonumbering: how to retrieve an OUT parameter - outside the procedure

On an earlier stage, our system was provided with tables that hold last used autonumber (instead of using sequences). We are now redoing the client solution for the system, and need to 'reinvent' how to fetch next record number - by SQL.
The client application is made in FileMaker, the database still resides in Oracle. The challenge is to update last used autonumber AND supply it to the new record initiated in the client - in one operation.
A SELECT statement can retrieve the last used number.
An UPDATE statement can increment the last used number.
A function selecting and returning the number is not allowed to contain update statements.
A procedure may do the update, and may retain the new value returning it into an OUT parameter, but does not return the new value to the client - unless the client in some way can read the OUT parameter from the procedure (I do not think it reads DBMS_OUTPUT).
If the procedure proceeds to do an INSERT on the table where the client is preparing an INSERT, the inserts will not be identical, as far as I can see.
So - is there a syntax that will make the OUT value accessible to the client as result of an SQL statement including a procedure call (perhaps making the OUT parameter in some way refer to the client's new record recnr field), or is this altogether a blind alley?
regarding syntax - You need to wrap your PL/SQL procedure with out param into function (you can use overloaded function with the same name in the same package) and return out value as function result.
Regarding design - I do not recommend to use "home-made" mechanism to replace sequences. Sequences are much better optimised and more reliable solution.

Call commit on autoCommit=false connection for SELECT statements JDBC?

I do have a webapp written in Java on Tomcat, all connections should be autoCommit=false by default. Now, if I do run SELECT statement only in a transaction. Do I still need to call commit() or is it sufficient just to close the connection?
For what it's worth: I am on Oracle 11.2.
There is a similar question but does not actually give an answer for this case.
It is sufficient to close the connection, no need to call commit or rollback.
But according to connection.close(), it is recommended to call either commit or rollback.
Select statements do not disturb the underlying model or the data contained within the model. It is safe to close the connection without calling any commands related to transactions (like commit).
Actually strike that. I had not considered adjacent selects made to a model in my first answer. Say you execute select id from users where age > 20 and follow it up with select id from users where age = 20, any updates made between these queries would affect the ACID nature of the selects and return duplicate results within the 2 queries. To guarantee consistent results you would need to wrap both selects in the same transaction with a commit().
So yes, It makes sense to commit your selects.

Local Temporary table in Oracle 10 (for the scope of Stored Procedure)

I am new to oracle. I need to process large amount of data in stored proc. I am considering using Temporary tables. I am using connection pooling and the application is multi-threaded.
Is there a way to create temporary tables in a way that different table instances are created for every call to the stored procedure, so that data from multiple stored procedure calls does not mix up?
You say you are new to Oracle. I'm guessing you are used to SQL Server, where it is quite common to use temporary tables. Oracle works differently so it is less common, because it is less necessary.
Bear in mind that using a temporary table imposes the following overheads:read data to populate temporary tablewrite temporary table data to fileread data from temporary table as your process startsMost of that activity is useless in terms of helping you get stuff done. A better idea is to see if you can do everything in a single action, preferably pure SQL.
Incidentally, your mention of connection pooling raises another issue. A process munging large amounts of data is not a good candidate for running in an OLTP mode. You really should consider initiating a background (i.e. asysnchronous) process, probably a database job, to run your stored procedure. This is especially true if you want to run this job on a regular basis, because we can use DBMS_SCHEDULER to automate the management of such things.
IF you're using transaction (rather than session) level temporary tables, then this may already do what you want... so long as each call only contains a single transaction? (you don't quite provide enough detail to make it clear whether this is the case or not)
So, to be clear, so long as each call only contains a single transaction, then it won't matter that you're using a connection pool since the data will be cleared out of the temporary table after each COMMIT or ROLLBACK anyway.
(Another option would be to create a uniquely named temporary table in each call using EXECUTE IMMEDIATE. Not sure how performant this would be though.)
In Oracle, it's almost never necessary to create objects at runtime.
Global Temporary Tables are quite possibly the best solution for your problem, however since you haven't said exactly why you need a temp table, I'd suggest you first check whether a temp table is necessary; half the time you can do with one SQL what you might have thought would require multiple queries.
That said, I have used global temp tables in the past quite successfully in applications that needed to maintain a separate "space" in the table for multiple contexts within the same session; this is done by adding an additional ID column (e.g. "CALL_ID") that is initially set to 1, and subsequent calls to the procedure would increment this ID. The ID would necessarily be remembered using a global variable somewhere, e.g. a package global variable declared in the package body. E.G.:
PACKAGE BODY gtt_ex IS
last_call_id integer;
PROCEDURE myproc IS
l_call_id integer;
BEGIN
last_call_id := NVL(last_call_id, 0) + 1;
l_call_id := last_call_id;
INSERT INTO my_gtt VALUES (l_call_id, ...);
...
SELECT ... FROM my_gtt WHERE call_id = l_call_id;
END;
END;
You'll find GTTs perform very well even with high concurrency, certainly better than using ordinary tables. Best practice is to design your application so that it never needs to delete the rows from the temp table - since the GTT is automatically cleared when the session ends.
I used global temporary table recently and it was behaving very unwantedly manner.
I was using temp table to format some complex data in a procedure call and once the data is formatted, pass the data to fron end (Asp.Net).
In first call to the procedure, i used to get proper data and any subsequent call used to give me data from last procedure call in addition to current call.
I investigated on net and found out an option to delete rows on commit.
I thought that will fix the problem.. guess what ? when i used on commit delete rows option, i always used to get 0 rows from database. so i had to go back to original approach of on commit preserve rows, which preserves the rows even after commiting the transaction.This option clears rows from temp table only after session is terminated.
then i found out this post and came to know about the column to track call_id of a session.
I implemented that solution and still it dint fix the problem.
then i wrote following statement in my procedure before i starting any processing.
Delete From Temp_table;
Above statemnet made the trick. my front end was using connection pooling and after each procedure call it was commitng the transaction but still keeping the connection in connection pool and subsequent request was using the same connection and hence the database session was not terminated after every call..
Deleting rows from temp table before strating any processing made it work....
It drove me nuts till i found this solution....