How to get next autoincrement value in sqlite? [duplicate] - sql

I have a table Messages with columns ID (primary key, autoincrement) and Content (text).
I have a table Users with columns username (primary key, text) and Hash.
A message is sent by one Sender (user) to many recipients (user) and a recipient (user) can have many messages.
I created a table Messages_Recipients with two columns: MessageID (referring to the ID column of the Messages table and Recipient (referring to the username column in the Users table). This table represents the many to many relation between recipients and messages.
So, the question I have is this. The ID of a new message will be created after it has been stored in the database. But how can I hold a reference to the MessageRow I just added in order to retrieve this new MessageID? I can always search the database for the last row added of course, but that could possibly return a different row in a multithreaded environment?
EDIT: As I understand it for SQLite you can use the SELECT last_insert_rowid(). But how do I call this statement from ADO.Net?
My Persistence code (messages and messagesRecipients are DataTables):
public void Persist(Message message)
{
pm_databaseDataSet.MessagesRow messagerow;
messagerow=messages.AddMessagesRow(message.Sender,
message.TimeSent.ToFileTime(),
message.Content,
message.TimeCreated.ToFileTime());
UpdateMessages();
var x = messagerow;//I hoped the messagerow would hold a
//reference to the new row in the Messages table, but it does not.
foreach (var recipient in message.Recipients)
{
var row = messagesRecipients.NewMessages_RecipientsRow();
row.Recipient = recipient;
//row.MessageID= How do I find this??
messagesRecipients.AddMessages_RecipientsRow(row);
UpdateMessagesRecipients();//method not shown
}
}
private void UpdateMessages()
{
messagesAdapter.Update(messages);
messagesAdapter.Fill(messages);
}

One other option is to look at the system table sqlite_sequence. Your sqlite database will have that table automatically if you created any table with autoincrement primary key. This table is for sqlite to keep track of the autoincrement field so that it won't repeat the primary key even after you delete some rows or after some insert failed (read more about this here http://www.sqlite.org/autoinc.html).
So with this table there is the added benefit that you can find out your newly inserted item's primary key even after you inserted something else (in other tables, of course!). After making sure that your insert is successful (otherwise you will get a false number), you simply need to do:
select seq from sqlite_sequence where name="table_name"

With SQL Server you'd SELECT SCOPE_IDENTITY() to get the last identity value for the current process.
With SQlite, it looks like for an autoincrement you would do
SELECT last_insert_rowid()
immediately after your insert.
http://www.mail-archive.com/sqlite-users#sqlite.org/msg09429.html
In answer to your comment to get this value you would want to use SQL or OleDb code like:
using (SqlConnection conn = new SqlConnection(connString))
{
string sql = "SELECT last_insert_rowid()";
SqlCommand cmd = new SqlCommand(sql, conn);
conn.Open();
int lastID = (Int32) cmd.ExecuteScalar();
}

I've had issues with using SELECT last_insert_rowid() in a multithreaded environment. If another thread inserts into another table that has an autoinc, last_insert_rowid will return the autoinc value from the new table.
Here's where they state that in the doco:
If a separate thread performs a new INSERT on the same database connection while the sqlite3_last_insert_rowid() function is running and thus changes the last insert rowid, then the value returned by sqlite3_last_insert_rowid() is unpredictable and might not equal either the old or the new last insert rowid.
That's from sqlite.org doco

According to Android Sqlite get last insert row id there is another query:
SELECT rowid from your_table_name order by ROWID DESC limit 1

Sample code from #polyglot solution
SQLiteCommand sql_cmd;
sql_cmd.CommandText = "select seq from sqlite_sequence where name='myTable'; ";
int newId = Convert.ToInt32( sql_cmd.ExecuteScalar( ) );

sqlite3_last_insert_rowid() is unsafe in a multithreaded environment (and documented as such on SQLite)
However the good news is that you can play with the chance, see below
ID reservation is NOT implemented in SQLite, you can also avoid PK using your own UNIQUE Primary Key if you know something always variant in your data.
Note:
See if the clause on RETURNING won't solve your issue
https://www.sqlite.org/lang_returning.html
As this is only available in recent version of SQLite and may have some overhead, consider Using the fact that it's really bad luck if you have an insertion in-between your requests to SQLite
see also if you absolutely need to fetch SQlite internal PK, can you design your own predict-able PK:
https://sqlite.org/withoutrowid.html
If need traditional PK AUTOINCREMENT, yes there is a small risk that the id you fetch may belong to another insertion. Small but unacceptable risk.
A workaround is to call twice the sqlite3_last_insert_rowid()
#1 BEFORE my Insert, then #2 AFTER my insert
as in :
int IdLast = sqlite3_last_insert_rowid(m_db); // Before (this id is already used)
const int rc = sqlite3_exec(m_db, sql,NULL, NULL, &m_zErrMsg);
int IdEnd = sqlite3_last_insert_rowid(m_db); // After Insertion most probably the right one,
In the vast majority of cases IdEnd==IdLast+1. This the "happy path" and you can rely on IdEnd as being the ID you look for.
Else you have to need to do an extra SELECT where you can use criteria based on IdLast to IdEnd (any additional criteria in WHERE clause are good to add if any)
Use ROWID (which is an SQlite keyword) to SELECT the id range that is relevant.
"SELECT my_pk_id FROM Symbols WHERE ROWID>%d && ROWID<=%d;",IdLast,IdEnd);
// notice the > in: ROWID>%zd, as we already know that IdLast is NOT the one we look for.
As second call to sqlite3_last_insert_rowid is done right away after INSERT, this SELECT generally only return 2 or 3 row max.
Then search in result from SELECT for the data you Inserted to find the proper id.
Performance improvement: As the call to sqlite3_last_insert_rowid() is way faster than the INSERT, (Even if mutex may make that wrong it is statistically true) I bet on IdEnd to be the right one and unwind the SELECT results by the end. Nearly in every cases we tested the last ROW does contain the ID you look for).
Performance improvement: If you have an additional UNIQUE Key, then add it to the WHERE to get only one row.
I experimented using 3 threads doing heavy Insertions, it worked as expected, the preparation + DB handling take the vast majority of CPU cycles, then results is that the Odd of mixup ID is in the range of 1/1000 insertions (situation where IdEnd>IdLast+1)
So the penalty of an additional SELECT to resolve this is rather low.
Otherwise said the benefit to use the sqlite3_last_insert_rowid() is great in the vast majority of Insertion, and if using some care, can even safely be used in MT.
Caveat: Situation is slightly more awkward in transactional mode.
Also SQLite didn't explicitly guaranty that ID will be contiguous and growing (unless AUTOINCREMENT). (At least I didn't found information about that, but looking at the SQLite source code it preclude that)

the simplest method would be using :
SELECT MAX(id) FROM yourTableName LIMIT 1;
if you are trying to grab this last id in a relation to effect another table as for example : ( if invoice is added THEN add the ItemsList to the invoice ID )
in this case use something like :
var cmd_result = cmd.ExecuteNonQuery(); // return the number of effected rows
then use cmd_result to determine if the previous Query have been excuted successfully, something like : if(cmd_result > 0) followed by your Query SELECT MAX(id) FROM yourTableName LIMIT 1; just to make sure that you are not targeting the wrong row id in case the previous command did not add any Rows.
in fact cmd_result > 0 condition is very necessary thing in case anything fail . specially if you are developing a serious Application, you don't want your users waking up finding random items added to their invoice.

I recently came up with a solution to this problem that sacrifices some performance overhead to ensure you get the correct last inserted ID.
Let's say you have a table people. Add a column called random_bigint:
create table people (
id int primary key,
name text,
random_bigint int not null
);
Add a unique index on random_bigint:
create unique index people_random_bigint_idx
ON people(random_bigint);
In your application, generate a random bigint whenever you insert a record. I guess there is a trivial possibility that a collision will occur, so you should handle that error.
My app is in Go and the code that generates a random bigint looks like this:
func RandomPositiveBigInt() (int64, error) {
nBig, err := rand.Int(rand.Reader, big.NewInt(9223372036854775807))
if err != nil {
return 0, err
}
return nBig.Int64(), nil
}
After you've inserted the record, query the table with a where filter on the random bigint value:
select id from people where random_bigint = <put random bigint here>
The unique index will add a small amount of overhead on the insertion. The id lookup, while very fast because of the index, will also add a little overhead.
However, this method will guarantee a correct last inserted ID.

Related

Locking row of a referenced table postgres

I have a table A which is referenced by a table B that is to say A's schema looks like this:
Table A
(
id int,
name varchar,
)
While Table B's schema is:
Table B
(
id int,
a_id int,
val int
)
I have a piece of code that creates a record in table B. But, in cases of race conditions say, in case of two parallel transactions, I have a condition in that block which fails and as a result two records are created in table B instead of one.
The transaction block looks very similar to this (in Rails):
ActiveRecord::Base.transaction do
# a here is an ActiveRecord Object of Model A
b = B.new(a_id: a.id, val: value) # value is -ve
raise ActiveRecord::Rollback unless b.save
# this method calculates the sum of val's of all associated records b of a.
# i.e. find all records from B where b.a_id = a.id and find the sum of val
# column
sum = calculateSum(a)
# below condition fails in race conditions
raise ActiveRecord::Rollback if sum <= 0
end
One solution to this would be to keep a centralized hash of locks whose key would be A's id and before entering the block (in my application) I keep on waiting for this lock to be released. This solution would definitely work but I was thinking if Postgres already provides any better solution.
Edit: There is no such constraint that A's should have only one B's record. A can have many B's. It's just that in the block of code that I mentioned has a check that fails in case of two parallel transactions.
The most general solution to concurrency issues like this is to put your whole block within a SERIALIZABLE transaction. Put simply, this guarantees that your transactions behave as if they had exclusive access to the database. The main downside is that you may trigger a serialisation failure at any point, with from a simple SELECT, and you should be prepared to retry the transaction if this happens. There is an example on the wiki which appears to be very similar to your case, which should give you a better idea of how these transactions behave in practice.
Other than that, I think you'll need to explicitly lock something. One possibility would be to lock the whole record in A via a SELECT FOR UPDATE statement, which will block competing processes in your application, as well as anything else trying to insert a referencing row in B. The drawback here is that you might block (or be blocked by) some unrelated operation, like an insert in a different referencing table, or an update of A itself.
A better approach might be to take out an advisory lock on A.id. This is basically equivalent to your centralised hash, but these locks have the advantage of being managed by Postgres, and automatically released on commit/rollback. The caveat is that, because you're taking out locks on arbitrary integers, you want to be sure that you don't collide with some other process which happens to be locking that same integer for some unrelated reason.
You can handle this by using the two-argument version of pg_advisory_xact_lock(), and using one of the inputs to identify the type of lock. Rather than maintaining a bunch of lock type constants somewhere on the client side, I find that a useful strategy is to wrap the call for each lock type in its own function, and use that function's oid as the type identifier, e.g:
CREATE FUNCTION lock_A_for_insert_into_B(a_id int) RETURNS VOID LANGUAGE sql AS $$
SELECT pg_advisory_xact_lock('lock_A_for_insert_into_B(int)'::regprocedure::int, a_id)
$$
If understand your dilemma, try executing inside a BEGIN...COMMIT block. For most operations, this takes the place of a lock. If the instructions fail, the db is unchanged. It is particularly useful for operations where multiple tables much change simultaneously.
You have a condition that will block? That's not how database work. You don't do anything. They do it. Why is your app conditionally doing anything? The database ensures integrity, it'll be fine. Centralized hash of locks? I'm not sure what you're doing.. but you're so far down the wrong rabbit hole, it's gunna take a lot cleverness to get you out.
You gotta backtrack. Fast.
CREATE TEMP TABLE a ( id_a int PRIMARY KEY, name text );
CREATE TEMP TABLE b ( id_b int PRIMARY KEY, id_a int REFERENCES a, val int );
WITH ti AS (
INSERT INTO a (id_a, name) VALUES (2,'foo')
RETURNING id_a
)
INSERT INTO b (id_b,id_a,val)
SELECT 1,ti.id_a,42
FROM ti;
Result,
TABLE a;
id_a | name
------+------
2 | foo
(1 row)
test=# TABLE b;
id_b | id_a | val
------+------+-----
1 | 2 | 42

SQL Oracle, inserting into tables without sequence

I am working with an Oracle 11.2g instance.
I'd like to know what I am exposing to by inserting rows into tables by generating the primary key values myself.
I would SELECT max(pk) FROM sometables;
and then use the next hundred values for example for my next 100 inserts.
Is is playing with fire?
The context is: I have a big number of inserts to do, that are splitted over several tables linked by foreign keys. I am trying to get good performance, and not use PL/SQL.
[EDIT] here a code sample that looks like what I'm dealing with:
QString query1 = "INSERT INTO table 1 (pk1_id, val) VALUES (pk1_seq.nextval, ?)"
sqlQuery->prepare(query);
sqlQuery->addBindValue(vec_of_values);
sqlQuery->execBatch();
QString query2 = "INSERT INTO table 2 (pk2_id, another_val, pk1_pk1_id) VALUES (pk2_seq.nextval, ?, ?)"
sqlQuery->prepare(query);
sqlQuery->addBindValue(vec_of_values);
// How do I get the primary keys (hundreds of them)
// from the first insert??
sqlQuery->addBindValue(vec_of_pk1);
sqlQuery->execBatch();
You are exposing yourself to slower performance, errors in your logic, and extra code to maintain. Oracle sequences are optimized for your specific purpose. For high DML operations you may also cache sequences:
ALTER SEQUENCE customers_seq CACHE 100;
Create a sequence for the master table(s)
Insert into the master table using your_sequence.nextval
Inserts into child (dependent) tables are done using your_sequence.currval
create table parent (id integer primary key not null);
create table child (id integer primary key not null, pid integer not null references parent(id));
create sequence parent_seq;
create sequence child_seq;
insert into parent (id) values (parent_seq.nextval);
insert into child (id, pid) values (child_seq.nextval, parent_seq.currval);
commit;
To explain why max(id) will not work reliably, consider the following scenario:
Transaction 1 retrieves max(id) + 1 (yields, say 42)
Transaction 1 insert a new row with id = 42
Transaction 2 retrieves max(id) + 1 (also yields 42, because transaction 1 is not yet committed)
Transaction 1 commits
Transcation 2 inserts a new row with id = 42
Transaction 2 tries to commit and gets a unique key violation
Now think about what happens when you have a lot of transactions doing this. You'll get a lot of errors. Additionally your inserts will be slower and slower, because the cost of calculating max(id) will increase with the size of the table.
Sequences are the only sane (i.e. correct, fast and scalable) way out of this problem.
Edit
If you are struck with yet another ORM which can't cope with these kind of strategy (which is supported by nearly all DBMS nowadays - even SQL Server has sequences now), then you should be able to do the following in your client code:
Retrieve the next PK value using select parent_seq.nextval from dual into a variable in your programming language (this is a fast, scalable and correct way to retrieve the PK value).
If you can run a select max(id) you can also run a select parent_seq.nextval from dual. In both cases just use the value obtained from that select statement.

SQL - renumbering a sequential column to be sequential again after deletion

I've researched and realize I have a unique situation.
First off, I am not allowed to post images yet to the board since I'm a new user, so see appropriate links below
I have multiple tables where a column (not always the identifier column) is sequentially numbered and shouldn't have any breaks in the numbering. My goal is to make sure this stays true.
Down and Dirty
We have an 'Event' table where we randomly select a percentage of the rows and insert the rows into table 'Results'. The "ID" column from the 'Results' is passed to a bunch of delete queries.
This more or less ensures that there are missing rows in several tables.
My problem:
Figuring out an sql query that will renumber the column I specify. I prefer to not drop the column.
Example delete query:
delete ItemVoid
from ItemTicket
join ItemVoid
on ItemTicket.item_ticket_id = itemvoid.item_ticket_id
where itemticket.ID in (select ID
from results)
Example Tables Before:
Example Tables After:
As you can see 2 rows were delete from both tables based on the ID column. So now I gotta figure out how to renumber the item_ticket_id and the item_void_id columns where the the higher number decreases to the missing value, and the next highest one decreases, etc. Problem #2, if the item_ticket_id changes in order to be sequential in ItemTickets, then
it has to update that change in ItemVoid's item_ticket_id.
I appreciate any advice you can give on this.
(answering an old question as it's the first search result when I was looking this up)
(MS T-SQL)
To resequence an ID column (not an Identity one) that has gaps,
can be performed using only a simple CTE with a row_number() to generate a new sequence.
The UPDATE works via the CTE 'virtual table' without any extra problems, actually updating the underlying original table.
Don't worry about the ID fields clashing during the update, if you wonder what happens when ID's are set that already exist, it
doesn't suffer that problem - the original sequence is changed to the new sequence in one go.
WITH NewSequence AS
(
SELECT
ID,
ROW_NUMBER() OVER (ORDER BY ID) as ID_New
FROM YourTable
)
UPDATE NewSequence SET ID = ID_New;
Since you are looking for advice on this, my advice is you need to redesign this as I see a big flaw in your design.
Instead of deleting the records and then going through the hassle of renumbering the remaining records, use a bit flag that will mark the records as Inactive. Then when you are querying the records, just include a WHERE clause to only include the records are that active:
SELECT *
FROM yourTable
WHERE Inactive = 0
Then you never have to worry about re-numbering the records. This also gives you the ability to go back and see the records that would have been deleted and you do not lose the history.
If you really want to delete the records and renumber them then you can perform this task the following way:
create a new table
Insert your original data into your new table using the new numbers
drop your old table
rename your new table with the corrected numbers
As you can see there would be a lot of steps involved in re-numbering the records. You are creating much more work this way when you could just perform an UPDATE of the bit flag.
You would change your DELETE query to something similar to this:
UPDATE ItemVoid
SET InActive = 1
FROM ItemVoid
JOIN ItemTicket
on ItemVoid.item_ticket_id = ItemTicket.item_ticket_id
WHERE ItemTicket.ID IN (select ID from results)
The bit flag is much easier and that would be the method that I would recommend.
The function that you are looking for is a window function. In standard SQL (SQL Server, MySQL), the function is row_number(). You use it as follows:
select row_number() over (partition by <col>)
from <table>
In order to use this in your case, you would delete the rows from the table, then use a with statement to recalculate the row numbers, and then assign them using an update. For transactional integrity, you might wrap the delete and update into a single transaction.
Oracle supports similar functionality, but the syntax is a bit different. Oracle calls these functions analytic functions and they support a richer set of operations on them.
I would strongly caution you from using cursors, since these have lousy performance. Of course, this will not work on an identity column, since such a column cannot be modified.

Which is faster to find repetitions?

I have a table with 1 column and I want to check repetition of a value between 10,000 available rows.
I think I have two choices:
Make a query using SELECT statement, like this :
Var = Query('SELECT * FROM Table WHERE
Field1="VALUE"');
if (Var <> null)
MessageBox("This value exists in the table");
Set my column as Primary Key and use INSERT statement, like this:
try {
Var = Query('INSERT INTO Table(Field1) VALUES("VALUE")');
}
catch {
MessageBox("This value exists in the table");
}
Which is faster?
There's no general answer here, it depends upon how your schema is set up. In most (perhaps all?) relational databases, making a field a Primary Key will automatically create an index on that field. And doing the uniqueness check against an index is pretty much as fast as you can get in this case.
But, you can index your field without declaring it your table's Primary Key. And if you do that the SELECT command will be just as fast as the INSERT plus catch method. More broadly, you can only have one Primary Key per table, so making the field the Primary Key is not a very robust solution. It will break as soon as you have multiple fields that you want to enforce uniqueness on (unless you make a compound primary key across both fields...but I digress, and that doesn't enforce per-column uniqueness anyways).
So I would recommend creating an index on your field/column, and then using the the SELECT method to see if the value already exists. Alternately, you can index the field and stipulate this it should be unique, without making it your Primary Key, and use the INSERT plus catch approach.
I'd recommend using select count:
Var = Query('SELECT count(*) FROM Table WHERE Field1="VALUE"');
if (Var > 0) MessageBox("This value exists in the table");
The second approach is not so nice, IMO, and it's probably going to be much slower.
And BTW, it should read exists instead of exist :-)
If you want to insert a value, if it doesn't exist, then something like the following would be most appropriate (although different SQL dialects may apply):
INSERT INTO Table(Column)
SELECT 'New Value' WHERE NOT EXISTS (SELECT * FROM Table where Column = 'New Value')
And then checking whether 0 or 1 rows were affected.
Note where I'm saying most appropriate, I'm not making a performance evaluation. I'm talking about the code that most clearly expresses the intent. Usually, this will be good enough. Only rarely should you move away from the code that most clearly expresses your intent, for something less clear that performs 0.5% better...
You should also beware your first format (SELECT * FROM TABLE...), as a general style. If you were querying a large, wide table, performing such a select may cause a lot of I/O to be performed by the database, to retrieve all column values, just for your code to then ignore all of those values. SELECT *... within an EXISTS clause, on the other hand, is specially dealt with by most database engines, and will not retrieve actual row/column data.

Using a database table as a queue

I want to use a database table as a queue. I want to insert in it and take elements from it in the inserted order (FIFO). My main consideration is performance because I have thousands of these transactions each second. So I want to use a SQL query that gives me the first element without searching the whole table. I do not remove a row when I read it.
Does SELECT TOP 1 ..... help here?
Should I use any special indexes?
I'd use an IDENTITY field as the primary key to provide the uniquely incrementing ID for each queued item, and stick a clustered index on it. This would represent the order in which the items were queued.
To keep the items in the queue table while you process them, you'd need a "status" field to indicate the current status of a particular item (e.g. 0=waiting, 1=being processed, 2=processed). This is needed to prevent an item be processed twice.
When processing items in the queue, you'd need to find the next item in the table NOT currently being processed. This would need to be in such a way so as to prevent multiple processes picking up the same item to process at the same time as demonstrated below. Note the table hints UPDLOCK and READPAST which you should be aware of when implementing queues.
e.g. within a sproc, something like this:
DECLARE #NextID INTEGER
BEGIN TRANSACTION
-- Find the next queued item that is waiting to be processed
SELECT TOP 1 #NextID = ID
FROM MyQueueTable WITH (UPDLOCK, READPAST)
WHERE StateField = 0
ORDER BY ID ASC
-- if we've found one, mark it as being processed
IF #NextId IS NOT NULL
UPDATE MyQueueTable SET Status = 1 WHERE ID = #NextId
COMMIT TRANSACTION
-- If we've got an item from the queue, return to whatever is going to process it
IF #NextId IS NOT NULL
SELECT * FROM MyQueueTable WHERE ID = #NextID
If processing an item fails, do you want to be able to try it again later? If so, you'll need to either reset the status back to 0 or something. That will require more thought.
Alternatively, don't use a database table as a queue, but something like MSMQ - just thought I'd throw that in the mix!
If you do not remove your processed rows, then you are going to need some sort of flag that indicates that a row has already been processed.
Put an index on that flag, and on the column you are going to order by.
Partition your table over that flag, so the dequeued transactions are not clogging up your queries.
If you would really get 1.000 messages every second, that would result in 86.400.000 rows a day. You might want to think of some way to clean up old rows.
Everything depends on your database engine/implementation.
For me simple queues on tables with following columns:
id / task / priority / date_added
usually works.
I used priority and task to group tasks and in case of doubled task i choosed the one with bigger priority.
And don't worry - for modern databases "thousands" is nothing special.
This will not be any trouble at all as long as you use something to keep track of the datetime of the insert. See here for the mysql options. The question is whether you only ever need the absolute most recently submitted item or whether you need to iterate. If you need to iterate, then what you need to do is grab a chunk with an ORDER BY statement, loop through, and remember the last datetime so that you can use that when you grab your next chunk.
perhaps adding a LIMIT=1 to your select statement would help ... forcing the return after a single match...
Since you don't delete the records from the table, you need to have a composite index on (processed, id), where processed is the column that indicates if the current record had been processed.
The best thing would be creating a partitioned table for your records and make the PROCESSED field the partitioning key. This way, you can keep three or more local indexes.
However, if you always process the records in id order, and have only two states, updating the record would mean just taking the record from the first leaf of the index and appending it to the last leaf
The currently processed record would always have the least id of all unprocessed records and the greatest id of all processed records.
Create a clustered index over a date (or autoincrement) column. This will keep the rows in the table roughly in index order and allow fast index-based access when you ORDER BY the indexed column. Using TOP X (or LIMIT X, depending on your RDMBS) will then only retrieve the first x items from the index.
Performance warning: you should always review the execution plans of your queries (on real data) to verify that the optimizer doesn't do unexpected things. Also try to benchmark your queries (again on real data) to be able to make informed decisions.
I had the same general question of "how do I turn a table into a queue" and couldn't find the answer I wanted anywhere.
Here is what I came up with for Node/SQLite/better-sqlite3.
Basically just modify the inner WHERE and ORDER BY clauses for your use case.
module.exports.pickBatchInstructions = (db, batchSize) => {
const buf = crypto.randomBytes(8); // Create a unique batch identifier
const q_pickBatch = `
UPDATE
instructions
SET
status = '${status.INSTRUCTION_INPROGRESS}',
run_id = '${buf.toString("hex")}',
mdate = datetime(datetime(), 'localtime')
WHERE
id IN (SELECT id
FROM instructions
WHERE
status is not '${status.INSTRUCTION_COMPLETE}'
and run_id is null
ORDER BY
length(targetpath), id
LIMIT ${batchSize});
`;
db.run(q_pickBatch); // Change the status and set the run id
const q_getInstructions = `
SELECT
*
FROM
instructions
WHERE
run_id = '${buf.toString("hex")}'
`;
const rows = db.all(q_getInstructions); // Get all rows with this batch id
return rows;
};
A very easy solution for this in order not to have transactions, locks etc is to use the change tracking mechanisms (not data capture). It utilizes versioning for each added/updated/removed row so you can track what changes happened after a specific version.
So, you persist the last version and query the new changes.
If a query fails, you can always go back and query data from the last version.
Also, if you want to not get all changes with one query, you can get top n order by last version and store the greatest version I'd you have got to query again.
See this for example Using Change Tracking in SQL Server 2008