Find gaps in auto incremented values - sql

Imagine having a table as the one below:
create table test (
id int auto_increment,
some int,
columns int
)
And then this table get used alot. Rows are inserted and rows are deleted and over time there might be gaps in the number that once was auto incremented. As an example, if I at some point make the following query:
select top 10 id from test
I might get something like
3
4
6
7
9
10
13
14
18
19
How do I design a query that returns the missing values 1,2,5,8 etc?

The easiest way is to get ranges of missing values:
select (id + 1) as firstmissing, (nextid - 1) as lastmissing
from (select t.id, lead(id) over (order by id) as nextid
from test t
) t
where nextid is not null and nextid <> id + 1;
Note this uses the lead() function, which is available in SQL Server 2012+. You can do something similar with apply or a subquery in earlier versions. Here is an example:
select (id + 1) as firstmissing, (nextid - 1) as lastmissing
from (select t.id, tt.id as nextid
from test t cross apply
(select top 1 id
from test t2
where t2.id > t.id
order by id
) tt
) t
where nextid is not null and nextid <> id + 1;

Simple way is by using cte..
;WITH cte
AS (SELECT 1 id
UNION ALL
SELECT id + 1 id from cte
WHERE id < (SELECT Max(id)
FROM tablename))
SELECT *
FROM cte
WHERE id NOT IN(SELECT id
FROM tablename)
Note: this will start from 1. If you want start from the min value of your table just replace
"SELECT 1 id" to "SELECT Min(id) id FROM tablename"

Why does it matter? I'm not trying to be snarky, but this question is usually asked in the context of "I want to fill in the gaps" or "I want to compress my id values to be contiguous". In either case, the answer is "don't do it". In your example, there was at some point a row with id = 5. If you're going to do either of the above, you'll be assigning a different, unrelated set of business data that id. If there's anything that references the id external to your database, now you've just invented a problem that you didn't have before. The id should be treated as immutable and arbitrary for all intents and purposes. If you really require it to be gapless, don't use identity and never do a hard delete (i.e. if you need to deactivate a row, you need a column which says whether it's active or not).

Related

How to round robin by UUID in SQL database?

I have a list of agents that I want to assign tasks to using round robin.
agents table:
id
uuid1
uuid2
uuid3
uuid4
How to get rows of the table above in a round robin fashion?
Desired outcome:
uuid1 -> uuid2 -> uuid3 -> uuid4 -> uuid1 (repeat)
I tried ordering uuids then selecting the next one based on the previous uuid
SELECT id FROM agents ORDER BY id; // When there is no previous
SELECT id FROM agents WHERE id > 'uuid1' ORDER BY id; // After the first query
But I don't know how to repeat when I reach the last uuid (when uuid4 is retrieved and uuid1 must be selected again)
select nxt
from (
select
-- for every id, find the next item, if not exists, use the earliest (min)
id, coalesce(lead(id) over (order by id),(select min(id) from Tbl)) as nxt
from Tbl
) as Agents
where id = 'uuid4'
I don't have PostgreSQL handy right now, but this works under SQL Server and I don't see an obvious reason why it shouldn't under any other DBMS:
COALESCE(
(SELECT TOP 1 id FROM agents WHERE #current_id < id ORDER BY id),
(SELECT TOP 1 id FROM agents ORDER BY id)
)
So it simply tries to get the next id (the first argument of COALESCE), and if there is no next id gets the first id (the second argument of COALESCE).
Here is a full T-SQL demo, something similar can probably be done under PostgreSQL...
CREATE TABLE agents (
id uniqueidentifier PRIMARY KEY
);
INSERT INTO agents VALUES (NEWID()), (NEWID()), (NEWID()), (NEWID());
DECLARE #current_id uniqueidentifier;
DECLARE #i int = 0;
WHILE #i < 10 BEGIN
SET #current_id = COALESCE(
(SELECT TOP 1 id FROM agents WHERE #current_id < id ORDER BY id),
(SELECT TOP 1 id FROM agents ORDER BY id)
);
PRINT CONCAT('#current_id = ', #current_id);
SET #i = #i + 1;
END
You can combine two queries where the second is only run if the first one didn't return anything:
with next_agent as (
SELECT id
FROM agents
WHERE id > $1 -- the last ID retrieved or NULL if it's the first
ORDER BY id
limit 1
)
select *
from next_agent
union all
(
select *
from agents
where not exists (select * from next_agent)
order by id
limit 1
)
So if $1 is null (first call) or 'uuid4' the next_agent CTE will not return anything. And in that case the second part of the UNION in the outer query will be run picking the "first" row

How to use a special while loop in tsql, do while numeric

I'm loading some quite nasty data through Azure data factory
This is how the data looks after being loaded, existing of 2 parts:
1. Metadata of a test
2. Actual measurements of the test -> the measurement is numeric
Image I have about 10 times such 'packages' of 1.Metadata + 2.Measurements
What I would like it to be / what I'm looking for is the following:
The number column with 1,2,.... is what I'm looking for!
Imagine my screenshot could go no further but this goes along until id=10
I guess a while loop is necessary here...
Query before:
SELECT Field1 FROM Input
Query after:
SELECT GeneratedId, Field1 FROM Input
Thanks a lot in advance!
EDIT: added a hint:
Here is a solution, this requires SQL-SERVER 2012 or later.
Start by getting an Id column on your data. If you can do this previous to the script that would be even better, but if not, try something like this...
CREATE TABLE #InputTable (
Id INT IDENTITY(1, 1),
TestData NVARCHAR(MAX) )
INSERT INTO #InputTable (TestData)
SELECT Field1 FROM Input
Now create a query to get the GeneratedId of each package as well as the Id where they start and end. You can do this by getting all the records LIKE 'title%' since that is the first record of each package, then using ROW_NUMBER, Id, and LEAD for the GeneratedId, StartId, and EndId respectively.
SELECT
GeneratedId = ROW_NUMBER() OVER(ORDER BY (Id)),
StartId = Id,
EndId = LEAD(Id) OVER (ORDER BY (Id))
FROM #InputTable
WHERE TestData LIKE 'title%'
Lastly, join this to the input in order to get all the records, with the correct GeneratedId.
SELECT
package.GeneratedId, i.TestData
FROM (
SELECT
GeneratedId = ROW_NUMBER() OVER(ORDER BY (Id)),
StartId = Id,
EndId = LEAD(Id) OVER (ORDER BY (Id))
FROM #InputTable
WHERE TestData LIKE 'title%' ) package
INNER JOIN #InputTable i
ON i.Id >= package.StartId
AND (package.EndId IS NULL OR i.Id < package.EndId)

SQL Get the row number of the inserted row

I am trying to get the row number of an inserted record so I can use it for a select statement. What I am trying to accomplish is insert a person into one table, get that row number and then select something from another table where the row numbers match. Here is what I got so far:
INSERT INTO TableA Values (‘Person’)
Select timeToken
From
(
Select
Row_Number() Over (Order By tokenOrder) As RowNum
, *
From TableB WHERE taken = false
) t2
Where RowNum = (Row Number of Inserted Item)
How do I get the row number of the inserted item, I want to compare ids as some records might have been deleted so they would not match.
TABLEA Data (primary key is id)
id name
3 John
12 Steve
TABLEB Data (primary key is id)
id timeToken tokenOrder taken
2 1:00am 1 false
3 2:00am 2 false
5 3:00am 3 true
6 4:00am 4 false
My expect result when I insert person, the select take would return 4:00am
I am doing this in a stored procedure.
It is an error to think that rows have numbers unless an ORDER BY clause is included.
The only way to find a row after you have inserted it is to search for it. Presumably your table has a primary key; use that to search for it.
Try This .It may help you out
Declare #TableA_PK BIGINT
INSERT INTO TableA Values ('Person')
SET #TableA_PK=SCOPE_IDENTITY()
Select timeToken
From
(
Select
Row_Number() Over (Order By tokenOrder) As RowNum
, *
From TableB WHERE taken = false
) t2
Where RowNum =#TableA_PK
SCOPE_IDENTITY(): Scope Identity will captures the last inserted record primary key value and which can be stored in a varaible and
and then it can be for further re-use
By the sounds of it you are trying to do something like what is listed on thhe following link LINK - SQL Server - Return value after INSERT
Basically :
INSERT INTO TableA (Person)
OUTPUT Inserted.ID
VALUES('bob');
Adding a foreign key constraint(referencing primary key in table A) in table b will be good since you won't be able to delete records from table A without deleting them from table B. It'll be helpful for comparing the records using ID.
Try this
declare #rowNum int;
INSERT INTO TableA Values ('Person')
SET #rowNum =SCOPE_IDENTITY()
select * from TableA where id = #rowNum

Select column values from DB for which the subsequent row does not have a specified value

I have a table say MyTable has two columns Id, Data and has following records in it:
Id Data
----------
1. ABCDE00
2. DEFGH11
3. CCCCC21
4. AAAAA00
5. BBBBB10
6. vvvvv00
7. xxxxx88
Now what I want that all the records which have end with string 00 and does not have subsequent row having column ending with 11.
So my output using this condition should be like this:
1. AAAAA00
2. vvvvv00
Any help would be appreciated.
This answer makes some assumptions:
You have a column specifying the ordering. Let me call it id.
By "subsequent row" you mean the row with the next highest id.
You are using SQL Server 2012+.
In that case, lead() does what you want:
select t.*
from (select t.*, lead(data order by id) as next_data
from t
) t
where data like '%00' and (next_data not like '%11' or next_data is null);
Earlier versions of SQL Server have alternative methods for calculating next_data.
if anyone is not using sql server 2012,then they an try this
declare #t table(id int identity(1,1),col1 varchar(100))
insert into #t values
('ABCDE00')
,('DEFGH11')
,('CCCCC21')
,('AAAAA00')
,('BBBBB10')
,('vvvvv00')
,('xxxxx88')
;With CTE as
(
select *,case when CHARINDEX('00',reverse(col1))>0 then 1 end
End00 from #t
)
,CTE1 as
(
select a.id,a.col1 from cte A
where exists
(select id from cte b where a.id=b.id+1 and b.end00 is not null)
and CHARINDEX('11',reverse(a.col1))<=0
)
select a.id,a.col1 from cte A
where exists
(select id from cte1 b where a.id=b.id-1 )

How do I get first unused ID in the table?

I have to write a query wherein i need to allocate a ID (unique key) for a particular record which is not being used / is not being generated / does not exist in database.
In short, I need to generate an id for a particular record and show it on print screen.
E. g.:
ID Name
1 abc
2 def
5 ghi
So, the thing is that it should return ID=3 as the next immediate which is not being generated yet, and after this generation of the id, I will store this data back to database table.
It's not an HW: I am doing a project, and I have a requirement where I need to write this query, so I need some help to achieve this.
So please guide me how to make this query, or how to achieve this.
Thanks.
I am not able to add comments,, so thats why i am writing my comments here..
I am using MySQL as the database..
My steps would be like this:-
1) Retrieve the id from the database table which is not being used..
2) As their are no. of users (website based project), so i want no concurrency to happen,, so if one ID is generated to one user, then it should lock the database, until the same user recieves the id and store the record for that id.. After that, the other user can retrieve the ID whichever is not existing.. (Major requirement)..
How can i achive all these things in MySQL,, Also i suppose Quassnoi's answer will be worth,, but its not working in MySQL.. so plz explain the bit about the query as it is new to me.. and will this query work in MySQL..
I named your table unused.
SELECT id
FROM (
SELECT 1 AS id
) q1
WHERE NOT EXISTS
(
SELECT 1
FROM unused
WHERE id = 1
)
UNION ALL
SELECT *
FROM (
SELECT id + 1
FROM unused t
WHERE NOT EXISTS
(
SELECT 1
FROM unused ti
WHERE ti.id = t.id + 1
)
ORDER BY
id
LIMIT 1
) q2
ORDER BY
id
LIMIT 1
This query consists of two parts.
The first part:
SELECT *
FROM (
SELECT 1 AS id
) q
WHERE NOT EXISTS
(
SELECT 1
FROM unused
WHERE id = 1
)
selects a 1 is there is no entry in the table with this id.
The second part:
SELECT *
FROM (
SELECT id + 1
FROM unused t
WHERE NOT EXISTS
(
SELECT 1
FROM unused ti
WHERE ti.id = t.id + 1
)
ORDER BY
id
LIMIT 1
) q2
selects a first id in the table for which there is no next id.
The resulting query selects the least of these two values.
Depends on what you mean by "next id" and how it's generated.
If you're using a sequence or identity in the database to generate the id, it's possible that the "next id" is not 3 or 4 but 6 in the case you've presented. You have no way of knowing whether or not there were values with id of 3 or 4 that were subsequently deleted. Sequences and identities don't necessarily try to reclaim gaps; once they're gone you don't reuse them.
So the right thing to do is to create a sequence or identity column in your database that's automatically incremented when you do an INSERT, then SELECT the generated value.
The correct way is to use an identity column for the primary key. Don't try to look at the rows already inserted, and pick an unused value. The Id column should hold a number large enough that your application will never run out of valid new (higher) values.
In your description , if you are skipping values that you are trying to use later, then you are probably giving some meaning to the values. Please reconsider. You probably should only use this field as a look up (a reference) value from another table.
Let the database engine assign the next higher value for your ID. If you have more than one process running concurrently, you will need to use LAST_INSERT_ID() function to determine the ID that the database generated for your row. You can use LAST_INSERT_ID() function within the same transaction before you commit.
Second best (but not good!) is to use the max value of the index field plus one. You would have to do a table lock to manage the concurrency issues.
/*
This is a query script I wrote to illustrate my method, and it was created to solve a Real World problem where we have multiple machines at multiple stores creating transfer transactions in their own databases,
that are then synced to other databases on the store (this happens often, so getting the Nth free entry for the Nth machine should work) where the transferid is the PK and then those are synced daily to a MainFrame where the maximum size of the key (which is the TransactionID and StoreID) is limited.
*/
--- table variable declarations
/* list of used transaction ids (this is just for testing, it will be the view or table you are reading the transaction ids from when implemented)*/
DECLARE #SampleTransferIDSourceTable TABLE(TransferID INT)
/* Here we insert the used transaction numbers*/
DECLARE #WorkTable TABLE (WorkTableID INT IDENTITY (1,1), TransferID INT)
/*this is the same table as above with an extra column to help us identify the blocks of unused row numbers (modifying a table variable is not a good idea)*/
DECLARE #WorkTable2 TABLE (WorkTableID INT , TransferID INT, diff int)
--- Machine ID declared
DECLARE #MachineID INT
-- MachineID set
SET #MachineID = 5
-- put in some rows with different sized blocks of missing rows.
-- comment out the inserts after two to the bottom to see how it handles no gaps or make
-- the #MachineID very large to do the same.
-- comment out early rows to test how it handles starting gaps.
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 1 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 2 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 4 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 5 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 6 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 9 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 10 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 20 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 21 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 24 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 25 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 30 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 31 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 33 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 39 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 40 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 50 )
-- copy the transaction ids into a table with an identiy item.
-- When implemented add where clause before the order by to limit to the local StoreID
-- Zero row added so that it will find gaps before the lowest used row.
INSERT #WorkTable (TransferID)
SELECT 0
INSERT #WorkTable (TransferID)
SELECT TransferID FROM #SampleTransferIDSourceTable ORDER BY TransferID
-- copy that table to the new table with the diff column
INSERT #WorkTable2
SELECT WorkTableID,TransferID,TransferID - WorkTableID
FROM #WorkTable
--- gives us the (MachineID)th unused ID or the (MachineID)th id beyond the highest id used.
IF EXISTS (
SELECT Top 1
GapStart.TransferID + #MachineID - (GapStart.diff + 1)
FROM #WorkTable2 GapStart
INNER JOIN #WorkTable2 GapEnd
ON GapStart.WorkTableID = GapEnd.WorkTableID - 1
AND GapStart.diff < GapEnd.diff
AND gapEnd.diff >= (#MachineID - 1)
ORDER BY GapStart.TransferID
)
SELECT Top 1
GapStart.TransferID + #MachineID - (GapStart.diff + 1)
FROM #WorkTable2 GapStart
INNER JOIN #WorkTable2 GapEnd
ON GapStart.WorkTableID = GapEnd.WorkTableID - 1
AND GapStart.diff < GapEnd.diff
AND gapEnd.diff >= (#MachineID - 1)
ORDER BY GapStart.TransferID
ELSE
SELECT MAX(TransferID) + #MachineID FROM #SampleTransferIDSourceTable
Should work under MySql.
SELECT TOP 100
T1.ID + 1 AS FREE_ID
FROM TABLE1 T1
LEFT JOIN TABLE2 T2 ON T2.ID = T1.ID + 1
WHERE T2.ID IS NULL
are you allowed to have a utility table? if so i would create a table like so:
CREATE TABLE number_helper (
n INT NOT NULL
,PRIMARY KEY(n)
);
Fill it with all positive 32 bit integers (assuming the id you need to generate is a positive 32 bit integer)
Then you can select like so:
SELECT MIN(h.n) as nextID
FROM my_table t
LEFT JOIN number_helper h ON h.n = t.ID
WHERE t.ID IS NULL
Haven't actually tested this but it should work.