Joining a table on itself with aggregation - sql

I am trying to (efficiently) fetch rows from the connections table, where the startdate is the latest within cpid - for selected cpid's.
Here's an example of the data in the connections table with rows I want marked with <<<
connid cpid startdate
1 20 7/17/16
2 20 8/23/16
3 20 9/12/16 <<<
4 30 6/17/16
5 30 8/23/16 <<<
6 40 2/24/16
7 40 3/17/16
8 40 5/18/16 <<<
etc...
This query returns the latest startdate and cpid, but I'm not sure how to join it with itself to get the result that I need:
select cpid, max(startdate)
from connections
where cpid in (
20,
30,
40
)
group by cpid
The result I'm looking for is as follows:
connid cpid startdate
3 20 9/12/16
5 30 8/23/16
8 40 5/18/16
Any help would be appreciated!
robm

Something like this:
WITH Numbered AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY cpid ORDER BY startdate DESC) AS Nr
,*
FROM connections
)
SELECT *
FROM Numbered
WHERE Nr=1;
The function ROW_NUMBER() will add a running number to the row. PARTITION BY allows you to re-start the running number for groups and ORDER BY allows you to define the order for the numbering. With DESC you will get the latest on top, hence Nr=1.
UPDATE: old-fashioned...
If you need this on other systems than SQL-Server you might go the old-fashioned way:
SET DATEFORMAT mdy;
DECLARE #tbl TABLE(connid INT, cpid INT, startdate DATE);
INSERT INTO #tbl VALUES
( 1,20,'7/17/16')
,(2,20,'8/23/16')
,(3,20,'9/12/16')
,(4,30,'6/17/16')
,(5,30,'8/23/16')
,(6,40,'2/24/16')
,(7,40,'3/17/16')
,(8,40,'5/18/16') ;
SELECT *
FROM #tbl AS tbl
WHERE tbl.startdate IN(SELECT MAX(x.startdate) FROM #tbl AS x WHERE x.cpid=tbl.cpid)

Here's an alternative solution using CROSS APPLY rather than a CTE, the execution plan is different, once you try it on your environment it could be more efficient.
DROP TABLE #connections
CREATE TABLE #connections(connid INT, cpid INT,startdate datetime)
INSERT INTO #connections(connid,cpid,startdate)
VALUES
(1,'20','20160717')
,(2,'20','20160823')
,(3,'20','20160912')
,(4,'30','20160617')
,(5,'30','20160823')
,(6,'40','20160224')
,(7,'40','20160317')
,(8,'40','20160518')
SELECT
c.connid,c.cpid,c.startdate
FROM
#connections c
CROSS APPLY (
SELECT
cpid
,MAX(startdate) startdate
FROM
#connections
GROUP BY
cpid
) a
WHERE
c.cpid = a.cpid
AND c.startdate = a.startdate

Related

Get the rows ONLY where the time difference between the current and the following row is under 5 minutes

TSQL Query
Accounts with Different Times of transaction done
I need help in figuring out a way to pull only the records where the txn_time difference between the current and next row is under 5 minutes.
The txn_time is sorted.
Looking at the attached image, only rows 1,2,3,6,7,8 should be shown, since the time difference between each of these row is under 5 minutes.
Any ideas would be helpful.
Sample Data:
rowno txn_Date_Time txn_time accountNo
1 2017-10-31 11:50:47.0000000 98989898
2 2017-10-31 11:52:23.0000000 98989898
3 2017-10-31 11:52:23.0000000 98989898
4 2017-10-31 11:59:03.0000000 98989898
5 2017-10-31 12:05:13.0000000 98989898
6 2017-10-31 12:41:06.0000000 98989898
7 2017-10-31 12:42:44.0000000 98989898
8 2017-10-31 12:44:02.0000000 98989898
9 2017-10-31 15:23:19.0000000 98989898
10 2017-10-31 16:19:17.0000000 98989898
In SQL Server 2012+ it is much more efficient to use LEAD and LAG functions instead of self-join.
WITH
CTE
AS
(
SELECT
rowno
,txn_Date_Time
,txn_time
,accountNo
,LEAD(txn_time) OVER (PARTITION BY accountNo ORDER BY txn_time, rowno) AS next_time
,LAG(txn_time) OVER (PARTITION BY accountNo ORDER BY txn_time, rowno) AS prev_time
FROM T
)
SELECT
rowno
,txn_Date_Time
,txn_time
,accountNo
FROM CTE
WHERE
DATEDIFF(second, prev_time, txn_time) < 5 * 60
OR
DATEDIFF(second, txn_time, next_time) < 5 * 60
ORDER BY txn_time, rowno;
Because you're using SQL 2012 you can use Window Offset Functions such as LAG and LEAD. #vladimir beat me to it, however; he and I put together similar solutions.
To keep things interesting I'll demonstrate how to optimize your query so that neither LAG nor LEAD cause SQL server to require a sort to satisfy your query. The type of index I'm creating is referred to as a POC index which is discussed here.
For simplicity I'm using a single column datetime data type for txn_date_time. I'll create two identical tables and run my solution against them. The second table will have a poc index on it.
Sample data
-- sample data
if object_id('tempdb..#table') is not null drop table #table;
if object_id('tempdb..#table2') is not null drop table #table2;
go
create table #table
(
rowno int identity,
txn_date_time datetime,
accountNo int
);
create table #table2
(
rowno int identity,
txn_date_time datetime,
accountNo int
);
-- populate #table
declare #dt varchar(9) = '20171031 ', #acn int = 98989898;
insert #table (txn_date_time, accountNo)
values
(#dt+'11:50:47',#acn), (#dt+'11:52:23', #acn), (#dt+'11:52:23',#acn),
(#dt+'11:59:03',#acn), (#dt+'12:05:13', #acn), (#dt+'12:41:06',#acn),
(#dt+'12:42:44',#acn), (#dt+'12:44:02', #acn), (#dt+'15:23:19',#acn),(#dt+'16:19:17',#acn);
-- populate #table2
insert #table2 (txn_date_time, accountNo)
select txn_date_time, accountNo from #table;
-- create unique clustered index on #table2
create unique clustered index uq_cl_table2 on #table2(txn_date_time, rowno);
GO
Run the same query against both tables keeping in mind that the second table has the poc index on it.
-- #table
select rowno, txn_date_time, accountNo
from
(
select rowno, txn_date_time, accountNo,
nextDt = datediff(minute, txn_date_time, lead(txn_date_time, 1) over (order by txn_date_time)),
prevDt = datediff(minute, lag(txn_date_time, 1) over (order by txn_date_time), txn_date_time)
from #table
) fixedDates
where nextDt <= 5 or prevDt <= 5;
-- #table2
select rowno, txn_date_time, accountNo
from
(
select rowno, txn_date_time, accountNo,
nextDt = datediff(minute, txn_date_time, lead(txn_date_time, 1) over (order by txn_date_time)),
prevDt = datediff(minute, lag(txn_date_time, 1) over (order by txn_date_time), txn_date_time)
from #table2
) fixedDates
where nextDt <= 5 or prevDt <= 5;
Note the execution plans. Adding the poc index removed the sort and made the query four times more efficient.
Try a self join to attach the previous row, then union a second query that self joins to the previous row:
SELECT
t1.rowno
,t1.txn_Date_time
,t1.txn_time
,t1.accountNo
FROM [table] t1
JOIN [table] t2
ON t2.rowno = t1.rowno + 1
WHERE DATEDIFF(MINUTE, t1.txn_time, t2.txn_time) < 5
UNION
SELECT
t1.rowno
,t1.txn_Date_time
,t1.txn_time
,t1.accountNo
FROM [table] t1
JOIN [table] t2
ON t2.rowno = t1.rowno - 1
WHERE DATEDIFF(MINUTE, t2.txn_time, t1.txn_time) < 5

T-SQL Combine Ranges Based On Value

I am using SQL Server 2012 and have been struggling with this query for hours. I am trying to aggregate mile post ranges based off the value in the Value column. The results should have unique segments with the highest value from the Value field for each segment. Here's an example:
Mile_Marker_Start | Mile_Marker_End | Value
0 100 5
50 150 6
100 200 10
75 300 9
150 200 7
And here's the result I'm looking for:
Mile_Marker_Start | Mile_Marker_End | Value
0 50 5
50 75 6
75 100 9
100 200 10
200 300 9
As you can see, the row with a value of 9 got split into 2 rows because Value 10 was bigger. Also, the row with Value 7 does not display because Value 10 was bigger. Can this be done without using a cursor? Any help would be much appreciated.
Thanks
I believe the following now does what you need. I'd recommend running all the parts separately so you can see what they do and how they work.
DECLARE #input AS TABLE
(Mile_Marker_Start int, Mile_Marker_End int, Value int)
INSERT INTO #input VALUES
(0,100,5), (50,150,6), (100,200,10), (75,300,9), (150,200,7)
DECLARE #staging as table
(Mile_Marker int)
INSERT INTO #staging
SELECT Mile_Marker_Start from #input
UNION -- this will remove duplicates
SELECT Mile_Marker_End from #input
; -- we need semi-colon for the following CTE
-- this CTE gets the right values, but the rows aren't "collapsed"
WITH all_markers AS
(
SELECT
groups.Mile_Marker_Start,
groups.Mile_Marker_End,
max(i3.Value) Value
FROM
(
SELECT
s1.Mile_Marker Mile_Marker_Start,
min(s2.Mile_Marker) Mile_Marker_End
FROM
#staging s1
JOIN #staging s2 ON
s1.Mile_Marker < s2.Mile_Marker
GROUP BY
s1.Mile_Marker
) as groups
JOIN #input i3 ON
i3.Mile_Marker_Start < groups.Mile_Marker_End AND
i3.Mile_Marker_End > groups.Mile_Marker_Start
GROUP BY
groups.Mile_Marker_Start,
groups.Mile_Marker_End
)
SELECT
MIN(collapse.Mile_Marker_Start) as Mile_Marker_Start,
MAX(collapse.Mile_Marker_End) as Mile_Marker_End,
collapse.Value
FROM
(-- Subquery get's IDs for the groups we're collapsing together
SELECT
am.*,
ROW_NUMBER() OVER (ORDER BY am.Mile_Marker_Start) - ROW_NUMBER() OVER (PARTITION BY am.Value ORDER BY am.Mile_Marker_Start) GroupID
FROM
all_markers am
) AS COLLAPSE
GROUP BY
collapse.GroupID,
collapse.Value
ORDER BY
MIN(collapse.Mile_Marker_Start)
Since you are on 2012 you could maybe use LEAD. Here is my code but as noted on your question by #stevelovell , we need clarification on how you are getting your result table.
--test date
declare #tablename TABLE
(
Mile_Marker_Start int,
Mile_Marker_End int,
Value int
);
insert into #tablename
values(0,100, 5),
(50,150, 6),
(100,200,10),
(75,300, 9),
(150,200, 7);
--query
select *
from #tablename
order by Mile_Marker_Start
select Mile_Marker_Start,
case when lead(mile_marker_start) over(order by mile_marker_start) < Mile_Marker_End THEN
lead(mile_marker_start) over(order by mile_marker_start)
ELSE
Mile_marker_end
END
AS MILE_MARKER_END,
Value
from #tablename
order by Mile_Marker_Start
Once you update your notes I will come back and update my answer.
Update: wasn't able to get LEAD and the other windowing functions to work with your requirements. With the way you need to move up and down the table current, and calculated values...

Get rows for the last 10 dates

I have a scenario in a Postgres 9.3 database where I have to get the last 10 dates when books were sold. Consider below example:
Store Book
---------- ----------------------
Id Name Id Name Sid Count Date
1 ABC 1 XYZ 1 20 11/11/2015
2 DEF 2 JHG 1 10 11/11/2015
3 UYH 1 10 15/11/2015
4 TRE 1 50 17/11/2015
There is currently no UNIQUE constraint on (name, sid, date) in table book, but we have a service in place that inserts only one count per day.
I have to get results based on store.id. When I pass the ID, the report should be generated with bookname, sold date, and the count of sold copies.
Desired output:
BookName 11/11/2015 15/11/2015 17/11/2015
XYZ 20 -- --
JHG 10 -- --
UYH -- 10 --
TRE -- -- 50
This looks unsuspicious, but it's a hell of a question.
Assumptions
Your counts are integer.
All columns in table book are defined NOT NULL.
The composite (name, sid, date) is unique in table book. You should have a UNIQUE constraint, preferably (for performance) with columns in this order:
UNIQUE(sid, date, name)
This provides the index needed for performance automatically. (Else create one.) See:
Multicolumn index and performance
Is a composite index also good for queries on the first field?
crosstab() queries
To get top performance and short query strings (especially if you run this query often) I suggest the additional module tablefunc providing various crosstab() functions. Basic instructions:
PostgreSQL Crosstab Query
Basic queries
You need to get these right first.
The last 10 days:
SELECT DISTINCT date
FROM book
WHERE sid = 1
ORDER BY date DESC
LIMIT 10;
Numbers for last 10 days using the window function dense_rank():
SELECT *
FROM (
SELECT name
, dense_rank() OVER (ORDER BY date DESC) AS date_rnk
, count
FROM book
WHERE sid = 1
) sub
WHERE date_rnk < 11
ORDER BY name, date_rnk DESC;
(Not including actual dates in this query.)
Column names for output columns (for full solution):
SELECT 'bookname, "' || string_agg(to_char(date, 'DD/MM/YYYY'), '", "' ORDER BY date) || '"'
FROM (
SELECT DISTINCT date
FROM book
WHERE sid = 1
ORDER BY date DESC
LIMIT 10
) sub;
Simple result with static column names
This may be good enough for you - but we don't see actual dates in the result:
SELECT * FROM crosstab(
'SELECT *
FROM (
SELECT name
, dense_rank() OVER (ORDER BY date DESC) AS date_rnk
, count
FROM book
WHERE sid = 1
) sub
WHERE date_rnk < 11
ORDER BY name, date_rnk DESC'
, 'SELECT generate_series(10, 1, -1)'
) AS (bookname text
, date1 int, date2 int, date3 int, date4 int, date5 int
, date6 int, date7 int, date8 int, date9 int, date10 int);
For repeated use I suggest you create this (very fast) generic C function for 10 integer columns once, to simplify things a bit:
CREATE OR REPLACE FUNCTION crosstab_int10(text, text)
RETURNS TABLE (bookname text
, date1 int, date2 int, date3 int, date4 int, date5 int
, date6 int, date7 int, date8 int, date9 int, date10 int)
LANGUAGE C STABLE STRICT AS
'$libdir/tablefunc','crosstab_hash';
Details in this related answer:
Dynamically generate columns for crosstab in PostgreSQL
Then your call becomes:
SELECT * FROM crosstab(
'SELECT *
FROM (
SELECT name
, dense_rank() OVER (ORDER BY date DESC) AS date_rnk
, count
FROM book
WHERE sid = 1
) sub
WHERE date_rnk < 11
ORDER BY name, date_rnk DESC'
, 'SELECT generate_series(10, 1, -1)'
); -- no column definition list required!
Full solution with dynamic column names
Your actual question is more complicated, you also want dynamic column names.
For a given table, the resulting query could look like this then:
SELECT * FROM crosstab_int10(
'SELECT *
FROM (
SELECT name
, dense_rank() OVER (ORDER BY date DESC) AS date_rnk
, count
FROM book
WHERE sid = 1
) sub
WHERE date_rnk < 11
ORDER BY name, date_rnk DESC'
, 'SELECT generate_series(10, 1, -1)'
) AS t(bookname
, "04/11/2015", "05/11/2015", "06/11/2015", "07/11/2015", "08/11/2015"
, "09/11/2015", "10/11/2015", "11/11/2015", "15/11/2015", "17/11/2015");
The difficulty is to distill dynamic column names. Either assemble the query string by hand, or (much rather) let this function do it for you:
CREATE OR REPLACE FUNCTION f_generate_date10_sql(_sid int = 1)
RETURNS text
LANGUAGE sql AS
$func$
SELECT format(
$$SELECT * FROM crosstab_int10(
'SELECT *
FROM (
SELECT name
, dense_rank() OVER (ORDER BY date DESC) AS date_rnk
, count
FROM book
WHERE sid = %1$s
) sub
WHERE date_rnk < 11
ORDER BY name, date_rnk DESC'
, 'SELECT generate_series(10, 1, -1)'
) AS ct(bookname, "$$
|| string_agg(to_char(date, 'DD/MM/YYYY'), '", "' ORDER BY date) || '")'
, _sid)
FROM (
SELECT DISTINCT date
FROM book
WHERE sid = 1
ORDER BY date DESC
LIMIT 10
) sub
$func$;
Call:
SELECT f_generate_date10_sql(1);
This generates the desired query, which you execute in turn.
db<>fiddle here

Drop rows identified within moving time window

I have a dataset of hospitalisations ('spells') - 1 row per spell. I want to drop any spells recorded within a week after another (there could be multiple) - the rationale being is that they're likely symptomatic of the same underlying cause. Here is some play data:
create table hif_user.rzb_recurse_src (
patid integer not null,
eventdate integer not null,
type smallint not null
);
insert into hif_user.rzb_recurse_src values (1,1,1);
insert into hif_user.rzb_recurse_src values (1,3,2);
insert into hif_user.rzb_recurse_src values (1,5,2);
insert into hif_user.rzb_recurse_src values (1,9,2);
insert into hif_user.rzb_recurse_src values (1,14,2);
insert into hif_user.rzb_recurse_src values (2,1,1);
insert into hif_user.rzb_recurse_src values (2,5,1);
insert into hif_user.rzb_recurse_src values (2,19,2);
Only spells of type 2 - within a week after any other - are to be dropped. Type 1 spells are to remain.
For patient 1, dates 1 & 9 should be kept. For patient 2, all rows should remain.
The issue is with patient 1. Spell date 9 is identified for dropping as it is close to spell date 5; however, as spell date 5 is close to spell date 1 is should be dropped therefore allowing spell date 9 to live...
So, it seems a recursive problem. However, I've not used recursive programming in SQL before and I'm struggling to really picture how to do it. Can anyone help? I should add that I'm using Teradata which has more restrictions than most with recursive SQL (only UNION ALL sets allowed I believe).
It's a cursor logic, check one row after the other if it fits your rules, so recursion is the easiest (maybe the only) way to solve your problem.
To get a decent performance you need a Volatile Table to facilitate this row-by-row processing:
CREATE VOLATILE TABLE vt (patid, eventdate, exac_type, rn, startdate) AS
(
SELECT r.*
,ROW_NUMBER() -- needed to facilitate the join
OVER (PARTITION BY patid ORDER BY eventdate) AS rn
FROM hif_user.rzb_recurse_src AS r
) WITH DATA ON COMMIT PRESERVE ROWS;
WITH RECURSIVE cte (patid, eventdate, exac_type, rn, startdate) AS
(
SELECT vt.*
,eventdate AS startdate
FROM vt
WHERE rn = 1 -- start with the first row
UNION ALL
SELECT vt.*
-- check if type = 1 or more than 7 days from the last eventdate
,CASE WHEN vt.eventdate > cte.startdate + 7
OR vt.exac_type = 1
THEN vt.eventdate -- new start date
ELSE cte.startdate -- keep old date
END
FROM vt JOIN cte
ON vt.patid = cte.patid
AND vt.rn = cte.rn + 1 -- proceed to next row
)
SELECT *
FROM cte
WHERE eventdate - startdate = 0 -- only new start days
order by patid, eventdate
I think the key to solving this is getting the first date more than 7 days from the current date and then doing a recursive subquery:
with rrs as (
select rrs.*,
(select min(rrs2.eventdate)
from hif_user.rzb_recurse_src rrs2
where rrs2.patid = rrs.patid and
rrs2.eventdate > rrs.eventdate + 7
) as eventdate7
from hif_user.rzb_recurse_src rrs
),
recursive cte as (
select patid, min(eventdate) as eventdate, min(eventdate7) as eventdate7
from hif_user.rzb_recurse_src rrs
group by patid
union all
select cte.patid, cte.eventdate7, rrs.eventdate7
from cte join
hif_user.rzb_recurse_src rrs
on rrs.patid = cte.patid and
rrs.eventdate = cte.eventdate7
)
select cte.patid, cte.eventdate
from cte;
If you want additional columns, then join in the original table at the last step.

How do I get first unused ID in the table?

I have to write a query wherein i need to allocate a ID (unique key) for a particular record which is not being used / is not being generated / does not exist in database.
In short, I need to generate an id for a particular record and show it on print screen.
E. g.:
ID Name
1 abc
2 def
5 ghi
So, the thing is that it should return ID=3 as the next immediate which is not being generated yet, and after this generation of the id, I will store this data back to database table.
It's not an HW: I am doing a project, and I have a requirement where I need to write this query, so I need some help to achieve this.
So please guide me how to make this query, or how to achieve this.
Thanks.
I am not able to add comments,, so thats why i am writing my comments here..
I am using MySQL as the database..
My steps would be like this:-
1) Retrieve the id from the database table which is not being used..
2) As their are no. of users (website based project), so i want no concurrency to happen,, so if one ID is generated to one user, then it should lock the database, until the same user recieves the id and store the record for that id.. After that, the other user can retrieve the ID whichever is not existing.. (Major requirement)..
How can i achive all these things in MySQL,, Also i suppose Quassnoi's answer will be worth,, but its not working in MySQL.. so plz explain the bit about the query as it is new to me.. and will this query work in MySQL..
I named your table unused.
SELECT id
FROM (
SELECT 1 AS id
) q1
WHERE NOT EXISTS
(
SELECT 1
FROM unused
WHERE id = 1
)
UNION ALL
SELECT *
FROM (
SELECT id + 1
FROM unused t
WHERE NOT EXISTS
(
SELECT 1
FROM unused ti
WHERE ti.id = t.id + 1
)
ORDER BY
id
LIMIT 1
) q2
ORDER BY
id
LIMIT 1
This query consists of two parts.
The first part:
SELECT *
FROM (
SELECT 1 AS id
) q
WHERE NOT EXISTS
(
SELECT 1
FROM unused
WHERE id = 1
)
selects a 1 is there is no entry in the table with this id.
The second part:
SELECT *
FROM (
SELECT id + 1
FROM unused t
WHERE NOT EXISTS
(
SELECT 1
FROM unused ti
WHERE ti.id = t.id + 1
)
ORDER BY
id
LIMIT 1
) q2
selects a first id in the table for which there is no next id.
The resulting query selects the least of these two values.
Depends on what you mean by "next id" and how it's generated.
If you're using a sequence or identity in the database to generate the id, it's possible that the "next id" is not 3 or 4 but 6 in the case you've presented. You have no way of knowing whether or not there were values with id of 3 or 4 that were subsequently deleted. Sequences and identities don't necessarily try to reclaim gaps; once they're gone you don't reuse them.
So the right thing to do is to create a sequence or identity column in your database that's automatically incremented when you do an INSERT, then SELECT the generated value.
The correct way is to use an identity column for the primary key. Don't try to look at the rows already inserted, and pick an unused value. The Id column should hold a number large enough that your application will never run out of valid new (higher) values.
In your description , if you are skipping values that you are trying to use later, then you are probably giving some meaning to the values. Please reconsider. You probably should only use this field as a look up (a reference) value from another table.
Let the database engine assign the next higher value for your ID. If you have more than one process running concurrently, you will need to use LAST_INSERT_ID() function to determine the ID that the database generated for your row. You can use LAST_INSERT_ID() function within the same transaction before you commit.
Second best (but not good!) is to use the max value of the index field plus one. You would have to do a table lock to manage the concurrency issues.
/*
This is a query script I wrote to illustrate my method, and it was created to solve a Real World problem where we have multiple machines at multiple stores creating transfer transactions in their own databases,
that are then synced to other databases on the store (this happens often, so getting the Nth free entry for the Nth machine should work) where the transferid is the PK and then those are synced daily to a MainFrame where the maximum size of the key (which is the TransactionID and StoreID) is limited.
*/
--- table variable declarations
/* list of used transaction ids (this is just for testing, it will be the view or table you are reading the transaction ids from when implemented)*/
DECLARE #SampleTransferIDSourceTable TABLE(TransferID INT)
/* Here we insert the used transaction numbers*/
DECLARE #WorkTable TABLE (WorkTableID INT IDENTITY (1,1), TransferID INT)
/*this is the same table as above with an extra column to help us identify the blocks of unused row numbers (modifying a table variable is not a good idea)*/
DECLARE #WorkTable2 TABLE (WorkTableID INT , TransferID INT, diff int)
--- Machine ID declared
DECLARE #MachineID INT
-- MachineID set
SET #MachineID = 5
-- put in some rows with different sized blocks of missing rows.
-- comment out the inserts after two to the bottom to see how it handles no gaps or make
-- the #MachineID very large to do the same.
-- comment out early rows to test how it handles starting gaps.
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 1 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 2 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 4 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 5 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 6 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 9 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 10 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 20 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 21 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 24 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 25 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 30 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 31 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 33 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 39 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 40 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 50 )
-- copy the transaction ids into a table with an identiy item.
-- When implemented add where clause before the order by to limit to the local StoreID
-- Zero row added so that it will find gaps before the lowest used row.
INSERT #WorkTable (TransferID)
SELECT 0
INSERT #WorkTable (TransferID)
SELECT TransferID FROM #SampleTransferIDSourceTable ORDER BY TransferID
-- copy that table to the new table with the diff column
INSERT #WorkTable2
SELECT WorkTableID,TransferID,TransferID - WorkTableID
FROM #WorkTable
--- gives us the (MachineID)th unused ID or the (MachineID)th id beyond the highest id used.
IF EXISTS (
SELECT Top 1
GapStart.TransferID + #MachineID - (GapStart.diff + 1)
FROM #WorkTable2 GapStart
INNER JOIN #WorkTable2 GapEnd
ON GapStart.WorkTableID = GapEnd.WorkTableID - 1
AND GapStart.diff < GapEnd.diff
AND gapEnd.diff >= (#MachineID - 1)
ORDER BY GapStart.TransferID
)
SELECT Top 1
GapStart.TransferID + #MachineID - (GapStart.diff + 1)
FROM #WorkTable2 GapStart
INNER JOIN #WorkTable2 GapEnd
ON GapStart.WorkTableID = GapEnd.WorkTableID - 1
AND GapStart.diff < GapEnd.diff
AND gapEnd.diff >= (#MachineID - 1)
ORDER BY GapStart.TransferID
ELSE
SELECT MAX(TransferID) + #MachineID FROM #SampleTransferIDSourceTable
Should work under MySql.
SELECT TOP 100
T1.ID + 1 AS FREE_ID
FROM TABLE1 T1
LEFT JOIN TABLE2 T2 ON T2.ID = T1.ID + 1
WHERE T2.ID IS NULL
are you allowed to have a utility table? if so i would create a table like so:
CREATE TABLE number_helper (
n INT NOT NULL
,PRIMARY KEY(n)
);
Fill it with all positive 32 bit integers (assuming the id you need to generate is a positive 32 bit integer)
Then you can select like so:
SELECT MIN(h.n) as nextID
FROM my_table t
LEFT JOIN number_helper h ON h.n = t.ID
WHERE t.ID IS NULL
Haven't actually tested this but it should work.