Sybase db tables do not have a concept of self updating row numbers. However , for one of the modules , I require the presence of rownumber corresponding to each row in the database such that max(Column) would always tell me the number of rows in the table.
I thought I'll introduce an int column and keep updating this column to keep track of the row number. However I'm having problems in updating this column in case of deletes. What sql should I use in delete trigger to update this column?
You can easily assign a unique number to each row by using an identity column. The identity can be a numeric or an integer (in ASE12+).
This will almost do what you require. There are certain circumstances in which you will get a gap in the identity sequence. (These are called "identity gaps", the best discussion on them is here). Also deletes will cause gaps in the sequence as you've identified.
Why do you need to use max(col) to get the number of rows in the table, when you could just use count(*)? If you're trying to get the last row from the table, then you can do
select * from table where column = (select max(column) from table).
Regarding the delete trigger to update a manually managed column, I think this would be a potential source of deadlocks, and many performance issues. Imagine you have 1 million rows in your table, and you delete row 1, that's 999999 rows you now have to update to subtract 1 from the id.
Delete trigger
CREATE TRIGGER tigger ON myTable FOR DELETE
AS
update myTable
set id = id - (select count(*) from deleted d where d.id < t.id)
from myTable t
To avoid locking problems
You could add an extra table (which joins to your primary table) like this:
CREATE TABLE rowCounter
(id int, -- foreign key to main table
rownum int)
... and use the rownum field from this table.
If you put the delete trigger on this table then you would hugely reduce the potential for locking problems.
Approximate solution?
Does the table need to keep its rownumbers up to date all the time?
If not, you could have a job which runs every minute or so, which checks for gaps in the rownum, and does an update.
Question: do the rownumbers have to reflect the order in which rows were inserted?
If not, you could do far fewer updates, but only updating the most recent rows, "moving" them into gaps.
Leave a comment if you would like me to post any SQL for these ideas.
I'm not sure why you would want to do this. You could experiment with using temporary tables and "select into" with an Identity column like below.
create table test
(
col1 int,
col2 varchar(3)
)
insert into test values (100, "abc")
insert into test values (111, "def")
insert into test values (222, "ghi")
insert into test values (300, "jkl")
insert into test values (400, "mno")
select rank = identity(10), col1 into #t1 from Test
select * from #t1
delete from test where col2="ghi"
select rank = identity(10), col1 into #t2 from Test
select * from #t2
drop table test
drop table #t1
drop table #t2
This would give you a dynamic id (of sorts)
Related
Hi I'm running into the following problem on SQlite3
I have a simple table
CREATE TABLE TestTable (id INT, cnt INT);
There are some rows already in the table.
I have some data I want to be inserted into the table: {(id0, cnt0), (id1, cnt1)...}
I want to insert data into the table, on id conflict, update TestTable.cnt = TestTable.cnt + value.cnt
(values.cnt is cnt0, cnt1 ... basically my data to be inserted)
*** But the problem is, there is no primary or unique constraint on id, and I am not allowed to change it!
What I currently have :
In my program I loop through all the values
UPDATE TestTABLE SET count = count + value.cnt WHERE id = value.id;
if (sqlite3_changes() == 0)
INSERT INTO MyTable (id, cnt) values (value.id, value.cnt);
But the problem is, with a very large dataset, doing 2 queries for each data entry takes too long. I'm trying to bundle multiple entries together into one call.
Please let me know if you have questions about my description, thank you for helping!
If you are able to create temporary tables, then do the following. Although I don't show it here, I suggest wrapping all this in a transaction. This technique will likely increase efficiency even if you are also able to add a temporary unique index. (In that case you could use an UPSERT with source data in the temporary table.)
CREATE TEMP TABLE data(id INT, cnt INT);
Now insert the new data into the temporary table, whether by using the host-language data libraries or crafting an insert statement similar to
INSERT INTO data (id, cnt)
VALUES (1, 100),
(2, 200),
(5, 400),
(7, 500);
Now update all existing rows using the single UPDATE statement. SQLite does not have a convenient syntax for joining tables and/or providing a source query for an UPDATE statement. However, one can use nested statement to provide similar convenience:
UPDATE TestTable AS tt
SET cnt = cnt + ifnull((SELECT cnt FROM data WHERE data.id == tt.id), 0)
WHERE tt.id IN (SELECT id FROM data);
Note that the two nested queries are independent of each other. In fact, one could eliminate the WHERE clause altogether and get the same results for this simple case. The WHERE clause is simply to make it more efficient, only attempting to update matching id's. The other subquery in the SET clause also specifies a match on id, but alone it would still allow updates of rows that don't have a match, defaulting to a null value and being converted to 0 (by isnull() function) for a no-op. By the way, without the isnull() function, the sum would result in null and would overwrite non-null values.
Finally, insert only rows with non-existing id values:
INSERT INTO TestTable (id, cnt)
SELECT data.id, data.cnt
FROM data LEFT JOIN TestTable
ON data.id == TestTable.id
WHERE TestTable.id IS NULL;
I am inserting data into two tables. Within each insert there is an OUTPUT to a #temp table each with an identity column. The select that is generating the data for the insert has the same order by for each insert. Later on I join the two #temp tables by the Identity column. What I would expect is that the identity column numbers would line up as the order by is specified on both sides when inserting. Every long once in a while it appears those numbers don't match up and the only thing I can think of is that perhaps the OUTPUT isn't always honoring the order by in the select statements when writing the OUTPUT data to the temp tables.
CREATE TABLE #TempTable
(
RowNumber Integer IDENTITY (1,1) NOT NULL,
TableID Integer
CONSTRAINT PK_TableID PRIMARY KEY NONCLUSTERED (RowNumber)
)
INSERT INTO Table
(column1,column2,column3,etc)
OUTPUT
INSERTED.ID
INTO #TempTable
(ID)
SELECT
column1,column2,column3,etc
FROM
Other table
ORDER BY
SourceFlag,
StoreID,
storenumber,
EstablishDate,
TableID
What I would expect is that the statements would insert for example 25 rows in both statements in the same order 1 through 25. Then I should be able to join based on the row number 1 = 1, 25= 25, etc. in order to get the matching data. What I think is happening is somehow that order is getting messed up, so that row #1 from the first insert really matches say row #14 from the second, so when I later join 1 on 1 I'm getting mismatched data.
Apparently, it doesn't:
However, SQL Server does not guarantee the order in which rows are
processed and returned by DML statements using the OUTPUT clause.
You need to identify a natural key in your data and then reference it to match the newly inserted rows with the OUTPUT resultset.
Alternatively, you can replace the INSERT with MERGE; in this case, you will be able to catch the newly created identity values for your records in the OUTPUT clause.
If I have a table with an auto-incrementing ID column, I'd like to be able to insert a row into that table, and get the ID of the row I just created. I know that generally, StackOverflow questions need some sort of code that was attempted or research effort, but I'm not sure where to begin with Snowflake. I've dug through their documentation and I've found nothing for this.
The best I could do so far is try result_scan() and last_query_id(), but these don't give me any relevant information about the row that was inserted, just confirmation that a row was inserted.
I believe what I'm asking for is along the lines of MS SQL Server's SCOPE_IDENTITY() function.
Is there a Snowflake equivalent function for MS SQL Server's SCOPE_IDENTITY()?
EDIT: for the sake of having code in here:
CREATE TABLE my_db..my_table
(
ROWID INT IDENTITY(1,1),
some_number INT,
a_time TIMESTAMP_LTZ(9),
b_time TIMESTAMP_LTZ(9),
more_data VARCHAR(10)
);
INSERT INTO my_db..my_table
(
some_number,
a_time,
more_data
)
VALUES
(1, my_time_value, some_data);
I want to get to that auto-increment ROWID for this row I just inserted.
NOTE: The answer below can be not 100% correct in some very rare cases, see the UPDATE section below
Original answer
Snowflake does not provide the equivalent of SCOPE_IDENTITY today.
However, you can exploit Snowflake's time travel to retrieve the maximum value of a column right after a given statement is executed.
Here's an example:
create or replace table x(rid int identity, num int);
insert into x(num) values(7);
insert into x(num) values(9);
-- you can insert rows in a separate transaction now to test it
select max(rid) from x AT(statement=>last_query_id());
----------+
MAX(RID) |
----------+
2 |
----------+
You can also save the last_query_id() into a variable if you want to access it later, e.g.
insert into x(num) values(5);
set qid = last_query_id();
...
select max(rid) from x AT(statement=>$qid);
Note - it will be usually correct, but if the user e.g. inserts a large value into rid manually, it might influence the result of this query.
UPDATE
Note, I realized the code above might rarely generate incorrect answer.
Since the execution order of various phases of a query in a distributed system like Snowflake can be non-deterministic, and Snowflake allows concurrent INSERT statements, the following might happen
Two queries, Q1 and Q2, do a simple single row INSERT, start at roughly the same time
Q1 starts, is a bit ahead
Q2 starts
Q1 creates a row with value 1 from the IDENTITY column
Q2 creates a row with value 2 from the IDENTITY column
Q2 gets ahead of Q1 - this is the key part
Q2 commits, is marked as finished at time T2
Q1 commits, is marked as finished at time T1
Note that T1 is later than T2. Now, when we try to do SELECT ... AT(statement=>Q1), we will see the state as-of T1, including all changes from statements before, hence including the value 2 from Q2. Which is not what we want.
The way around it could be to add a unique identifier to each INSERT (e.g. from a separate SEQUENCE object), and then use a MAX.
Sorry. Distributed transactions are hard :)
If I have a table with an auto-incrementing ID column, I'd like to be
able to insert a row into that table, and get the ID of the row I just
created.
FWIW, here's a slight variation of the current accepted answer (using Snowflake's 'Time Travel' feature) that gives any column values "of the row I just created." It applies to auto-incrementing sequences and more generally to any column configured with a default (e.g. CURRENT_TIMESTAMP() or UUID_STRING()). Further, I believe it avoids any inconsistencies associated with a second query utilizing MAX().
Assuming this table setup:
CREATE TABLE my_db.my_table
(
ROWID INT IDENTITY(1,1),
some_number INT,
a_time TIMESTAMP_LTZ(9),
b_time TIMESTAMP_LTZ(9),
more_data VARCHAR(10)
);
Make sure the 'Time Travel' feature (change_tracking) is enabled for this table with:
ALTER TABLE my_db.my_table SET change_tracking = true;
Perform the INSERT per usual:
INSERT INTO my_db.my_table
(
some_number,
a_time,
more_data
)
VALUES
(1, my_time_value, some_data);
Use the CHANGES clause with BEFORE(statement... and END(statement... specified as LAST_QUERY_ID() to SELECT the row(s) added to my_table which are the precise result of the previous INSERT statement (with column values that existed the moment the row(s) was(were) added, including any defaults):
SET insertQueryId=LAST_QUERY_ID();
SELECT
ROWID,
some_number,
a_time,
b_time,
more_data
FROM my_db.my_table
CHANGES(information => default)
BEFORE(statement => $insertQueryId)
END(statement => $insertQueryId);
For more information on the CHANGES, BEFORE, END clauses see the Snowflake documentation here.
I want to insert a row into a SQL server table at a specific position. For example my table has 100 rows and I want to insert a new row at position 9. But the ID column which is PK for the table already has a row with ID 9. How can I insert a row at this position so that all the rows after it shift to next position?
Relational tables have no 'position'. As an optimization, an index will sort rows by the specified key, if you wish to insert a row at a specific rank in the key order, insert it with a key that sorts in that rank position. In your case you'll have to update all rows with a value if ID greater than 8 to increment ID with 1, then insert the ID with value 9:
UPDATE TABLE table SET ID += 1 WHERE ID >= 9;
INSERT INTO TABLE (ID, ...) VALUES (9, ...);
Needless to say, there cannot possibly be any sane reason for doing something like that. If you would truly have such a requirement, then you would use a composite key with two (or more) parts. Such a key would allow you to insert subkeys so that it sorts in the desired order. But much more likely your problem can be solved exclusively by specifying a correct ORDER BY, w/o messing with the physical order of the rows.
Another way to look at it is to reconsider what primary key means: the identifier of an entity, which does not change during that entity lifetime. Then your question can be rephrased in a way that makes the fallacy in your question more obvious:
I want to change the content of the entity with ID 9 to some new
value. The old values of the entity 9 should be moved to the content
of entity with ID 10. The old content of entity with ID 10 should be
moved to the entity with ID 11... and so on and so forth. The old
content of the entity with the highest ID should be inserted as a new
entity.
Usually you do not want to use primary keys this way. A better approach would be to create another column called 'position' or similar where you can keep track of your own ordering system.
To perform the shifting you could run a query like this:
UPDATE table SET id = id + 1 WHERE id >= 9
This do not work if your column uses auto_increment functionality.
No, you can't control where the new row is inserted. Actually, you don't need to: use the ORDER BY clause on your SELECT statements to order the results the way you need.
DECLARE #duplicateTable4 TABLE (id int,data VARCHAR(20))
INSERT INTO #duplicateTable4 VALUES (1,'not duplicate row')
INSERT INTO #duplicateTable4 VALUES (2,'duplicate row')
INSERT INTO #duplicateTable4 VALUES (3,'duplicate rows')
INSERT INTO #duplicateTable4 VALUES (4,'second duplicate row')
INSERT INTO #duplicateTable4 VALUES (5,'second duplicat rows')
DECLARE #duplicateTable5 TABLE (id int,data VARCHAR(20))
insert into #duplicateTable5 select *from #duplicateTable4
delete from #duplicateTable4
declare #i int , #cnt int
set #i=1
set #cnt=(select count(*) from #duplicateTable5)
while(#i<=#cnt)
begin
if #i=1
begin
insert into #duplicateTable4(id,data) select 11,'indian'
insert into #duplicateTable4(id,data) select id,data from #duplicateTable5 where id=#i
end
else
insert into #duplicateTable4(id,data) select id,data from #duplicateTable5 where id=#i
set #i=#i+1
end
select *from #duplicateTable4
This kind of violates the purpose of a relational table, but if you need, it's not really that hard to do.
1) use ROW_NUMBER() OVER(ORDER BY NameOfColumnToSort ASC) AS Row to make a column for the row numbers in your table.
2) From here you can copy (using SELECT columnsYouNeed INTO ) the before and after portions of the table into two separate tables (based on which row number you want to insert your values after) using a WHERE Row < ## and Row >= ## statement respectively.
3) Next you drop the original table using DROP TABLE.
4) Then you use a UNION for the before table, the row you want to insert (using a single explicitly defined SELECT statement without anything else), and the after table. By now you have two UNION statements for 3 separate select clauses. Here you can just wrap this in a SELECT INTO FROM clause calling it the name of your original table.
5) Last, you DROP TABLE the two tables you made.
This is similar to how an ALTER TABLE works.
INSERT INTO customers
(customer_id, last_name, first_name)
SELECT employee_number AS customer_id, last_name, first_name
FROM employees
WHERE employee_number < 1003;
FOR MORE REF: https://www.techonthenet.com/sql/insert.php
I have an web application that creates printable forms, these forms have a unique number on them, the problem is I have 2 forms that separate numbers need to be created for them.
ie)
Form1- Numbered 2000000-2999999
Form2- Numbered 3000000-3999999
dbo.test2 - is my form information table
Tsel - is my autoinc table for the 3000000 series numbers
Tadv - is my autoinc table for the 2000000 series numbers
What I have done is create 2 tables with just autoinc row (one for 2000000 series numbers and one for 3000000 series numbers), I then created a trigger to add a record to the coresponding table, read back the autoinc number and add it to my table that stores the form information including the just created autoinc number for the right series of forms.
Although it does work, I'm concerned that the numbers will get messed up under load.
I'm not sure the ##IDENTITY will always return the right value when many people are using the system. (I cannot have duplicates and I need to use the numbering form show above.
See code below.
**** TRIGGER ****
CREATE TRIGGER MAKEANID2 ON dbo.test2
AFTER INSERT
AS
SET NOCOUNT ON
declare #someid int
declare #someid2 int
declare #startfrom int
declare #test1 varchar(10)
select #someid=##IDENTITY
select #test1 = (Select name1 from test2 where sysid = #someid )
if #test1 = 'select'
begin
insert into Tsel Default values
select #someid2 = ##IDENTITY
end
if #test1 = 'adv'
begin
insert into Tadv Default values
select #someid2 = ##IDENTITY
end
update test2
set name2=(#someid2) where sysid = #someid
SET NOCOUNT OFF
The best way to keep the two IDs in sync is to create a persisted Computed Column based on the actual identity column. Where Col1 is the identity column and Col2 is the persisted computed column that is the result of some formula based on Col1. You can then even Create Indexes on Computed Columns.
test this out:
CREATE TABLE YourTable
(Col1 int not null identity(2000000,1)
,Col2 AS (Col1-2000000+3000000) PERSISTED
,Col3 varchar(5)
)
GO
insert into YourTable (col3) values ('a')
insert into YourTable (col3) SELECT 'b' UNION SELECT 'c'
SELECT * FROM YourTable
OUTPUT:
Col1 Col2 Col3
----------- ----------- -----
2000000 3000000 a
2000001 3000001 b
2000002 3000002 c
(3 row(s) affected)
EDIT After OPs comments, I'm still not 100% sure what you are after.
I never used SQL Server 2000 (we skipped that version), and I don't really want to look up how to do everything in that version, it is so limited without the OUTPUT clause and ROW_NUMBER(), CTEs, etc.
I can think of three methods to do:
1) You could just create a sequence table, where you have 2 rows one for A and one for B, each time you need to insert one, look up, increment, and save the value of the type of seq you need and then insert with that value. for example if you are inserting a type "A" row, do this:
INSERT INTO test2
(col1, col2, col3,...)
SELECT
ISNULL(MAX(NextSeq),0)+1, col2, col3,...
FROM YourSequenceTable WITH (UPDLOCK, HOLDLOCK)
WHERE SequenceType='A'
UPDATE YourSequenceTable
SET NextSeq=ISNULL(NextSeq,0)+1
WHERE SequenceType='A'
2) change your table structure to just save the data in Tsel or Tadv and have a trigger insert into a third common table table where you can have your additional "common" identity. common table would be like
CommonTable
ID int not null indentity(1,1) primary key
TselID int null FK to Tsel.PK
TadvID int null FK to Tadv.PK
3) if you need a single table, try this, which is a real hack. Change your Tsel and Tadv tables to contain all the necessary columns and from the application INSERT INTO Tsel when the value is select and have a trigger grab that identity value and then INSERT that into test2, then remove the data from tsel. Then, from the application when the value is adv just INSERT INTO Tadv an have a trigger on that table insert the data into test2, and remove the data from Tadv. You need to have all data columns in Tsel and Tadv so the trigger can copy the values to test2, but the trigger will remove the rows from there (the identity will be sequential even if the original rows are removed).
your Tsel trigger would look like:
CREATE Trigger MAKEANID2_Tsel ON dbo.Tsel
AFTER INSERT
AS
--copy data from Tsel into test2., test2 can still have its own identity value
INSERT INTO test2
(PK, col1, col2, col3,...)
SELECT
col0, col1, col2, col3,....
FROM INSERTED
--remove rows from Tsel, which were just copied and not needed anymore.
DELETE Tsel
WHERE PK IN (SELECT PK FROM INSERTED)
GO
YOu are right to worry about ##identity, it is not a recommended peice of code, if somone else adds a differnet trigger that inserets an identity and that one triggers first, that is the value you will get.
But you have much bigger problems. Your trigger is deisgned to work on only one record ata time. This is a very very very bad thing to do with a trigger. Triggers operate on sets of data and must ALWAYS even if you think therer will never be more than one record inserted ata time) be set up to handle sets of data not one record. Further, you don;t need to ask for the identity, you have the identities of all records inserted inteh batch in a psuedotable availlble in triggers called inserted.
Now reading one of your comments, you say you can't have any missing values at all. Inthat case you cannot under any circustance use an identity column as it will have gaps if any transaction is rolled back. You will have to write your own process to create the numbers based onteh last number and look out for race conditions.