I have a table for logging that needs a log ID but I can't use an identity column because the log ID is part of a combo key.
create table StuffLogs
{
StuffID int
LogID int
Note varchar(255)
}
There is a combo key for StuffID & LogID.
I want to build an insert trigger that computes the next LogID when inserting log records. I can do it for one record at a time (see below to see how LogID is computed), but that's not really effective, and I'm hoping there's a way to do this without cursors.
select #NextLogID = isnull(max(LogID),0)+1
from StuffLogs where StuffID = (select StuffID from inserted)
The net result should allow me to insert any number of records into StuffLogs with the LogID column auto computed.
StuffID LogID Note
123 1 foo
123 2 bar
456 1 boo
789 1 hoo
Inserting another record using StuffID: 123, Note: bop will result in the following record:
StuffID LogID Note
123 3 bop
Unless there is a rigid business reason that requires each LogID to be a sequence starting from 1 for each distinct StuffID, then just use an identity. With an identity, you'll still be able to order rows properly with StuffID+LogID, but you'll not have the insert issues of trying to do it manually (concurrency, deadlocks, locking/blocking, slow inserts, etc.).
Make sure the LogId has a default value of NULL, so that it need not be supplied during insert statements, like it was an identity column.
CREATE TRIGGER Insert ON dbo.StuffLogs
INSTEAD OF INSERT
AS
UPDATE #Inserted SET LogId = select max(LogId)+1 from StuffLogs where StuffId=[INSERTED].StuffId
Select Row_Number() Over( Order By LogId ) + MaxValue.LogId + 1
From inserted
Cross Join ( Select Max(LogId) As Id From StuffLogs ) As MaxValue
You would need to thoroughly test this and ensure that if two connections were inserting into the table at the same time that you do not get collisions on LogId.
Related
I've got table:
ID (identity, PK), TaskNr, OfferNr
I want to do insert ignore statement but sadly it's not working on MSSQL, so there's a IGNORE_DUP switch. But I need to check duplicates using TaskNr column. Is there any chance to do that?
Edit:
Sample data:
ID (identity, PK), TaskNr, OfferNr
1 BP1234 XAS
2 BD123 JFRT
3 1122AH JDA33
4 22345_a MD_3
Trying to do:
insert ignore into Sample_table (TaskNr, OfferNr) values (BP1234, DFD,)
Should ignore that row and go to next value of insert statement. ID is autoincremented but unique value should be checked using TaskNr column.
SQL Server does not support insert ignore. That is MySQL functionality.
You can do what you want as:
insert ignore into Sample_table (TaskNr, OfferNr)
select x.TaskNr, x.OfferNr
from (select 'BP1234' as TaskNr, 'DFD' as OfferNr) x
where not exists (select 1
from Sample_Table st
where st.TaskNr = x.TaskNr and st.OfferNr = x.OfferNr
);
You can try two options:
insert into ... where not exists ()
t-sql merge statement (https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql)
In my project I need to create a script that insert data with auto generate value for the primary key and then to reuse this number for foreign on other tables.
I'm trying to use the WITH statement in order to keep that value.
For instance, I'm trying to do this:
WITH tmp as (SELECT ID FROM (INSERT INTO A ... VALUES ...))
INSERT INTO B ... VALUES tmp.ID ...
But I can't make it work.
Is it at least possible to do it or am I completely wrong???
Thank you
Yes, it is possible, if your DB2-server version supports the syntax.
For example:
create table xemp(id bigint generated always as identity, other_stuff varchar(20));
create table othertab(xemp_id bigint);
SELECT id FROM FINAL TABLE
(INSERT INTO xemp(other_stuff)
values ('a'), ('b'), ('c'), ('d')
) ;
The above snippet of code gives the result below:
ID
--------------------
1
2
3
4
4 record(s) selected.
If you want to re-use the ID to populate another table:
with tmp1(id) as ( SELECT id FROM new TABLE (INSERT INTO xemp(other_stuff) values ('a1'), ('b1'), ('c1'), ('d1') ) tmp3 )
, tmp2 as (select * from new table (insert into othertab(xemp_id) select id from tmp1 ) tmp4 )
select * from othertab;
As per my understanding
You will have to create an auto-increment field with the sequence object (this object generates a number sequence).
You can CREATE SEQUENCE to achieve the auto increment value :
CREATE SEQUENCE seq_person
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 10
I have a large (5 million row, 300+ column) csv file I need to import into a staging table in SQL Server, then run a script to split each row up and insert data into the relevant tables in a normalized db. The format of the source table looks something like this:
(fName, lName, licenseNumber1, licenseIssuer1, licenseNumber2, licenseIssuer2..., specialtyName1, specialtyState1, specialtyName2, specialtyState2..., identifier1, identifier2...)
There are 50 licenseNumber/licenseIssuer columns, 15 specialtyName/specialtyState columns, and 15 identifier columns. There is always at least one of each of those, but the remaining 49 or 14 could be null. The first identifier is unique, but is not used as the primary key of the Person in our schema.
My database schema looks like this
People(ID int Identity(1,1))
Names(ID int, personID int, lName varchar, fName varchar)
Licenses(ID int, personID int, number varchar, issuer varchar)
Specialties(ID int, personID int, name varchar, state varchar)
Identifiers(ID int, personID int, value)
The database will already be populated with some People before adding the new ones from the csv.
What is the best way to approach this?
I have tried iterating over the staging table one row at a time with select top 1:
WHILE EXISTS (Select top 1 * from staging)
BEGIN
INSERT INTO People Default Values
SET #LastInsertedID = SCOPE_IDENTITY() -- might use the output clause to get this instead
INSERT INTO Names (personID, lName, fName)
SELECT top 1 #LastInsertedID, lName, fName from staging
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber1, licenseIssuer1 from staging
IF (select top 1 licenseNumber2 from staging) is not null
BEGIN
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber2, licenseIssuer2 from staging
END
-- Repeat the above 49 times, etc...
DELETE top 1 from staging
END
One problem with this approach is that it is prohibitively slow, so I refactored it to use a cursor. This works and is significantly faster, but has me declaring 300+ variables for Fetch INTO.
Is there a set-based approach that would work here? That would be preferable, as I understand that cursors are frowned upon, but I'm not sure how to get the identity from the INSERT into the People table for use as a foreign key in the others without going row-by-row from the staging table.
Also, how could I avoid copy and pasting the insert into the Licenses table? With a cursor approach I could try:
FETCH INTO ...#LicenseNumber1, #LicenseIssuer1, #LicenseNumber2, #LicenseIssuer2...
INSERT INTO #LicenseTemp (number, issuer) Values
(#LicenseNumber1, #LicenseIssuer1),
(#LicenseNumber2, #LicenseIssuer2),
... Repeat 48 more times...
.
.
.
INSERT INTO Licenses(personID, number, issuer)
SELECT #LastInsertedID, number, issuer
FROM #LicenseTEMP
WHERE number is not null
There still seems to be some redundant copy and pasting there, though.
To summarize the questions, I'm looking for idiomatic approaches to:
Break up one large staging table into a set of normalized tables, retrieving the Primary Key/identity from one table and using it as the foreign key in the others
Insert multiple rows into the normalized tables that come from many repeated columns in the staging table with less boilerplate/copy and paste (Licenses and Specialties above)
Short of discreet answers, I'd also be very happy with pointers towards resources and references that could assist me in figuring this out.
Ok, I'm not an SQL Server expert, but here's the "strategy" I would suggest.
Calculate the personId on the staging table
As #Shnugo suggested before me, calculating the personId in the staging table will ease the next steps
Use a sequence for the personID
From SQL Server 2012 you can define sequences. If you use it for every person insert, you'll never risk an overlapping of IDs. If you have (as it seems) personId that were loaded before the sequence you can create the sequence with the first free personID as starting value
Create a numbers table
Create an utility table keeping numbers from 1 to n (you need n to be at least 50.. you can look at this question for some implementations)
Use set logic to do the insert
I'd avoid cursor and row-by-row logic: you are right that it is better to limit the number of accesses to the table, but I'd say that you should strive to limit it to one access for target table.
You could proceed like these:
People:
INSERT INTO People (personID)
SELECT personId from staging;
Names:
INSERT INTO Names (personID, lName, fName)
SELECT personId, lName, fName from staging;
Licenses:
here we'll need the Number table
INSERT INTO Licenses (personId, number, issuer)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then licenseNumber1
when 2 then licenseNumber2
...
when 50 then licenseNumber50
end as licenseNumber,
case nbrs.n
when 1 then licenseIssuer1
when 2 then licenseIssuer2
...
when 50 then licenseIssuer50
end as licenseIssuer
from staging
cross join
(select n from numbers where n>=1 and n<=50) nbrs
) WHERE licenseNumber is not null;
Specialties:
INSERT INTO Specialties(personId, name, state)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then specialtyName1
when 2 then specialtyName2
...
when 15 then specialtyName15
end as specialtyName,
case nbrs.n
when 1 then specialtyState1
when 2 then specialtyState2
...
when 15 then specialtyState15
end as specialtyState
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE specialtyName is not null;
Identifiers:
INSERT INTO Identifiers(personId, value)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then identifier1
when 2 then identifier2
...
when 15 then identifier15
end as value
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE value is not null;
Hope it helps.
You say: but the staging table could be modified
I would
add a PersonID INT NOT NULL column and fill it with DENSE_RANK() OVER(ORDER BY fname,lname)
add an index to this PersonID
use this ID in combination with GROUP BY to fill your People table
do the same with your names table
And then use this ID for a set-based insert into your three side tables
Do it like this
SELECT AllTogether.PersonID, AllTogether.TheValue
FROM
(
SELECT PersonID,SomeValue1 AS TheValue FROM StagingTable
UNION ALL SELECT PersonID,SomeValue2 FROM StagingTable
UNION ALL ...
) AS AllTogether
WHERE AllTogether.TheValue IS NOT NULL
UPDATE
You say: might cause a conflict with IDs that already exist in the People table
You did not tell anything about existing People...
Is there any sure and unique mark to identify them? Use a simple
UPDATE StagingTable SET PersonID=xyz WHERE ...
to set existing PersonIDs into your staging table and then use something like
UPDATE StagingTable
SET PersonID=DENSE RANK() OVER(...) + MaxExistingID
WHERE PersonID IS NULL
to set new IDs for PersonIDs still being NULL.
I have a table (in SQL Server) like:
UserId CompanyId UserName Position
1 1 John 1
2 2 Adam 1
3 2 Nick 2
4 1 Mark 2
5 3 Jack 1
UserId is the PK with autoincrement. CompanyID is a FK. Position is just a sequential counter for the users (per company).
When a new user record is inserted (via LINQ) for a company, I want the Position to be incremented as well. Currently I get the MAX+1 of the Position for the given CompanyId and then assign it to the new record. The problem is that concurrent insert operations often result in identical Position values.
I tried incrementing the position in an insert trigger for uniqueness, but LINQ doesn't reflect the updated value automatically.
How can I go about fixing this through LINQ-to-SQL or directly as a TSQL query?
Any help is appreciated. Thanks!
I can tell you about TSQL query.
you can use
IDENT_CURRENT - which gives you the last id inserted into the table by anyone.
if that is not what you want than try - SCOPE_IDENTITY
My 2 cents:
begin transaction
declare #v int
select #v = isnull((select max(field2)+1 from t1), 1)
insert into t1 (field2) values (#v)
commit transaction
-- just checking
select * from t1
You can add a unique constraint on CompanyID and Position to prevent duplicates.
create table Users
(
UserId int identity primary key,
CompanyId int not null,
UserName char(10) not null,
Position int not null,
unique(CompanyId, Position)
)
Instead of duplicates you get an exception and in your code you can do something about that like perhaps a retry.
I'll begin by admitting that my problem is most likely the result of bad design since I can't find anything about this elsewhere. That said, let's get dirty.
I have an Activities table and an ActivitySegments table. The activities table looks something like:
activityid (ident) | actdate (datetime) | actduration (datetime) | ticketnumber (numeric) |
ActivitySegments looks something like
segmentid (ident) | ticketid (numeric) | activityid (numeric) | startdate | starttime | enddate | endtime
This is a time tracking function of an intranet. The "old way" of doing things is just using the activity table. They want to be able to track individual segments of work throughout the day with a start/stop mechanism and have them roll up into records in the activities table. The use case is the user should be able to select any/all segments that they've worked on that day and have them be grouped by ticketid and inserted into the activity table. I have that working. I'm sending a string of comma separated values that correspond to segmentids to a sproc that puts them in a temp table. So I have the above two tables and a temp table with one column of relevant segmentids. Can't they all just get along?
What I need is to take these passed activitysegment Ids, group them by ticket number and sum the duration worked on each ticket (I already have the sql for that). Then insert this dataset into the activities table BUT also get the new activityid ##identity and update the activitiessegments table with the appropriate value.
In procedural programming I'd for loop the insert, get the ##identity and do something else to figure out which segmentids went into creating that activityid. I'm pretty sure I'm thinking about this all wrong, but the deadline approaches and I've been staring at SQL management studio for two days, wasted sheets of paper and burned through way too many cigarettes. I see SQL for Smarties in my near future, until then, can someone help me?
try this approach:
declare #x table (tableID int not null primary key identity (1,1), datavalue varchar(10) null)
INSERT INTO #x values ('one')
INSERT INTO #x values ('aaaa')
INSERT INTO #x values ('cccc')
declare #y table (tableID int not null primary key , datavalue varchar(10) null)
declare #count int ---------------FROM HERE, see comment
set #count=5;
WITH hier(cnt) AS
(
SELECT 1 AS cnt
UNION ALL
SELECT cnt + 1
FROM hier
WHERE cnt < #count
) -----------------------To HERE, see comment
INSERT INTO #x
(datavalue)
OUTPUT INSERTED.tableID, INSERTED.datavalue
INTO #y
SELECT
'value='+CONVERT(varchar(5),h.cnt)
FROM hier h
ORDER BY cnt DESC
select '#x',* from #x --table you just inserted into
select '#y',* from #y --captured data, including identity
here is output of the SELECTs
tableID datavalue
---- ----------- ----------
#x 1 one
#x 2 aaaa
#x 3 cccc
#x 4 value=5
#x 5 value=4
#x 6 value=3
#x 7 value=2
#x 8 value=1
(8 row(s) affected)
tableID datavalue
---- ----------- ----------
#y 4 value=5
#y 5 value=4
#y 6 value=3
#y 7 value=2
#y 8 value=1
The "FROM HERE" - "TO HERE" is just a fancy way to create a table to join to, you can use your own table to join to there...
use #y to process your updates, update from and join it in...
It can be tricky to implement the case of (1) do an insert of ranges and (2) use the identity values generated by them.
One approach is to have some kind of tracking column on the table that generates the Id. So for example add a TransactionGUID (uniqueidentifier) or something to the table that generates the identity you want to capture. When you do insert the rowset to this table you specify a given GUID and can then harvest the set of identity values after the insert completes.
The other common approach is just to to it iteratively like you mentioned.
Probably there is a better way to architect what you want to do, but if you must use your current approach (and if I understand correctly what it is you are doing) then adding the TransactionGUID may be the easiest fix.