Increment Record Position in Database (via LINQ or TSQL) - sql

I have a table (in SQL Server) like:
UserId CompanyId UserName Position
1 1 John 1
2 2 Adam 1
3 2 Nick 2
4 1 Mark 2
5 3 Jack 1
UserId is the PK with autoincrement. CompanyID is a FK. Position is just a sequential counter for the users (per company).
When a new user record is inserted (via LINQ) for a company, I want the Position to be incremented as well. Currently I get the MAX+1 of the Position for the given CompanyId and then assign it to the new record. The problem is that concurrent insert operations often result in identical Position values.
I tried incrementing the position in an insert trigger for uniqueness, but LINQ doesn't reflect the updated value automatically.
How can I go about fixing this through LINQ-to-SQL or directly as a TSQL query?
Any help is appreciated. Thanks!

I can tell you about TSQL query.
you can use
IDENT_CURRENT - which gives you the last id inserted into the table by anyone.
if that is not what you want than try - SCOPE_IDENTITY

My 2 cents:
begin transaction
declare #v int
select #v = isnull((select max(field2)+1 from t1), 1)
insert into t1 (field2) values (#v)
commit transaction
-- just checking
select * from t1

You can add a unique constraint on CompanyID and Position to prevent duplicates.
create table Users
(
UserId int identity primary key,
CompanyId int not null,
UserName char(10) not null,
Position int not null,
unique(CompanyId, Position)
)
Instead of duplicates you get an exception and in your code you can do something about that like perhaps a retry.

Related

Split one large, denormalized table into a normalized database

I have a large (5 million row, 300+ column) csv file I need to import into a staging table in SQL Server, then run a script to split each row up and insert data into the relevant tables in a normalized db. The format of the source table looks something like this:
(fName, lName, licenseNumber1, licenseIssuer1, licenseNumber2, licenseIssuer2..., specialtyName1, specialtyState1, specialtyName2, specialtyState2..., identifier1, identifier2...)
There are 50 licenseNumber/licenseIssuer columns, 15 specialtyName/specialtyState columns, and 15 identifier columns. There is always at least one of each of those, but the remaining 49 or 14 could be null. The first identifier is unique, but is not used as the primary key of the Person in our schema.
My database schema looks like this
People(ID int Identity(1,1))
Names(ID int, personID int, lName varchar, fName varchar)
Licenses(ID int, personID int, number varchar, issuer varchar)
Specialties(ID int, personID int, name varchar, state varchar)
Identifiers(ID int, personID int, value)
The database will already be populated with some People before adding the new ones from the csv.
What is the best way to approach this?
I have tried iterating over the staging table one row at a time with select top 1:
WHILE EXISTS (Select top 1 * from staging)
BEGIN
INSERT INTO People Default Values
SET #LastInsertedID = SCOPE_IDENTITY() -- might use the output clause to get this instead
INSERT INTO Names (personID, lName, fName)
SELECT top 1 #LastInsertedID, lName, fName from staging
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber1, licenseIssuer1 from staging
IF (select top 1 licenseNumber2 from staging) is not null
BEGIN
INSERT INTO Licenses(personID, number, issuer)
SELECT top 1 #LastInsertedID, licenseNumber2, licenseIssuer2 from staging
END
-- Repeat the above 49 times, etc...
DELETE top 1 from staging
END
One problem with this approach is that it is prohibitively slow, so I refactored it to use a cursor. This works and is significantly faster, but has me declaring 300+ variables for Fetch INTO.
Is there a set-based approach that would work here? That would be preferable, as I understand that cursors are frowned upon, but I'm not sure how to get the identity from the INSERT into the People table for use as a foreign key in the others without going row-by-row from the staging table.
Also, how could I avoid copy and pasting the insert into the Licenses table? With a cursor approach I could try:
FETCH INTO ...#LicenseNumber1, #LicenseIssuer1, #LicenseNumber2, #LicenseIssuer2...
INSERT INTO #LicenseTemp (number, issuer) Values
(#LicenseNumber1, #LicenseIssuer1),
(#LicenseNumber2, #LicenseIssuer2),
... Repeat 48 more times...
.
.
.
INSERT INTO Licenses(personID, number, issuer)
SELECT #LastInsertedID, number, issuer
FROM #LicenseTEMP
WHERE number is not null
There still seems to be some redundant copy and pasting there, though.
To summarize the questions, I'm looking for idiomatic approaches to:
Break up one large staging table into a set of normalized tables, retrieving the Primary Key/identity from one table and using it as the foreign key in the others
Insert multiple rows into the normalized tables that come from many repeated columns in the staging table with less boilerplate/copy and paste (Licenses and Specialties above)
Short of discreet answers, I'd also be very happy with pointers towards resources and references that could assist me in figuring this out.
Ok, I'm not an SQL Server expert, but here's the "strategy" I would suggest.
Calculate the personId on the staging table
As #Shnugo suggested before me, calculating the personId in the staging table will ease the next steps
Use a sequence for the personID
From SQL Server 2012 you can define sequences. If you use it for every person insert, you'll never risk an overlapping of IDs. If you have (as it seems) personId that were loaded before the sequence you can create the sequence with the first free personID as starting value
Create a numbers table
Create an utility table keeping numbers from 1 to n (you need n to be at least 50.. you can look at this question for some implementations)
Use set logic to do the insert
I'd avoid cursor and row-by-row logic: you are right that it is better to limit the number of accesses to the table, but I'd say that you should strive to limit it to one access for target table.
You could proceed like these:
People:
INSERT INTO People (personID)
SELECT personId from staging;
Names:
INSERT INTO Names (personID, lName, fName)
SELECT personId, lName, fName from staging;
Licenses:
here we'll need the Number table
INSERT INTO Licenses (personId, number, issuer)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then licenseNumber1
when 2 then licenseNumber2
...
when 50 then licenseNumber50
end as licenseNumber,
case nbrs.n
when 1 then licenseIssuer1
when 2 then licenseIssuer2
...
when 50 then licenseIssuer50
end as licenseIssuer
from staging
cross join
(select n from numbers where n>=1 and n<=50) nbrs
) WHERE licenseNumber is not null;
Specialties:
INSERT INTO Specialties(personId, name, state)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then specialtyName1
when 2 then specialtyName2
...
when 15 then specialtyName15
end as specialtyName,
case nbrs.n
when 1 then specialtyState1
when 2 then specialtyState2
...
when 15 then specialtyState15
end as specialtyState
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE specialtyName is not null;
Identifiers:
INSERT INTO Identifiers(personId, value)
SELECT * FROM (
SELECT personId,
case nbrs.n
when 1 then identifier1
when 2 then identifier2
...
when 15 then identifier15
end as value
from staging
cross join
(select n from numbers where n>=1 and n<=15) nbrs
) WHERE value is not null;
Hope it helps.
You say: but the staging table could be modified
I would
add a PersonID INT NOT NULL column and fill it with DENSE_RANK() OVER(ORDER BY fname,lname)
add an index to this PersonID
use this ID in combination with GROUP BY to fill your People table
do the same with your names table
And then use this ID for a set-based insert into your three side tables
Do it like this
SELECT AllTogether.PersonID, AllTogether.TheValue
FROM
(
SELECT PersonID,SomeValue1 AS TheValue FROM StagingTable
UNION ALL SELECT PersonID,SomeValue2 FROM StagingTable
UNION ALL ...
) AS AllTogether
WHERE AllTogether.TheValue IS NOT NULL
UPDATE
You say: might cause a conflict with IDs that already exist in the People table
You did not tell anything about existing People...
Is there any sure and unique mark to identify them? Use a simple
UPDATE StagingTable SET PersonID=xyz WHERE ...
to set existing PersonIDs into your staging table and then use something like
UPDATE StagingTable
SET PersonID=DENSE RANK() OVER(...) + MaxExistingID
WHERE PersonID IS NULL
to set new IDs for PersonIDs still being NULL.

Get all missing values between two limits in SQL table column

I am trying to assign ID numbers to records that are being inserted into an SQL Server 2005 database table. Since these records can be deleted, I would like these records to be assigned the first available ID in the table. For example, if I have the table below, I would like the next record to be entered at ID 4 as it is the first available.
| ID | Data |
| 1 | ... |
| 2 | ... |
| 3 | ... |
| 5 | ... |
The way that I would prefer this to be done is to build up a list of available ID's via an SQL query. From there, I can do all the checks within the code of my application.
So, in summary, I would like an SQL query that retrieves all available ID's between 1 and 99999 from a specific table column.
First build a table of all N IDs.
declare #allPossibleIds table (id integer)
declare #currentId integer
select #currentId = 1
while #currentId < 1000000
begin
insert into #allPossibleIds
select #currentId
select #currentId = #currentId+1
end
Then, left join that table to your real table. You can select MIN if you want, or you could limit your allPossibleIDs to be less than the max table id
select a.id
from #allPossibleIds a
left outer join YourTable t
on a.id = t.Id
where t.id is null
Don't go for identity,
Let me give you an easy option while i work on a proper one.
Store int from 1-999999 in a table say Insert_sequence.
try to write an Sp for insertion,
You can easly identify the min value that is present in your Insert_sequence and not in
your main table, store this value in a variable and insert the row with ID from variable..
Regards
Ashutosh Arya
You could also loop through the keys. And when you hit an empty one Select it and exit Loop.
DECLARE #intStart INT, #loop bit
SET #intStart = 1
SET #loop = 1
WHILE (#loop = 1)
BEGIN
IF NOT EXISTS(SELECT [Key] FROM [Table] Where [Key] = #intStart)
BEGIN
SELECT #intStart as 'FreeKey'
SET #loop = 0
END
SET #intStart = #intStart + 1
END
GO
From there you can use the key as you please. Setting a #intStop to limit the loop field would be no problem.
Why do you need a table from 1..999999 all information you need is in your source table. Here is a query which give you minimal ID to insert in gaps.
It works for all combinations:
(2,3,4,5) - > 1
(1,2,3,5) - > 4
(1,2,3,4) - > 5
SQLFiddle demo
select min(t1.id)+1 from
(
select id from t
union
select 0
)
t1
left join t as t2 on t1.id=t2.id-1
where t2.id is null
Many people use an auto-incrementing integer or long value for the Primary Key of their tables, and it is often called ID or MyEntityID or something similar. This column, since it's just an auto-incrementing integer, often has nothing to do with the data being stored itself.
These types of "primary keys" are called surrogate keys. They have no meaning. Many people like these types of IDs to be sequential because it is "aesthetically pleasing", but this is a waste of time and resources. The database could care less about which IDs are being used and which are not.
I would highly suggest you forget trying to do this and just leave the ID column auto-increment. You should also create an index on your table that is made up of those (subset of) columns that can uniquely identify each record in the table (and even consider using this index as your primary key index). In rare cases where you would need to use all columns to accomplish that, that is where an auto-incrementing primary key ID is extremely useful—because it may not be performant to create an index over all columns in the table. Even so, the database engine could care less about this ID (e.g. which ones are in use, are not in use, etc.).
Also consider that an integer-based ID has a maximum total of 4.2 BILLION IDs. It is quite unlikely that you'll exhaust the supply of integer-based IDs in any short amount of time, which further bolsters the argument for why this sort of thing is a waste of time and resources.

Simulating an identity column within an insert trigger

I have a table for logging that needs a log ID but I can't use an identity column because the log ID is part of a combo key.
create table StuffLogs
{
StuffID int
LogID int
Note varchar(255)
}
There is a combo key for StuffID & LogID.
I want to build an insert trigger that computes the next LogID when inserting log records. I can do it for one record at a time (see below to see how LogID is computed), but that's not really effective, and I'm hoping there's a way to do this without cursors.
select #NextLogID = isnull(max(LogID),0)+1
from StuffLogs where StuffID = (select StuffID from inserted)
The net result should allow me to insert any number of records into StuffLogs with the LogID column auto computed.
StuffID LogID Note
123 1 foo
123 2 bar
456 1 boo
789 1 hoo
Inserting another record using StuffID: 123, Note: bop will result in the following record:
StuffID LogID Note
123 3 bop
Unless there is a rigid business reason that requires each LogID to be a sequence starting from 1 for each distinct StuffID, then just use an identity. With an identity, you'll still be able to order rows properly with StuffID+LogID, but you'll not have the insert issues of trying to do it manually (concurrency, deadlocks, locking/blocking, slow inserts, etc.).
Make sure the LogId has a default value of NULL, so that it need not be supplied during insert statements, like it was an identity column.
CREATE TRIGGER Insert ON dbo.StuffLogs
INSTEAD OF INSERT
AS
UPDATE #Inserted SET LogId = select max(LogId)+1 from StuffLogs where StuffId=[INSERTED].StuffId
Select Row_Number() Over( Order By LogId ) + MaxValue.LogId + 1
From inserted
Cross Join ( Select Max(LogId) As Id From StuffLogs ) As MaxValue
You would need to thoroughly test this and ensure that if two connections were inserting into the table at the same time that you do not get collisions on LogId.

prevent duplicate invoice no. against particular ID

I have confusion on how to prevent duplicate InvoiceNo against CompanyId. I just have prepare an Invoice project that has field like CompanyId and InvNo. Both do not have Unique Keys because Company ID and InvoiceNo both have to repeated. as per below
CompanyID InvNo
1 1
2 1
1 2
3 1
4 1
1 3
Now I want to fire a raiserror on duplicate InvoiceNo against a particular CompanyId. How do I implement this. Important: if i create a unique index then duplicate records will not be allowed and it is important to allow except against particular CompanyId
What you are looking for is a Unique Constraint composed of both CompanyId and InvNo. This will let you create only one InvoiceNo = 1 for CompanyId = 1 and will automatically RaisError if you try to insert a duplicate. It will also let you insert InvoiceNo = 1 for CompanyId = 2 thereby (hopefully) satisfying your requiements
This is how I would do it in SQL Server
ALTER TABLE YourTableName
ADD UNIQUE CONSTRAINT InvoiceIdMustBeUniqePerCompany (CompanyId, InvNo)
Your question is not all that straight forward, but assuming you're asking what I think you're asking, it goes something like this...
You need a cable called NextInvoiceNumber, comprising of CompanyID and NextInvoiceNo. Creating a new Company should create a new NextInvoiceNumber - so perhaps use an insert trigger on your Companies table for that...
Write a function to get the next Invoice id for a specific company, and then incremenent the value in the NextInvoiceNumber table (all inside a common transaction).
So, in pseudo code, something like
function GetNextInvoiceNo(CompanyIDCode){
begin transaction;
result = Select NextInvoiceNo from NextInvoiceNumber where CompanyID = CompanyIDCode;
update NextInvoiceNo set NextInvoiceNo = NextInvoiceNo + 1 where CompanyID = CompanyIDCode;
commit transaction;
return result;
}
This function ideally belongs on your database server as a UDF.
From the sample data it appears that the combination of CompanyID + InvNo is unique. If that's true you can create a key on those two fields and upon an attempt to insert a duplicate InvNo that has already been used for a particular CompanyID an error would be thrown.
create table Invoice
(
CompanyID int,
InvNo int,
Primary Key(CompanyID, InvNo)
)

Is it possible INSERT SELECT a collection of aggregate values, then Update another table based on the ##IDENTITY values of the inserts you just made?

I'll begin by admitting that my problem is most likely the result of bad design since I can't find anything about this elsewhere. That said, let's get dirty.
I have an Activities table and an ActivitySegments table. The activities table looks something like:
activityid (ident) | actdate (datetime) | actduration (datetime) | ticketnumber (numeric) |
ActivitySegments looks something like
segmentid (ident) | ticketid (numeric) | activityid (numeric) | startdate | starttime | enddate | endtime
This is a time tracking function of an intranet. The "old way" of doing things is just using the activity table. They want to be able to track individual segments of work throughout the day with a start/stop mechanism and have them roll up into records in the activities table. The use case is the user should be able to select any/all segments that they've worked on that day and have them be grouped by ticketid and inserted into the activity table. I have that working. I'm sending a string of comma separated values that correspond to segmentids to a sproc that puts them in a temp table. So I have the above two tables and a temp table with one column of relevant segmentids. Can't they all just get along?
What I need is to take these passed activitysegment Ids, group them by ticket number and sum the duration worked on each ticket (I already have the sql for that). Then insert this dataset into the activities table BUT also get the new activityid ##identity and update the activitiessegments table with the appropriate value.
In procedural programming I'd for loop the insert, get the ##identity and do something else to figure out which segmentids went into creating that activityid. I'm pretty sure I'm thinking about this all wrong, but the deadline approaches and I've been staring at SQL management studio for two days, wasted sheets of paper and burned through way too many cigarettes. I see SQL for Smarties in my near future, until then, can someone help me?
try this approach:
declare #x table (tableID int not null primary key identity (1,1), datavalue varchar(10) null)
INSERT INTO #x values ('one')
INSERT INTO #x values ('aaaa')
INSERT INTO #x values ('cccc')
declare #y table (tableID int not null primary key , datavalue varchar(10) null)
declare #count int ---------------FROM HERE, see comment
set #count=5;
WITH hier(cnt) AS
(
SELECT 1 AS cnt
UNION ALL
SELECT cnt + 1
FROM hier
WHERE cnt < #count
) -----------------------To HERE, see comment
INSERT INTO #x
(datavalue)
OUTPUT INSERTED.tableID, INSERTED.datavalue
INTO #y
SELECT
'value='+CONVERT(varchar(5),h.cnt)
FROM hier h
ORDER BY cnt DESC
select '#x',* from #x --table you just inserted into
select '#y',* from #y --captured data, including identity
here is output of the SELECTs
tableID datavalue
---- ----------- ----------
#x 1 one
#x 2 aaaa
#x 3 cccc
#x 4 value=5
#x 5 value=4
#x 6 value=3
#x 7 value=2
#x 8 value=1
(8 row(s) affected)
tableID datavalue
---- ----------- ----------
#y 4 value=5
#y 5 value=4
#y 6 value=3
#y 7 value=2
#y 8 value=1
The "FROM HERE" - "TO HERE" is just a fancy way to create a table to join to, you can use your own table to join to there...
use #y to process your updates, update from and join it in...
It can be tricky to implement the case of (1) do an insert of ranges and (2) use the identity values generated by them.
One approach is to have some kind of tracking column on the table that generates the Id. So for example add a TransactionGUID (uniqueidentifier) or something to the table that generates the identity you want to capture. When you do insert the rowset to this table you specify a given GUID and can then harvest the set of identity values after the insert completes.
The other common approach is just to to it iteratively like you mentioned.
Probably there is a better way to architect what you want to do, but if you must use your current approach (and if I understand correctly what it is you are doing) then adding the TransactionGUID may be the easiest fix.