How can I auto-generate complex identifiers and ensure uniqueness?

How can I auto-generate complex identifiers and ensure uniqueness? - sql

I've been handed a very specific business requirement, and I'm having trouble translating it to work within our database.
We have multiple departments, who all deal with what we call files. Each file has a unique identifier, but how this identifier is assigned depends on the department that the file is associated with.
Departments ABC, DEF and GHI need the system to provide them with the next identifier following a pattern... while departments JKL and MNO need to be able to specify their own identifiers.
This is further compounded by the fact that the generated file numbers are semi complex. Their file numbers follow the pattern:
[a-z]{3,4}\d{2}-\d{4}
(A 3 or 4 letter prefix that corresponds to the department with two digits corresponding to the year, a dash followed by a four digit generated number - i.e. ABC13-0001)
The part before the dash is easy to generate... the file table has a mandatory foreign key reference to a department table that has the prefix column, and I can grab the year just as easily. it's the part after the dash that I can't seem to work out.
It's a four digit identifier. Each department needs the next generated ID to be as sequential to their department as possible (I say as sequential as possible, as I'm aware that even identity specifications can leave gaps). On top of that, we need to reset it back to 0001 each year. All of that, while ensuring that there are no duplicates, which is the most important part of all of this.
So, we only have one file table that is used by all of these departments. As such, to be able to handle JKL and MNO, I have the FileNumber field set to varchar(12) with a unique constraint. They can type in whatever they need, so long as it's unique. That just leaves me with how to generate the unique file numbers for the other departments.
My first instinct is to give the file table a surrogate identity primary key to make sure that, even if something goes wrong with the generated file number, that each record is guaranteed a unique identifier.
Then I would create a single row table that has two columns per department, one for a number and one for a date. The number would be the last used number for the 4 digit identifier suffix for a given department (as an int), and the date would be when it was assigned. Storing the date would let me check if the year has changed since the last id was pulled so that I can assign 1, instead of lastid + 1.
Then an insert trigger on the file table would generate the file id using:
The foreign key to the department table to get the department prefix
CONVERT(VARCHAR, YEAR( GETDATE() ) % 100) to pull the current two digit year
A select to the above described utility table to get the last id + 1 (or reseting to 1 if the current year differs from the year of the last updated date).
The trigger would finally update the utility table with the last used id and date.
In theory, I think that would work... and the unique constraint on the file id column would prevent an insert where the generated file id already exists. But it feels so brittle, and I can foresee that unique constraint being a double edged sword that would possibly prevent a department from creating a new file should the trigger fail to update the utility table. After all, if it doesn't, then the next time a file number is generated, it would try to use the same generated id, and fail. There must be a better way.
My other thought was to have a table per department with just an identity integer column and non-null date field with a default of (getdate())... and have the trigger on the file table insert a new row, and use that id. It would also be responsible for deleting all rows in the given department id table and reset the identity come the new year. It feels more secure to me, but then I have 5 utility tables (1 per department that auto-generate ids) with up to 9999 records, just to generate an id.
Am I missing something simple? Am I going about this the right way? I just can't help but think that there must be an easier way. Am I right to try to make the SQL server responsible for this, or should I try doing this in the desktop application that I will build on top of this database?

So, I think it's possible you're overcomplicating this. It's also possible I'm misunderstanding the requirements. I don't believe you need separate tables or anything, you just need to check the existing IDs and increment them. SQLFiddle doesn't seem to be working right now, but here's a method I came up with in my local DB:
create table dept_ids (
id varchar(12) primary key
)
insert into dept_ids values('ABC13-0001')
insert into dept_ids values('DEF13-0001')
insert into dept_ids values('GHI13-0001')
insert into dept_ids values('JKL13-0001')
insert into dept_ids values('ABC13-0002')
declare
#dept varchar(4)
, #year varchar(2)
, #prefix varchar(8)
, #new_seq int
set #dept = 'ABC'
select #year = right(cast(DATEPART(yy, getdate()) as varchar(4)), 2)
set #prefix = #dept + #year + '-'
select #new_seq = isnull(right(max(id), 4), 0) + 1 from dept_ids where id like #prefix + '%'
select new_id = #prefix + right(replicate('0', 4) + cast(#new_seq as varchar(4)), 4)
This handles the different departments of course, as well as the year scenario by using getdate() and an ISNULL check. For the departments that can enter their own sequence values, you can just add a null check for that parameter and skip generating this value if it's present.
If I'm oversimplifying, feel free to let me know and I'll adjust.

Related

SQL Server - Reset custom identity column

I have a table with two columns ID and SubDetail. I have created custom identity like 2018XX for the ID columns with 2018 as current year and XX as a number and it will start from "01". I have searched for some topic about "Reset identity field - SQL Server" but It only showed how to reset only number identity field. I'm confused that is there anyway to reset custom identity field.
Here is my SQL query code:
CREATE PROCEDURE spAddSubWareHouse
(#SubDetail VARCHAR(50))
AS
BEGIN
DECLARE #maxDetailID INT
SELECT #maxDetailID = MAX(CAST(RIGHT(ID, 2) AS INT))
FROM SubWareHouse
IF (#maxDetailID IS NULL)
SET #maxDetailID = 1
ELSE
SET #maxDetailID += 1
--insert
INSERT INTO SubWareHouse
VALUES (CAST(YEAR(GETDATE()) AS VARCHAR) + RIGHT('00' + CAST(#maxDetailID AS VARCHAR), 2), #SubDetail)
END

Don't bother doing this. Simply have a numeric identity column defined as:
id int identity(1, 1) primary key
You can then store the year as a separate column, because that appears to be information that you want.
Why do it this way? There are many reasons. First, identity is built into the database. So, identity columns are guaranteed to be unique even when multiple applications are doing inserts at the same time.
Second, integers are more efficient (slightly) for foreign key references.
Third, the locking required to do what you really want is very cumbersome and could significantly slow down your system. Plus, any implementation to prevent duplicates could have a bug -- so why not use the built-in mechanisms?
If you need to get the sequence number for a given year, it is easy enough using row_number():
select row_number() over (partition by year order by id) as year_sequence_number

Adding Row in existing table (SQL Server 2005)

I want to add another row in my existing table and I'm a bit hesitant if I'm doing the right thing because it might skew the database. I have my script below and would like to hear your thoughts about it.
I want to add another row for 'Jane' in the table, which will be 'SKATING" in the ACT column.
Table: [Emp_table].[ACT].[LIST_EMP]
My script is:
INSERT INTO [Emp_table].[ACT].[LIST_EMP]
([ENTITY],[TYPE],[EMP_COD],[DATE],[LINE_NO],[ACT],[NAME])
VALUES
('REG','EMP','45233','2016-06-20 00:00:00:00','2','SKATING','JANE')
Will this do the trick?

Your statement looks ok. If the database has a problem with it (for example, due to a foreign key constraint violation), it will reject the statement.
If any of the fields in your table are numeric (and not varchar or char), just remove the quotes around the corresponding field. For example, if emp_cod and line_no are int, insert the following values instead:
('REG','EMP',45233,'2016-06-20 00:00:00:00',2,'SKATING','JANE')

Inserting records into a database has always been the most common reason why I've lost a lot of my hairs on my head!
SQL is great when it comes to SELECT or even UPDATEs but when it comes to INSERTs it's like someone from another planet came into the SQL standards commitee and managed to get their way of doing it implemented into the final SQL standard!
If your table does not have an automatic primary key that automatically gets generated on every insert, then you have to code it yourself to manage avoiding duplicates.
Start by writing a normal SELECT to see if the record(s) you're going to add don't already exist. But as Robert implied, your table may not have a primary key because it looks like a LOG table to me. So insert away!
If it does require to have a unique record everytime, then I strongly suggest you create a primary key for the table, either an auto generated one or a combination of your existing columns.
Assuming the first five combined columns make a unique key, this select will determine if your data you're inserting does not already exist...
SELECT COUNT(*) AS FoundRec FROM [Emp_table].[ACT].[LIST_EMP]
WHERE [ENTITY] = wsEntity AND [TYPE] = wsType AND [EMP_COD] = wsEmpCod AND [DATE] = wsDate AND [LINE_NO] = wsLineno
The wsXXX declarations, you will have to replace them with direct values or have them DECLAREd earlier in your script.
If you ran this alone and recieved a value of 1 or more, then the data exists already in your table, at least those 5 first columns. A true duplicate test will require you to test EVERY column in your table, but it should give you an idea.
In the INSERT, to do it all as one statement, you can do this ...
INSERT INTO [Emp_table].[ACT].[LIST_EMP]
([ENTITY],[TYPE],[EMP_COD],[DATE],[LINE_NO],[ACT],[NAME])
VALUES
('REG','EMP','45233','2016-06-20 00:00:00:00','2','SKATING','JANE')
WHERE (SELECT COUNT(*) AS FoundRec FROM [Emp_table].[ACT].[LIST_EMP]
WHERE [ENTITY] = wsEntity AND [TYPE] = wsType AND
[EMP_COD] = wsEmpCod AND [DATE] = wsDate AND
[LINE_NO] = wsLineno) = 0
Just replace the wsXXX variables with the values you want to insert.
I hope that made sense.

Create column from other columns in Database

I have a table name: test
ID | Prefix | ACCID
ID's type is INTEGER which is selected from ID_SEQ
Prefix's type is VARCHAR(6)
ACCID is the combination of Prefix + ID
I want to auto-create ACCID when I insert the ID and Prefix value such as
INSERT INTO TEST (PREFIX) VALUES ('A01407V');
and the database store the ACCID as 'A01407V000001'
I create the sequence as
CREATE SEQUENCE ID_SEQ AS INT MAXVALUE 999999 CYCLE;
How to implement SQL statement to produce this result?
Thank you for all solutions and suggestions.
Ps. I use Apache Derby as my SQL Server

As documented in the manual, Derby supports generated columns (since Version 10.5)
The real problem is the formatting of a number with leading zeros as Derby has no function for that.
If you really, really think you need to store a value that can always be determined by the values already stored in the table, you can use something like this:
create table test
(
id integer,
prefix varchar(6),
accid generated always as (prefix||substr('000000', 1, 6 - length(rtrim(char(id))))||rtrim(char(id)))
);
The expression substr('000000', 1, 6 - length(rtrim(char(id))))||rtrim(char(id)) is just a complicated way to format a the ID with leading zeros.
I would highly recommend to not store this value though. It is much cleaner to create a view that shows this value if you do need access to this in SQL.

You can use COMPUTED Column.
Is a computed column that is based on some other column in the table. We can physically save the data of the column/ or not. Table will automatically update the value of this column.
syntax:
columnname AS expression [PERSISTED]
--PERSISTED will make it physically saved, otherwise it will be calculated every time.
We can create indexes on computed columns.
You add, The following in the table CREATE Script
ACCID AS Prefix + CAST(ID AS CHAR(6)) [PERSISTED]

SQL Server Unique Composite Key of Two Field With Second Field Auto-Increment

I have the following problem, I want to have Composite Primary Key like:
PRIMARY KEY (`base`, `id`);
for which when I insert a base the id to be auto-incremented based on the previous id for the same base
Example:
base id
A 1
A 2
B 1
C 1
Is there a way when I say:
INSERT INTO table(base) VALUES ('A')
to insert a new record with id 3 because that is the next id for base 'A'?
The resulting table should be:
base id
A 1
A 2
B 1
C 1
A 3
Is it possible to do it on the DB exactly since if done programmatically it could cause racing conditions.
EDIT
The base currently represents a company, the id represents invoice number. There should be auto-incrementing invoice numbers for each company but there could be cases where two companies have invoices with the same number. Users logged with a company should be able to sort, filter and search by those invoice numbers.

Ever since someone posted a similar question, I've been pondering this. The first problem is that DBs don't provide "partitionable" sequences (that would restart/remember based on different keys). The second is that the SEQUENCE objects that are provided are geared around fast access, and can't be rolled back (ie, you will get gaps). This essentially this rules out using a built-in utility... meaning we have to roll our own.
The first thing we're going to need is a table to store our sequence numbers. This can be fairly simple:
CREATE TABLE Invoice_Sequence (base CHAR(1) PRIMARY KEY CLUSTERED,
invoiceNumber INTEGER);
In reality the base column should be a foreign-key reference to whatever table/id defines the business(es)/entities you're issuing invoices for. In this table, you want entries to be unique per issued-entity.
Next, you want a stored proc that will take a key (base) and spit out the next number in the sequence (invoiceNumber). The set of keys necessary will vary (ie, some invoice numbers must contain the year or full date of issue), but the base form for this situation is as follows:
CREATE PROCEDURE Next_Invoice_Number #baseKey CHAR(1),
#invoiceNumber INTEGER OUTPUT
AS MERGE INTO Invoice_Sequence Stored
USING (VALUES (#baseKey)) Incoming(base)
ON Incoming.base = Stored.base
WHEN MATCHED THEN UPDATE SET Stored.invoiceNumber = Stored.invoiceNumber + 1
WHEN NOT MATCHED BY TARGET THEN INSERT (base) VALUES(#baseKey)
OUTPUT INSERTED.invoiceNumber ;;
Note that:
You must run this in a serialized transaction
The transaction must be the same one that's inserting into the destination (invoice) table.
That's right, you'll still get blocking per-business when issuing invoice numbers. You can't avoid this if invoice numbers must be sequential, with no gaps - until the row is actually committed, it might be rolled back, meaning that the invoice number wouldn't have been issued.
Now, since you don't want to have to remember to call the procedure for the entry, wrap it up in a trigger:
CREATE TRIGGER Populate_Invoice_Number ON Invoice INSTEAD OF INSERT
AS
DECLARE #invoiceNumber INTEGER
BEGIN
EXEC Next_Invoice_Number Inserted.base, #invoiceNumber OUTPUT
INSERT INTO Invoice (base, invoiceNumber)
VALUES (Inserted.base, #invoiceNumber)
END
(obviously, you have more columns, including others that should be auto-populated - you'll need to fill them in)
...which you can then use by simply saying:
INSERT INTO Invoice (base) VALUES('A');
So what have we done? Mostly, all this work was about shrinking the number of rows locked by a transaction. Until this INSERT is committed, there are only two rows locked:
The row in Invoice_Sequence maintaining the sequence number
The row in Invoice for the new invoice.
All other rows for a particular base are free - they can be updated or queried at will (deleting information out of this kind of system tends to make accountants nervous). You probably need to decide what should happen when queries would normally include the pending invoice...

you can use the trigger for before insert and assign the next value by taking the max(id) with "base" filter which is "A" in this case.
That will give you the max(id) value as 2 and than increment it by max(id)+1. now push the new value to the "id" field. before insert.
I think this may help you
MSSQL Triggers: http://msdn.microsoft.com/en-in/library/ms189799.aspx

Test Table
CREATE TABLE MyTable
( base CHAR(1),
id INT
)
GO
Trigger Definition
CREATE TRIGGER dbo.tr_Populate_ID
ON dbo.MyTable
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO MyTable (base,id)
SELECT i.base, ISNULL(MAX(mt.id),0) +1 AS NextValue
FROM inserted i left join MyTable mt
on i.base = mt.base
GROUP BY i.base
END
Test
Execute the following statement multiple times and you will see the next values available in that group will be assigned to ID.
INSERT INTO MyTable VALUES
('A'),
('B'),
('C')
GO
SELECT * FROM MyTable
GO

Generating Random ID's in Microsoft SQL Server

I want to create a table in sql server and fill it up with data (people's info) every person should have a unique ID different than the auto incremented ID's by sql server
For example i need the ID for the first person inserted like this: 2016xxxx
how to fix the 2016 and randomly generate the numbers after that to be filled instead of xxxx
should i use a regular expression ?

You can also create a computed column like below
CREATE TABLE tableName
(
PkAutoId INT PRIMARY KEY IDENTITY(1,1),
PersonUniqueNo AS (CAST(DATEPART(YEAR,GETDATE()) AS VARCHAR) + RIGHT(RIGHT(CAST(RAND() AS VARCHAR),4) + CAST(PkAutoId AS VARCHAR),4))
)
Computed Column "PersonUniqueNo" is 8 Digit Unique Number comprising of Current Year And Conceited value of Random number and Primary Key Id for 4 Length, Total length will be 8 as asked.

You could create a function that would get the next value for you and use that instead of an AUTO_INCREMENT field.
I wouldn't recommend it tho. You shouldn't format the data like that before inserting it. That sort of thing should be done on the way out, preferably by the front-end code. Or you can just write a query and create a view ...
However if you must do that here is the complete answer with the code:
Is there a way to insert an auto-incremental primary id with a prefix in mysql database?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas