Generate unique random integer in database column - sql

My application creates coupons that each need a unique barcode number. This number needs to be a positive integer and must be between 6 - 12 digits. This number represents a unique coupon, so this number must be unique. I can't simply increment the barcode numbers by 1, because this will make it easy for hackers to guess other coupon barcodes.
If I have a coupon db table, how can I generate this random barcode number and guarantee uniqueness?

This will give you a random number of up to 12 digits, with very few collisions.
select -convert(bigint, convert(varbinary(max), newid())) % 1000000000000
You need to test and ignore collisions, as well as discard numbers that end up with less than 6 digits.
EDIT
To use the lowest lengths first, you won't be able to use a truly random number generator. This is because once you get to 95% of the 6-digit range, the collisions would be so high that the program spends all its time trying and retrying to get a unique number that hasn't been used yet. Once you get to only one number remaining, the program can wait forever and never "generate" that number. So, to fulfil "lowest lengths first" you would actually have to generate ALL numbers into a table and then row number (order by len(num), newid()) them randomly, then sequentially draw them out.
To 0-pad to 12 digits, use
select right('000000000000'
+right(-convert(bigint, convert(varbinary(max), newid())),12),12)

Might sound lame, but depending on the # of values you're going to need, you could just put a unique constraint on the column, and update each row with a random number (with the 6-12 digits) and loop until it doesn't fail. 12 digits is a lot of values, so you're probably not going to get many collisions.

Related

Generate a progressive number when new record are inserted (some record need to have the same number)

the Title can be a little confused. Let me explain the problem. I have a pipeline that loads new record daily. This record contain sales. The key is <date, location, ticket, line>. This data are loaded into a redshift table and than are exposed through a view that is read by a system. This system have a limit, the column for the ticket is a varchar(10) but the ticket is a string of 30 char. If the system take only the first 10 character will generate duplicate. The ticket number can be a "fake" number. Doesn't matter if it isn't equal to the real number. So I'm thinking to add a new column on the redshift table that contain a progressive number. The problem is that I cannot use an identity column because the record belonging to the same ticket must have the same "progressive number". Then I will expose this new column (ticket_id) instead of the original one.
That is what I want:
day
location
ticket
line
amount
ticket_id
12/12/2020
67
123...GH
1
10
1
12/12/2020
67
123...GH
2
5
1
12/12/2020
67
123...GH
3
23
1
12/12/2020
23
123...GB
1
13
2
12/12/2020
23
123...GB
2
45
2
...
...
...
...
...
...
12/12/2020
78
123...AG
5
100
153
The next day when new data will be loaded I want start with the ticket_id 154 and so on.
Each row have a column which specify the instant in which it was inserted. Rows inserted the same day have the same insert_time.
My solution is:
insert the record with ticket_id as a dense_rank. But each time (that I load new record, so each day) the ticket_id start by one, so...
... update the rows just inserted as ticket_id = ticket_id + the max number that I find under the ticket_id column where insert_time != max(insert_time)
Do you think that there is a better solution? It would be very nice if a hash function existed that take <day, location, ticket> as input and return a number of max 10 characters.
So from the comments it sounds like you cannot add a dimension table to just look up the number or 10 character string that identifies each ticket as this would be a data model change. This is likely the best and most accurate way to do this.
You asked about a hash function to do this and there are several. But first let's talk about hashes - these take strings of varying length and make a signature out of them. Since this process can significantly reduce the number of characters there is a possibility that 2 different string will generate the same hash. The longer the hash value is the lower the odds are for having such a collision but the odds are never zero. Since you can only have 10 chars this sets the odds of a hash collision.
The md5() function on Redshift will take a string and make a 32 character string (base 16 characters) out of it. md5(day::text || location || ticket:text) will make such a hash out of the columns you mentioned. This process can make 16^32 possible different strings which is a big number.
But you only want a string of 10 character. The good news is that hash functions like md5() spread the differences between strings across the whole output so you can just pick any 10 characters to use. Doing this will reduce the number of unique values to 16^10 or about 1.1 trillion - still a big number but if you have billions of rows you could see a collision. One way to improve this would be to base64 encode the md5() output and then truncate to 10 characters. Doing this will require a UDF but would improve the number of possible hashes to 1.1E18 - a million times larger. If you want the output to be an integer you can convert hex strings to integers with strtol() but a 10 digit number only has 10 billion possible values.
So if you are sure you want to use a hash this is quite possible. Just remember what a hash does.

SQLite3 Order by highest/lowest numerical value

I am trying to do a query in SQLite3 to order a column by numerical value. Instead of getting the rows ordered by the numerical value of the column, the rows are ordered alphabetically by the first digit's numerical value.
For example in the query below 110 appears before 2 because the first digit (1) is less than two. However the entire number 110 is greater than 2 and I need that to appear after 2.
sqlite> SELECT digit,text FROM test ORDER BY digit;
1|one
110|One Hundred Ten
2|TWO
3|Three
sqlite>
Is there a way to make 110 appear after 2?
It seems like digit is a stored as a string, not as a number. You need to convert it to a number to get the proper ordering. A simple approach uses:
SELECT digit, text
FROM test
ORDER BY digit + 0

CHECK constraint for string to contain a certain amount of digits as well as certain digits. (Oracle SQL)

I have a column, number where I need a length constraint (say 11 digits) as well as to assert the existence of some certain numbers. Let us say the first four digits need to be '1234' and the fifth in the range'6-9'. I am using a varchar type so I also need to assert numbers. With some research here is what I have been able to come up with:
CHECK (REGEXP_LIKE(number, '^1234\d{6}$'))
In this way I have been able to check the number of digits (11), the first 4 starting numbers and number values. However, I cannot fit the fifth number which needs to be between 6 and 9 into this expression.
Thanks in advance
Try this.
CHECK (REGEXP_LIKE(number, '^1234[6-9]\d{6}$'))

Extract Number from VARCHAR

I have a [Comment] column of type VARCHAR(255) in a table that I'm trying to extract numbers from. The numbers will always be 12 digits, but aren't usually in the same place. Some of them will also have more than one 12 digit number, which is fine, but I only need the first.
I've tried using PATINDEX('%[0-9]%',[Comment]), but I can't figure out how to set a requirement of 12 digits.
An example of the data I'm working with is below:
Combined 4 items for $73.05 with same claim no. 123456789012 as is exceeding financial limits
Consolidated remaining amount of claim numbers, 123456789013, 123456789014, 123456789015, 123456789016 due to financial limits
You can just use 12 [0-9]'s in a row:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9‌​][0-9][0-9]%',[Comme‌​nt])

How to get Next 4 digit number from Table (Contains 3,4, 5 & 6, digit long numbers)

I found a good method of getting the next 4 digit number.
How to find next free unique 4-digit number
But in my case I want to get next available 4 or 5 digit number.
And this will change depending upon the users request. Theses number are not a key ID columns, but they are essential to the tagging structure for the business purpose.
Currently I use a table adabpter query. But how would I write a query.
I suppose I could do a long iterative loop through all values until I see a 4 digit.
But I'm trying to think of something more efficient.
Function GetNextAvailableNumber(NumofDigits) as Long
'SQL Code Here ----
'Query Number Table
Return Long
End Function
Here's my current SQL:
'This Queries my View
SELECT MIN([Number]) AS Expr1
FROM LineNumbersNotUsed
'This is my View SQL
SELECT Numbers.Number
FROM Numbers
WHERE (((Exists (Select * From LineList Where LineList.LineNum = Numbers.Number))=False))
ORDER BY Numbers.Number;
Numbers is the List of All available number from 0 to 99999, basically what's available to use.
LineList is my final master table where I keep the long and all the relevant other business information.
Hopefully this make sense.
Gosh you guys are so tough on new guys.
I accidentally hit the enter key, and the question posted and I instantly get -3 votes.
Give a new guy a break will you! Please.
I apologize in advance in case I overlooked something in your question. Using your design, won't a query like this return the next unused 4 digit number?
SELECT MIN([Number]) AS next_number
FROM LineNumbersNotUsed
WHERE
[Number] > 999
AND [Number] < 10000;
This approach is not adequate with multiple concurrent users, but you didn't indicate that is an issue for you.
The question you linked to explains that what you need is a table with 2 fields:
Number InUse
0000 No
0001 No
0002 Yes
0003 No
0005 Yes
Whenever a number is used/released, the table must be updated to set InUse to Yes/No.
Maybe I'm missing something, but from your explanation, and the SQL code you show us, it seems that you only have a table with a single field containing all numbers from 0 to 100000.
If that's the case, I don't see the usefulness of that table at all.
If I were you, and if I understand your need correctly, what you want is something like this:
First of all, create the table as above, with all running numbers from 0 to 100000, and a field for confirming if that number is used or not.
Initialise the InUse field with all the numbers already taken in your LineList table, something like:
UPDATE Numbers SET InUse = True
WHERE Numbers.Number IN (SELECT LineNum FROM LineList)
Write a function ReserveNumber(NumOfDigits as Integer) As Long to find and reserve a 4-digit or 5-digit free number following this logical sequence:
Depending on NumOfDigits (4 or 5) get the result of one of the queries as LowestNumber:
SELECT Min(Number) FROM Numbers WHERE Number < 10000 AND NOT InUse
SELECT Min(Number) FROM Numbers WHERE Number >= 10000 AND NOT InUse
Reserve that particular number to ensure it's not going to be used again:
UPDATE Numbers SET InUse = True WHERE Number = #LowestNumber
Return LowestNumber
Whenever
Notes: the logic above is a bit naive as it suppose that no two users will attempt to get the lowest number at the same time. There is however a risk that this may happen one day.
To remove that risk, you can, for instance, add a TakenBy column to the Numbers table and set it to the current username. Then, after you have reserved the number, read-it again to ensure that the TakenBy is really updated by the current client. If not, just try gain.
There are lots of ways to do this. You can try to fiddle around table locks as well, but whatever your solution, make sure you test it.