I am trying to update multiple rows with random 9 digit number using the following code.
UPDATE SGT_EMPLOYER
SET SSN = (CONVERT(NUMERIC(10,0),RAND() * 899999999) + 100000000)
WHERE EMPLOYER_ACCOUNT_ID = 123456789;
Expected result: the query should update 300 rows with 300 random 9 digit numbers.
Actual: query is updating 300 rows with same number as the ran() function is executing only once.
Please help. Thank You.
As you already figured out yourself, RAND is a run-time constant function in SQL Server. It means that it is called once per statement and the generated value is used for each affected row.
There are other functions that are called for each row. Often people use NEWID usually together with CHECKSUM as a substitute for a random number, but I would not recommend it because the distribution of such random numbers is likely to be poor.
There is a good function specifically designed to generate random numbers: CRYPT_GEN_RANDOM. It is available since at least SQL Server 2008.
It generates a given number of random bytes.
In your case it would be convenient to have a random number as a float value in the range of [0;1], same as the value returned by RAND.
So, CRYPT_GEN_RANDOM(4) generates 4 random bytes as varbinary.
Convert them to int, divide by the maximum value of 32-bit integer (4294967295) and add 0.5 to shift the range from [-0.5;+0.5] to [0;1]:
(CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5)
Your query becomes:
UPDATE SGT_EMPLOYER
SET SSN =
CONVERT(NUMERIC(10,0),
(CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5) * 899999999.0 + 100000000.0)
WHERE EMPLOYER_ACCOUNT_ID = 123456789;
Yes, the rand() line will only be executed once, before the rows are being updated, not every time a row is updated.
You can use a Stored Procedure to update every row with (CONVERT(NUMERIC(10,0),RAND() * 899999999) + 100000000).
Sean Lange is 100% correct. However, if you want to quickly mask your SSN, perhaps the following using HashBytes() may help.
Example
Declare #Table table (SSN varchar(25))
Insert into #Table values
('070-99-12345'),
('123-45-67890')
Select SSN
,AsInt = abs(cast(HashBytes('MD5', SSN) as int))
From #Table
Returns
SSN AsInt
070-99-12345 508860145
123-45-67890 843256257
Related
Say I have a table named 'Parts'. I am looking to create a SQL query that compares the first X characters of two of the fields, let's call them 'PartNum1' and 'PartNum2'. For example, I would like to return all records from 'Parts' where the first 6 characters of 'PartNum1' equals the first 6 characters of 'PartNum2'.
Parts
PartNum1
PartNum2
12345678
12345600
12388888
12345000
12000000
14500000
the query would only return row 1 since the first 6 characters match. MS SQL Server 2017 in case that makes a difference.
If they are strings, use left():
left(partnum1, 6) = left(partnum2, 6)
This would be appropriate in a where, on, or case expression. Note that using left() would generally prevent the use of indexes. If this is for a join and you care about performance, you might want to include a computed column with the first six characters.
you can try something like this. I am assuming datatype as integer. You can set size of varchar based on length of fields.
select *
from Parts
WHERE SUBSTRING(CAST(PartNum1 AS VARCHAR(max)), 1,6) = SUBSTRING(CAST(PartNum2 AS VARCHAR(max)), 1,6)
You can go for simple division to see if the numerator matches for those partnumbers.
DECLARE #table table(partnum int, partnum2 int)
insert into #table values
(12345678, 12345600)
,(12388888, 12345000)
,(12000000, 14500000);
select * from #table where partnum/100 = partnum2/100
partnum
partnum2
12345678
12345600
I need to anonymize a variable in SQL data (VAR NAME = "ArId").
The variable contains 10 numbers + 1 letter + 2 numbers. I need to randomize the 10 first numbers and then keep the letter + the last two numbers.
I have tried the rand() function, but this randomize the whole value.
SELECT TOP 1000 *
FROM [XXXXXXXXXXX].[XXXXXXXXXX].[XXXXX.TEST]
I have only loaded the data.
EDIT (from "answer"):
I have tried: UPDATE someTable
SET someColumn = CONCAT(CAST(RAND() * 10000000000 as BIGINT), RIGHT(someColumn, 3))
However as i am totally new to SQL i don't know how to make this work. I put 'someColumn = new column name for the variable i am crating. RIGHT(someColumn) = the column i am changing. When i do that i get the message that the right function requires 2 arguments??
Example for Zohar: I have a variable containing for example: 1724981628R01On all these values in this variable i would like to randomize the first 10 letters and keep the last three (R01). How can i do that?
A couple things. First, your conversion to a big int does not guarantee that the results has the right number of characters.
Second, rand() is constant for all rows of the query. Try this version:
UPDATE someTable
SET someColumn = CONCAT(FORMAT(RAND(CHECKSUM(NEWID())
), '0000000000'
),
RIGHT(someColumn, 3)
);
I have a unique ID that I am generating program-side in the format CCYYMMDDxxxx, where xxxx is a 4 digit string that will auto increment, starting from 0001.
To calculate the next element, I have wrote part of a query which gets those 4 digits from the string using substring.
DECLARE #number int, #nextstring varchar(4)
SET #number = (SELECT CONVERT(int, SUBSTRING(Payment_ID, 9, 4), 103) FROM Orders)
I need to be able to increment it by 1, but keep it in 4 digit format. I came across the 'right' keyword, but I don't know how many 0's ill need to put in front of it.
Is there a nice way to do this without a bunch of IF's? Of course, I could calculate the length and put the respective number of 0's at the start, but that doesn't account for 9, 99, and 999.
I really think that an identity column is the best way to handle this . . . then assign the sequential number afterwards.
But, if you want to do this, you need to left pad the number. Here is a method to get the next id based on values in the table:
SELECT (LEFT(MAX(payment_id), 8) +
RIGHT('0000' + CAST(CAST(RIGHT(MAX(payment_id), 4) as int) + 1 as VARCHAR(255))
)
FROM Orders;
This does not verify that the id is long enough. Let me repeat: I think it is much better to use an identity column as the id and then construct whatever attributes you want (such as the number within a day) when you need that information.
Within my table 'SERVICE_TICKET' are two columns, namely 'Defect_Description' and 'Defect_Description_Code'.
I'd like to populate the second column with random numbers between 1000000 and 9999999 (7-digit-number). However, the random number should be the same for equal values within the first column. So for example if the 'Defect_Description'= 'microphone for hands-free device',the'Defect_Description_Code'should always equal the same arbitrary number, e.g.'8374917'`.
I came up with the following expression, but this creates a diffirent number for each 'Defect_Description'. What do I need to change in order to get the same number for each of these?
UPDATE dbo.SERVICE_TICKET
SET Defect_Description_Code =
CASE Defect_Description
WHEN 'microphone for hands-free device' THEN (ABS(CHECKSUM(NewId())) % 1111111 + 9999999)
ELSE '-'
END
I think you want to avoid newid() in this case. I would recommend simply using Defect_Desription itself.
The following query also fixes the logic to get the 7 digit number:
UPDATE dbo.SERVICE_TICKET
SET Defect_Description_Code = ABS(CHECKSUM(Defect_Description)) % 9000000 + 1000000;
declare #fieldForceCounter as int
declare #SaleDate as dateTime
declare #RandomNoSeed as decimal
set #fieldForceCounter = 1
set #SaleDate = '1 Jan 2009'
set #RandomNoSeed = 0.0
WHILE #fieldForceCounter <= 3
BEGIN
while #SaleDate <= '1 Dec 2009'
begin
INSERT INTO MonthlySales(FFCode, SaleDate, SaleValue) VALUES(#fieldForceCounter, #SaleDate, RAND(#RandomNoSeed))
set #saleDate = #saleDate + 1
set #RandomNoSeed = Rand(#RandomNoSeed) + 1
end
set #SaleDate = '1 Jan 2009'
set #fieldForceCounter = #fieldForceCounter + 1
END
GO
This T-SQL command was supposed to insert random values in the 'SaleValue'-column in the 'MonthlySales'-table.
But it is inserting '1' every time .
What can be the problem?
Two problems:
Firstly, the rand() function returns a number between 0 and 1.
Secondly, when rand() is called multiple times in the same query (e.g. for multiple rows in an update statement), it usually returns the same number (which I suspect your algorithm above is trying to solve, by splitting it into multiple calls)
My favourite way around the second problem is to use a function that's guaranteed to return a unique value each time, like newid(), convert it to varbinary, and use it as the seed :)
Edit: after some testing, it seems you'll need to try using a different datatype for #RandomNoSeed; float behaves somewhat different to decimal, but still approaches a fixed value, so I'd recommend avoiding the use of #RandomNoSeed altogether, and simply use:
INSERT INTO MonthlySales(FFCode, SaleDate, SaleValue)
VALUES(#fieldForceCounter, #SaleDate, RAND(convert(varbinary,newid())))
You have major issues here...
Decimal issues
The default precision/scale for decimal is 38,0. So you aren't storing any decimal part.
So you are only using RAND(0) for 1st iteration and RAND(1) for all subsequent iterations, which is 0.943597390424144 and 0.713591993212924
I can't recall how rounding/truncation applies, and I don't know what datatype SalesValue is, but rounding would give "1" every time.
Now, if you fix this and declare decimal correctly...
Seeding issues
RAND takes an integer seed. Seeding with 1.0001 or 1.3 or 1.999 gives the same value (0.713591993212924).
So, "Rand(1.713591993212924) + 1" = "RAND(1) + 1" = "1.713591993212924" for every subsequent iteration.
Back to square one...
To fix
Get rid of #RandomNoSeed
Either: Generate a random integer value using CHECKSUM(NEWID())
Or: generate a random float value using RAND() * CHECKSUM(NEWID()) (Don't care about seed now)
Just a guess, but often rand functions generate a number from 0-1. Try multiplying your random number by 10.