SQL change a string using a pattern - sql

I need to do something special in SQL, I don't know if a standard function exists, I actually don't know what to search... ! So any advice would be appreciated.
Here is my problem:
I have a data which is a number: 7000000
And I have a "formatting pattern": ****5**
My goal is to merge both: result for this example is: 7000500
(a star means to keep the original value, and a number means to change it)
another example:
7894321
*0**9*1
-------
7094921
(I use SQL Server)

This task can be performed in any programming language with basic for-loop and and some internal functions to find substring and replace
Here is how it's done in SQL SERVER (given that the string and the format is of same length)
Create your own function
CREATE FUNCTION dbo.Formatter
(
#str NVARCHAR(MAX),
#format NVARCHAR(MAX)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE #i int = 1, #len int = LEN(#str)
-- Iterates over over each char in the FORMAT and replace the original string if applies
WHILE #i <= #len
BEGIN
IF SUBSTRING(#format, #i, 1) <> '*'
SET #str = STUFF(#str, #i, 1, SUBSTRING(#format, #i, 1))
SET #i = #i + 1
END
RETURN #str
END
USE your function in your SELECTs, e.g.
DECLARE #str VARCHAR(MAX) = '7894321'
DECLARE #format VARCHAR(MAX) = '*0**9*1'
PRINT('Format: ' + #format)
PRINT('Old string: ' + #str)
PRINT('New string: ' + dbo.Formatter(#str, #format))
Result:
Format: *0**9*1
Old string: 7894321
New string: 7094921

I would split the string into it's individual characters, use a CASE expression to determine what character should be retained, and then remerge. I'm going to assume you're on a recent version of SQL Server, and thus have access to STRING_AGG; if not, you'll want to use the "old" FOR XML PATH method. I also assume a string of length 10 of less (if it's more, then just increase the size of the tally).
DECLARE #Format varchar(10) = '****5**';
WITH Tally AS(
SELECT V.I
FROM (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10))V(I))
SELECT STRING_AGG(CASE SSf.C WHEN '*' THEN SSc.C ELSE SSf.C END,'') WITHIN GROUP (ORDER BY T.I) AS NewString
FROM (VALUES(1,'7894321'),(2,'7000000'))YT(SomeID,YourColumn) --This would be your table
JOIN Tally T ON LEN(YT.YourColumn) >= T.I
CROSS APPLY (VALUES(SUBSTRING(YT.YourColumn,T.I,1)))SSc(C)
CROSS APPLY (VALUES(SUBSTRING(#Format,T.I,1)))SSf(C)
GROUP BY YT.SomeID;
db<>fiddle

If your value is an int and your format is an int also, then you can do this:
DECLARE #formatNum int = CAST(REPLACE(#format, '*', '0') AS int);
SELECT f.Formatted
FROM Table
CROSS APPLY (
SELECT Formatted = SUM(CASE
WHEN ((#FormatNum / Base) % 10) > 0
THEN ((#FormatNum / Base) % 10) * Base
ELSE ((t.MyNumber / Base) % 10) * Base END
FROM (VALUES
(10),(100),(1000),(10000),(100000),(1000000),(10000000),(100000000),(1000000000)
) v(Base)
) f
The / is integer division, % is modulo, so dividing a number by Base and taking the modulo 10 will give you exactly one digit.

Related

Count numeric chars in string

Using tsql I want to count a numeric chars in string. For example i've got 'kick0my234ass' string and i wanna count how many (4 in that example) numbers are in that string. I can't use regex, just plain tslq.
You COULD do this I suppose:
declare #c varchar(30)
set #c = 'kick0my234ass'
select #c, len(replace(#c,' ','')) - len(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(#c,'0',''),'1',''),'2',''),'3',''),'4',''),'5',''),'6',''),'7',''),'8',''),'9',''),' ',''))
You'll first have to split the character string in its individual characters, evaluate which are numeric, and finally count those that are. This will do the trick:
DECLARE #test TABLE (Example NVARCHAR(255))
INSERT #test
VALUES ('kick0my234ass')
SELECT COUNT(1)
FROM #test AS T
INNER JOIN master..spt_values v
ON v.type = 'P'
AND v.number < len(T.Example)
WHERE SUBSTRING(T.Example, v.number + 1, 1) LIKE '[0-9]'
You could try this solution with regular expressions (if you'd allow them):
it uses recursive CTE, at every recursive step, one digit is removed from given string and the condition is to stop, when there are no digits in string. The rows are also numbered with consecutive ids, so the last id is the amount of removed digits from string.
declare #str varchar(100) = 'kick0my123ass';
with cte as (
select 1 [id], stuff(#str,PATINDEX('%[0-9]%', #str),1,'') [col]
union all
select [id] + 1, stuff([col],PATINDEX('%[0-9]%', [col]),1,'') from cte
where col like '%[0-9]%'
)
--this will give you number of digits in string
select top 1 id from cte order by id desc
Use a WHILE loop to each each character is a numeric or not.
Query
declare #text as varchar(max) = 'kick0my234ass';
declare #len as int;
select #len = len(#text);
if(#len > 0)
begin
declare #i as int = 1;
declare #count as int = 0;
while(#i <= #len)
begin
if(substring(#text, #i, 1) like '[0-9]')
set #count += 1;
set #i += 1;
end
print 'Count of Numerics in ' + #text + ' : ' + cast(#count as varchar(100));
end
else
print 'Empty string';
If simplicity & performance are important I suggest a purely set-based solution. Grab a copy of DigitsOnlyEE which will remove all non-numeric characters. Then use LEN against the output.
DECLARE #string varchar(100) = '123xxx45ff678';
SELECT string = #string, digitsOnly, DigitCount = LEN(digitsOnly)
FROM dbo.DigitsOnlyEE(#string);
Results
string digitsOnly DigitCount
------------------ ----------- ------------
123xxx45ff678 12345678 8
using a Tally Table created by an rCTE:
CREATE TABLE #Sample (S varchar(100));
INSERT INTO #Sample
VALUES ('kick0my234 ass');
GO
WITH Tally AS(
SELECT 1 AS N
UNION ALL
SELECT N + 1
FROM Tally
WHERE N + 1 <= 100)
SELECT S.S, SUM(CASE WHEN SUBSTRING(S,T.N, 1) LIKE '[0-9]' THEN 1 ELSE 0 END) AS Numbers
FROM #Sample S
JOIN Tally T ON LEN(S.S) >= T.N
GROUP BY S.S;
For future reference, also post your owns attempts please. We aren't here (really) to do your work for you.

Change characters but keep length

I am migrating sensitive data to a database, and I need to hide details of the text. We would like to keep the volume and length of the text, but change the meaning.
For example:
"James has been well received, and should be helped when ever he finds it hard to speak"
should change to:
"jhdfy dfw aslk dfe kjdfkjd, kjf kjdsf df iotryy erhr lsdj jf ytwe it kjdf tr kjsdd"
Is there a way to update all rows, set the column text to this random type text? Really only want to change charactors (a-z, A-Z), and keep the rest.
One option is to use a bunch of nested replaces . . . but that would probably hit on the maximum number of nested functions.
You could write a painful query using outer apply:
select
from t outer apply
(select replace(t.col, 'a', 'z') as col1) outer apply
(select replace(col1, 'b', 'y') ) outer apply
. . .
However, you might want to write your own function. In other databases, this is called translate() (after the Unix command). If you Google SQL Server translate, I think you'll find examples on the web.
One way is to split the string character by character and replace each row with a random string. And then concatenate them back to get the desired output
DECLARE #str VARCHAR(MAX) = 'James has been well received, and should be helped when ever he finds it hard to speak'
;WITH Cte(orig, random) AS(
SELECT
SUBSTRING(t.a, v.number + 1, 1),
CASE
WHEN SUBSTRING(t.a, v.number + 1, 1) LIKE '[a-z]'
THEN CHAR(ABS(CHECKSUM(NEWID())) % 25 + 97)
ELSE SUBSTRING(t.a, v.number + 1, 1)
END
FROM (SELECT #str) t(a)
CROSS JOIN master..spt_values v
WHERE
v.number < LEN(t.a)
AND v.type = 'P'
)
SELECT
OrignalString = #str,
RandomString = (
SELECT '' + random
FROM Cte FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'
)
TRY IT HERE
OK this is possible using a user defined function (UDF) and a view.
SQL Server does not allow random number generation in a UDF but does allow it in a view. Ref: http://blog.sqlauthority.com/2012/11/20/sql-server-using-rand-in-user-defined-functions-udf/
So here is the solution
CREATE VIEW [dbo].[rndView]
AS
SELECT RAND() rndResult
GO
CREATE FUNCTION [dbo].[RandFn]()
RETURNS float
AS
BEGIN
DECLARE #rndValue float
SELECT #rndValue = rndResult
FROM rndView
RETURN #rndValue
END
GO
CREATE FUNCTION [dbo].[randomstring] ( #stringToParse VARCHAR(MAX))
RETURNS
varchar(max)
AS
BEGIN
/*
A = 65
Z = 90
a = 97
z = 112
declare #stringToParse VARCHAR(MAX) = 'James has been well received, and should be helped when ever he finds it hard to speak'
Select [dbo].[randomstring] ( #stringToParse )
go
Update SpecialTable
Set SpecialString = [dbo].[randomstring] (SpecialString)
go
*/
declare #StringToreturn varchar(max) = ''
declare #charCounter int = 1
declare #len int = len(#stringToParse)
declare #thisRand int
declare #UpperA int = 65
declare #UpperZ int = 90
declare #LowerA int = 97
declare #LowerZ int = 112
declare #thisChar char(1)
declare #Random_Number float
declare #randomChar char(1)
WHILE #charCounter < #len
BEGIN
SELECT #thisChar = SUBSTRING(#stringToParse, #charCounter, 1)
set #randomChar = #thisChar
--print #randomChar
SELECT #Random_Number = dbo.RandFn()
--print #Random_Number
--only swap if a-z or A-Z
if ASCII(#thisChar) >= #UpperA and ASCII(#thisChar) <= #UpperZ begin
--upper case
set #thisRand = #UpperA + (#Random_Number * convert(float, (#UpperZ-#UpperA)))
set #randomChar = CHAR(#thisRand)
--print #thisRand
end
if ASCII(#thisChar) >= #LowerA and ASCII(#thisChar) <= #LowerZ begin
--upper case
set #thisRand = #LowerA + (#Random_Number * convert(float, (#LowerZ-#LowerA)))
set #randomChar = CHAR(#thisRand)
end
--print #thisRand
--print #randomChar
set #StringToreturn = #StringToreturn + #randomChar
SET #charCounter = #charCounter + 1
END
--Select * from #returnList
return #StringToreturn
END
GO

Query to get only numbers from a string

I have data like this:
string 1: 003Preliminary Examination Plan
string 2: Coordination005
string 3: Balance1000sheet
The output I expect is
string 1: 003
string 2: 005
string 3: 1000
And I want to implement it in SQL.
First create this UDF
CREATE FUNCTION dbo.udf_GetNumeric
(
#strAlphaNumeric VARCHAR(256)
)
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
SQL FIDDLE
I hope this solved your problem.
Reference
Try this one -
Query:
DECLARE #temp TABLE
(
string NVARCHAR(50)
)
INSERT INTO #temp (string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')
SELECT LEFT(subsrt, PATINDEX('%[^0-9]%', subsrt + 't') - 1)
FROM (
SELECT subsrt = SUBSTRING(string, pos, LEN(string))
FROM (
SELECT string, pos = PATINDEX('%[0-9]%', string)
FROM #temp
) d
) t
Output:
----------
003
005
1000
Query:
DECLARE #temp TABLE
(
string NVARCHAR(50)
)
INSERT INTO #temp (string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')
SELECT SUBSTRING(string, PATINDEX('%[0-9]%', string), PATINDEX('%[0-9][^0-9]%', string + 't') - PATINDEX('%[0-9]%',
string) + 1) AS Number
FROM #temp
Please try:
declare #var nvarchar(max)='Balance1000sheet'
SELECT LEFT(Val,PATINDEX('%[^0-9]%', Val+'a')-1) from(
SELECT SUBSTRING(#var, PATINDEX('%[0-9]%', #var), LEN(#var)) Val
)x
Getting only numbers from a string can be done in a one-liner.
Try this :
SUBSTRING('your-string-here', PATINDEX('%[0-9]%', 'your-string-here'), LEN('your-string-here'))
NB: Only works for the first int in the string, ex: abc123vfg34 returns 123.
I found this approach works about 3x faster than the top voted answer. Create the following function, dbo.GetNumbers:
CREATE FUNCTION dbo.GetNumbers(#String VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN;
WITH
Numbers
AS (
--Step 1.
--Get a column of numbers to represent
--every character position in the #String.
SELECT 1 AS Number
UNION ALL
SELECT Number + 1
FROM Numbers
WHERE Number < LEN(#String)
)
,Characters
AS (
SELECT Character
FROM Numbers
CROSS APPLY (
--Step 2.
--Use the column of numbers generated above
--to tell substring which character to extract.
SELECT SUBSTRING(#String, Number, 1) AS Character
) AS c
)
--Step 3.
--Pattern match to return only numbers from the CTE
--and use STRING_AGG to rebuild it into a single string.
SELECT #String = STRING_AGG(Character,'')
FROM Characters
WHERE Character LIKE '[0-9]'
--allows going past the default maximum of 100 loops in the CTE
OPTION (MAXRECURSION 8000)
RETURN #String
END
GO
Testing
Testing for purpose:
SELECT dbo.GetNumbers(InputString) AS Numbers
FROM ( VALUES
('003Preliminary Examination Plan') --output: 003
,('Coordination005') --output: 005
,('Balance1000sheet') --output: 1000
,('(111) 222-3333') --output: 1112223333
,('1.38hello#f00.b4r#\-6') --output: 1380046
) testData(InputString)
Testing for performance:
Start off setting up the test data...
--Add table to hold test data
CREATE TABLE dbo.NumTest (String VARCHAR(8000))
--Make an 8000 character string with mix of numbers and letters
DECLARE #Num VARCHAR(8000) = REPLICATE('12tf56se',800)
--Add this to the test table 500 times
DECLARE #n INT = 0
WHILE #n < 500
BEGIN
INSERT INTO dbo.NumTest VALUES (#Num)
SET #n = #n +1
END
Now testing the dbo.GetNumbers function:
SELECT dbo.GetNumbers(NumTest.String) AS Numbers
FROM dbo.NumTest -- Time to complete: 1 min 7s
Then testing the UDF from the top voted answer on the same data.
SELECT dbo.udf_GetNumeric(NumTest.String)
FROM dbo.NumTest -- Time to complete: 3 mins 12s
Inspiration for dbo.GetNumbers
Decimals
If you need it to handle decimals, you can use either of the following approaches, I found no noticeable performance differences between them.
change '[0-9]' to '[0-9.]'
change Character LIKE '[0-9]' to ISNUMERIC(Character) = 1 (SQL treats a single decimal point as "numeric")
Bonus
You can easily adapt this to differing requirements by swapping out WHERE Character LIKE '[0-9]' with the following options:
WHERE Letter LIKE '[a-zA-Z]' --Get only letters
WHERE Letter LIKE '[0-9a-zA-Z]' --Remove non-alphanumeric
WHERE Letter LIKE '[^0-9a-zA-Z]' --Get only non-alphanumeric
With the previous queries I get these results:
'AAAA1234BBBB3333' >>>> Output: 1234
'-çã+0!\aº1234' >>>> Output: 0
The code below returns All numeric chars:
1st output: 12343333
2nd output: 01234
declare #StringAlphaNum varchar(255)
declare #Character varchar
declare #SizeStringAlfaNumerica int
declare #CountCharacter int
set #StringAlphaNum = 'AAAA1234BBBB3333'
set #SizeStringAlfaNumerica = len(#StringAlphaNum)
set #CountCharacter = 1
while isnumeric(#StringAlphaNum) = 0
begin
while #CountCharacter < #SizeStringAlfaNumerica
begin
if substring(#StringAlphaNum,#CountCharacter,1) not like '[0-9]%'
begin
set #Character = substring(#StringAlphaNum,#CountCharacter,1)
set #StringAlphaNum = replace(#StringAlphaNum, #Character, '')
end
set #CountCharacter = #CountCharacter + 1
end
set #CountCharacter = 0
end
select #StringAlphaNum
declare #puvodni nvarchar(20)
set #puvodni = N'abc1d8e8ttr987avc'
WHILE PATINDEX('%[^0-9]%', #puvodni) > 0 SET #puvodni = REPLACE(#puvodni, SUBSTRING(#puvodni, PATINDEX('%[^0-9]%', #puvodni), 1), '' )
SELECT #puvodni
A solution for SQL Server 2017 and later, using TRANSLATE:
DECLARE #T table (string varchar(50) NOT NULL);
INSERT #T
(string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet');
SELECT
result =
REPLACE(
TRANSLATE(
T.string COLLATE Latin1_General_CI_AI,
'abcdefghijklmnopqrstuvwxyz',
SPACE(26)),
SPACE(1),
SPACE(0))
FROM #T AS T;
Output:
result
003
005
1000
The code works by:
Replacing characters a-z (ignoring case & accents) with a space
Replacing spaces with an empty string.
The string supplied to TRANSLATE can be expanded to include additional characters.
I did not have rights to create functions but had text like
["blahblah012345679"]
And needed to extract the numbers out of the middle
Note this assumes the numbers are grouped together and not at the start and end of the string.
select substring(column_name,patindex('%[0-9]%', column_name),patindex('%[0-9][^0-9]%', column_name)-patindex('%[0-9]%', column_name)+1)
from table name
Although this is an old thread its the first in google search, I came up with a different answer than what came before. This will allow you to pass your criteria for what to keep within a string, whatever that criteria might be. You can put it in a function to call over and over again if you want.
declare #String VARCHAR(MAX) = '-123. a 456-78(90)'
declare #MatchExpression VARCHAR(255) = '%[0-9]%'
declare #return varchar(max)
WHILE PatIndex(#MatchExpression, #String) > 0
begin
set #return = CONCAT(#return, SUBSTRING(#string,patindex(#matchexpression, #string),1))
SET #String = Stuff(#String, PatIndex(#MatchExpression, #String), 1, '')
end
select (#return)
This UDF will work for all types of strings:
CREATE FUNCTION udf_getNumbersFromString (#string varchar(max))
RETURNS varchar(max)
AS
BEGIN
WHILE #String like '%[^0-9]%'
SET #String = REPLACE(#String, SUBSTRING(#String, PATINDEX('%[^0-9]%', #String), 1), '')
RETURN #String
END
Just a little modification to #Epsicron 's answer
SELECT SUBSTRING(string, PATINDEX('%[0-9]%', string), PATINDEX('%[0-9][^0-9]%', string + 't') - PATINDEX('%[0-9]%',
string) + 1) AS Number
FROM (values ('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')) as a(string)
no need for a temporary variable
Firstly find out the number's starting length then reverse the string to find out the first position again(which will give you end position of number from the end). Now if you deduct 1 from both number and deduct it from string whole length you'll get only number length. Now get the number using SUBSTRING
declare #fieldName nvarchar(100)='AAAA1221.121BBBB'
declare #lenSt int=(select PATINDEX('%[0-9]%', #fieldName)-1)
declare #lenEnd int=(select PATINDEX('%[0-9]%', REVERSE(#fieldName))-1)
select SUBSTRING(#fieldName, PATINDEX('%[0-9]%', #fieldName), (LEN(#fieldName) - #lenSt -#lenEnd))
T-SQL function to read all the integers from text and return the one at the indicated index, starting from left or right, also using a starting search term (optional):
create or alter function dbo.udf_number_from_text(
#text nvarchar(max),
#search_term nvarchar(1000) = N'',
#number_position tinyint = 1,
#rtl bit = 0
) returns int
as
begin
declare #result int = 0;
declare #search_term_index int = 0;
if #text is null or len(#text) = 0 goto exit_label;
set #text = trim(#text);
if len(#text) = len(#search_term) goto exit_label;
if len(#search_term) > 0
begin
set #search_term_index = charindex(#search_term, #text);
if #search_term_index = 0 goto exit_label;
end;
if #search_term_index > 0
if #rtl = 0
set #text = trim(right(#text, len(#text) - #search_term_index - len(#search_term) + 1));
else
set #text = trim(left(#text, #search_term_index - 1));
if len(#text) = 0 goto exit_label;
declare #patt_number nvarchar(10) = '%[0-9]%';
declare #patt_not_number nvarchar(10) = '%[^0-9]%';
declare #number_start int = 1;
declare #number_end int;
declare #found_numbers table (id int identity(1,1), val int);
while #number_start > 0
begin
set #number_start = patindex(#patt_number, #text);
if #number_start > 0
begin
if #number_start = len(#text)
begin
insert into #found_numbers(val)
select cast(substring(#text, #number_start, 1) as int);
break;
end;
else
begin
set #text = right(#text, len(#text) - #number_start + 1);
set #number_end = patindex(#patt_not_number, #text);
if #number_end = 0
begin
insert into #found_numbers(val)
select cast(#text as int);
break;
end;
else
begin
insert into #found_numbers(val)
select cast(left(#text, #number_end - 1) as int);
if #number_end = len(#text)
break;
else
begin
set #text = trim(right(#text, len(#text) - #number_end));
if len(#text) = 0 break;
end;
end;
end;
end;
end;
if #rtl = 0
select #result = coalesce(a.val, 0)
from (select row_number() over (order by m.id asc) as c_row, m.val
from #found_numbers as m) as a
where a.c_row = #number_position;
else
select #result = coalesce(a.val, 0)
from (select row_number() over (order by m.id desc) as c_row, m.val
from #found_numbers as m) as a
where a.c_row = #number_position;
exit_label:
return #result;
end;
Example:
select dbo.udf_number_from text(N'Text text 10 text, 25 term', N'term',2,1);
returns 10;
This is one of the simplest and easiest one. This will work on the entire String for multiple occurences as well.
CREATE FUNCTION dbo.fn_GetNumbers(#strInput NVARCHAR(500))
RETURNS NVARCHAR(500)
AS
BEGIN
DECLARE #strOut NVARCHAR(500) = '', #intCounter INT = 1
WHILE #intCounter <= LEN(#strInput)
BEGIN
SELECT #strOut = #strOut + CASE WHEN SUBSTRING(#strInput, #intCounter, 1) LIKE '[0-9]' THEN SUBSTRING(#strInput, #intCounter, 1) ELSE '' END
SET #intCounter = #intCounter + 1
END
RETURN #strOut
END
Following a solution using a single common table expression (CTE).
DECLARE #s AS TABLE (id int PRIMARY KEY, value nvarchar(max));
INSERT INTO #s
VALUES
(1, N'003Preliminary Examination Plan'),
(2, N'Coordination005'),
(3, N'Balance1000sheet');
SELECT * FROM #s ORDER BY id;
WITH t AS (
SELECT
id,
1 AS i,
SUBSTRING(value, 1, 1) AS c
FROM
#s
WHERE
LEN(value) > 0
UNION ALL
SELECT
t.id,
t.i + 1 AS i,
SUBSTRING(s.value, t.i + 1, 1) AS c
FROM
t
JOIN #s AS s ON t.id = s.id
WHERE
t.i < LEN(s.value)
)
SELECT
id,
STRING_AGG(c, N'') WITHIN GROUP (ORDER BY i ASC) AS value
FROM
t
WHERE
c LIKE '[0-9]'
GROUP BY
id
ORDER BY
id;
DECLARE #index NVARCHAR(20);
SET #index = 'abd565klaf12';
WHILE PATINDEX('%[0-9]%', #index) != 0
BEGIN
SET #index = REPLACE(#index, SUBSTRING(#index, PATINDEX('%[0-9]%', #index), 1), '');
END
SELECT #index;
One can replace [0-9] with [a-z] if numbers only are wanted with desired castings using the CAST function.
If we use the User Define Function, the query speed will be greatly reduced. This code extracts the number from the string....
SELECT
Reverse(substring(Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) )))) , patindex('%[0-9]%', Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) )))) ), len(Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) ))))) )) NumberValue
FROM dbo.TableName
CREATE OR REPLACE FUNCTION count_letters_and_numbers(input_string TEXT)
RETURNS TABLE (letters INT, numbers INT) AS $$
BEGIN
RETURN QUERY SELECT
sum(CASE WHEN input_string ~ '[A-Za-z]' THEN 1 ELSE 0 END) as letters,
sum(CASE WHEN input_string ~ '[0-9]' THEN 1 ELSE 0 END) as numbers
FROM unnest(string_to_array(input_string, '')) as input_string;
END;
$$ LANGUAGE plpgsql;
For the hell of it...
This solution is different to all earlier solutions, viz:
There is no need to create a function
There is no need to use pattern matching
There is no need for a temporary table
This solution uses a recursive common table expression (CTE)
But first - note the question does not specify where such strings are stored. In my solution below, I create a CTE as a quick and dirty way to put these strings into some kind of "source table".
Note also - this solution uses a recursive common table expression (CTE) - so don't get confused by the usage of two CTEs here. The first is simply to make the data avaliable to the solution - but it is only the second CTE that is required in order to solve this problem. You can adapt the code to make this second CTE query your existing table, view, etc.
Lastly - my coding is verbose, trying to use column and CTE names that explain what is going on and you might be able to simplify this solution a little. I've added in a few pseudo phone numbers with some (expected and atypical, as the case may be) formatting for the fun of it.
with SOURCE_TABLE as (
select '003Preliminary Examination Plan' as numberString
union all select 'Coordination005' as numberString
union all select 'Balance1000sheet' as numberString
union all select '1300 456 678' as numberString
union all select '(012) 995 8322 ' as numberString
union all select '073263 6122,' as numberString
),
FIRST_CHAR_PROCESSED as (
select
len(numberString) as currentStringLength,
isNull(cast(try_cast(replace(left(numberString, 1),' ','z') as tinyint) as nvarchar),'') as firstCharAsNumeric,
cast(isNull(cast(try_cast(nullIf(left(numberString, 1),'') as tinyint) as nvarchar),'') as nvarchar(4000)) as newString,
cast(substring(numberString,2,len(numberString)) as nvarchar) as remainingString
from SOURCE_TABLE
union all
select
len(remainingString) as currentStringLength,
cast(try_cast(replace(left(remainingString, 1),' ','z') as tinyint) as nvarchar) as firstCharAsNumeric,
cast(isNull(newString,'') as nvarchar(3999)) + isNull(cast(try_cast(nullIf(left(remainingString, 1),'') as tinyint) as nvarchar(1)),'') as newString,
substring(remainingString,2,len(remainingString)) as remainingString
from FIRST_CHAR_PROCESSED fcp2
where fcp2.currentStringLength > 1
)
select
newString
,* -- comment this out when required
from FIRST_CHAR_PROCESSED
where currentStringLength = 1
So what's going on here?
Basically in our CTE we are selecting the first character and using try_cast (see docs) to cast it to a tinyint (which is a large enough data type for a single-digit numeral). Note that the type-casting rules in SQL Server say that an empty string (or a space, for that matter) will resolve to zero, so the nullif is added to force spaces and empty strings to resolve to null (see discussion) (otherwise our result would include a zero character any time a space is encountered in the source data).
The CTE also returns everything after the first character - and that becomes the input to our recursive call on the CTE; in other words: now let's process the next character.
Lastly, the field newString in the CTE is generated (in the second SELECT) via concatenation. With recursive CTEs the data type must match between the two SELECT statements for any given column - including the column size. Because we know we are adding (at most) a single character, we are casting that character to nvarchar(1) and we are casting the newString (so far) as nvarchar(3999). Concatenated, the result will be nvarchar(4000) - which matches the type casting we carry out in the first SELECT.
If you run this query and exclude the WHERE clause, you'll get a sense of what's going on - but the rows may be in a strange order. (You won't necessarily see all rows relating to a single input value grouped together - but you should still be able to follow).
Hope it's an interesting option that may help a few people wanting a strictly expression-based solution.
In Oracle
You can get what you want using this:
SUBSTR('ABCD1234EFGH',REGEXP_INSTR ('ABCD1234EFGH', '[[:digit:]]'),REGEXP_COUNT ('ABCD1234EFGH', '[[:digit:]]'))
Sample Query:
SELECT SUBSTR('003Preliminary Examination Plan ',REGEXP_INSTR ('003Preliminary Examination Plan ', '[[:digit:]]'),REGEXP_COUNT ('003Preliminary Examination Plan ', '[[:digit:]]')) SAMPLE1,
SUBSTR('Coordination005',REGEXP_INSTR ('Coordination005', '[[:digit:]]'),REGEXP_COUNT ('Coordination005', '[[:digit:]]')) SAMPLE2,
SUBSTR('Balance1000sheet',REGEXP_INSTR ('Balance1000sheet', '[[:digit:]]'),REGEXP_COUNT ('Balance1000sheet', '[[:digit:]]')) SAMPLE3 FROM DUAL
If you are using Postgres and you have data like '2000 - some sample text' then try substring and position combination, otherwise if in your scenario there is no delimiter, you need to write regex:
SUBSTRING(Column_name from 0 for POSITION('-' in column_name) - 1) as
number_column_name

How to extract numbers from a string using TSQL

I have a string:
#string='TEST RESULTS\TEST 1\RESULT 1
The string/text remains the same except for the numbers
need the 1 from TEST
need 1 from RESULT
to be used in a query like:
SET #sql = "SELECT *
FROM TABLE
WHERE test = (expression FOR CASE 1 resulting IN INT 1)
AND result = (expression FOR CASE 2 resulting IN INT 1)"
Looks like you already have a solution that met your needs but I have a little trick that I use to extract numbers from strings that I thought might benefit someone. It takes advantage of the FOR XML statement and avoids explicit loops. It makes a good inline table function or simple scalar. Do with it what you will :)
DECLARE #String varchar(255) = 'This1 Is2 my3 Test4 For Number5 Extr#ct10n';
SELECT
CAST((
SELECT CASE --// skips alpha. make sure comparison is done on upper case
WHEN ( ASCII(UPPER(SUBSTRING(#String, Number, 1))) BETWEEN 48 AND 57 )
THEN SUBSTRING(#String, Number, 1)
ELSE ''END
FROM
(
SELECT TOP 255 --// east way to get a list of numbers
--// change value as needed.
ROW_NUMBER() OVER ( ORDER BY ( SELECT 1 ) ) AS Number
FROM master.sys.all_columns a
CROSS JOIN master.sys.all_columns b
) AS n
WHERE Number <= LEN(#String)
--// use xml path to pivot the results to a row
FOR XML PATH('') ) AS varchar(255)) AS Result
Result ==> 1234510
You can script an sql function which can used through your search queries.
Here is the sample code.
CREATE FUNCTION udf_extractInteger(#string VARCHAR(2000))
RETURNS VARCHAR(2000)
AS
BEGIN
DECLARE #count int
DECLARE #intNumbers VARCHAR(1000)
SET #count = 0
SET #intNumbers = ''
WHILE #count <= LEN(#string)
BEGIN
IF SUBSTRING(#string, #count, 1)>='0' and SUBSTRING (#string, #count, 1) <='9'
BEGIN
SET #intNumbers = #intNumbers + SUBSTRING (#string, #count, 1)
END
SET #count = #count + 1
END
RETURN #intNumbers
END
GO
QUERY :
SELECT dbo.udf_extractInteger('hello 123 world456') As output
OUTPUT:
123456
Referred from : http://www.ittutorials.in/source/sql/sql-function-to-extract-only-numbers-from-string.aspx
Since you have stable text and only 2 elements, you can make good use of replace and parsename:
declare #string varchar(100) = 'TEST RESULTS\TEST 1\RESULT 2'
select cast(parsename(replace(replace(#string, 'TEST RESULTS\TEST ', ''), '\RESULT ', '.'), 2) as int) as Test
, cast(parsename(replace(replace(#string, 'TEST RESULTS\TEST ', ''), '\RESULT ', '.'), 1) as int) as Result
/*
Test Result
----------- -----------
1 2
*/
The replace portion does assume the same text and spacing always, and sets up for parsename with the period.
This method uses SUBSTRING, PARSENAME, and PATINDEX:
SELECT
SUBSTRING(PARSENAME(c,2), PATINDEX('%[0-9]%',PARSENAME(c,2)), LEN(c)) Test,
SUBSTRING(PARSENAME(c,1), PATINDEX('%[0-9]%',PARSENAME(c,1)), LEN(c)) Result
FROM ( SELECT REPLACE(#val, '\', '.') c) t
Use PARSENAME to split the string. The text of the string won't matter -- it will just need to contain the 2 back slashes to parse to 3 elements. Use PATINDEX with a regular expression to replace non-numeric values from the result. This would need adjusting if the text in front of the number ever contained numbers.
If needed, CAST/CONVERT the results to int or the appropriate data type.
Here is some sample Fiddle.
Good luck.

Generating random strings with T-SQL

If you wanted to generate a pseudorandom alphanumeric string using T-SQL, how would you do it? How would you exclude characters like dollar signs, dashes, and slashes from it?
Using a guid
SELECT #randomString = CONVERT(varchar(255), NEWID())
very short ...
Similar to the first example, but with more flexibility:
-- min_length = 8, max_length = 12
SET #Length = RAND() * 5 + 8
-- SET #Length = RAND() * (max_length - min_length + 1) + min_length
-- define allowable character explicitly - easy to read this way an easy to
-- omit easily confused chars like l (ell) and 1 (one) or 0 (zero) and O (oh)
SET #CharPool =
'abcdefghijkmnopqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ23456789.,-_!$##%^&*'
SET #PoolLength = Len(#CharPool)
SET #LoopCount = 0
SET #RandomString = ''
WHILE (#LoopCount < #Length) BEGIN
SELECT #RandomString = #RandomString +
SUBSTRING(#Charpool, CONVERT(int, RAND() * #PoolLength) + 1, 1)
SELECT #LoopCount = #LoopCount + 1
END
I forgot to mention one of the other features that makes this more flexible. By repeating blocks of characters in #CharPool, you can increase the weighting on certain characters so that they are more likely to be chosen.
When generating random data, specially for test, it is very useful to make the data random, but reproducible. The secret is to use explicit seeds for the random function, so that when the test is run again with the same seed, it produces again exactly the same strings. Here is a simplified example of a function that generates object names in a reproducible manner:
alter procedure usp_generateIdentifier
#minLen int = 1
, #maxLen int = 256
, #seed int output
, #string varchar(8000) output
as
begin
set nocount on;
declare #length int;
declare #alpha varchar(8000)
, #digit varchar(8000)
, #specials varchar(8000)
, #first varchar(8000)
declare #step bigint = rand(#seed) * 2147483647;
select #alpha = 'qwertyuiopasdfghjklzxcvbnm'
, #digit = '1234567890'
, #specials = '_## '
select #first = #alpha + '_#';
set #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #length = #minLen + rand(#seed) * (#maxLen-#minLen)
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
declare #dice int;
select #dice = rand(#seed) * len(#first),
#seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = substring(#first, #dice, 1);
while 0 < #length
begin
select #dice = rand(#seed) * 100
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
if (#dice < 10) -- 10% special chars
begin
select #dice = rand(#seed) * len(#specials)+1
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = #string + substring(#specials, #dice, 1);
end
else if (#dice < 10+10) -- 10% digits
begin
select #dice = rand(#seed) * len(#digit)+1
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = #string + substring(#digit, #dice, 1);
end
else -- rest 80% alpha
begin
declare #preseed int = #seed;
select #dice = rand(#seed) * len(#alpha)+1
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = #string + substring(#alpha, #dice, 1);
end
select #length = #length - 1;
end
end
go
When running the tests the caller generates a random seed it associates with the test run (saves it in the results table), then passed along the seed, similar to this:
declare #seed int;
declare #string varchar(256);
select #seed = 1234; -- saved start seed
exec usp_generateIdentifier
#seed = #seed output
, #string = #string output;
print #string;
exec usp_generateIdentifier
#seed = #seed output
, #string = #string output;
print #string;
exec usp_generateIdentifier
#seed = #seed output
, #string = #string output;
print #string;
Update 2016-02-17: See the comments bellow, the original procedure had an issue in the way it advanced the random seed. I updated the code, and also fixed the mentioned off-by-one issue.
Use the following code to return a short string:
SELECT SUBSTRING(CONVERT(varchar(40), NEWID()),0,9)
If you are running SQL Server 2008 or greater, you could use the new cryptographic function crypt_gen_random() and then use base64 encoding to make it a string. This will work for up to 8000 characters.
declare #BinaryData varbinary(max)
, #CharacterData varchar(max)
, #Length int = 2048
set #BinaryData=crypt_gen_random (#Length)
set #CharacterData=cast('' as xml).value('xs:base64Binary(sql:variable("#BinaryData"))', 'varchar(max)')
print #CharacterData
I'm not expert in T-SQL, but the simpliest way I've already used it's like that:
select char((rand()*25 + 65))+char((rand()*25 + 65))
This generates two char (A-Z, in ascii 65-90).
select left(NEWID(),5)
This will return the 5 left most characters of the guid string
Example run
------------
11C89
9DB02
For one random letter, you can use:
select substring('ABCDEFGHIJKLMNOPQRSTUVWXYZ',
(abs(checksum(newid())) % 26)+1, 1)
An important difference between using newid() versus rand() is that if you return multiple rows, newid() is calculated separately for each row, while rand() is calculated once for the whole query.
Here is a random alpha numeric generator
print left(replace(newid(),'-',''),#length) //--#length is the length of random Num.
There are a lot of good answers but so far none of them allow a customizable character pool and work as a default value for a column. I wanted to be able to do something like this:
alter table MY_TABLE add MY_COLUMN char(20) not null
default dbo.GenerateToken(crypt_gen_random(20))
So I came up with this. Beware of the hard-coded number 32 if you modify it.
-- Converts a varbinary of length N into a varchar of length N.
-- Recommend passing in the result of CRYPT_GEN_RANDOM(N).
create function GenerateToken(#randomBytes varbinary(max))
returns varchar(max) as begin
-- Limit to 32 chars to get an even distribution (because 32 divides 256) with easy math.
declare #allowedChars char(32);
set #allowedChars = 'abcdefghijklmnopqrstuvwxyz012345';
declare #oneByte tinyint;
declare #oneChar char(1);
declare #index int;
declare #token varchar(max);
set #index = 0;
set #token = '';
while #index < datalength(#randomBytes)
begin
-- Get next byte, use it to index into #allowedChars, and append to #token.
-- Note: substring is 1-based.
set #index = #index + 1;
select #oneByte = convert(tinyint, substring(#randomBytes, #index, 1));
select #oneChar = substring(#allowedChars, 1 + (#oneByte % 32), 1); -- 32 is the number of #allowedChars
select #token = #token + #oneChar;
end
return #token;
end
This worked for me: I needed to generate just three random alphanumeric characters for an ID, but it could work for any length up to 15 or so.
declare #DesiredLength as int = 3;
select substring(replace(newID(),'-',''),cast(RAND()*(31-#DesiredLength) as int),#DesiredLength);
Another simple solution with the complete alphabet:
SELECT LEFT(REPLACE(REPLACE((SELECT CRYPT_GEN_RANDOM(16) FOR XML PATH(''), BINARY BASE64),'+',''),'/',''),16);
Replace the two 16's with the desired length.
Sample result:
pzyMATe3jJwN1XkB
I realize that this is an old question with many fine answers. However when I found this I also found a more recent article on TechNet by Saeid Hasani
T-SQL: How to Generate Random Passwords
While the solution focuses on passwords it applies to the general case. Saeid works through various considerations to arrive at a solution. It is very instructive.
A script containing all the code blocks form the article is separately available via the TechNet Gallery, but I would definitely start at the article.
For SQL Server 2016 and later, here is a really simple and relatively efficient expression to generate cryptographically random strings of a given byte length:
--Generates 36 bytes (48 characters) of base64 encoded random data
select r from OpenJson((select Crypt_Gen_Random(36) r for json path))
with (r varchar(max))
Note that the byte length is not the same as the encoded size; use the following from this article to convert:
Bytes = 3 * (LengthInCharacters / 4) - Padding
I came across this blog post first, then came up with the following stored procedure for this that I'm using on a current project (sorry for the weird formatting):
CREATE PROCEDURE [dbo].[SpGenerateRandomString]
#sLength tinyint = 10,
#randomString varchar(50) OUTPUT
AS
BEGIN
SET NOCOUNT ON
DECLARE #counter tinyint
DECLARE #nextChar char(1)
SET #counter = 1
SET #randomString = ”
WHILE #counter <= #sLength
BEGIN
SELECT #nextChar = CHAR(48 + CONVERT(INT, (122-48+1)*RAND()))
IF ASCII(#nextChar) not in (58,59,60,61,62,63,64,91,92,93,94,95,96)
BEGIN
SELECT #randomString = #randomString + #nextChar
SET #counter = #counter + 1
END
END
END
I did this in SQL 2000 by creating a table that had characters I wanted to use, creating a view that selects characters from that table ordering by newid(), and then selecting the top 1 character from that view.
CREATE VIEW dbo.vwCodeCharRandom
AS
SELECT TOP 100 PERCENT
CodeChar
FROM dbo.tblCharacter
ORDER BY
NEWID()
...
SELECT TOP 1 CodeChar FROM dbo.vwCodeCharRandom
Then you can simply pull characters from the view and concatenate them as needed.
EDIT: Inspired by Stephan's response...
select top 1 RandomChar from tblRandomCharacters order by newid()
No need for a view (in fact I'm not sure why I did that - the code's from several years back). You can still specify the characters you want to use in the table.
I use this procedure that I developed simply stipluate the charaters you want to be able to display in the input variables, you can define the length too.
Hope this formats well, I am new to stack overflow.
IF EXISTS (SELECT * FROM sys.objects WHERE type = 'P' AND object_id = OBJECT_ID(N'GenerateARandomString'))
DROP PROCEDURE GenerateARandomString
GO
CREATE PROCEDURE GenerateARandomString
(
#DESIREDLENGTH INTEGER = 100,
#NUMBERS VARCHAR(50)
= '0123456789',
#ALPHABET VARCHAR(100)
='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',
#SPECIALS VARCHAR(50)
= '_=+-$£%^&*()"!#~#:',
#RANDOMSTRING VARCHAR(8000) OUT
)
AS
BEGIN
-- Author David Riley
-- Version 1.0
-- You could alter to one big string .e.e numebrs , alpha special etc
-- added for more felxibility in case I want to extend i.e put logic in for 3 numbers, 2 pecials 3 numbers etc
-- for now just randomly pick one of them
DECLARE #SWAP VARCHAR(8000); -- Will be used as a tempoary buffer
DECLARE #SELECTOR INTEGER = 0;
DECLARE #CURRENTLENGHT INTEGER = 0;
WHILE #CURRENTLENGHT < #DESIREDLENGTH
BEGIN
-- Do we want a number, special character or Alphabet Randonly decide?
SET #SELECTOR = CAST(ABS(CHECKSUM(NEWID())) % 3 AS INTEGER); -- Always three 1 number , 2 alphaBET , 3 special;
IF #SELECTOR = 0
BEGIN
SET #SELECTOR = 3
END;
-- SET SWAP VARIABLE AS DESIRED
SELECT #SWAP = CASE WHEN #SELECTOR = 1 THEN #NUMBERS WHEN #SELECTOR = 2 THEN #ALPHABET ELSE #SPECIALS END;
-- MAKE THE SELECTION
SET #SELECTOR = CAST(ABS(CHECKSUM(NEWID())) % LEN(#SWAP) AS INTEGER);
IF #SELECTOR = 0
BEGIN
SET #SELECTOR = LEN(#SWAP)
END;
SET #RANDOMSTRING = ISNULL(#RANDOMSTRING,'') + SUBSTRING(#SWAP,#SELECTOR,1);
SET #CURRENTLENGHT = LEN(#RANDOMSTRING);
END;
END;
GO
DECLARE #RANDOMSTRING VARCHAR(8000)
EXEC GenerateARandomString #RANDOMSTRING = #RANDOMSTRING OUT
SELECT #RANDOMSTRING
Sometimes we need a lot of random things: love, kindness, vacation, etc.
I have collected a few random generators over the years, and these are from Pinal Dave and a stackoverflow answer I found once. Refs below.
--Adapted from Pinal Dave; http://blog.sqlauthority.com/2007/04/29/sql-server-random-number-generator-script-sql-query/
SELECT
ABS( CAST( NEWID() AS BINARY( 6)) %1000) + 1 AS RandomInt
, CAST( (ABS( CAST( NEWID() AS BINARY( 6)) %1000) + 1)/7.0123 AS NUMERIC( 15,4)) AS RandomNumeric
, DATEADD( DAY, -1*(ABS( CAST( NEWID() AS BINARY( 6)) %1000) + 1), GETDATE()) AS RandomDate
--This line from http://stackoverflow.com/questions/15038311/sql-password-generator-8-characters-upper-and-lower-and-include-a-number
, CAST((ABS(CHECKSUM(NEWID()))%10) AS VARCHAR(1)) + CHAR(ASCII('a')+(ABS(CHECKSUM(NEWID()))%25)) + CHAR(ASCII('A')+(ABS(CHECKSUM(NEWID()))%25)) + LEFT(NEWID(),5) AS RandomChar
, ABS(CHECKSUM(NEWID()))%50000+1 AS RandomID
This will produce a string 96 characters in length, from the Base64 range (uppers, lowers, numbers, + and /). Adding 3 "NEWID()" will increase the length by 32, with no Base64 padding (=).
SELECT
CAST(
CONVERT(NVARCHAR(MAX),
CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
,2)
AS XML).value('xs:base64Binary(xs:hexBinary(.))', 'VARCHAR(MAX)') AS StringValue
If you are applying this to a set, make sure to introduce something from that set so that the NEWID() is recomputed, otherwise you'll get the same value each time:
SELECT
U.UserName
, LEFT(PseudoRandom.StringValue, LEN(U.Pwd)) AS FauxPwd
FROM Users U
CROSS APPLY (
SELECT
CAST(
CONVERT(NVARCHAR(MAX),
CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), NEWID())
+CONVERT(VARBINARY(8), U.UserID) -- Causes a recomute of all NEWID() calls
,2)
AS XML).value('xs:base64Binary(xs:hexBinary(.))', 'VARCHAR(MAX)') AS StringValue
) PseudoRandom
CREATE OR ALTER PROC USP_GENERATE_RANDOM_CHARACTER ( #NO_OF_CHARS INT, #RANDOM_CHAR VARCHAR(40) OUTPUT)
AS
BEGIN
SELECT #RANDOM_CHAR = SUBSTRING (REPLACE(CONVERT(VARCHAR(40), NEWID()), '-',''), 1, #NO_OF_CHARS)
END
/*
USAGE:
DECLARE #OUT VARCHAR(40)
EXEC USP_GENERATE_RANDOM_CHARACTER 13,#RANDOM_CHAR = #OUT OUTPUT
SELECT #OUT
*/
Small modification of Remus Rusanu code -thanks for sharing
This generate a random string and can be used without the seed value
declare #minLen int = 1, #maxLen int = 612, #string varchar(8000);
declare #length int;
declare #seed INT
declare #alpha varchar(8000)
, #digit varchar(8000)
, #specials varchar(8000)
, #first varchar(8000)
declare #step bigint = rand() * 2147483647;
select #alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
, #digit = '1234567890'
, #specials = '_##-/\ '
select #first = #alpha + '_#';
set #seed = (rand(#step)*2147483647);
select #length = #minLen + rand(#seed) * (#maxLen-#minLen)
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
declare #dice int;
select #dice = rand(#seed) * len(#first),
#seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = substring(#first, #dice, 1);
while 0 < #length
begin
select #dice = rand(#seed) * 100
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
if (#dice < 10) -- 10% special chars
begin
select #dice = rand(#seed) * len(#specials)+1
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = #string + substring(#specials, #dice, 1);
end
else if (#dice < 10+10) -- 10% digits
begin
select #dice = rand(#seed) * len(#digit)+1
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = #string + substring(#digit, #dice, 1);
end
else -- rest 80% alpha
begin
declare #preseed int = #seed;
select #dice = rand(#seed) * len(#alpha)+1
, #seed = (rand((#seed+#step)%2147483647)*2147483647);
select #string = #string + substring(#alpha, #dice, 1);
end
select #length = #length - 1
end
SELECT #string
Heres something based on New Id.
with list as
(
select 1 as id,newid() as val
union all
select id + 1,NEWID()
from list
where id + 1 < 10
)
select ID,val from list
option (maxrecursion 0)
I thought I'd share, or give back to the community ...
It's ASCII based, and the solution is not perfect but it works quite well.
Enjoy,
Goran B.
/*
-- predictable masking of ascii chars within a given decimal range
-- purpose:
-- i needed an alternative to hashing alg. or uniqueidentifier functions
-- because i wanted to be able to revert to original char set if possible ("if", the operative word)
-- notes: wrap below in a scalar function if desired (i.e. recommended)
-- by goran biljetina (2014-02-25)
*/
declare
#length int
,#position int
,#maskedString varchar(500)
,#inpString varchar(500)
,#offsetAsciiUp1 smallint
,#offsetAsciiDown1 smallint
,#ipOffset smallint
,#asciiHiBound smallint
,#asciiLoBound smallint
set #ipOffset=null
set #offsetAsciiUp1=1
set #offsetAsciiDown1=-1
set #asciiHiBound=126 --> up to and NOT including
set #asciiLoBound=31 --> up from and NOT including
SET #inpString = '{"config":"some string value", "boolAttr": true}'
SET #length = LEN(#inpString)
SET #position = 1
SET #maskedString = ''
--> MASK:
---------
WHILE (#position < #length+1) BEGIN
SELECT #maskedString = #maskedString +
ISNULL(
CASE
WHEN ASCII(SUBSTRING(#inpString,#position,1))>#asciiLoBound AND ASCII(SUBSTRING(#inpString,#position,1))<#asciiHiBound
THEN
CHAR(ASCII(SUBSTRING(#inpString,#position,1))+
(case when #ipOffset is null then
case when ASCII(SUBSTRING(#inpString,#position,1))%2=0 then #offsetAsciiUp1 else #offsetAsciiDown1 end
else #ipOffset end))
WHEN ASCII(SUBSTRING(#inpString,#position,1))<=#asciiLoBound
THEN '('+CONVERT(varchar,ASCII(SUBSTRING(#Inpstring,#position,1))+1000)+')' --> wrap for decode
WHEN ASCII(SUBSTRING(#inpString,#position,1))>=#asciiHiBound
THEN '('+CONVERT(varchar,ASCII(SUBSTRING(#inpString,#position,1))+1000)+')' --> wrap for decode
END
,'')
SELECT #position = #position + 1
END
select #MaskedString
SET #inpString = #maskedString
SET #length = LEN(#inpString)
SET #position = 1
SET #maskedString = ''
--> UNMASK (Limited to within ascii lo-hi bound):
-------------------------------------------------
WHILE (#position < #length+1) BEGIN
SELECT #maskedString = #maskedString +
ISNULL(
CASE
WHEN ASCII(SUBSTRING(#inpString,#position,1))>#asciiLoBound AND ASCII(SUBSTRING(#inpString,#position,1))<#asciiHiBound
THEN
CHAR(ASCII(SUBSTRING(#inpString,#position,1))+
(case when #ipOffset is null then
case when ASCII(SUBSTRING(#inpString,#position,1))%2=1 then #offsetAsciiDown1 else #offsetAsciiUp1 end
else #ipOffset*(-1) end))
ELSE ''
END
,'')
SELECT #position = #position + 1
END
select #maskedString
This uses rand with a seed like one of the other answers, but it is not necessary to provide a seed on every call. Providing it on the first call is sufficient.
This is my modified code.
IF EXISTS (SELECT * FROM sys.objects WHERE type = 'P' AND object_id = OBJECT_ID(N'usp_generateIdentifier'))
DROP PROCEDURE usp_generateIdentifier
GO
create procedure usp_generateIdentifier
#minLen int = 1
, #maxLen int = 256
, #seed int output
, #string varchar(8000) output
as
begin
set nocount on;
declare #length int;
declare #alpha varchar(8000)
, #digit varchar(8000)
, #specials varchar(8000)
, #first varchar(8000)
select #alpha = 'qwertyuiopasdfghjklzxcvbnm'
, #digit = '1234567890'
, #specials = '_##$&'
select #first = #alpha + '_#';
-- Establish our rand seed and store a new seed for next time
set #seed = (rand(#seed)*2147483647);
select #length = #minLen + rand() * (#maxLen-#minLen);
--print #length
declare #dice int;
select #dice = rand() * len(#first);
select #string = substring(#first, #dice, 1);
while 0 < #length
begin
select #dice = rand() * 100;
if (#dice < 10) -- 10% special chars
begin
select #dice = rand() * len(#specials)+1;
select #string = #string + substring(#specials, #dice, 1);
end
else if (#dice < 10+10) -- 10% digits
begin
select #dice = rand() * len(#digit)+1;
select #string = #string + substring(#digit, #dice, 1);
end
else -- rest 80% alpha
begin
select #dice = rand() * len(#alpha)+1;
select #string = #string + substring(#alpha, #dice, 1);
end
select #length = #length - 1;
end
end
go
In SQL Server 2012+ we could concatenate the binaries of some (G)UIDs and then do a base64 conversion on the result.
SELECT
textLen.textLen
, left((
select CAST(newid() as varbinary(max)) + CAST(newid() as varbinary(max))
where textLen.textLen is not null /*force evaluation for each outer query row*/
FOR XML PATH(''), BINARY BASE64
),textLen.textLen) as randomText
FROM ( values (2),(4),(48) ) as textLen(textLen) --define lengths here
;
If you need longer strings (or you see = characters in the result) you need to add more + CAST(newid() as varbinary(max)) in the sub select.
Here's one I came up with today (because I didn't like any of the existing answers enough).
This one generates a temp table of random strings, is based off of newid(), but also supports a custom character set (so more than just 0-9 & A-F), custom length (up to 255, limit is hard-coded, but can be changed), and a custom number of random records.
Here's the source code (hopefully the comments help):
/**
* First, we're going to define the random parameters for this
* snippet. Changing these variables will alter the entire
* outcome of this script. Try not to break everything.
*
* #var {int} count The number of random values to generate.
* #var {int} length The length of each random value.
* #var {char(62)} charset The characters that may appear within a random value.
*/
-- Define the parameters
declare #count int = 10
declare #length int = 60
declare #charset char(62) = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'
/**
* We're going to define our random table to be twice the maximum
* length (255 * 2 = 510). It's twice because we will be using
* the newid() method, which produces hex guids. More later.
*/
-- Create the random table
declare #random table (
value nvarchar(510)
)
/**
* We'll use two characters from newid() to make one character in
* the random value. Each newid() provides us 32 hex characters,
* so we'll have to make multiple calls depending on length.
*/
-- Determine how many "newid()" calls we'll need per random value
declare #iterations int = ceiling(#length * 2 / 32.0)
/**
* Before we start making multiple calls to "newid", we need to
* start with an initial value. Since we know that we need at
* least one call, we will go ahead and satisfy the count.
*/
-- Iterate up to the count
declare #i int = 0 while #i < #count begin set #i = #i + 1
-- Insert a new set of 32 hex characters for each record, limiting to #length * 2
insert into #random
select substring(replace(newid(), '-', ''), 1, #length * 2)
end
-- Now fill the remaining the remaining length using a series of update clauses
set #i = 0 while #i < #iterations begin set #i = #i + 1
-- Append to the original value, limit #length * 2
update #random
set value = substring(value + replace(newid(), '-', ''), 1, #length * 2)
end
/**
* Now that we have our base random values, we can convert them
* into the final random values. We'll do this by taking two
* hex characters, and mapping then to one charset value.
*/
-- Convert the base random values to charset random values
set #i = 0 while #i < #length begin set #i = #i + 1
/**
* Explaining what's actually going on here is a bit complex. I'll
* do my best to break it down step by step. Hopefully you'll be
* able to follow along. If not, then wise up and come back.
*/
-- Perform the update
update #random
set value =
/**
* Everything we're doing here is in a loop. The #i variable marks
* what character of the final result we're assigning. We will
* start off by taking everything we've already done first.
*/
-- Take the part of the string up to the current index
substring(value, 1, #i - 1) +
/**
* Now we're going to convert the two hex values after the index,
* and convert them to a single charset value. We can do this
* with a bit of math and conversions, so function away!
*/
-- Replace the current two hex values with one charset value
substring(#charset, convert(int, convert(varbinary(1), substring(value, #i, 2), 2)) * (len(#charset) - 1) / 255 + 1, 1) +
-- (1) -------------------------------------------------------^^^^^^^^^^^^^^^^^^^^^^^-----------------------------------------
-- (2) ---------------------------------^^^^^^^^^^^^^^^^^^^^^^11111111111111111111111^^^^-------------------------------------
-- (3) --------------------^^^^^^^^^^^^^2222222222222222222222222222222222222222222222222^------------------------------------
-- (4) --------------------333333333333333333333333333333333333333333333333333333333333333---^^^^^^^^^^^^^^^^^^^^^^^^^--------
-- (5) --------------------333333333333333333333333333333333333333333333333333333333333333^^^4444444444444444444444444--------
-- (6) --------------------5555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555^^^^----
-- (7) ^^^^^^^^^^^^^^^^^^^^66666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666^^^^
/**
* (1) - Determine the two hex characters that we'll be converting (ex: 0F, AB, 3C, etc.)
* (2) - Convert those two hex characters to a a proper hexadecimal (ex: 0x0F, 0xAB, 0x3C, etc.)
* (3) - Convert the hexadecimals to integers (ex: 15, 171, 60)
* (4) - Determine the conversion ratio between the length of #charset and the range of hexadecimals (255)
* (5) - Multiply the integer from (3) with the conversion ratio from (4) to get a value between 0 and (len(#charset) - 1)
* (6) - Add 1 to the offset from (5) to get a value between 1 and len(#charset), since strings start at 1 in SQL
* (7) - Use the offset from (6) and grab a single character from #subset
*/
/**
* All that is left is to add in everything we have left to do.
* We will eventually process the entire string, but we will
* take things one step at a time. Round and round we go!
*/
-- Append everything we have left to do
substring(value, 2 + #i, len(value))
end
-- Select the results
select value
from #random
It's not a stored procedure, but it wouldn't be that hard to turn it into one. It's also not horrendously slow (it took me ~0.3 seconds to generate 1,000 results of length 60, which is more than I'll ever personally need), which was one of my initial concerns from all of the string mutation I'm doing.
The main takeaway here is that I'm not trying to create my own random number generator, and my character set isn't limited. I'm simply using the random generator that SQL has (I know there's rand(), but that's not great for table results). Hopefully this approach marries the two kinds of answers here, from overly simple (i.e. just newid()) and overly complex (i.e. custom random number algorithm).
It's also short (minus the comments), and easy to understand (at least for me), which is always a plus in my book.
However, this method cannot be seeded, so it's going to be truly random each time, and you won't be able to replicate the same set of data with any means of reliability. The OP didn't list that as a requirement, but I know that some people look for that sort of thing.
I know I'm late to the party here, but hopefully someone will find this useful.
Based on various helpful responses in this article I landed with a combination of a couple options I liked.
DECLARE #UserId BIGINT = 12345 -- a uniqueId in my system
SELECT LOWER(REPLACE(NEWID(),'-','')) + CONVERT(VARCHAR, #UserId)
Below are the way to Generate 4 Or 8 Characters Long Random Alphanumeric String in SQL
select LEFT(CONVERT(VARCHAR(36),NEWID()),4)+RIGHT(CONVERT(VARCHAR(36),NEWID()),4)
SELECT RIGHT(REPLACE(CONVERT(VARCHAR(36),NEWID()),'-',''),8)
So I liked a lot of the answers above, but I was looking for something that was a little more random in nature. I also wanted a way to explicitly call out excluded characters. Below is my solution using a view that calls the CRYPT_GEN_RANDOM to get a cryptographic random number. In my example, I only chose a random number that was 8 bytes. Please note, you can increase this size and also utilize the seed parameter of the function if you want. Here is the link to the documentation: https://learn.microsoft.com/en-us/sql/t-sql/functions/crypt-gen-random-transact-sql
CREATE VIEW [dbo].[VW_CRYPT_GEN_RANDOM_8]
AS
SELECT CRYPT_GEN_RANDOM(8) as [value];
The reason for creating the view is because CRYPT_GEN_RANDOM cannot be called directly from a function.
From there, I created a scalar function that accepts a length and a string parameter that can contain a comma delimited string of excluded characters.
CREATE FUNCTION [dbo].[fn_GenerateRandomString]
(
#length INT,
#excludedCharacters VARCHAR(200) --Comma delimited string of excluded characters
)
RETURNS VARCHAR(Max)
BEGIN
DECLARE #returnValue VARCHAR(Max) = ''
, #asciiValue INT
, #currentCharacter CHAR;
--Optional concept, you can add default excluded characters
SET #excludedCharacters = CONCAT(#excludedCharacters,',^,*,(,),-,_,=,+,[,{,],},\,|,;,:,'',",<,.,>,/,`,~');
--Table of excluded characters
DECLARE #excludedCharactersTable table([asciiValue] INT);
--Insert comma
INSERT INTO #excludedCharactersTable SELECT 44;
--Stores the ascii value of the excluded characters in the table
INSERT INTO #excludedCharactersTable
SELECT ASCII(TRIM(value))
FROM STRING_SPLIT(#excludedCharacters, ',')
WHERE LEN(TRIM(value)) = 1;
--Keep looping until the return string is filled
WHILE(LEN(#returnValue) < #length)
BEGIN
--Get a truly random integer values from 33-126
SET #asciiValue = (SELECT TOP 1 (ABS(CONVERT(INT, [value])) % 94) + 33 FROM [dbo].[VW_CRYPT_GEN_RANDOM_8]);
--If the random integer value is not in the excluded characters table then append to the return string
IF(NOT EXISTS(SELECT *
FROM #excludedCharactersTable
WHERE [asciiValue] = #asciiValue))
BEGIN
SET #returnValue = #returnValue + CHAR(#asciiValue);
END
END
RETURN(#returnValue);
END
Below is an example of the how to call the function.
SELECT [dbo].[fn_GenerateRandomString](8,'!,#,#,$,%,&,?');
May this will be an answer to create random lower & upper characters. It's using a while loop which can be controlled to generate the string with a specific length.
--random string generator
declare #maxLength int = 5; --max length
declare #bigstr varchar(10) --uppercase character
declare #smallstr varchar(10) --lower character
declare #i int = 1;
While #i <= #maxLength
begin
set #bigstr = concat(#bigstr, char((rand()*26 + 65)));
set #smallstr = concat(#smallstr, char((rand()*26 + 96)));
set #i = len(#bigstr)
end
--select query
select #bigstr, #smallstr