PATINDEX to detect letters and six numbers - sql

SQL Server 2012.
table a
Id name number
1 py/ut/455656/ip null
2 py/ut/jl/op null
3 py/utr//grt null
I want to retrieve the numbers
Id name number
1 py/ut/455656/ip 455656
2 py/ut/jl/op null
3 py/utr//grt null
here the sql script
update table a
set number=SUBSTRING(name,PATINDEX('py/u/[0-9]',name)+6,6)
I need to retrieve the number after py/ut and before the / . The script works well if there is a number. For the second row it is delivering jl/op
The number always get six algarisms.

check this :
declare #Number nvarchar(20)='py/ut/455656/ip'
Declare #intAlpha int
SET #intAlpha = PATINDEX('%[^0-9]%', #Number )
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #Number = STUFF(#Number , #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #Number )
END
END
select #Number

Simply add a where clause:
update table a
set number = SUBSTRING(name, PATINDEX('py/u/[0-9]', name) + 6, 6)
where name like '%py/u/[0-9]%'

DECLARE #a TABLE([name] NVARCHAR(MAX), number INT NULL)
INSERT #a([name]) VALUES ('py/ut/455656/ip'), ('py/ut/jl/op'), ('py/utr//grt')
UPDATE #a
SET number = SUBSTRING([name], PATINDEX('%/[0-9][0-9][0-9][0-9][0-9][0-9]/%', [name]) + 1, 6)
WHERE [name] LIKE '%/[0-9][0-9][0-9][0-9][0-9][0-9]/%'
SELECT * FROM #a
Make the pattern more or less specific to taste.

PATINDEX works like the LIKE operator, so the pattern you're using is actually returning 0 for both rows and is just coincidentally working for the value that has 6 numbers starting after the py/ut/ part. You need to add a wildcard to the pattern you're passing into PATINDEX and a WHERE clause to the UPDATE statement.
Try something like this:
-- Length of the path prefix, assumes it is constant
DECLARE #lenPrefix int
set #lenPrefix = 6
DECLARE #lenNumber int
SET #lenNumber = 6
UPDATE TABLE a
SET number=SUBSTRING(name, PATINDEX('py/ut/[0-9]%', name) + #lenPrefix, #lenNumber)
WHERE
PATINDEX('py/ut/[0-9]%', name) > 0

If name field contain only one number this script should work for you :
What did I do:
I have used PATINDEX() to find which point the number starts.
Also, again I used PATINDEX() with REVERSE() name to find end point.
I used LEN() to find total length the field.
Then finally I used SUBSTRING() to capture number from starting point to total length - (starting point) - (end point).
Check it:
--DROP TABLE #A
--GO
CREATE TABLE #A
(
id int
,name VARCHAR(100)
);
INSERT INTO #A
VALUES (1, 'py/ut/455656/ip')
, (2, 'py/ut/jl/op ')
, (3, 'py/utr//grt ')
SELECT
id
,name
/*
,PATINDEX('%[0-9]%', name) - 1 --starting poing
,PATINDEX('%[0-9]%', REVERSE(name)) - 1 --reverse starting point
*/
,CASE WHEN (PATINDEX('%[0-9]%', name) - 1)>0
THEN SUBSTRING(name
,PATINDEX('%[0-9]%', name),
LEN(NAME) - (PATINDEX('%[0-9]%', name) - 1) - (PATINDEX('%[0-9]%', REVERSE(name)) - 1)
)
ELSE null END Number
FROM #A

Related

Count numeric chars in string

Using tsql I want to count a numeric chars in string. For example i've got 'kick0my234ass' string and i wanna count how many (4 in that example) numbers are in that string. I can't use regex, just plain tslq.
You COULD do this I suppose:
declare #c varchar(30)
set #c = 'kick0my234ass'
select #c, len(replace(#c,' ','')) - len(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(#c,'0',''),'1',''),'2',''),'3',''),'4',''),'5',''),'6',''),'7',''),'8',''),'9',''),' ',''))
You'll first have to split the character string in its individual characters, evaluate which are numeric, and finally count those that are. This will do the trick:
DECLARE #test TABLE (Example NVARCHAR(255))
INSERT #test
VALUES ('kick0my234ass')
SELECT COUNT(1)
FROM #test AS T
INNER JOIN master..spt_values v
ON v.type = 'P'
AND v.number < len(T.Example)
WHERE SUBSTRING(T.Example, v.number + 1, 1) LIKE '[0-9]'
You could try this solution with regular expressions (if you'd allow them):
it uses recursive CTE, at every recursive step, one digit is removed from given string and the condition is to stop, when there are no digits in string. The rows are also numbered with consecutive ids, so the last id is the amount of removed digits from string.
declare #str varchar(100) = 'kick0my123ass';
with cte as (
select 1 [id], stuff(#str,PATINDEX('%[0-9]%', #str),1,'') [col]
union all
select [id] + 1, stuff([col],PATINDEX('%[0-9]%', [col]),1,'') from cte
where col like '%[0-9]%'
)
--this will give you number of digits in string
select top 1 id from cte order by id desc
Use a WHILE loop to each each character is a numeric or not.
Query
declare #text as varchar(max) = 'kick0my234ass';
declare #len as int;
select #len = len(#text);
if(#len > 0)
begin
declare #i as int = 1;
declare #count as int = 0;
while(#i <= #len)
begin
if(substring(#text, #i, 1) like '[0-9]')
set #count += 1;
set #i += 1;
end
print 'Count of Numerics in ' + #text + ' : ' + cast(#count as varchar(100));
end
else
print 'Empty string';
If simplicity & performance are important I suggest a purely set-based solution. Grab a copy of DigitsOnlyEE which will remove all non-numeric characters. Then use LEN against the output.
DECLARE #string varchar(100) = '123xxx45ff678';
SELECT string = #string, digitsOnly, DigitCount = LEN(digitsOnly)
FROM dbo.DigitsOnlyEE(#string);
Results
string digitsOnly DigitCount
------------------ ----------- ------------
123xxx45ff678 12345678 8
using a Tally Table created by an rCTE:
CREATE TABLE #Sample (S varchar(100));
INSERT INTO #Sample
VALUES ('kick0my234 ass');
GO
WITH Tally AS(
SELECT 1 AS N
UNION ALL
SELECT N + 1
FROM Tally
WHERE N + 1 <= 100)
SELECT S.S, SUM(CASE WHEN SUBSTRING(S,T.N, 1) LIKE '[0-9]' THEN 1 ELSE 0 END) AS Numbers
FROM #Sample S
JOIN Tally T ON LEN(S.S) >= T.N
GROUP BY S.S;
For future reference, also post your owns attempts please. We aren't here (really) to do your work for you.

How can I select rows from database where column Name contains exactly 5 digits

I have a database with some count of rows. They contains information about books. And I need to select books with Name which contain EXACTLY 5 numbers.
I tried to select by
SELECT * FROM books WHERE Name LIKE “*#*#*#*#*#*”
But result by this query returning books with names which contain more than 5 digits
For example, I have some rows (Names of books):
To Kill a Mockingbird 2
1984 2
The Lord of the Rings (The Lord of the Rings, #1-3)
The Chronicles of Narnia (Chronicles of Narnia, #1,2,3,4,5,6,7)
And query, what I need, must return 2 but not 4 item
If you need to select all books with a certain number of digits you can use a LIKE clause checking for multiple digit ranges.
SELECT *
FROM books
WHERE Name LIKE '*[0-9]*[0-9]*[0-9]*[0-9]*[0-9]*'
If name column is numeric than you can use this query.
SELECT * FROM table1 WHERE ISNUMERIC(column1) = 1 and LEN(column1) = 5
If name column is alphanumeric then you can create one function which gives you no of numeric in string
CREATE FUNCTION dbo.GetNumbers
(#str VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #str)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #str = STUFF(#str, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #str )
END
END
RETURN ISNULL(#str,0)
END
GO
Once function gets created, you can use this query
SELECT * FROM table1 WHERE LEN(dbo.GetNumbers(column1)) = 5
So, that was very and very easy.
SELECT *
FROM books
WHERE Name LIKE "*#*#*#*#*#*" AND Name NOT LIKE "*#*#*#*#*#*#*"
Try following statement with your table name and column name:
select yt.*, regexp_substr(<book_name>, '[[:digit:]]+',1, 1)
from <YOUR_TABLE> yt
where length(regexp_substr(<book_name>, '[[:digit:]]+',1, 1) ) = 5
;

Shuffling numbers based on the numbers from the row

Let's say we have a 12-digit numbers in a given row.
AccountNumber
=============
136854775807
293910210121
763781239182
Is it possible to shuffle the numbers of a single row solely based on the numbers of that row? e.g. 136854775807 would become 573145887067
I have created a user-defined function to shuffle the numbers.
What I have done is, taken out each character and stored it into a table variable along with a random number. Then at last concatenated each character in the ascending order of the random number.
It is not possible to use RAND function inside a user-defined function. So created a VIEW for taking a random number.
View : random_num
create view dbo.[random_num]
as
select floor(rand()* 12) as [rnd];
It's not necessary that the random number should be between 0 and 12. We can give a larger number instead of 12.
User-defined function : fn_shuffle
create function dbo.[fn_shuffle](
#acc varchar(12)
)
returns varchar(12)
as begin
declare #tbl as table([a] varchar(1), [b] int);
declare #i as int = 1;
declare #l as int;
set #l = (select len(#acc));
while(#i <= #l)
begin
insert into #tbl([a], [b])
select substring(#acc, #i, 1), [rnd] from [random_num]
set #i += 1;
end
declare #res as varchar(12);
select #res = stuff((
select '' + [a]
from #tbl
order by [b], [a]
for xml path('')
)
, 1, 0, ''
);
return #res;
end
Then, you would be able to use the function like below.
select [acc_no],
dbo.[fn_shuffle]([acc_no]) as [shuffled]
from dbo.[your_table_name];
Find a demo here
I don't really see the utility, but you can. Here is one way:
select t.accountnumber, x.shuffled
from t cross apply
(select digit
from (values (substring(accountnumber, 1, 1)),
substring(accountnumber, 2, 1)),
. . .
substring(accountnumber, 12, 1))
)
) v(digit)
order by newid()
for xml path ('')
) x(shuffled);

Matching strings in TSQL

I have a table that has some columns containing strings ,Let's say nvarchar. Now, the user passes a string to a function that searches for this string in its assigned column. I want to check if that string is present in the database but the problem is it does not necessarily have to be a 100% match.
Let's say for example:
The user passed the string Johnathon and string John is present in this database.
So, basically I want to get the number of characters that matched.In this particular case of John and Johnathon. it should be 4 matched and 5 unmatched.
Can I please get some directions to approach this problem?
Edit: What I am guessing is I can do the percentage match thing once I have retrieved the best matching string from the column. So, likewise, if we ignore the number of matched and unmatched characters and focus on retrieving the matched string from database, that should work.
Forexample, as Johnathon was passed by the user, and John is present in the database, I definitely can not use Like operator here but a piece of code that searches for the most matched string in the column and returns it.
You can do it this way:
SELECT Name, LEN(Name) AS Equals, (LEN('Johnathon') - LEN(Name)) AS NotEquals
FROM TableName
WHERE 'Johnathon' LIKE '%' +Name +'%'
Or if you want to compare both ways then:
DECLARE #parameter NVARCHAR(MAX) = N'Johnathon'
SELECT Name,
CASE WHEN LEN(Name) > LEN(#parameter) THEN LEN(#parameter) ELSE LEN(Name) END AS Equals,
CASE WHEN LEN(Name) > LEN(#parameter) THEN LEN(Name) - LEN(#parameter) ELSE LEN(#parameter) - LEN(Name) END AS NotEquals
FROM TableName
WHERE Name LIKE '%' + #parameter + '%' OR #parameter LIKE '%' +Name +'%'
The Levenshtein distance mentioned by #DeadlyJesus might suit you, but an alternative would be just to count matching characaters from the start of the 2 strings. A simple user defined function could do this.
create function dbo.MatchStart(#input1 nvarchar(100), #input2 nvarchar(100)) returns int as
begin
declare #i int
set #i = 1
if (#input1 is not null and #input2 is not null)
begin
while (1 = 1)
begin
if (#i > len(#input1) or #i > len(#input2))
break
if (substring(#input1, #i, 1) <> substring(#input2, #i, 1))
break;
set #i = #i + 1
end
end
return #i - 1
end
go
declare #testTable table (text1 nvarchar(100))
declare #userInput nvarchar(100)
insert #testTable values
(null),
(''),
('John'),
('Johnathan'),
('JohXXX'),
('Fred'),
('JxOxHxN')
set #userInput = 'Johnathan'
select text1, dbo.MatchStart(text1, #userInput) as result from #testTable
You can try this approach:-
IF EXISTS(SELECT * FROM TAB_NAME WHERE COL LIKE '%JOHN%')
SELECT LEN('JOHN') AS MATCHED, (LEN(COL) - LEN('JOHN')) AS UNMATCHED
FROM TAB_NAME;
I think this approach can solve your problem.

Query to get only numbers from a string

I have data like this:
string 1: 003Preliminary Examination Plan
string 2: Coordination005
string 3: Balance1000sheet
The output I expect is
string 1: 003
string 2: 005
string 3: 1000
And I want to implement it in SQL.
First create this UDF
CREATE FUNCTION dbo.udf_GetNumeric
(
#strAlphaNumeric VARCHAR(256)
)
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
SQL FIDDLE
I hope this solved your problem.
Reference
Try this one -
Query:
DECLARE #temp TABLE
(
string NVARCHAR(50)
)
INSERT INTO #temp (string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')
SELECT LEFT(subsrt, PATINDEX('%[^0-9]%', subsrt + 't') - 1)
FROM (
SELECT subsrt = SUBSTRING(string, pos, LEN(string))
FROM (
SELECT string, pos = PATINDEX('%[0-9]%', string)
FROM #temp
) d
) t
Output:
----------
003
005
1000
Query:
DECLARE #temp TABLE
(
string NVARCHAR(50)
)
INSERT INTO #temp (string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')
SELECT SUBSTRING(string, PATINDEX('%[0-9]%', string), PATINDEX('%[0-9][^0-9]%', string + 't') - PATINDEX('%[0-9]%',
string) + 1) AS Number
FROM #temp
Please try:
declare #var nvarchar(max)='Balance1000sheet'
SELECT LEFT(Val,PATINDEX('%[^0-9]%', Val+'a')-1) from(
SELECT SUBSTRING(#var, PATINDEX('%[0-9]%', #var), LEN(#var)) Val
)x
Getting only numbers from a string can be done in a one-liner.
Try this :
SUBSTRING('your-string-here', PATINDEX('%[0-9]%', 'your-string-here'), LEN('your-string-here'))
NB: Only works for the first int in the string, ex: abc123vfg34 returns 123.
I found this approach works about 3x faster than the top voted answer. Create the following function, dbo.GetNumbers:
CREATE FUNCTION dbo.GetNumbers(#String VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN;
WITH
Numbers
AS (
--Step 1.
--Get a column of numbers to represent
--every character position in the #String.
SELECT 1 AS Number
UNION ALL
SELECT Number + 1
FROM Numbers
WHERE Number < LEN(#String)
)
,Characters
AS (
SELECT Character
FROM Numbers
CROSS APPLY (
--Step 2.
--Use the column of numbers generated above
--to tell substring which character to extract.
SELECT SUBSTRING(#String, Number, 1) AS Character
) AS c
)
--Step 3.
--Pattern match to return only numbers from the CTE
--and use STRING_AGG to rebuild it into a single string.
SELECT #String = STRING_AGG(Character,'')
FROM Characters
WHERE Character LIKE '[0-9]'
--allows going past the default maximum of 100 loops in the CTE
OPTION (MAXRECURSION 8000)
RETURN #String
END
GO
Testing
Testing for purpose:
SELECT dbo.GetNumbers(InputString) AS Numbers
FROM ( VALUES
('003Preliminary Examination Plan') --output: 003
,('Coordination005') --output: 005
,('Balance1000sheet') --output: 1000
,('(111) 222-3333') --output: 1112223333
,('1.38hello#f00.b4r#\-6') --output: 1380046
) testData(InputString)
Testing for performance:
Start off setting up the test data...
--Add table to hold test data
CREATE TABLE dbo.NumTest (String VARCHAR(8000))
--Make an 8000 character string with mix of numbers and letters
DECLARE #Num VARCHAR(8000) = REPLICATE('12tf56se',800)
--Add this to the test table 500 times
DECLARE #n INT = 0
WHILE #n < 500
BEGIN
INSERT INTO dbo.NumTest VALUES (#Num)
SET #n = #n +1
END
Now testing the dbo.GetNumbers function:
SELECT dbo.GetNumbers(NumTest.String) AS Numbers
FROM dbo.NumTest -- Time to complete: 1 min 7s
Then testing the UDF from the top voted answer on the same data.
SELECT dbo.udf_GetNumeric(NumTest.String)
FROM dbo.NumTest -- Time to complete: 3 mins 12s
Inspiration for dbo.GetNumbers
Decimals
If you need it to handle decimals, you can use either of the following approaches, I found no noticeable performance differences between them.
change '[0-9]' to '[0-9.]'
change Character LIKE '[0-9]' to ISNUMERIC(Character) = 1 (SQL treats a single decimal point as "numeric")
Bonus
You can easily adapt this to differing requirements by swapping out WHERE Character LIKE '[0-9]' with the following options:
WHERE Letter LIKE '[a-zA-Z]' --Get only letters
WHERE Letter LIKE '[0-9a-zA-Z]' --Remove non-alphanumeric
WHERE Letter LIKE '[^0-9a-zA-Z]' --Get only non-alphanumeric
With the previous queries I get these results:
'AAAA1234BBBB3333' >>>> Output: 1234
'-çã+0!\aº1234' >>>> Output: 0
The code below returns All numeric chars:
1st output: 12343333
2nd output: 01234
declare #StringAlphaNum varchar(255)
declare #Character varchar
declare #SizeStringAlfaNumerica int
declare #CountCharacter int
set #StringAlphaNum = 'AAAA1234BBBB3333'
set #SizeStringAlfaNumerica = len(#StringAlphaNum)
set #CountCharacter = 1
while isnumeric(#StringAlphaNum) = 0
begin
while #CountCharacter < #SizeStringAlfaNumerica
begin
if substring(#StringAlphaNum,#CountCharacter,1) not like '[0-9]%'
begin
set #Character = substring(#StringAlphaNum,#CountCharacter,1)
set #StringAlphaNum = replace(#StringAlphaNum, #Character, '')
end
set #CountCharacter = #CountCharacter + 1
end
set #CountCharacter = 0
end
select #StringAlphaNum
declare #puvodni nvarchar(20)
set #puvodni = N'abc1d8e8ttr987avc'
WHILE PATINDEX('%[^0-9]%', #puvodni) > 0 SET #puvodni = REPLACE(#puvodni, SUBSTRING(#puvodni, PATINDEX('%[^0-9]%', #puvodni), 1), '' )
SELECT #puvodni
A solution for SQL Server 2017 and later, using TRANSLATE:
DECLARE #T table (string varchar(50) NOT NULL);
INSERT #T
(string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet');
SELECT
result =
REPLACE(
TRANSLATE(
T.string COLLATE Latin1_General_CI_AI,
'abcdefghijklmnopqrstuvwxyz',
SPACE(26)),
SPACE(1),
SPACE(0))
FROM #T AS T;
Output:
result
003
005
1000
The code works by:
Replacing characters a-z (ignoring case & accents) with a space
Replacing spaces with an empty string.
The string supplied to TRANSLATE can be expanded to include additional characters.
I did not have rights to create functions but had text like
["blahblah012345679"]
And needed to extract the numbers out of the middle
Note this assumes the numbers are grouped together and not at the start and end of the string.
select substring(column_name,patindex('%[0-9]%', column_name),patindex('%[0-9][^0-9]%', column_name)-patindex('%[0-9]%', column_name)+1)
from table name
Although this is an old thread its the first in google search, I came up with a different answer than what came before. This will allow you to pass your criteria for what to keep within a string, whatever that criteria might be. You can put it in a function to call over and over again if you want.
declare #String VARCHAR(MAX) = '-123. a 456-78(90)'
declare #MatchExpression VARCHAR(255) = '%[0-9]%'
declare #return varchar(max)
WHILE PatIndex(#MatchExpression, #String) > 0
begin
set #return = CONCAT(#return, SUBSTRING(#string,patindex(#matchexpression, #string),1))
SET #String = Stuff(#String, PatIndex(#MatchExpression, #String), 1, '')
end
select (#return)
This UDF will work for all types of strings:
CREATE FUNCTION udf_getNumbersFromString (#string varchar(max))
RETURNS varchar(max)
AS
BEGIN
WHILE #String like '%[^0-9]%'
SET #String = REPLACE(#String, SUBSTRING(#String, PATINDEX('%[^0-9]%', #String), 1), '')
RETURN #String
END
Just a little modification to #Epsicron 's answer
SELECT SUBSTRING(string, PATINDEX('%[0-9]%', string), PATINDEX('%[0-9][^0-9]%', string + 't') - PATINDEX('%[0-9]%',
string) + 1) AS Number
FROM (values ('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')) as a(string)
no need for a temporary variable
Firstly find out the number's starting length then reverse the string to find out the first position again(which will give you end position of number from the end). Now if you deduct 1 from both number and deduct it from string whole length you'll get only number length. Now get the number using SUBSTRING
declare #fieldName nvarchar(100)='AAAA1221.121BBBB'
declare #lenSt int=(select PATINDEX('%[0-9]%', #fieldName)-1)
declare #lenEnd int=(select PATINDEX('%[0-9]%', REVERSE(#fieldName))-1)
select SUBSTRING(#fieldName, PATINDEX('%[0-9]%', #fieldName), (LEN(#fieldName) - #lenSt -#lenEnd))
T-SQL function to read all the integers from text and return the one at the indicated index, starting from left or right, also using a starting search term (optional):
create or alter function dbo.udf_number_from_text(
#text nvarchar(max),
#search_term nvarchar(1000) = N'',
#number_position tinyint = 1,
#rtl bit = 0
) returns int
as
begin
declare #result int = 0;
declare #search_term_index int = 0;
if #text is null or len(#text) = 0 goto exit_label;
set #text = trim(#text);
if len(#text) = len(#search_term) goto exit_label;
if len(#search_term) > 0
begin
set #search_term_index = charindex(#search_term, #text);
if #search_term_index = 0 goto exit_label;
end;
if #search_term_index > 0
if #rtl = 0
set #text = trim(right(#text, len(#text) - #search_term_index - len(#search_term) + 1));
else
set #text = trim(left(#text, #search_term_index - 1));
if len(#text) = 0 goto exit_label;
declare #patt_number nvarchar(10) = '%[0-9]%';
declare #patt_not_number nvarchar(10) = '%[^0-9]%';
declare #number_start int = 1;
declare #number_end int;
declare #found_numbers table (id int identity(1,1), val int);
while #number_start > 0
begin
set #number_start = patindex(#patt_number, #text);
if #number_start > 0
begin
if #number_start = len(#text)
begin
insert into #found_numbers(val)
select cast(substring(#text, #number_start, 1) as int);
break;
end;
else
begin
set #text = right(#text, len(#text) - #number_start + 1);
set #number_end = patindex(#patt_not_number, #text);
if #number_end = 0
begin
insert into #found_numbers(val)
select cast(#text as int);
break;
end;
else
begin
insert into #found_numbers(val)
select cast(left(#text, #number_end - 1) as int);
if #number_end = len(#text)
break;
else
begin
set #text = trim(right(#text, len(#text) - #number_end));
if len(#text) = 0 break;
end;
end;
end;
end;
end;
if #rtl = 0
select #result = coalesce(a.val, 0)
from (select row_number() over (order by m.id asc) as c_row, m.val
from #found_numbers as m) as a
where a.c_row = #number_position;
else
select #result = coalesce(a.val, 0)
from (select row_number() over (order by m.id desc) as c_row, m.val
from #found_numbers as m) as a
where a.c_row = #number_position;
exit_label:
return #result;
end;
Example:
select dbo.udf_number_from text(N'Text text 10 text, 25 term', N'term',2,1);
returns 10;
This is one of the simplest and easiest one. This will work on the entire String for multiple occurences as well.
CREATE FUNCTION dbo.fn_GetNumbers(#strInput NVARCHAR(500))
RETURNS NVARCHAR(500)
AS
BEGIN
DECLARE #strOut NVARCHAR(500) = '', #intCounter INT = 1
WHILE #intCounter <= LEN(#strInput)
BEGIN
SELECT #strOut = #strOut + CASE WHEN SUBSTRING(#strInput, #intCounter, 1) LIKE '[0-9]' THEN SUBSTRING(#strInput, #intCounter, 1) ELSE '' END
SET #intCounter = #intCounter + 1
END
RETURN #strOut
END
Following a solution using a single common table expression (CTE).
DECLARE #s AS TABLE (id int PRIMARY KEY, value nvarchar(max));
INSERT INTO #s
VALUES
(1, N'003Preliminary Examination Plan'),
(2, N'Coordination005'),
(3, N'Balance1000sheet');
SELECT * FROM #s ORDER BY id;
WITH t AS (
SELECT
id,
1 AS i,
SUBSTRING(value, 1, 1) AS c
FROM
#s
WHERE
LEN(value) > 0
UNION ALL
SELECT
t.id,
t.i + 1 AS i,
SUBSTRING(s.value, t.i + 1, 1) AS c
FROM
t
JOIN #s AS s ON t.id = s.id
WHERE
t.i < LEN(s.value)
)
SELECT
id,
STRING_AGG(c, N'') WITHIN GROUP (ORDER BY i ASC) AS value
FROM
t
WHERE
c LIKE '[0-9]'
GROUP BY
id
ORDER BY
id;
DECLARE #index NVARCHAR(20);
SET #index = 'abd565klaf12';
WHILE PATINDEX('%[0-9]%', #index) != 0
BEGIN
SET #index = REPLACE(#index, SUBSTRING(#index, PATINDEX('%[0-9]%', #index), 1), '');
END
SELECT #index;
One can replace [0-9] with [a-z] if numbers only are wanted with desired castings using the CAST function.
If we use the User Define Function, the query speed will be greatly reduced. This code extracts the number from the string....
SELECT
Reverse(substring(Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) )))) , patindex('%[0-9]%', Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) )))) ), len(Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) ))))) )) NumberValue
FROM dbo.TableName
CREATE OR REPLACE FUNCTION count_letters_and_numbers(input_string TEXT)
RETURNS TABLE (letters INT, numbers INT) AS $$
BEGIN
RETURN QUERY SELECT
sum(CASE WHEN input_string ~ '[A-Za-z]' THEN 1 ELSE 0 END) as letters,
sum(CASE WHEN input_string ~ '[0-9]' THEN 1 ELSE 0 END) as numbers
FROM unnest(string_to_array(input_string, '')) as input_string;
END;
$$ LANGUAGE plpgsql;
For the hell of it...
This solution is different to all earlier solutions, viz:
There is no need to create a function
There is no need to use pattern matching
There is no need for a temporary table
This solution uses a recursive common table expression (CTE)
But first - note the question does not specify where such strings are stored. In my solution below, I create a CTE as a quick and dirty way to put these strings into some kind of "source table".
Note also - this solution uses a recursive common table expression (CTE) - so don't get confused by the usage of two CTEs here. The first is simply to make the data avaliable to the solution - but it is only the second CTE that is required in order to solve this problem. You can adapt the code to make this second CTE query your existing table, view, etc.
Lastly - my coding is verbose, trying to use column and CTE names that explain what is going on and you might be able to simplify this solution a little. I've added in a few pseudo phone numbers with some (expected and atypical, as the case may be) formatting for the fun of it.
with SOURCE_TABLE as (
select '003Preliminary Examination Plan' as numberString
union all select 'Coordination005' as numberString
union all select 'Balance1000sheet' as numberString
union all select '1300 456 678' as numberString
union all select '(012) 995 8322 ' as numberString
union all select '073263 6122,' as numberString
),
FIRST_CHAR_PROCESSED as (
select
len(numberString) as currentStringLength,
isNull(cast(try_cast(replace(left(numberString, 1),' ','z') as tinyint) as nvarchar),'') as firstCharAsNumeric,
cast(isNull(cast(try_cast(nullIf(left(numberString, 1),'') as tinyint) as nvarchar),'') as nvarchar(4000)) as newString,
cast(substring(numberString,2,len(numberString)) as nvarchar) as remainingString
from SOURCE_TABLE
union all
select
len(remainingString) as currentStringLength,
cast(try_cast(replace(left(remainingString, 1),' ','z') as tinyint) as nvarchar) as firstCharAsNumeric,
cast(isNull(newString,'') as nvarchar(3999)) + isNull(cast(try_cast(nullIf(left(remainingString, 1),'') as tinyint) as nvarchar(1)),'') as newString,
substring(remainingString,2,len(remainingString)) as remainingString
from FIRST_CHAR_PROCESSED fcp2
where fcp2.currentStringLength > 1
)
select
newString
,* -- comment this out when required
from FIRST_CHAR_PROCESSED
where currentStringLength = 1
So what's going on here?
Basically in our CTE we are selecting the first character and using try_cast (see docs) to cast it to a tinyint (which is a large enough data type for a single-digit numeral). Note that the type-casting rules in SQL Server say that an empty string (or a space, for that matter) will resolve to zero, so the nullif is added to force spaces and empty strings to resolve to null (see discussion) (otherwise our result would include a zero character any time a space is encountered in the source data).
The CTE also returns everything after the first character - and that becomes the input to our recursive call on the CTE; in other words: now let's process the next character.
Lastly, the field newString in the CTE is generated (in the second SELECT) via concatenation. With recursive CTEs the data type must match between the two SELECT statements for any given column - including the column size. Because we know we are adding (at most) a single character, we are casting that character to nvarchar(1) and we are casting the newString (so far) as nvarchar(3999). Concatenated, the result will be nvarchar(4000) - which matches the type casting we carry out in the first SELECT.
If you run this query and exclude the WHERE clause, you'll get a sense of what's going on - but the rows may be in a strange order. (You won't necessarily see all rows relating to a single input value grouped together - but you should still be able to follow).
Hope it's an interesting option that may help a few people wanting a strictly expression-based solution.
In Oracle
You can get what you want using this:
SUBSTR('ABCD1234EFGH',REGEXP_INSTR ('ABCD1234EFGH', '[[:digit:]]'),REGEXP_COUNT ('ABCD1234EFGH', '[[:digit:]]'))
Sample Query:
SELECT SUBSTR('003Preliminary Examination Plan ',REGEXP_INSTR ('003Preliminary Examination Plan ', '[[:digit:]]'),REGEXP_COUNT ('003Preliminary Examination Plan ', '[[:digit:]]')) SAMPLE1,
SUBSTR('Coordination005',REGEXP_INSTR ('Coordination005', '[[:digit:]]'),REGEXP_COUNT ('Coordination005', '[[:digit:]]')) SAMPLE2,
SUBSTR('Balance1000sheet',REGEXP_INSTR ('Balance1000sheet', '[[:digit:]]'),REGEXP_COUNT ('Balance1000sheet', '[[:digit:]]')) SAMPLE3 FROM DUAL
If you are using Postgres and you have data like '2000 - some sample text' then try substring and position combination, otherwise if in your scenario there is no delimiter, you need to write regex:
SUBSTRING(Column_name from 0 for POSITION('-' in column_name) - 1) as
number_column_name