I have a large database in which I want to do a part string search. The user will enter characters: JoeBloggs.
For arguments sake if I had a name Joe 23 Blo Ggs 4 in the database. I want to remove everything in the name other than A-Z.
I have the REPLACE(Name, ' ','') function to remove spaces and the UPPER() function to capitalize the name.
Is there a more efficient fast way maybe by terms of regex to replace anything other than A-Z. I cannot change the values in the database.
1st option -
You can nest REPLACE() functions up to 32 levels deep. It runs fast.
REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (#str, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '')
2nd option --
do the reverse of -
Removing nonnumerical data out of a number + SQL
3rd option - if you want to use regex
then
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=27205
This one works for me
CREATE Function [dbo].[RemoveNumericCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare #NumRange as varchar(50) = '%[0-9]%'
While PatIndex(#NumRange, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#NumRange, #Temp), 1, '')
Return #Temp
End
and you can use it like so
SELECT dbo.[RemoveNumericCharacters](Name) FROM TARGET_TABLE
Try below for your query. where val is your string or column name.
CASE WHEN PATINDEX('%[a-z]%', REVERSE(val)) > 1
THEN LEFT(val, LEN(val) - PATINDEX('%[a-z]%', REVERSE(val)) + 1)
ELSE '' END
I just started relearning SQL...and I read this somewhere:
SELECT TRIM('0123456789' FROM customername )
You can also use this to remove spaces or other symbols as well. Just include it in the first parameter within the quotes ''.
One more approach using Recursive CTE..
declare #string varchar(100)
set #string ='te165st1230004616161616'
;With cte
as
(
select #string as string,0 as n
union all
select cast(replace(string,n,'') as varchar(100)),n+1
from cte
where n<9
)
select top 1 string from cte
order by n desc
**Output:**
test
Quoting part of #Jatin answer with some modifications,
use this in your where statement:
SELECT * FROM .... etc.
Where
REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (Name, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '') = P_SEARCH_KEY
DecLARE #input varchar(50)= 'Joe 23 Blo Ggs 4'
SELECT TRANSLATE(#input,'0123456789',' ') As output1,
REPLACE
(
TRANSLATE(#input,'0123456789',' ') /*10 spaces here*/
, ' '
,''
) as output2
Entire SELECT returns string: $P#M from Python. Multiple spaces can be removed using this approach.
SELECT
REPLACE
(
TRANSLATE
(
'$P#M 1244 from93 Python', -- Replace each char from this string
'123456789', -- that is met in this set with
'000000000' -- the corresponding char from this set.
),
'0', -- Now replace all '0'
'' -- with nothing (void) (delete it).
) AS CLEARED;
Remove everything after first digit (was adequate for my use case):
LEFT(field,PATINDEX('%[0-9]%',field+'0')-1)
Remove trailing digits:
LEFT(field,len(field)+1-PATINDEX('%[^0-9]%',reverse('0'+field))
Not tested, but you can do something like this:
Create Function dbo.AlphasOnly(#s as varchar(max)) Returns varchar(max) As
Begin
Declare #Pos int = 1
Declare #Ret varchar(max) = null
If #s Is Not Null
Begin
Set #Ret = ''
While #Pos <= Len(#s)
Begin
If SubString(#s, #Pos, 1) Like '[A-Za-z]'
Begin
Set #Ret = #Ret + SubString(#s, #Pos, 1)
End
Set #Pos = #Pos + 1
End
End
Return #Ret
End
The key is to use this as a computed column and index it. It doesn't really matter how fast you make this function if the database has to execute it against every row in your large table every time you run the query.
Related
I have this data result as a nvarchar that i need to compare using IN conditions.
How can I turn this nvarchar '1,2,3,4,5' into '1', '2', '3', '4', '5'?
Use this — here is the sqlfiddle:
REPLACE('1,2,3,4,5', ',', ''',''')
If above one does not work then use this:
'''' + REPLACE('1,2,3,4,5', ',', ''',''') + ''''
Don't use in -- directly. Instead, split the string:
where col in (select s.value from string_split('1,2,3,4,5', ',') s)
SQL: How to find the count of words in following example?
declare #s varchar(55) = 'How to find the count of words in this string ?'
Subquestions:
How to count spaces?
How to count double/triple/... spaces as one? answer by Gordon Linoff here
How to avoid counting of special characters? Example: 'Please , don't count this comma'
Is it possible without string_split function (because it's available only since SQL SERVER 2016)?
Summary with the best solutions HERE
Thanks to Gordon Linoff's answer here
SELECT len(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','))
OutPut
-------
How,to,find,the,count,of,words,in,this,string?
SELECT replace(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','),',','')
OutPut
------
Howtofindthecountofwordsinthisstring?
Now you can find the difference between the length of both the output and add 1 for the last word like below.
declare #s varchar(55) = 'How to find the count of words in this string?'
SELECT len(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','))
-len(replace(replace(replace(replace(replace(#s,' ','<>'),'><',''),'<>',' '),' ',','),',',''))
+ 1 AS WORD_COUNT
WORD_COUNT
----------
10
http://sqlfiddle.com/#!18/06c1d/5
One method uses a recursive CTE:
declare #s varchar(55) = 'How to find the count of words in this string ?';
with cte as (
select convert(varchar(max), '') as word,
convert(varchar(max), ltrim(#s)) as rest
union all
select left(rest, patindex('%[ ]%', rest + ' ') - 1),
ltrim(stuff(rest, 1, patindex('%[ ]%', rest + ' '), ''))
from cte
where rest <> ''
)
select count(*)
from cte
where word not in ('', '?', ',')
--OPTION (MAXRECURSION 1000); -- use if number of words >99
;
Here is a db<>fiddle.
First thing is you need to remove the double/tripple.. or more count into one.
declare #str varchar(500) = 'dvdv sdd dfxdfd dfd'
select Replace(Replace(Replace( #str,' ',']['), '[]', ''), '][', ' ')
this will remove all the unnecessary space in between the word and you'll get your final word.
After that you may use string_split (for SQL SERVER 2016 and above). To count the number of word in your text from which minus 1 is your total count of spaces.
select count(value) - 1 from string_split( #str, ' ')
Final query looks like
declare #str varchar(500) = 'dvdv sdd dfxdfd dfd'
select count(value) - 1 from string_split( Replace(Replace(Replace( #str,' ',']['), '[]', ''), '][', ' '), ' ')
For only word count and if your MSSQL Version support STRING_SPLIT, you can use this simple script below-
DECLARE #s VARCHAR(55) = 'How to find the count of words in this string ?'
SELECT
COUNT(
IIF(
LTRIM(value)='',
NULL,
1
)
)
FROM STRING_SPLIT(#s, ' ')
WHERE value LIKE '%[0-9,A-z]%'
Using string_split (available only since SQL SERVER 2016):
declare #string varchar(55) = 'How to find the count of words in this string ?';
select count(*) WordCount from string_split(#string,' ') where value like '%[0-9A-Za-z]%'
The same idea is used in following answers:
https://stackoverflow.com/a/57783421/6165594
https://stackoverflow.com/a/57783743/6165594
Without using string_split:
declare #string varchar(55) = 'How to find the count of words in this string ?';
;with space as
( -- returns space positions in a string
select cast( 0 as int) idx union all
select cast(charindex(' ', #string, idx+1) as int) from space
where charindex(' ', #string, idx+1)>0
)
select count(*) WordCount from space
where substring(#string,idx+1,charindex(' ',#string+' ',idx+1)-idx-1) like '%[0-9A-Za-z]%'
OPTION (MAXRECURSION 0);
The same idea is used in following answers:
https://stackoverflow.com/a/57787850/6165594
As Inline Function:
ALTER FUNCTION dbo.WordCount
(
#string NVARCHAR(MAX)
, #WordPattern NVARCHAR(MAX) = '%[0-9A-Za-z]%'
)
/*
Call Example:
1) Word count for single string:
select * from WordCount(N'How to find the count of words in this string ? ', default)
2) Word count for set of strings:
select *
from (
select 'How to find the count of words in this string ? ' as string union all
select 'How many words in 2nd example?'
) x
cross apply WordCount(x.string, default)
Limitations:
If string contains >100 spaces function fails with error:
Msg 530, Level 16, State 1, Line 45
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
NB! OPTION (MAXRECURSION 0); -- don't work within inline function
*/
RETURNS TABLE AS RETURN
(
with space as
( -- returns space positions in a string
select cast( 0 as int) idx union all
select cast(charindex(' ', #string, idx+1) as int) from space
where charindex(' ', #string, idx+1)>0
)
select count(*) WordCount from space
where substring(#string,idx+1,charindex(' ',#string+' ',idx+1)-idx-1) like #WordPattern
-- OPTION (MAXRECURSION 0); -- don't work within inline function
);
go
For ex : if sql column value is sa,123k and the output should first three characters i.e. sak
Letters and any special characters needs to be eliminated and gets only three characters. How do we do this ?
You can use recursive CTEs for this purpose:
with t as (
select 'sa,123k' as str
),
cte as (
select str, left(str, 1) as c, stuff(str, 1, 1, '') as rest, 1 as lev,
convert(varchar(max), (case when left(str, 1) like '[a-zA-Z]' then left(str, 1) else '' end)) as chars
from t
union all
select str, left(rest, 1) as c, stuff(rest, 1, 1, '') as rest, lev + 1,
convert(varchar(max), (case when left(rest, 1) like '[a-zA-Z]' then chars + left(rest, 1) else chars end))
from cte
where rest > '' and len(chars) < 3
)
select str, max(chars)
from cte
where len(chars) <= 3
group by str;
Here is a db<>fiddle.
This might help
DECLARE #VAR VARCHAR(100)= 'sa,1235JSKL', #RESULT VARCHAR(MAX)=''
SELECT #RESULT = #RESULT+
CASE WHEN RESULT LIKE '[a-zA-Z]' THEN RESULT ELSE '' END
FROM (
SELECT NUMBER, SUBSTRING(#VAR,NUMBER,1) AS RESULT
FROM MASTER..spt_values
WHERE TYPE = 'P' AND NUMBER BETWEEN 1 AND LEN(#VAR)
)A
ORDER BY NUMBER
SELECT SUBSTRING(#RESULT,1,3)
If you want to apply this on a Tables column, you need to create Scalar function with same logic. You can find more number of articles how to create the scalar function by Googling..
You can use this function which is written by G Mastros to do this.
Create Function [dbo].[RemoveNonAlphaCharacters](#Temp nvarchar(MAX))
Returns nvarchar(MAX)
AS
Begin
Declare #KeepValues as nvarchar(MAX)
Set #KeepValues = '%[^a-z]%'
While PatIndex(#KeepValues, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
Return #Temp
End
Then simply call the fuction like this
SELECT LEFT(dbo.RemoveNonAlphaCharacters(colName), 3)
FROM TableName
Reference: G Mastros answer on "How to strip all non-alphabetic characters from string in SQL Server" question.
Well, this is ugly, but you could replace all the characters you don't like.
In your example, this would be:
SELECT REPLACE (REPLACE (REPLACE (REPLACE ('sa,123k', '1', ''), '2', ''), '3', ''), ',', '')
Obviously, this needs a lot of replaces if you need all numbers and other sorts of characters replaced.
Edited, based on your comment:
SELECT REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE ('123456gh,.!879k', '1', ''), '2', ''), '3', ''), ',', ''), '4', ''), '5', ''), '6', ''), '.', ''), '!', ''), '7', ''), '8', ''), '9', '')
How Can I separate characters and number from a word using SQL Server query?
Example word: AB12C34DE
The Output is like:
col1
-----
ABCDE
col2
-----
1234
Please try this.
DECLARE #Numstring varchar(100)
SET #Numstring = 'AB12C34DE'
WHILE PATINDEX('%[^0-9]%',#Numstring) <> 0.
SET #Numstring = STUFF(#Numstring,PATINDEX('%[^0-9]%',#Numstring),1,'')
SELECT #Numstring As Number
DECLARE #Charstring varchar(100)
SET #Charstring = 'AB12C34DE'
WHILE PATINDEX('%[^A-Z]%',#Charstring) <> 0.
SET #Charstring = STUFF(#Charstring,PATINDEX('%[^A-Z]%',#Charstring),1,'')
SELECT #Charstring As Character
Try it like this:
DECLARE #word VARCHAR(100)='AB12C34DE';
WITH Tally(Nmbr) AS
(
SELECT TOP(LEN(#word)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values
)
,Separated AS
(
SELECT CASE WHEN OneChar LIKE '[0-9]' THEN 1 ELSE 0 END AS IsDigit
,OneChar
,Nmbr
FROM Tally
CROSS APPLY(SELECT SUBSTRING(#word,Nmbr,1)) A(OneChar)
)
SELECT (SELECT OneChar AS [*] FROM Separated WHERE IsDigit=1 ORDER BY Nmbr FOR XML PATH(''),TYPE).value('.','nvarchar(max)') AS AllNumbers
,(SELECT OneChar AS [*] FROM Separated WHERE IsDigit=0 ORDER BY Nmbr FOR XML PATH(''),TYPE).value('.','nvarchar(max)') AS AllCharacters;
Some explanation
The idea uses a tally table (a list of numbers). You might use an existing physical numbers table...
The first CTE "Tally" will create a derived list of numbers (1,2,3, ...), one for each character.
The second CTE will read each character one-by-one and mark it as digit or not.
The final query will re-concatenate the list of characters
As you are using SQL Server 2012 so you can't use TRANSLATE which can simplify this.
One ways is to use REPLACE like following. If you want you can convert it to a user defined function so that you don't have to write same thing again and again.
DECLARE #TABLE TABLE(VAL VARCHAR(100))
INSERT INTO #TABLE SELECT 'AB12C34DE'
SELECT REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (VAL, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '') COL1,
REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (VAL, 'A', ''),
'B', ''),
'C', ''),
'D', ''),
'E', ''),
'F', ''),
'6', ''),
'G', ''),
'H', ''),
'I', '') COL2
--ADD OTHER CHARACTERS
FROM #TABLE
This seems like a good place to use a recursive CTE:
with cte as (
select v.str, convert(varchar(max), '') as digits, convert(varchar(max), '') as chars, 1 as lev
from (values ('AB12C34DE')) v(str)
union all
select stuff(str, 1, 1, ''),
(case when left(str, 1) like '[0-9]' then digits + left(str, 1) else digits end),
(case when left(str, 1) like '[a-zA-Z]' then chars + left(str, 1) else chars end),
lev + 1
from cte
where str > ''
)
select top (1) with ties cte.*
from cte
order by row_number() over (order by lev desc);
As the values() clause suggests, this will work on columns in a table as well as constants.
I have a column of alphanumeric IDs let's call it [IDS].
The id's are meant to be numbers only, but some of them have stray characters.
For example:
[IDS]
- 012345A
- 23456789AF
- 789789
I want to turn these into numbers only - so the output would be:
[IDS]
012345
23456789
789789
I want to write some code that will search the column for all and any letters in the alphabet (A-Z) and remove them so I can extract the numeric value.
I know I could do a replace(replace(replace(....etc but for all 26 letters in the alphabet this isn't ideal.
I am now trying to solve it using a "declare #" but these seem to be designed for specific strings and I want the whole column to be searched and replaced.
Using Microsoft SQL Server.
Thanks
CREATE TABLE #Table11
([IDS] varchar(10))
;
INSERT INTO #Table11
([IDS])
VALUES
('012345A'),
('23456789AF'),
('789789')
;
SELECT SUBSTRING([IDS], PATINDEX('%[0-9]%', [IDS]), PATINDEX('%[0-9][^0-9]%', [IDS] + 't') - PATINDEX('%[0-9]%',
[IDS]) + 1) AS IDS
FROM #Table11
output
IDS
012345
23456789
789789
Gotta throw this ugly beast in here...
SELECT REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(REPLACE (
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(REPLACE (
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE
(IDS, 'A', ''), 'B', ''), 'C', ''), 'D', ''), 'E', ''), 'F', ''), 'G', '')
, 'H', ''), 'I', ''), 'J', ''), 'K', ''), 'L', ''), 'M', '')
, 'N', ''), 'O', ''), 'P', ''), 'Q', ''), 'R', ''), 'S', '')
, 'T', ''), 'U', ''), 'V', ''), 'W', ''), 'X', ''), 'y', '')
, 'Z', '')
FROM #Table11
You may create a function :
CREATE FUNCTION getNumber(#string VARCHAR(1500))
RETURNS VARCHAR(1500)
AS
BEGIN
DECLARE #count int
DECLARE #intNumbers VARCHAR(1500)
SET #count = 0
SET #intNumbers = ''
WHILE #count <= LEN(#string)
BEGIN
IF SUBSTRING(#string, #count, 1)>='0' and SUBSTRING (#string, #count, 1) <='9'
BEGIN
SET #intNumbers = #intNumbers + SUBSTRING (#string, #count, 1)
END
SET #count = #count + 1
END
RETURN #intNumbers
END
GO
and then call it :
SELECT dbo.getNumber('23456789AF') As "Number"
Number
23456789
Rextester Demo
First create this UDF
CREATE FUNCTION dbo.udf_GetNumeric
(#strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
SQL FIDDLE
I hope this solved your problem.
Reference