For ex : if sql column value is sa,123k and the output should first three characters i.e. sak
Letters and any special characters needs to be eliminated and gets only three characters. How do we do this ?
You can use recursive CTEs for this purpose:
with t as (
select 'sa,123k' as str
),
cte as (
select str, left(str, 1) as c, stuff(str, 1, 1, '') as rest, 1 as lev,
convert(varchar(max), (case when left(str, 1) like '[a-zA-Z]' then left(str, 1) else '' end)) as chars
from t
union all
select str, left(rest, 1) as c, stuff(rest, 1, 1, '') as rest, lev + 1,
convert(varchar(max), (case when left(rest, 1) like '[a-zA-Z]' then chars + left(rest, 1) else chars end))
from cte
where rest > '' and len(chars) < 3
)
select str, max(chars)
from cte
where len(chars) <= 3
group by str;
Here is a db<>fiddle.
This might help
DECLARE #VAR VARCHAR(100)= 'sa,1235JSKL', #RESULT VARCHAR(MAX)=''
SELECT #RESULT = #RESULT+
CASE WHEN RESULT LIKE '[a-zA-Z]' THEN RESULT ELSE '' END
FROM (
SELECT NUMBER, SUBSTRING(#VAR,NUMBER,1) AS RESULT
FROM MASTER..spt_values
WHERE TYPE = 'P' AND NUMBER BETWEEN 1 AND LEN(#VAR)
)A
ORDER BY NUMBER
SELECT SUBSTRING(#RESULT,1,3)
If you want to apply this on a Tables column, you need to create Scalar function with same logic. You can find more number of articles how to create the scalar function by Googling..
You can use this function which is written by G Mastros to do this.
Create Function [dbo].[RemoveNonAlphaCharacters](#Temp nvarchar(MAX))
Returns nvarchar(MAX)
AS
Begin
Declare #KeepValues as nvarchar(MAX)
Set #KeepValues = '%[^a-z]%'
While PatIndex(#KeepValues, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
Return #Temp
End
Then simply call the fuction like this
SELECT LEFT(dbo.RemoveNonAlphaCharacters(colName), 3)
FROM TableName
Reference: G Mastros answer on "How to strip all non-alphabetic characters from string in SQL Server" question.
Well, this is ugly, but you could replace all the characters you don't like.
In your example, this would be:
SELECT REPLACE (REPLACE (REPLACE (REPLACE ('sa,123k', '1', ''), '2', ''), '3', ''), ',', '')
Obviously, this needs a lot of replaces if you need all numbers and other sorts of characters replaced.
Edited, based on your comment:
SELECT REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE ('123456gh,.!879k', '1', ''), '2', ''), '3', ''), ',', ''), '4', ''), '5', ''), '6', ''), '.', ''), '!', ''), '7', ''), '8', ''), '9', '')
Related
How Can I separate characters and number from a word using SQL Server query?
Example word: AB12C34DE
The Output is like:
col1
-----
ABCDE
col2
-----
1234
Please try this.
DECLARE #Numstring varchar(100)
SET #Numstring = 'AB12C34DE'
WHILE PATINDEX('%[^0-9]%',#Numstring) <> 0.
SET #Numstring = STUFF(#Numstring,PATINDEX('%[^0-9]%',#Numstring),1,'')
SELECT #Numstring As Number
DECLARE #Charstring varchar(100)
SET #Charstring = 'AB12C34DE'
WHILE PATINDEX('%[^A-Z]%',#Charstring) <> 0.
SET #Charstring = STUFF(#Charstring,PATINDEX('%[^A-Z]%',#Charstring),1,'')
SELECT #Charstring As Character
Try it like this:
DECLARE #word VARCHAR(100)='AB12C34DE';
WITH Tally(Nmbr) AS
(
SELECT TOP(LEN(#word)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values
)
,Separated AS
(
SELECT CASE WHEN OneChar LIKE '[0-9]' THEN 1 ELSE 0 END AS IsDigit
,OneChar
,Nmbr
FROM Tally
CROSS APPLY(SELECT SUBSTRING(#word,Nmbr,1)) A(OneChar)
)
SELECT (SELECT OneChar AS [*] FROM Separated WHERE IsDigit=1 ORDER BY Nmbr FOR XML PATH(''),TYPE).value('.','nvarchar(max)') AS AllNumbers
,(SELECT OneChar AS [*] FROM Separated WHERE IsDigit=0 ORDER BY Nmbr FOR XML PATH(''),TYPE).value('.','nvarchar(max)') AS AllCharacters;
Some explanation
The idea uses a tally table (a list of numbers). You might use an existing physical numbers table...
The first CTE "Tally" will create a derived list of numbers (1,2,3, ...), one for each character.
The second CTE will read each character one-by-one and mark it as digit or not.
The final query will re-concatenate the list of characters
As you are using SQL Server 2012 so you can't use TRANSLATE which can simplify this.
One ways is to use REPLACE like following. If you want you can convert it to a user defined function so that you don't have to write same thing again and again.
DECLARE #TABLE TABLE(VAL VARCHAR(100))
INSERT INTO #TABLE SELECT 'AB12C34DE'
SELECT REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (VAL, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '') COL1,
REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (VAL, 'A', ''),
'B', ''),
'C', ''),
'D', ''),
'E', ''),
'F', ''),
'6', ''),
'G', ''),
'H', ''),
'I', '') COL2
--ADD OTHER CHARACTERS
FROM #TABLE
This seems like a good place to use a recursive CTE:
with cte as (
select v.str, convert(varchar(max), '') as digits, convert(varchar(max), '') as chars, 1 as lev
from (values ('AB12C34DE')) v(str)
union all
select stuff(str, 1, 1, ''),
(case when left(str, 1) like '[0-9]' then digits + left(str, 1) else digits end),
(case when left(str, 1) like '[a-zA-Z]' then chars + left(str, 1) else chars end),
lev + 1
from cte
where str > ''
)
select top (1) with ties cte.*
from cte
order by row_number() over (order by lev desc);
As the values() clause suggests, this will work on columns in a table as well as constants.
I have a column of alphanumeric IDs let's call it [IDS].
The id's are meant to be numbers only, but some of them have stray characters.
For example:
[IDS]
- 012345A
- 23456789AF
- 789789
I want to turn these into numbers only - so the output would be:
[IDS]
012345
23456789
789789
I want to write some code that will search the column for all and any letters in the alphabet (A-Z) and remove them so I can extract the numeric value.
I know I could do a replace(replace(replace(....etc but for all 26 letters in the alphabet this isn't ideal.
I am now trying to solve it using a "declare #" but these seem to be designed for specific strings and I want the whole column to be searched and replaced.
Using Microsoft SQL Server.
Thanks
CREATE TABLE #Table11
([IDS] varchar(10))
;
INSERT INTO #Table11
([IDS])
VALUES
('012345A'),
('23456789AF'),
('789789')
;
SELECT SUBSTRING([IDS], PATINDEX('%[0-9]%', [IDS]), PATINDEX('%[0-9][^0-9]%', [IDS] + 't') - PATINDEX('%[0-9]%',
[IDS]) + 1) AS IDS
FROM #Table11
output
IDS
012345
23456789
789789
Gotta throw this ugly beast in here...
SELECT REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(REPLACE (
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(REPLACE (
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE (REPLACE
(IDS, 'A', ''), 'B', ''), 'C', ''), 'D', ''), 'E', ''), 'F', ''), 'G', '')
, 'H', ''), 'I', ''), 'J', ''), 'K', ''), 'L', ''), 'M', '')
, 'N', ''), 'O', ''), 'P', ''), 'Q', ''), 'R', ''), 'S', '')
, 'T', ''), 'U', ''), 'V', ''), 'W', ''), 'X', ''), 'y', '')
, 'Z', '')
FROM #Table11
You may create a function :
CREATE FUNCTION getNumber(#string VARCHAR(1500))
RETURNS VARCHAR(1500)
AS
BEGIN
DECLARE #count int
DECLARE #intNumbers VARCHAR(1500)
SET #count = 0
SET #intNumbers = ''
WHILE #count <= LEN(#string)
BEGIN
IF SUBSTRING(#string, #count, 1)>='0' and SUBSTRING (#string, #count, 1) <='9'
BEGIN
SET #intNumbers = #intNumbers + SUBSTRING (#string, #count, 1)
END
SET #count = #count + 1
END
RETURN #intNumbers
END
GO
and then call it :
SELECT dbo.getNumber('23456789AF') As "Number"
Number
23456789
Rextester Demo
First create this UDF
CREATE FUNCTION dbo.udf_GetNumeric
(#strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
SQL FIDDLE
I hope this solved your problem.
Reference
I have data as below
98-45.3A-22
104-44.0A-23
00983-29.1-22
01757-42.5A-22
04968-37.3A2-23
Output Looking for output as below in SQL Server
00098-BA45.3A-IN-22
00104-BA44.0A-IN-23
00983-BA29.1-IN-22
01757-BA42.5A-IN-22
04968-BA37.3A2-IN-23
I splitted parts to cope with tricky data templates. This should work even with non-dash-2-digit tail:
WITH Src AS
(
SELECT * FROM (VALUES
('98-45.3A-22'),
('104-44.0A-23'),
('00983-29.1-22'),
('01757-42.5A-22'),
('04968-37.3A2-23')
) T(X)
), Parts AS
(
SELECT *,
RIGHT('00000'+SUBSTRING(X, 1, CHARINDEX('-',X, 1)-1),5) Front,
'BA'+SUBSTRING(X, CHARINDEX('-',X, 1)+1, 2) BA,
SUBSTRING(X, PATINDEX('%.%',X), LEN(X)-CHARINDEX('-', REVERSE(X), 1)-PATINDEX('%.%',X)+1) P,
SUBSTRING(X, LEN(X)-CHARINDEX('-', REVERSE(X), 1)+1, LEN(X)) En
FROM Src
)
SELECT Front+'-'+BA+P+'-IN'+En
FROM Parts
It returns:
00098-BA45.3A-IN-22
00104-BA44.0A-IN-23
00983-BA29.1-IN-22
01757-BA42.5A-IN-22
04968-BA37.3A2-IN-23
Try this,
DECLARE #String VARCHAR(100) = '98-45.3A-22'
SELECT ISNULL(REPLICATE('0',6 - CHARINDEX('-',#String)),'') -- Add leading Zeros
+ STUFF(
STUFF(#String,CHARINDEX('-',#String),1,'-BA'), -- Add 'BA'
CHARINDEX('-',#String,CHARINDEX('-',#String)+1)+2, -- 2 additional for the character 'BA'
1,'-IN') -- Add 'IN'
What if I have more than 6 digit number before first hyphen and want to remove the leading zeros to make it 6 digits.
DECLARE #String VARCHAR(100) = '0000098-45.3A-22'
SELECT CASE WHEN CHARINDEX('-',#String) <= 6
THEN ISNULL(REPLICATE('0',6 - CHARINDEX('-',#String)),'') -- Add leading Zeros
+ STUFF(
STUFF( #String,CHARINDEX('-',#String),1,'-BA'), -- Add 'BA'
CHARINDEX('-',#String,CHARINDEX('-',#String)+1)+2, -- 2 additional for the character 'BA'
1,'-IN') -- Add 'IN'
ELSE STUFF(
STUFF(
STUFF(#String,CHARINDEX('-',#String),1,'-BA'), -- Add 'BA'
CHARINDEX('-',#String,CHARINDEX('-',#String)+1)+2, -- 2 additional for the character 'BA'
1,'-IN'), -- Add 'IN'
1, CHARINDEX('-',#String) - 6, '' -- remove extra leading Zeros
)
END
Making assumptions that the format is consistent (e.g. always ends with "-" + 2 characters....)
DECLARE #Data TABLE (Col1 VARCHAR(100))
INSERT #Data ( Col1 )
SELECT Col1
FROM (
VALUES ('98-45.3A-22'), ('104-44.0A-23'),
('00983-29.1-22'), ('01757-42.5A-22'),
('04968-37.3A2-23')
) x (Col1)
SELECT RIGHT('0000' + LEFT(Col1, CHARINDEX('-', Col1) - 1), 5)
+ '-BA' + SUBSTRING(Col1, CHARINDEX('-', Col1) + 1, CHARINDEX('.', Col1) - CHARINDEX('-', Col1))
+ SUBSTRING(Col1, CHARINDEX('.', Col1) + 1, LEN(Col1) - CHARINDEX('.', Col1) - 3)
+ '-IN-' + RIGHT(Col1, 2)
FROM #Data
It's not ideal IMO to do this string manipulation all the time in SQL. You could shift it out to your presentation layer, or store the pre-formatted value in the db to save the cost of this every time.
Use REPLICATE AND CHARINDEX:
Replicate: will repeat given character till reach required count specify in function
CharIndex: Finds the first occurrence of any character
Declare #Data AS VARCHAR(50)='98-45.3A-22'
SELECT REPLICATE('0',6-CHARINDEX('-',#Data)) + #Data
SELECT
SUBSTRING
(
(REPLICATE('0',6-CHARINDEX('-',#Data)) +#Data)
,0
,6
)
+'-'+'BA'+ CAST('<x>' + REPLACE(#Data,'-','</x><x>') + '</x>' AS XML).value('/x[2]','varchar(max)')
+'-'+ 'IN'+ '-' + CAST('<x>' + REPLACE(#Data,'-','</x><x>') + '</x>' AS XML).value('/x[3]','varchar(max)')
In another way by using PARSENAME() you can use this query:
WITH t AS (
SELECT
PARSENAME(REPLACE(REPLACE(s, '.', '###'), '-', '.'), 3) AS p1,
REPLACE(PARSENAME(REPLACE(REPLACE(s, '.', '###'), '-', '.'), 2), '###', '.') AS p2,
PARSENAME(REPLACE(REPLACE(s, '.', '###'), '-', '.'), 1) AS p3
FROM yourTable)
SELECT RIGHT('00000' + p1, 5) + '-BA' + p2 + '-IN-' + p3
FROM t;
I have searched but not found any examples for my particular problem.
I am trying to strip some unwanted text from a column containing department names. I am trying to combine 2 queries to do this.
This first query strips all characters after the colon in the name:
SELECT
CASE WHEN CHARINDEX(':', DB.Table.DEPT)>0
THEN
LEFT(DB.Table.DEPT, CHARINDEX(':', DB.Table.DEPT)-1)
ELSE
DB.Table.DEPT
END
FROM
DB.Table
The second query strips the prefix from the name:
SELECT
REPLACE(
REPLACE(
REPLACE (DB.Table.DEPT,'[NA1] ','')
,'[NA2] ', '')
,'[NA3] ', '')
FROM
DB.Table
Both of these work great independent of each other, but when I try to combine them it fails.
SELECT
CASE WHEN CHARINDEX(':', DB.Table.DEPT)>0
THEN
LEFT(DB.Table.DEPT, CHARINDEX(':', DB.Table.DEPT)-1)
ELSE
DB.Table.DEPT
END
FROM
(SELECT
REPLACE(
REPLACE(
REPLACE (DB.Table.DEPT,'[NA1] ','')
,'[NA2] ', '')
,'[NA3] ', '')
FROM
DB.Table)
I could really use some guidance with this.
Thanks in advance.
Your query is syntactically incorrect, because you need an alias for the subquery and for the expression result:
SELECT (CASE WHEN CHARINDEX(':', DEPT)>0
THEN LEFT(DEPT, CHARINDEX(':', DEPT)-1)
ELSE DEPT
END)
FROM (SELECT REPLACE(REPLACE(REPLACE(t.DEPT,'[NA1] ',''
), '[NA2] ', ''
), '[NA3] ', ''
) as DEPT
FROM DB.Table t
) t;
EDIT:
To see both the original and new department:
SELECT (CASE WHEN CHARINDEX(':', new_DEPT) > 0
THEN LEFT(new_DEPT, CHARINDEX(':', newj_DEPT)-1)
ELSE new_DEPT
END),
Orig_DEPT
FROM (SELECT REPLACE(REPLACE(REPLACE(t.DEPT,'[NA1] ',''
), '[NA2] ', ''
), '[NA3] ', ''
) as new_DEPT,
t.DEPT as orig_DEPT
FROM DB.Table t
) t
You should always name your subquerys.
Try this:
SELECT
CASE WHEN CHARINDEX(':', x.DEPT)>0
THEN
LEFT(x.DEPT, CHARINDEX(':', x.DEPT)-1)
ELSE
x.DEPT
END AS DEPT
FROM
(
SELECT
REPLACE(REPLACE(REPLACE (DEPT,'[NA1] ','') ,'[NA2] ', ''),'[NA3] ', '') AS DEPT
FROM
DB.Table
) x
I have a large database in which I want to do a part string search. The user will enter characters: JoeBloggs.
For arguments sake if I had a name Joe 23 Blo Ggs 4 in the database. I want to remove everything in the name other than A-Z.
I have the REPLACE(Name, ' ','') function to remove spaces and the UPPER() function to capitalize the name.
Is there a more efficient fast way maybe by terms of regex to replace anything other than A-Z. I cannot change the values in the database.
1st option -
You can nest REPLACE() functions up to 32 levels deep. It runs fast.
REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (#str, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '')
2nd option --
do the reverse of -
Removing nonnumerical data out of a number + SQL
3rd option - if you want to use regex
then
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=27205
This one works for me
CREATE Function [dbo].[RemoveNumericCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare #NumRange as varchar(50) = '%[0-9]%'
While PatIndex(#NumRange, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#NumRange, #Temp), 1, '')
Return #Temp
End
and you can use it like so
SELECT dbo.[RemoveNumericCharacters](Name) FROM TARGET_TABLE
Try below for your query. where val is your string or column name.
CASE WHEN PATINDEX('%[a-z]%', REVERSE(val)) > 1
THEN LEFT(val, LEN(val) - PATINDEX('%[a-z]%', REVERSE(val)) + 1)
ELSE '' END
I just started relearning SQL...and I read this somewhere:
SELECT TRIM('0123456789' FROM customername )
You can also use this to remove spaces or other symbols as well. Just include it in the first parameter within the quotes ''.
One more approach using Recursive CTE..
declare #string varchar(100)
set #string ='te165st1230004616161616'
;With cte
as
(
select #string as string,0 as n
union all
select cast(replace(string,n,'') as varchar(100)),n+1
from cte
where n<9
)
select top 1 string from cte
order by n desc
**Output:**
test
Quoting part of #Jatin answer with some modifications,
use this in your where statement:
SELECT * FROM .... etc.
Where
REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE (Name, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '') = P_SEARCH_KEY
DecLARE #input varchar(50)= 'Joe 23 Blo Ggs 4'
SELECT TRANSLATE(#input,'0123456789',' ') As output1,
REPLACE
(
TRANSLATE(#input,'0123456789',' ') /*10 spaces here*/
, ' '
,''
) as output2
Entire SELECT returns string: $P#M from Python. Multiple spaces can be removed using this approach.
SELECT
REPLACE
(
TRANSLATE
(
'$P#M 1244 from93 Python', -- Replace each char from this string
'123456789', -- that is met in this set with
'000000000' -- the corresponding char from this set.
),
'0', -- Now replace all '0'
'' -- with nothing (void) (delete it).
) AS CLEARED;
Remove everything after first digit (was adequate for my use case):
LEFT(field,PATINDEX('%[0-9]%',field+'0')-1)
Remove trailing digits:
LEFT(field,len(field)+1-PATINDEX('%[^0-9]%',reverse('0'+field))
Not tested, but you can do something like this:
Create Function dbo.AlphasOnly(#s as varchar(max)) Returns varchar(max) As
Begin
Declare #Pos int = 1
Declare #Ret varchar(max) = null
If #s Is Not Null
Begin
Set #Ret = ''
While #Pos <= Len(#s)
Begin
If SubString(#s, #Pos, 1) Like '[A-Za-z]'
Begin
Set #Ret = #Ret + SubString(#s, #Pos, 1)
End
Set #Pos = #Pos + 1
End
End
Return #Ret
End
The key is to use this as a computed column and index it. It doesn't really matter how fast you make this function if the database has to execute it against every row in your large table every time you run the query.