sql server 2008
I have a data in a column something like
"Brake pad kit, disc brake"
/Brake disk (sold separately).
"The belt pulley, crankshaft"
Fuel Pump
the special character are "",space,/
i want to remove any special character or space present in begining or end of the string.
is this possible to do in sql, not sure.
Please share your thoughts.
Here is one way to do it using String functions
DECLARE #str VARCHAR(200)= '"The belt pulley, crankshaft"'
SELECT Reverse(CASE
WHEN LEFT(Reverse(scd_str), 1) LIKE '[A-Z]' OR LEFT(Reverse(scd_str), 1) LIKE '[a-z]' THEN Reverse(scd_str)
ELSE Substring(Reverse(scd_str), 2, Len(Reverse(scd_str)))
END)
FROM (SELECT CASE
WHEN LEFT(string, 1) LIKE '[A-Z]' OR LEFT(string, 1) LIKE '[a-z]' THEN string
ELSE Substring(string, 2, Len(string))
END AS Scd_Str
FROM (SELECT Rtrim(Ltrim(#str)) AS string) A) B
Result : The belt pulley, crankshaft
Related
Using SQL Server, I have a column with numeric and Roman numerals at the end. How do I remove the numeric alone without specifying the position?
Job_Title
Data Analyst 2
Manager 50
Robotics 1615
Software Engineer
DATA ENGINEER III
I tried using this query:
SELECT
CASE
WHEN PATINDEX('%[0-9 ]%', job_title) > 0
THEN RTRIM(SUBSTRING(Job_title, 1, PATINDEX('%[0-9 ]%', job_title) - 1))
ELSE JOB_TITLE
END
FROM
my_table
WHERE
PATINDEX('%[0-9]%', JOB_TITLE) <> 0
But the result I'm getting is:
Job_Title
Data
Manager
Robotics
Use the TRANSLATE function like this :
SELECT TRANSLATE(Job_title, '0123456789', ' ') AS JOB_TITLE
from my_table
You can use RTRIM to complete
You should remove the space character in the regex expression. So, new code should be
SELECT case when patindex('%[0-9]%', job_title) > 0 then
rtrim(substring(Job_title,1, patindex('%[0-9]%', job_title) - 1))
else
JOB_TITLE
end
from my_table
WHERE PATINDEX('%[0-9]%',JOB_TITLE) <>0
I think you're trying to remove numbers from the end of a job title, and not exclude results. So, as others have mentioned, you need to remove the space from the brackets of the regex and put it in front of the brackets to say it is separated from the stuff in front of it by a space. But I think you also need to remove the wildcard character from the right side of the comparison value so that the numbers have to be at the end of the job title, like...
SELECT case when patindex('% [0-9]', job_title) > 0 then
rtrim(substring(Job_title,1, patindex('% [0-9]', job_title) - 1))
else
JOB_TITLE
end
from my_table
But, you also mention roman numerals... and... that's tougher if it's possible for a job title to end in something like " X" where it means "X" and not "10". If that's not possible, you should just be able to do [0-9IVXivx] to replace all the bracketed segments.
I am trying to parse out a last name field that may have two last names that are separated by either a blank space ' ' or a hyphen '-' or it may only have one name.
Here is what I'm using to do that:
select top 1000
BENE_FIRST_NAME,
BENE_LAST_NAME,
FirstNm =
case
when BENE_FIRST_NAME like '% %' then
left(BENE_FIRST_NAME, CHARINDEX(' ', BENE_FIRST_NAME))
when BENE_FIRST_NAME like '%-%' then
left(BENE_FIRST_NAME, CHARINDEX('-', BENE_FIRST_NAME))
else BENE_FIRST_NAME
end,
LastNm =
case
when BENE_LAST_NAME like '% %' then
right(BENE_LAST_NAME, CHARINDEX(' ', BENE_LAST_NAME))
when BENE_LAST_NAME like '%-%' then
right(BENE_LAST_NAME, CHARINDEX('-', BENE_LAST_NAME))
else BENE_LAST_NAME
end,
CharIndxDash = CHARINDEX('-', BENE_LAST_NAME),
CharIndxSpace = CHARINDEX(' ', BENE_LAST_NAME)
from xMIUR_Elig_Raw_v3
Here are some results:
BENE_FIRST_NAME
BENE_LAST_NAME
FirstNm
LastNm
CharIndxDash
CharIndxSpace
JUANA
PEREZ-MARTINEZ
JUANA
RTINEZ
6
0
EMILIANO
PICENO ESPINOZA
EMILIANO
SPINOZA
0
7
JULIAN
NIETO-CARRENO
JULIAN
ARRENO
6
0
EMILY
SALMERON TERRIQUEZ
EMILY
TERRIQUEZ
0
9
The CHARINDEX seems to be selecting the correct position but it is not bringing in all of the CHARs to the right of that position. Sometimes it works like in the last record. But sometimes it is off by 1. And sometimes it is off by 2. Any ideas?
If you need to select part of a last name after space/hyphen, you need to get right part of the string with length = total_lenght - space_position:
...
LastNm =
case
when BENE_LAST_NAME like '% %' then
right(BENE_LAST_NAME, LEN(BENE_LAST_NAME) - CHARINDEX(' ', BENE_LAST_NAME))
when BENE_LAST_NAME like '%-%' then
right(BENE_LAST_NAME, LEN(BENE_LAST_NAME) -CHARINDEX('-', BENE_LAST_NAME))
else BENE_LAST_NAME
end,
...
Your last name logic doesn't make sense..
RIGHT takes N chars from the right of the string
CHARINDEX gives the position of a char from the left of the string
You can't use it to find a position from left and then take that number of chars from the right of the string
Here's a name:
JOHN MALKOVICH
The space is at 5. If you take 5 chars from the right, you get OVICH. The shorter the name before the space and the longer the name after the space, the fewer chars you get from the last name
Perhaps you mean to put a LEN in there so you take the string length minus the index of the space.. You can also use it in a call to SUBSTRING as the start index, and tell SQLS to take 9999 chars (of any number longer than the remaining string) and it will take up to the end of the string
SUBSTRING(name, CHARINDEX(' ', name)+1, 9999)
I think you can simplify your code by a lot. Consider below with a different but representative sample data
with data (name) as
(select 'first-last' union select 'first last' union select 'firstlast'),
data_prepped (name, indx) as
(select name,coalesce(nullif(charindex(' ', name)+charindex('-', name),0),len(name))
from data)
select name,
left(name, indx-1) as part1,
right(name, indx) as part2
from data_prepped
I am in the process of loading a bunch of tables into SQL Server and converting them from varchar to specific data types (int, date, etc.). One frustration is how many different ways there are to break the conversion from string to numeric (int, decimal, etc) and that there is not an easy diagnostic tool to find the offending rows (besides ISNUMERIC() which doesn't work all the time).
Here is my list of ways to break the conversion that won't get caught by ISNUMERIC().
The string contains scientific notation (ie 3.55E-10)
The string contains a blank ('')
The string contains a non-alphanumeric symbol ('$', '-', ',')
Here's what I'm currently using to compensate:
SELECT
CASE
WHEN [MyColumn] IN ('','-') THEN NULL -- deals with blanks
WHEN [MyColumn] LIKE '%E%' THEN CONVERT(DECIMAL(20, 4), CONVERT(FLOAT(53), [MyColumn])) -- deals with scientific notation
ELSE CAST(REPLACE(REPLACE([MyColumn] , '$', ''), '-', '') AS DECIMAL(20, 4))
END [MyColumn] -- deals with special characters
FROM
MyTable
Does anyone else have others? Or good ways to diagnose?
Don't use ISNUMERIC(). If you are on 2012+ then you could use TRY_CAST or TRY_CONVERT.
If you are on older versions, you could use some syntax like this:
SELECT *
FROM #TableA
WHERE ColA NOT LIKE '%[^0-9]%'
You can try to use LIKE '%[0-9]%' instead of ISNUMERIC()
SELECT col, CASE WHEN col NOT LIKE '%[^0-9]%' and col<>''
THEN 1
ELSE 0
END
FROM T
You can use NOT LIKE to exclude anything that isn't a digit... and REPLACE for commas and periods. Naturally, you can add other nested REPLACE functions for values you want to accept.
declare #var varchar(64) = '55,5646'
SELECT
CASE
WHEN replace(replace(#var,'.',''),',','') NOT LIKE '%[^0-9]%'
THEN 1
ELSE 0
END
This allows you to accept decimals for your decimal / numeric / float conversions.
How can I check if string is in AAABBB or AABBAA or BBBAAA or BABABA or ABABAB or XXXXXA or AXXXXX format in SQL server. (where 'A' is any character from A-Z... AAA means all characters are same... same for 'B')
I want to validate this string with such patterns. I tried using regular expressions. I tried many regular expressions but failed to get the result.
I tried something like this:
Select CASE WHEN 'AAABBB' LIKE '%[^a-zA-Z0-9]%' THEN 'Valid' ELSE 'Invalid' END
(Above regular expression is just for demo I need something there to validate my string). For every string I will be needing separate regular expression
I can validate string my comparing each character with others but it will increase size of my query.I need something short and simple.
You can actually do this . . . painfully. Note that patterns such as ABABAB and BABABA are the same -- because the letters are interchangeable.
select v.str,
(case when len(str) <> 6 then NULL
when rep = '###%%%' then 'AAABBB'
when rep = '##%%##' then 'AABBAA'
when rep = '#%#%#%' then 'ABABAB'
when str like '%' + a then 'XXXXXA'
when str like a + '%' then 'AXXXXX'
end) as match_pattern
from (values ('ABABAB')) v(str) cross apply
(values (left(v.str, 1), left(replace(v.str, left(v.str, 1), ''), 1))) v2 (a, b) cross apply
(values (replace(replace(v.str, a, '#'), b, '%'))) v3(rep);
Here is a rextester.
The idea is to find the values of "a" and "b" in the string. "a" is the first letter in the string. "b" is any other letter. The rest is just replacement and checking against patterns. The use of '#' and '%' is just to get letters that don't conflict with any characters already in the string.
First: these are not regular expressions. SQL Server allows to match a single character with a specified range, but this is not remotely close to what is known as regular expressions.
You did not specify what the X represents. I will take it to mean a numerical character, but you can simply adapt the following to what you meant with it. The next query assumes a table mytable with a column str that needs its values to be tested:
select str,
coalesce(
case
when upper(str) like '[A-Z][A-Z][A-Z][A-Z][A-Z][A-Z]' then
case str
when replicate(substring(str,1,2), 3) then 'Valid ABABAB'
when replicate(substring(str,1,1), 2)
+ replicate(substring(str,3,1), 2)
+ replicate(substring(str,1,1), 2) then 'Valid AABBAA'
when replicate(substring(str,1,1), 3)
+ replicate(substring(str,4,1), 3) then 'Valid AAABBB'
end
when str like '[A-Za-z][0-9][0-9][0-9][0-9][0-9]' then 'Valid AXXXXX'
when str like '[0-9][0-9][0-9][0-9][0-9][A-Za-z]' then 'Valid XXXXXA'
end,
'Invalid') as validation
from mytable
There is a boundary case, where A and B happen to be the same character. In that case we actually have AAAAAA. The query above allows this. If this should not be allowed then just exclude that case with an extra condition in the outer case:
select str,
coalesce(
case
when upper(str) like '[A-Z][A-Z][A-Z][A-Z][A-Z][A-Z]'
and str <> replicate(substring(str,1,1), 6) then
case str
when replicate(substring(str,1,2), 3) then 'Valid ABABAB'
when replicate(substring(str,1,1), 2)
+ replicate(substring(str,3,1), 2)
+ replicate(substring(str,1,1), 2) then 'Valid AABBAA'
when replicate(substring(str,1,1), 3)
+ replicate(substring(str,4,1), 3) then 'Valid AAABBB'
end
when str like '[A-Za-z][0-9][0-9][0-9][0-9][0-9]' then 'Valid AXXXXX'
when str like '[0-9][0-9][0-9][0-9][0-9][A-Za-z]' then 'Valid XXXXXA'
end,
'Invalid') as validation
from mytable
I am working with a database of products, trying to extract the product color from a combined ID/color code column where the color code is always the string following the last hyphen in the column. The issue is that the number of hyphens, product ID, and color code can all be different.
Here are four examples:
ABC123-001
BCD45678-0165
S-XYZ999-M2235
A-S-ABC123-001
The color codes in this case would be 001, 0165, M2235, and 001. What would be the best way to select these into their own column?
I think the following does what you want:
select right(col, charindex('-', reverse(col)) - 1)
In the event that you might have no hyphens in the value, then use a case:
select (case when col like '%-%'
then right(col, charindex('-', reverse(col)) - 1)
else col
end)
It is great to check whether the hyphen exists or not in the string, to avoid the following error:
Invalid length parameter passed to the right function.
SELECT CASE WHEN Col like '%\%' THEN RIGHT(Col,CHARINDEX('\',REVERSE(Col))-1) ELSE '' END AS ColName