I am trying to parse out a last name field that may have two last names that are separated by either a blank space ' ' or a hyphen '-' or it may only have one name.
Here is what I'm using to do that:
select top 1000
BENE_FIRST_NAME,
BENE_LAST_NAME,
FirstNm =
case
when BENE_FIRST_NAME like '% %' then
left(BENE_FIRST_NAME, CHARINDEX(' ', BENE_FIRST_NAME))
when BENE_FIRST_NAME like '%-%' then
left(BENE_FIRST_NAME, CHARINDEX('-', BENE_FIRST_NAME))
else BENE_FIRST_NAME
end,
LastNm =
case
when BENE_LAST_NAME like '% %' then
right(BENE_LAST_NAME, CHARINDEX(' ', BENE_LAST_NAME))
when BENE_LAST_NAME like '%-%' then
right(BENE_LAST_NAME, CHARINDEX('-', BENE_LAST_NAME))
else BENE_LAST_NAME
end,
CharIndxDash = CHARINDEX('-', BENE_LAST_NAME),
CharIndxSpace = CHARINDEX(' ', BENE_LAST_NAME)
from xMIUR_Elig_Raw_v3
Here are some results:
BENE_FIRST_NAME
BENE_LAST_NAME
FirstNm
LastNm
CharIndxDash
CharIndxSpace
JUANA
PEREZ-MARTINEZ
JUANA
RTINEZ
6
0
EMILIANO
PICENO ESPINOZA
EMILIANO
SPINOZA
0
7
JULIAN
NIETO-CARRENO
JULIAN
ARRENO
6
0
EMILY
SALMERON TERRIQUEZ
EMILY
TERRIQUEZ
0
9
The CHARINDEX seems to be selecting the correct position but it is not bringing in all of the CHARs to the right of that position. Sometimes it works like in the last record. But sometimes it is off by 1. And sometimes it is off by 2. Any ideas?
If you need to select part of a last name after space/hyphen, you need to get right part of the string with length = total_lenght - space_position:
...
LastNm =
case
when BENE_LAST_NAME like '% %' then
right(BENE_LAST_NAME, LEN(BENE_LAST_NAME) - CHARINDEX(' ', BENE_LAST_NAME))
when BENE_LAST_NAME like '%-%' then
right(BENE_LAST_NAME, LEN(BENE_LAST_NAME) -CHARINDEX('-', BENE_LAST_NAME))
else BENE_LAST_NAME
end,
...
Your last name logic doesn't make sense..
RIGHT takes N chars from the right of the string
CHARINDEX gives the position of a char from the left of the string
You can't use it to find a position from left and then take that number of chars from the right of the string
Here's a name:
JOHN MALKOVICH
The space is at 5. If you take 5 chars from the right, you get OVICH. The shorter the name before the space and the longer the name after the space, the fewer chars you get from the last name
Perhaps you mean to put a LEN in there so you take the string length minus the index of the space.. You can also use it in a call to SUBSTRING as the start index, and tell SQLS to take 9999 chars (of any number longer than the remaining string) and it will take up to the end of the string
SUBSTRING(name, CHARINDEX(' ', name)+1, 9999)
I think you can simplify your code by a lot. Consider below with a different but representative sample data
with data (name) as
(select 'first-last' union select 'first last' union select 'firstlast'),
data_prepped (name, indx) as
(select name,coalesce(nullif(charindex(' ', name)+charindex('-', name),0),len(name))
from data)
select name,
left(name, indx-1) as part1,
right(name, indx) as part2
from data_prepped
Related
Using SQL Server, I have a column with numeric and Roman numerals at the end. How do I remove the numeric alone without specifying the position?
Job_Title
Data Analyst 2
Manager 50
Robotics 1615
Software Engineer
DATA ENGINEER III
I tried using this query:
SELECT
CASE
WHEN PATINDEX('%[0-9 ]%', job_title) > 0
THEN RTRIM(SUBSTRING(Job_title, 1, PATINDEX('%[0-9 ]%', job_title) - 1))
ELSE JOB_TITLE
END
FROM
my_table
WHERE
PATINDEX('%[0-9]%', JOB_TITLE) <> 0
But the result I'm getting is:
Job_Title
Data
Manager
Robotics
Use the TRANSLATE function like this :
SELECT TRANSLATE(Job_title, '0123456789', ' ') AS JOB_TITLE
from my_table
You can use RTRIM to complete
You should remove the space character in the regex expression. So, new code should be
SELECT case when patindex('%[0-9]%', job_title) > 0 then
rtrim(substring(Job_title,1, patindex('%[0-9]%', job_title) - 1))
else
JOB_TITLE
end
from my_table
WHERE PATINDEX('%[0-9]%',JOB_TITLE) <>0
I think you're trying to remove numbers from the end of a job title, and not exclude results. So, as others have mentioned, you need to remove the space from the brackets of the regex and put it in front of the brackets to say it is separated from the stuff in front of it by a space. But I think you also need to remove the wildcard character from the right side of the comparison value so that the numbers have to be at the end of the job title, like...
SELECT case when patindex('% [0-9]', job_title) > 0 then
rtrim(substring(Job_title,1, patindex('% [0-9]', job_title) - 1))
else
JOB_TITLE
end
from my_table
But, you also mention roman numerals... and... that's tougher if it's possible for a job title to end in something like " X" where it means "X" and not "10". If that's not possible, you should just be able to do [0-9IVXivx] to replace all the bracketed segments.
Using MSSQL Server 2012
I have a column name LocationName with the following data in it.
LocationName
C1-Highland
C687-I-10 & 51st
C74-Bossier
C0716-South Broadway & Cluff
Las Vegas
I want to find only those records which is having pattern like {CXXXX} where XXXX can be any number between 0-9.
SELECT
CASE
WHEN LocationName like 'C%' THEN SUBSTRING(LocationName, 0, charindex('-', LocationName,
0))
ELSE 'Unknown'+ '-' + LocationName
END AS storebusinesskey,*
FROM [DBO].[Store]
The problem with this code is if I have location name start with C but not having pattern {CXXXX} then also I'm getting this record ,which I didn't want.
You can use [] to use regex that matches what you want
You said that XXXX can be any number between 0-9
If you want the first 4 digit is number 0-9, then you need to repeat [] 4 times
SELECT * FROM [DBO].[Store]
WHERE LocationName like '[C]'+ REPLICATE('[0-9]', 4) + '%'
If you just want first digit is number 0-9, then you can simply to the following:
SELECT * FROM [DBO].[Store]
WHERE LocationName like '[C][0-9]%'
If you want four or more digits, you can use:
SELECT (CASE WHEN LocationName like 'C[0-9][0-9][0-9][0-9]%'
THEN SUBSTRING(LocationName, 1, charindex('-', LocationName, 0))
ELSE 'Unknown'+ '-' + LocationName
END) AS storebusinesskey,*
FROM [DBO].[Store];
Note: This logic is a little strange because the WHEN condition is not guaranteeing that there is a hyphen.
If you want exactly four digits -- and no more -- then you need to tweak the logic a bit:
SELECT (CASE WHEN LocationName + 'x' like 'C[0-9][0-9][0-9][0-9][^0-9]%'
THEN SUBSTRING(LocationName, 1, charindex('-', LocationName, 0))
ELSE 'Unknown'+ '-' + LocationName
END) AS storebusinesskey,*
FROM [DBO].[Store];
This checks that the following character is not a digit. The + 'x' simply ensures that the check works if the C#### value is at the end of the string.
Hi I'm having problems trying to switch the output of the FirstWordLength to print the amount of characters in each first word this output to
SELECT InvoiceLineItemDescription,
LEFT(InvoiceLineItemDescription,
CASE
WHEN charindex(' ', InvoiceLineItemDescription) = 0 THEN LEN(InvoiceLineItemDescription)
ELSE charindex(' ', InvoiceLineItemDescription) - 1 END)
AS FirstWordLength
FROM InvoiceLineItems
ORDER BY FirstWordLength desc;
It should look something like this:
InvoiceLineItemDescription FirstWordLength
citi bank 4
You can get the first word length using charindex():
SELECT InvoiceLineItemDescription,
CHARINDEX(' ', InvoiceLineItemDescription + ' ') - 1 as FirstWordLength
FROM InvoiceLineItems
ORDER BY FirstWordLength desc;
As in your question, this assumes that only spaces are used for delimiting words. You can use PATINDEX() to support more separator characters.
The case in your code should work, but you are using it to extract the first word, rather than just the length.
In SQL Query, I need the values as below using select query of my column.
Result has to be the text after the first space ' ' and before the first '('
Source Column
create Table Test_Table (Column1 Varchar(50))
Insert into Test_Table Values
('0636 KAVITHI (LOC)'),
('0638 SRI KRISHNA (NAT)'),
('0639 SELVAM'),
('0643 GOOD SERVICE (LOC)'),
('0644 FINA CARE EVENT (LOC)')
I need get the string found between first ' ' and the '('
Expected Result
KAVITHI
SRI KRISHNA
SELVAM
GOOD SERVICE
FINA CARE EVENT
Another approach without using an OUTER APPLY.
SELECT CASE WHEN Column1 LIKE '%(%'
THEN SUBSTRING(RIGHT(Column1,LEN(Column1)-CHARINDEX(' ',Column1)),0,
CHARINDEX('(',RIGHT(Column1,LEN(Column1)-CHARINDEX(' ',Column1)),0))
ELSE RIGHT(Column1,LEN(Column1)-CHARINDEX(' ',Column1))
END AS Trimmed
FROM Test_Table
OUTPUT
Trimmed
KAVITHI
SRI KRISHNA
SELVAM
GOOD SERVICE
FINA CARE EVENT
SQL Fiddle: http://sqlfiddle.com/#!3/69dd1/20/0
CHARINDEX() can be used to find the position of specific characters.
OUTER APPLY can be used to find the position of the space and brace characters, and store them in a place that you can re-use them.
SUBSTRING() can be used to find the text between the space and the brace.
EDIT: Added CASE to cope with values that contain no (.
SELECT
SUBSTRING(
test_table.column1, -- the field we're searching
stats.idx_space + 1, -- starting from the character after the first space
CASE
WHEN stats.idx_brace > stats.idx_space
THEN stats.idx_brace
ELSE stats.idx_eos
END
-
stats.idx_space -- for as many characters as there are between the space and the brace
)
FROM
test_table
OUTER APPLY
(
SELECT
CHARINDEX(' ', test_table.column1) AS idx_space, -- position of the first space
CHARINDEX('(', test_table.column1) AS idx_brace, -- position of the first brace
LEN(test_table.column1) AS idx_eos -- position of the end-of-string
)
AS stats
EDIT: A single "line", as requested.
Do note that forcing this as a single line does make this harder to read, maintain and adapt. One of APPLY's strongest use-cases is to maintain DRY (Don't Repeat Yourself) principles.
This query repeats several parts several times:
- find the first space repeated 2 times
- find the first brace repeated 3 times
SELECT
SUBSTRING(
test_table.column1,
CHARINDEX(' ', test_table.column1) + 1,
CASE
WHEN CHARINDEX('(', test_table.column1) > CHARINDEX(' ', test_table.column1)
THEN CHARINDEX('(', test_table.column1)
ELSE LEN(test_table.column1)
END
-
CHARINDEX('(', test_table.column1)
)
FROM
test_table
I'm look for the simplest way to split up first and last name wile trimming out the middle initial. The current layout of the field is [Last Name], [First Name] [MI]. Also, middle initial is not always there. My current code is below, I'm just not sure how to trim out the middle initial from first name without writing a case statement.
SELECT SUBSTRING(h.Name, CHARINDEX(',', h.Name, 0) + 2, LEN(h.Name) - CHARINDEX(',', h.Name, 0)), 0 as FirstName
,SUBSTRING(h.Name, 0, CHARINDEX(',', h.Name, 0)) as LastName
FROM Members
I have made some assumptions below:
1 - First names are always longer than one character.
2 - Middle inital will always be preceded by a space.
3 - The data is trimmed.
This code will return NULL if any of the above are not true. If your data is not trimmed, you can use RTRIM on all instances of #n below to mitigate.
declare #n as varchar(50)
set #n = 'Smith, John A'
select #n,
case
when SUBSTRING(#n, LEN(#n) - 1, 1) = ' '
then SUBSTRING(#n, LEN(#n), 1)
end
What are the business rules of this system? Will it always be:
last name , first name space a possible middle initial
What other permutations can exist?
Will it always be space letter . ? Because then you could always take the right three characters, look for a space and a period, then remove the set of three.
select REPLACE(firstName+ISNULL(middleName+' ','')+ISNULL( lastName +' ',''),' ',' ') as 'name' from Contacts