Split name into multiple parts in SELECT statement - sql

I cannot seem to find an existing post on splitting a string into the parts I require. I have a database field in SQL Server that contains the "LastName FirstName MI" (no commas just spaces delimiting each part of a person's name). I have the following SQL to get the FirstName and Last, but cannot figure out how to get the Middle Initial or Middle Name.
Ex. Doe John B
SELECT
RTRIM(LEFT([PATIENT_NAME], CHARINDEX(' ', [PATIENT_NAME]))) AS LastName,
SUBSTRING([PATIENT_NAME], CHARINDEX(' ', [PATIENT_NAME]) + 1, LEN([PATIENT_NAME])) AS FirstName
FROM
Clients
Results in:
FirstName = John B
LastName = Doe
How to just return the first name without the middle initial and get the 'B' as middle name from this string in this SELECT statement?

You can either take the right 1 character, or reverse the string the take the first char.
SELECT RIGHT(LTRIM(RTRIM([Patient_Name])), 1) AS Middle_Initial
SELECT LEFT(REVERSE(LTRIM(RTRIM([Patient_Name]))), 1) AS Middle_Initial
As for removing MI from your firstname string, I would either find the length of the string and take the left N-2 chars or I would charindex the space and then take that many chars. To put it all together:
DECLARE #name VARCHAR(100) = 'Smith David M '
--Clean the string of leading/trailing whitespace
SELECT LTRIM(RTRIM(#name)) AS name_cleaned
--Find the first space to parse out the last name
SELECT CHARINDEX(' ', #name) AS first_space
--Select all chars before the first space
SELECT LEFT(LTRIM(RTRIM(#name)), CHARINDEX(' ', #name)-1) AS last_name
--Find the next space, use the starting location as the previous space and add 1
SELECT CHARINDEX(' ', #name, 7) AS second_space
--Select all chars between the spaces
SELECT SUBSTRING(#name, CHARINDEX(' ', #name)+1, CHARINDEX(' ', #name, 7) - CHARINDEX(' ', #name)) AS first_name
--Select the right most char for middle initial
SELECT RIGHT(LTRIM(RTRIM(#name)), 1) AS middle

You can REPLACE the space characters with period characters (.) and use PARSENAME().
Note that this would work for all 3 parts of the name, not just the middle initial.

When using the CHARINDEX on the last name, you'll use it as the length of the substring. Then, on the FirstName, use it again as start position on the substring. Now, the trick on the Middle, on the CHARINDEX, you have to include the start position which will be the LEN minus the LastName CHARINDEX. this would gives you the second space which is the position you want to start with for taking the Middle Name.
See the example below :
DECLARE #tb TABLE (PATIENT_NAME varchar(250));
INSERT INTO #tb VALUES
('Doe John B')
DECLARE
#LastName INT
, #Middle INT
SELECT
#LastName = CHARINDEX(' ', PATIENT_NAME)
, #Middle = CHARINDEX(' ', PATIENT_NAME, LEN(PATIENT_NAME) - CHARINDEX(' ', PATIENT_NAME))
FROM #tb
SELECT
SUBSTRING(PATIENT_NAME, 1, #LastName) LastName
, SUBSTRING(PATIENT_NAME, #LastName, LEN(PATIENT_NAME) - #LastName) FirstName
, SUBSTRING(PATIENT_NAME, #Middle, LEN(PATIENT_NAME) - #Middle + 1 ) Middle
FROM #tb
I have declared some variables to make things much readable, but you can do it without them.
Surely, LEFT and RIGHT are the easier approaches on taking the lastname and Middle Name. Along with using some helper functions such as REVERSE and TRIM, but I would prefer PARSENAME as a simpler and cleaner approach.
Here is an example :
SELECT
PARSENAME(REPLACE(PATIENT_NAME,' ','.'),3) LastName
, PARSENAME(REPLACE(PATIENT_NAME,' ','.'),2) FirstName
, PARSENAME(REPLACE(PATIENT_NAME,' ','.'),1) Middle

Since the number of elements you must extract from your string is fixed(3) you can use XML based split:
DECLARE #clients TABLE (PATIENT_NAME nvarchar(max));
INSERT INTO #clients VALUES
(' Doe John B ')
,(' Doe Jane C ')
,(' Doe Jill ')
;WITH Splitted
AS (
SELECT PATIENT_NAME as ORIGINAL_PATIENT_NAME
,REPLACE(REPLACE(REPLACE(ltrim(rtrim(PATIENT_NAME)),' ','<>'),'><',''),'<>',' ') as PATIENT_NAME
,CAST('<x>' + REPLACE(REPLACE(REPLACE(REPLACE(ltrim(rtrim(PATIENT_NAME)),' ','<>'),'><',''),'<>',' '), ' ', '</x><x>') + '</x>' AS XML) AS Parts
FROM #clients
)
SELECT
ORIGINAL_PATIENT_NAME
,PATIENT_NAME
,Parts.value(N'/x[1]', 'nvarchar(max)') AS LAST_NAME
,Parts.value(N'/x[2]', 'nvarchar(max)') AS FIRST_NAME
,Parts.value(N'/x[3]', 'nvarchar(max)') AS MIDDLE_NAME
FROM Splitted
Results:
As you can see it works even with random-spaced names.

Related

Splitting a Full Name into First and Last Name

I have a list of customer whose name is given as a full name.
I want to create a function that takes the full name as parameter and returns the first and last name separately. If this is not possible I can have two separate functions one that returns the first name and the other that returns the last name. The full name list contains names that have a maximum of three words.
What I want is this:-
When a full name is composed of two words. The first one should be
the name and the second one should be the last name.
When a full name is composed of three words. The first and middle words should be the first name while the third word should be the last name.
Example:-
**Full Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Result:-
**First Name Last Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
I have search and found solutions that are not working as intended and would like some advice.
Keeping it short and simple
DECLARE #t TABLE(Fullname varchar(40))
INSERT #t VALUES('John Paul White'),('Peter Smith'),('Thomas')
SELECT
LEFT(Fullname, LEN(Fullname) - CHARINDEX(' ', REVERSE(FullName))) FirstName,
STUFF(RIGHT(FullName, CHARINDEX(' ', REVERSE(FullName))),1,1,'') LastName
FROM
#t
Result:
FirstName LastName
John Paul White
Peter Smith
Thomas NULL
If you are certain that your names will only ever be two or three words, with single spaces, then we can rely on the base string functions to extract the first and last name components.
SELECT
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col, 1,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) - 1)
ELSE SUBSTRING(col, 1, CHARINDEX(' ', col) - 1)
END AS first,
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) + 1,
LEN(col) - CHARINDEX(' ', col, CHARINDEX(' ', col)))
ELSE SUBSTRING(col,
CHARINDEX(' ', col) + 1,
LEN(col) - CHARINDEX(' ', col))
END AS last
FROM yourTable;
Yuck, but it seems to work. My feeling is that you should fix your data model at some point. A more ideal place to scrub your name data would be outside the database, e.g. in Java. Or, better yet, fix the source of your data such that you record proper first and last names from the very beginning.
Demo here:
Rextester
Another option (just for fun) is to use a little XML in concert with an CROSS APPLY
Example
Select FirstName = ltrim(reverse(concat(Pos2,' ',Pos3,' ',Pos4,' ',Pos5)))
,LastName = reverse(Pos1)
From YourTable A
Cross Apply (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
From (Select Cast('<x>' + replace(reverse(A.[Full Name]),' ','</x><x>')+'</x>' as xml) as xDim) XMLData
) B
Returns
FirstName LastName
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Cher
Sally Anne Bella Donna Baxter
You're trying to do two things at once...I won't solve for you, but here's the direction I'd take:
1) Check this out for string splitting: https://ole.michelsen.dk/blog/split-string-to-table-using-transact-sql.html. This will allow you to parse the name into a temp table and you can perform your logic on it to create names based on your rules
2) Create this as a table-valued function so that you can return a single row of parsed FirstName, LastName from your parameter. That way you can join to it and include in your results
Have you tried by Using PARSENAME Function?
The last method in splitting a full name into its corresponding first name and last name is the use of the PARSENAME string function, as can be seen from the following script:
DECLARE #FullName VARCHAR(100)
SET #FullName = 'John White Doe'
SELECT CONCAT(PARSENAME(REPLACE(#FullName, ' ', '.'), 3),' ',PARSENAME(REPLACE(#FullName, ' ', '.'), 2)) AS [FirstName],
PARSENAME(REPLACE(#FullName, ' ', '.'), 1) AS [LastName]
For more information, Goto this Site
This is the output..
Make it a table-valued function.
see here for an example
And this is the code you need to create your function. Basically you just need to split your LastName
IF OBJECT_ID(N'dbo.ufnParseName', N'TF') IS NOT NULL
DROP FUNCTION dbo.ufnParseName;
GO
CREATE FUNCTION dbo.ufnParseName(#FullName VARCHAR(300))
RETURNS #retParseName TABLE
(
-- Columns returned by the function
FirstName nvarchar(150) NULL,
LastName nvarchar(50) NULL
)
AS
-- Returns the spliced last name.
BEGIN
DECLARE
#FirstName nvarchar(250),
#LastName nvarchar(250);
-- Get common contact information
SELECT #LastName = RTRIM(RIGHT(#FullName, CHARINDEX(' ', REVERSE(#FullName)) - 1));
SELECT #FirstName = LTRIM(RTRIM(Replace(#FullName, #LastName, '')))
INSERT #retParseName
SELECT #FirstName, #LastName;
RETURN;
END
You can run as SELECT * FROM dbo.ufnParseName('M J K');
Why Table-Valued-Function
You can get rid off the duplication of your sql query and achieve DRY
You can try the below query. It is written as per your requirement and it only handles full_name with 2 or 3 parts in it.
;WITH cte AS(
SELECT full_name, (LEN(full_name) - LEN(REPLACE(full_name, ' ', '')) + 1) AS size FROM #temp
)
SELECT FirstName =
CASE
WHEN size=3 THEN PARSENAME(REPLACE(full_name, ' ', '.'), 3) + ' ' + PARSENAME(REPLACE(full_name, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(full_name, ' ', '.'), 2)
END,
PARSENAME(REPLACE(full_name, ' ', '.'), 1) AS LastName
FROM cte

How to Specify Trim Chars in SQL TRIM

I'm having a table Employee, in that some values are started with ", ". So, I need to remove the comma and white-space at the beginning of the name at the time of SELECT query using LTRIM() - SQL-Server.
My Table : Employee
CREATE TABLE Employee
(
PersonID int,
ContactName varchar(255),
Address varchar(255),
City varchar(255)
);
INSERT INTO Employee(PersonID, ContactName, Address, City)
VALUES ('1001',', B. Bala','21, Car Street','Bangalore');
SELECT PersonID, ContactName, Address, City FROM Employee
Here the ContactName Column has a value ", B. Bala". I need to remove the comma and white-space at the beginning of the name.
Alas, SQL Server does not support the ANSI standard functionality of specifying the characters for LTRIM().
In this case, you can use:
(case when ContactName like ', %' then stuff(ContactName, 1, 2, '')
else ContactName
end)
You could potentially use PATINDEX() in order to get this done.
DECLARE #Text VARCHAR(50) = ', Well Crap';
SELECT STUFF(#Text, 1, PATINDEX('%[A-z]%', #Text) - 1, '');
This would output Well Crap. PATINDEX() will find first letter in your word and cut everything before it.
It works fine even if there's no leading rubbish:
DECLARE #Text VARCHAR(50) = 'Mister Roboto';
SELECT STUFF(#Text, 1, PATINDEX('%[A-z]%', #Text) - 1, '');
This outputs Mister Roboto
If there are no valid characters, let's say ContactName is , 9132124, :::, this would output NULL, if you'd like to get blank result, you can use COALESCE():
DECLARE #Text VARCHAR(50) = ', 9132124, :::';
SELECT COALESCE(STUFF(#Text, 1, PATINDEX('%[A-z]%', #Text) - 1, ''), '');
This will output an empty string.
You could also use REPLACE.....
eg.
REPLACE( ' ,Your String with space comma', ' ,', '')
UPDATE dbo.Employee
SET
dbo.Employee.ContactName = replace(LEFT(ContactName, 2),', ','')
+ SUBSTRING (ContactName, 3, len(contactname))
where LEFT(ContactName, 2)=', '
This will only update where first two character contains ', '

How to get middle portion from Sql server table data?

I am trying to get First name from employee table, in employee table full_name is like this: Dow, Mike P.
I tried with to get first name using below syntax but it comes with Middle initial - how to remove middle initial from first name if any. because not all name contain middle initial value.
-- query--
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
len(Employee_First_Name)) AS FirstName
---> remove middle initial from right side from employee
-- result
Full_name Firstname Dow,Mike P. Mike P.
--few example for Full_name data---
smith,joe j. --->joe (need result as)
smith,alan ---->alan (need result as)
Instead of specifying the len you need to use charindex again, but specify that you want the second occurrence of a space.
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
CHARINDEX(' ', Employee_First_Name, 2)) AS FirstName
One thing to note, the second charindex can return 0 if there is no second occurence. In that case, you would want to use something like the following:
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
IIF(CHARINDEX(' ', Employee_First_Name, 2) = 0, Len(Employee_First_name), CHARINDEX(' ', Employee_First_Name, 2))) AS FirstName
This removes the portion before the comma.. then uses that string and removes everything after space.
WITH cte AS (
SELECT *
FROM (VALUES('smith,joe j.'),('smith,alan'),('joe smith')) t(fullname)
)
SELECT
SUBSTRING(
LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname))),
0,
COALESCE(NULLIF(CHARINDEX(' ',LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname)))),0),LEN(fullname)))
FROM cte
output
------
joe
alan
joe
To be honest, this is most easily expressed using multiple levels of logic. One way is using outer apply:
select ttt.firstname
from t outer apply
(select substring(t.full_name, charindex(', ', t.full_name) + 2, len(t.full_name) as firstmi
) tt outer apply
(select (case when tt.firstmi like '% %'
then left(tt.firstmi, charindex(' ', tt.firstmi)
else tt.firstmi
end) as firstname
) as ttt
If you want to put this all in one complicated statement, I would suggest a computed column:
alter table t
add firstname as (stuff((case when full_name like '%, % %.',
then left(full_name,
charindex(' ', full_name, charindex(', ', full_name) + 2)
)
else full_name
end),
1,
charindex(', ', full_name) + 2,
'')
If format of this full_name field is the same for all rows, you may utilize power of SQL FTS word breaker for this task:
SELECT N'Dow, Mike P.' AS full_name INTO #t
SELECT display_term FROM #t
CROSS APPLY sys.dm_fts_parser(N'"' + full_name + N'"', 1033, NULL, 1) p
WHERE occurrence = 2
DROP TABLE #t

deleting second comma in data

Ok so I have a table called PEOPLE that has a name column. In the name column is a name, but its totally a mess. For some reason its not listed such as last, first middle. It's sitting like last,first,middle and last first (and middle if there) are separated by a comma.. two commas if the person has a middle name.
example:
smith,steve
smith,steve,j
smith,ryan,tom
I'd like the second comma taken away (for parsing reason ) spaces put after existing first comma so the above would come out looking like:
smith, steve
smith, steve j
smith, ryan tom
Ultimately I'd like to be able to parse the names into first, middle, and last name fields, but that's for another post :_0. I appreciate any help.
thank you.
Drop table T1;
Create table T1(Name varchar(100));
Insert T1 Values
('smith,steve'),
('smith,steve,j'),
('smith,ryan,tom');
UPDATE T1
SET Name=
CASE CHARINDEX(',',name, CHARINDEX(',',name)+1) WHEN
0 THEN Name
ELSE
LEFT(name,CHARINDEX(',',name, CHARINDEX(',',name)+1)-1)+' ' +
RIGHT(name,LEN(Name)-CHARINDEX(',',name, CHARINDEX(',',name)+1))
END
Select * from T1
This seems to work. Not the most concise but avoids cursors.
DECLARE #people TABLE (name varchar(50))
INSERT INTO #people
SELECT 'smith,steve'
UNION
SELECT 'smith,steve,j'
UNION
SELECT 'smith,ryan,tom'
UNION
SELECT 'commaless'
SELECT name,
CASE
WHEN CHARINDEX(',',name) > 0 THEN
CASE
WHEN CHARINDEX(',',name,CHARINDEX(',',name) + 1) > 0 THEN
STUFF(STUFF(name, CHARINDEX(',',name,CHARINDEX(',',name) + 1), 1, ' '),CHARINDEX(',',name),1,', ')
ELSE
STUFF(name,CHARINDEX(',',name),1,', ')
END
ELSE name
END AS name2
FROM #people
Using a table function to split apart the names with a delimiter and for XML Path to stitch them back together, we can get what you're looking for! Hope this helps!
Declare #People table(FullName varchar(200))
Insert Into #People Values ('smith,steve')
Insert Into #People Values ('smith,steve,j')
Insert Into #People Values ('smith,ryan,tom')
Insert Into #People Values ('smith,john,joseph Jr')
Select p.*,stuff(fn.FullName,1,2,'') as ModifiedFullName
From #People p
Cross Apply (
select
Case When np.posID<=2 Then ', ' Else ' ' End+np.Val
From #People n
Cross Apply Custom.SplitValues(n.FullName,',') np
Where n.FullName=p.FullName
For XML Path('')
) fn(FullName)
Output:
ModifiedFullName
smith, steve
smith, steve j
smith, ryan tom
smith, john joseph Jr
SplitValues table function definition:
/*
This Function takes a delimited list of values and returns a table containing
each individual value and its position.
*/
CREATE FUNCTION [Custom].[SplitValues]
(
#List varchar(max)
, #Delimiter varchar(1)
)
RETURNS
#ValuesTable table
(
posID int
,val varchar(1000)
)
AS
BEGIN
WITH Cte AS
(
SELECT CAST('<v>' + REPLACE(#List, #Delimiter, '</v><v>') + '</v>' AS XML) AS val
)
INSERT #ValuesTable (posID,val)
SELECT row_number() over(Order By x) as posID, RTRIM(LTRIM(Split.x.value('.', 'VARCHAR(1000)'))) AS val
FROM Cte
CROSS APPLY val.nodes('/v') Split(x)
RETURN
END
GO
String manipulation in SQLServer, outside of writing your own User Defined Function, is limited but you can use the PARSENAME function for your purposes here. It takes a string, splits it on the period character, and returns the segment you specify.
Try this:
DECLARE #name VARCHAR(100) = 'smith,ryan,tom'
SELECT REVERSE(PARSENAME(REPLACE(REVERSE(#name), ',', '.'), 1)) + ', ' +
REVERSE(PARSENAME(REPLACE(REVERSE(#name), ',', '.'), 2)) +
COALESCE(' ' + REVERSE(PARSENAME(REPLACE(REVERSE(#name), ',', '.'), 3)), '')
Result: smith, ryan tom
If you set #name to 'smith,steve' instead, you'll get:
Result: smith, steve
Segment 1 actually gives you the last segment, segment 2 the second to last etc. Hence I've used REVERSE to get the order you want. In the case of 'steve,smith', segment 3 will be null, hence the COALESCE to add an empty string if that is the case. The REPLACE of course changes the commas to periods so that the split will work.
Note that this is a bit of a hack. PARSENAME will not work if there are more than four parts and this will fail if the name happens to contain a period. However if your data conforms to these limitations, hopefully it provides you with a solution.
Caveat: it sounds like your data may be inconsistently formatted. In that case, applying any automated treatment to it is going to be risky. However, you could try:
UPDATE people SET name = REPLACE(name, ',', ' ')
UPDATE people SET name = LEFT(name, CHARINDEX(' ', name)-1)+ ', '
+ RIGHT(name, LEN(name) - CHARINDEX(' ', name)
That'll work for the three examples you give. What it will do to the rest of your set is another question.
Here's an example with CHARINDEX() and SUBSTRING
WITH yourTable
AS
(
SELECT names
FROM
(
VALUES ('smith,steve'),('smith,steve,j'),('smith,ryan,tom')
) A(names)
)
SELECT names AS old,
CASE
WHEN comma > 0
THEN SUBSTRING(spaced_names,0,comma + 1) --before the comma
+ SUBSTRING(spaced_names,comma + 2,1000) --after the comma
ELSE spaced_names
END AS new
FROM yourTable
CROSS APPLY(SELECT CHARINDEX(',',names,CHARINDEX(',',names) + 1),REPLACE(names,',',', ')) AS CA(comma,spaced_names)

SQL- Get the substring after first space and second space in separate columns

I have a column as FullName containg FirstName, MiddleName, LastName in it.
For example:
FullName: Marilyn Kean Kirkland
I want to have 3 separate columns for FirstName, MiddleName and LastName from the FullName by taking a substring from it.
I am pulling the FirstName by using the code:
substring(c.LegalName, 1, CHARINDEX(' ', c.LegalName)) as 'First Name'
I am wondering how can I pull just the middle name which comes after first space and before second space?
Also, I want to pull the last name which comes after the second space?
SQL Server doesn't have very good string manipulation functions. This is easier with subqueries:
select firstname,
stuff(reverse(stuff(reverse(legalname), 1, len(lastname) + 1, '')),
1, len(firstname) + 1, '')
from (select legalname,
left(legalname, charindex(' ', legalname) - 1) as firstname,
right(legalname, charindex(' ', reverse(legalname)) - 1) as lastname
. . .
) c
However, I would be very careful, because not all people have three part names. And others have suffixes (JR, SR) and other complications.
You can try something like this.
;WITH c AS
(
SELECT 'Marilyn Kean Kirkland' AS legalname
UNION ALL SELECT 'J Smith' AS legalname
)
SELECT
SUBSTRING(legalname,1,space1) firstname,
SUBSTRING(legalname,space1,space2 - space1 + 1) middlename,
SUBSTRING(legalname,space2 + 1,totallength - space2) lastname
FROM
(
SELECT
legalname,
CHARINDEX(' ',legalname) space1,
LEN(legalname) - CHARINDEX(' ',REVERSE(legalname)) space2,
LEN(legalname) as totallength
FROM c
)c
GO
As noted above by other users as well,these scripts will work with user that only have 3 part or 2 part names.