Extract last name, first name and suffix into separate columns - sql

I was wondering if someone could provide me an easy way to extract the names into different columns as below. There is a comma after the Last Name and space between First Name, Middle Initial, and Suffix. Greatly appreciate it.
Stored Data:
Name
Walker,James M JR
Smith,Jack P
Smith,Whitney
Required result:
LastName FirstName Suffix
Walker James JR
Smith Jack
Smith Whitney
Tried Code:
select top 5 Name,
LEFT(Name, CHARINDEX(',', Name) - 1) AS LastName,
right(Name, len(Name) - CHARINDEX(',', Name)) as FirstName
Just having problem with separating First Name from Middle Initial and Suffix. Then getting Suffix from the last space from the right.

You really should store these parts of the name in separate columns (first normal form) to avoid such parsing.
You can put all the logic into one huge call of nested functions, but it is quite handy to separate them into single calls using CROSS APPLY.
The parsing is straight-forward:
find position of comma
split the string into part before comma (LastName) and part AfterComma
find position of first space in the second part AfterComma
split the string into two parts again - this gives FirstName and the rest (AfterSpace)
find position of space in AfterSpace
split the string into two parts again - this gives Initial and Suffix.
The query also checks results of CHARINDEX - it returns 0 if the string is not found.
Obviously, if the string value is not in the expected format, you'll get incorrect result.
DECLARE #T TABLE (Name varchar(8000));
INSERT INTO #T (Name) VALUES
('Walker'),
('Walker,James M JR'),
('Smith,Jack P'),
('Smith,Whitney');
SELECT
Name
,LastName
,AfterComma
,FirstName
,AfterSpace
,MidInitial
,Suffix
FROM
#T
CROSS APPLY (SELECT CHARINDEX(',', Name) AS CommaPosition) AS CA_CP
CROSS APPLY (SELECT CASE WHEN CommaPosition > 0 THEN
LEFT(Name, CommaPosition - 1) ELSE Name END AS LastName) AS CA_LN
CROSS APPLY (SELECT CASE WHEN CommaPosition > 0 THEN
SUBSTRING(Name, CommaPosition + 1, 8000) ELSE '' END AS AfterComma) AS CA_AC
CROSS APPLY (SELECT CHARINDEX(' ', AfterComma) AS SpacePosition) AS CA_SP
CROSS APPLY (SELECT CASE WHEN SpacePosition > 0 THEN
LEFT(AfterComma, SpacePosition - 1) ELSE AfterComma END AS FirstName) AS CA_FN
CROSS APPLY (SELECT CASE WHEN SpacePosition > 0 THEN
SUBSTRING(AfterComma, SpacePosition + 1, 8000) ELSE '' END AS AfterSpace) AS CA_AS
CROSS APPLY (SELECT CHARINDEX(' ', AfterSpace) AS Space2Position) AS CA_S2P
CROSS APPLY (SELECT CASE WHEN Space2Position > 0 THEN
LEFT(AfterSpace, Space2Position - 1) ELSE AfterSpace END AS MidInitial) AS CA_MI
CROSS APPLY (SELECT CASE WHEN Space2Position > 0 THEN
SUBSTRING(AfterSpace, Space2Position + 1, 8000) ELSE '' END AS Suffix) AS CA_S
result
Name LastName AfterComma FirstName AfterSpace MidInitial Suffix
Walker Walker
Walker,James M JR Walker James M JR James M JR M JR
Smith,Jack P Smith Jack P Jack P P
Smith,Whitney Smith Whitney Whitney

Related

Splitting a Full Name into First and Last Name

I have a list of customer whose name is given as a full name.
I want to create a function that takes the full name as parameter and returns the first and last name separately. If this is not possible I can have two separate functions one that returns the first name and the other that returns the last name. The full name list contains names that have a maximum of three words.
What I want is this:-
When a full name is composed of two words. The first one should be
the name and the second one should be the last name.
When a full name is composed of three words. The first and middle words should be the first name while the third word should be the last name.
Example:-
**Full Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Result:-
**First Name Last Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
I have search and found solutions that are not working as intended and would like some advice.
Keeping it short and simple
DECLARE #t TABLE(Fullname varchar(40))
INSERT #t VALUES('John Paul White'),('Peter Smith'),('Thomas')
SELECT
LEFT(Fullname, LEN(Fullname) - CHARINDEX(' ', REVERSE(FullName))) FirstName,
STUFF(RIGHT(FullName, CHARINDEX(' ', REVERSE(FullName))),1,1,'') LastName
FROM
#t
Result:
FirstName LastName
John Paul White
Peter Smith
Thomas NULL
If you are certain that your names will only ever be two or three words, with single spaces, then we can rely on the base string functions to extract the first and last name components.
SELECT
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col, 1,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) - 1)
ELSE SUBSTRING(col, 1, CHARINDEX(' ', col) - 1)
END AS first,
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) + 1,
LEN(col) - CHARINDEX(' ', col, CHARINDEX(' ', col)))
ELSE SUBSTRING(col,
CHARINDEX(' ', col) + 1,
LEN(col) - CHARINDEX(' ', col))
END AS last
FROM yourTable;
Yuck, but it seems to work. My feeling is that you should fix your data model at some point. A more ideal place to scrub your name data would be outside the database, e.g. in Java. Or, better yet, fix the source of your data such that you record proper first and last names from the very beginning.
Demo here:
Rextester
Another option (just for fun) is to use a little XML in concert with an CROSS APPLY
Example
Select FirstName = ltrim(reverse(concat(Pos2,' ',Pos3,' ',Pos4,' ',Pos5)))
,LastName = reverse(Pos1)
From YourTable A
Cross Apply (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
From (Select Cast('<x>' + replace(reverse(A.[Full Name]),' ','</x><x>')+'</x>' as xml) as xDim) XMLData
) B
Returns
FirstName LastName
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Cher
Sally Anne Bella Donna Baxter
You're trying to do two things at once...I won't solve for you, but here's the direction I'd take:
1) Check this out for string splitting: https://ole.michelsen.dk/blog/split-string-to-table-using-transact-sql.html. This will allow you to parse the name into a temp table and you can perform your logic on it to create names based on your rules
2) Create this as a table-valued function so that you can return a single row of parsed FirstName, LastName from your parameter. That way you can join to it and include in your results
Have you tried by Using PARSENAME Function?
The last method in splitting a full name into its corresponding first name and last name is the use of the PARSENAME string function, as can be seen from the following script:
DECLARE #FullName VARCHAR(100)
SET #FullName = 'John White Doe'
SELECT CONCAT(PARSENAME(REPLACE(#FullName, ' ', '.'), 3),' ',PARSENAME(REPLACE(#FullName, ' ', '.'), 2)) AS [FirstName],
PARSENAME(REPLACE(#FullName, ' ', '.'), 1) AS [LastName]
For more information, Goto this Site
This is the output..
Make it a table-valued function.
see here for an example
And this is the code you need to create your function. Basically you just need to split your LastName
IF OBJECT_ID(N'dbo.ufnParseName', N'TF') IS NOT NULL
DROP FUNCTION dbo.ufnParseName;
GO
CREATE FUNCTION dbo.ufnParseName(#FullName VARCHAR(300))
RETURNS #retParseName TABLE
(
-- Columns returned by the function
FirstName nvarchar(150) NULL,
LastName nvarchar(50) NULL
)
AS
-- Returns the spliced last name.
BEGIN
DECLARE
#FirstName nvarchar(250),
#LastName nvarchar(250);
-- Get common contact information
SELECT #LastName = RTRIM(RIGHT(#FullName, CHARINDEX(' ', REVERSE(#FullName)) - 1));
SELECT #FirstName = LTRIM(RTRIM(Replace(#FullName, #LastName, '')))
INSERT #retParseName
SELECT #FirstName, #LastName;
RETURN;
END
You can run as SELECT * FROM dbo.ufnParseName('M J K');
Why Table-Valued-Function
You can get rid off the duplication of your sql query and achieve DRY
You can try the below query. It is written as per your requirement and it only handles full_name with 2 or 3 parts in it.
;WITH cte AS(
SELECT full_name, (LEN(full_name) - LEN(REPLACE(full_name, ' ', '')) + 1) AS size FROM #temp
)
SELECT FirstName =
CASE
WHEN size=3 THEN PARSENAME(REPLACE(full_name, ' ', '.'), 3) + ' ' + PARSENAME(REPLACE(full_name, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(full_name, ' ', '.'), 2)
END,
PARSENAME(REPLACE(full_name, ' ', '.'), 1) AS LastName
FROM cte

deleting second comma in data

Ok so I have a table called PEOPLE that has a name column. In the name column is a name, but its totally a mess. For some reason its not listed such as last, first middle. It's sitting like last,first,middle and last first (and middle if there) are separated by a comma.. two commas if the person has a middle name.
example:
smith,steve
smith,steve,j
smith,ryan,tom
I'd like the second comma taken away (for parsing reason ) spaces put after existing first comma so the above would come out looking like:
smith, steve
smith, steve j
smith, ryan tom
Ultimately I'd like to be able to parse the names into first, middle, and last name fields, but that's for another post :_0. I appreciate any help.
thank you.
Drop table T1;
Create table T1(Name varchar(100));
Insert T1 Values
('smith,steve'),
('smith,steve,j'),
('smith,ryan,tom');
UPDATE T1
SET Name=
CASE CHARINDEX(',',name, CHARINDEX(',',name)+1) WHEN
0 THEN Name
ELSE
LEFT(name,CHARINDEX(',',name, CHARINDEX(',',name)+1)-1)+' ' +
RIGHT(name,LEN(Name)-CHARINDEX(',',name, CHARINDEX(',',name)+1))
END
Select * from T1
This seems to work. Not the most concise but avoids cursors.
DECLARE #people TABLE (name varchar(50))
INSERT INTO #people
SELECT 'smith,steve'
UNION
SELECT 'smith,steve,j'
UNION
SELECT 'smith,ryan,tom'
UNION
SELECT 'commaless'
SELECT name,
CASE
WHEN CHARINDEX(',',name) > 0 THEN
CASE
WHEN CHARINDEX(',',name,CHARINDEX(',',name) + 1) > 0 THEN
STUFF(STUFF(name, CHARINDEX(',',name,CHARINDEX(',',name) + 1), 1, ' '),CHARINDEX(',',name),1,', ')
ELSE
STUFF(name,CHARINDEX(',',name),1,', ')
END
ELSE name
END AS name2
FROM #people
Using a table function to split apart the names with a delimiter and for XML Path to stitch them back together, we can get what you're looking for! Hope this helps!
Declare #People table(FullName varchar(200))
Insert Into #People Values ('smith,steve')
Insert Into #People Values ('smith,steve,j')
Insert Into #People Values ('smith,ryan,tom')
Insert Into #People Values ('smith,john,joseph Jr')
Select p.*,stuff(fn.FullName,1,2,'') as ModifiedFullName
From #People p
Cross Apply (
select
Case When np.posID<=2 Then ', ' Else ' ' End+np.Val
From #People n
Cross Apply Custom.SplitValues(n.FullName,',') np
Where n.FullName=p.FullName
For XML Path('')
) fn(FullName)
Output:
ModifiedFullName
smith, steve
smith, steve j
smith, ryan tom
smith, john joseph Jr
SplitValues table function definition:
/*
This Function takes a delimited list of values and returns a table containing
each individual value and its position.
*/
CREATE FUNCTION [Custom].[SplitValues]
(
#List varchar(max)
, #Delimiter varchar(1)
)
RETURNS
#ValuesTable table
(
posID int
,val varchar(1000)
)
AS
BEGIN
WITH Cte AS
(
SELECT CAST('<v>' + REPLACE(#List, #Delimiter, '</v><v>') + '</v>' AS XML) AS val
)
INSERT #ValuesTable (posID,val)
SELECT row_number() over(Order By x) as posID, RTRIM(LTRIM(Split.x.value('.', 'VARCHAR(1000)'))) AS val
FROM Cte
CROSS APPLY val.nodes('/v') Split(x)
RETURN
END
GO
String manipulation in SQLServer, outside of writing your own User Defined Function, is limited but you can use the PARSENAME function for your purposes here. It takes a string, splits it on the period character, and returns the segment you specify.
Try this:
DECLARE #name VARCHAR(100) = 'smith,ryan,tom'
SELECT REVERSE(PARSENAME(REPLACE(REVERSE(#name), ',', '.'), 1)) + ', ' +
REVERSE(PARSENAME(REPLACE(REVERSE(#name), ',', '.'), 2)) +
COALESCE(' ' + REVERSE(PARSENAME(REPLACE(REVERSE(#name), ',', '.'), 3)), '')
Result: smith, ryan tom
If you set #name to 'smith,steve' instead, you'll get:
Result: smith, steve
Segment 1 actually gives you the last segment, segment 2 the second to last etc. Hence I've used REVERSE to get the order you want. In the case of 'steve,smith', segment 3 will be null, hence the COALESCE to add an empty string if that is the case. The REPLACE of course changes the commas to periods so that the split will work.
Note that this is a bit of a hack. PARSENAME will not work if there are more than four parts and this will fail if the name happens to contain a period. However if your data conforms to these limitations, hopefully it provides you with a solution.
Caveat: it sounds like your data may be inconsistently formatted. In that case, applying any automated treatment to it is going to be risky. However, you could try:
UPDATE people SET name = REPLACE(name, ',', ' ')
UPDATE people SET name = LEFT(name, CHARINDEX(' ', name)-1)+ ', '
+ RIGHT(name, LEN(name) - CHARINDEX(' ', name)
That'll work for the three examples you give. What it will do to the rest of your set is another question.
Here's an example with CHARINDEX() and SUBSTRING
WITH yourTable
AS
(
SELECT names
FROM
(
VALUES ('smith,steve'),('smith,steve,j'),('smith,ryan,tom')
) A(names)
)
SELECT names AS old,
CASE
WHEN comma > 0
THEN SUBSTRING(spaced_names,0,comma + 1) --before the comma
+ SUBSTRING(spaced_names,comma + 2,1000) --after the comma
ELSE spaced_names
END AS new
FROM yourTable
CROSS APPLY(SELECT CHARINDEX(',',names,CHARINDEX(',',names) + 1),REPLACE(names,',',', ')) AS CA(comma,spaced_names)

How to format the order of first/last name and remove prefix and nickname

I have a need to retrieve a hierarchy of managers and the column which stores the manager names for a given person are formatted like this Smith, Mr. William (Bill). I want this output to simply be William Smith. So far I have put this together:
SELECT DISTINCT RIGHT(u.manager, LEN(u.manager)-(1+CHARINDEX(', ', u.manager))) + ' ' +
LEFT(u.manager, CHARINDEX(', ', u.manager) - 1) as ManagerName
FROM Users u
The current result from that query using my example above is Mr. William (Bill) Smith. This CHARINDEX and SUBSTRING stuff always gives me a lot of trouble so I am not really sure what the easiest way to do this is. This is also a one-off, so I am not sure a function would be useful here.
DEMO
SELECT
SUBSTRING(manager,0,CHARINDEX(',', manager)) as surname,
SUBSTRING(manager,CHARINDEX('. ', manager)+2, LEN(manager)-CHARINDEX(' (', manager)+1) as name,
CONCAT(SUBSTRING(manager,CHARINDEX('. ', manager)+2, LEN(manager)-CHARINDEX(' (', manager)+1),
' ',
SUBSTRING(manager,0,CHARINDEX(',', manager))) as 'name surname'
FROM
Users
Result:
+-------------+-----------+--------------+
| surname | name | name surname |
+-------------+-----------+--------------+
Smith William William Smith
I took your query and modified a little bit:
SELECT
---this is the tricky part: inner part finds the first instance of '(' parenthesis
--and substract it from the length of the first name and get only the left part of the first name by subtracting it
CONCAT (
LEFT(t.FirstName, LEN(t.FirstName) - (LEN(t.FirstName) - CHARINDEX('(', t.FirstName) + 1))
,t.LastName
)
FROM (
--basically separating your above syntax to two columns
SELECT RIGHT('Smith, Mr. William (Bill)', LEN('Smith, Mr. William (Bill)') - CHARINDEX('.', 'Smith, Mr. William (Bill)') - 1) AS FirstName
,LEFT('Smith, Mr. William (Bill)', CHARINDEX(', ', 'Smith, Mr. William (Bill)') - 1) AS LastName
) t
Here is the query that should work with your table name and column:
SELECT
---Use case when statement to determine if there are any instances of '(' in the first name
CONCAT (
CASE
WHEN CHARINDEX('(', t.FirstName) > 0
THEN LEFT(t.FirstName, LEN(t.FirstName) - (LEN(t.FirstName) - CHARINDEX('(', t.FirstName) + 1))
ELSE t.FirstName + ' '
END
,t.LastName
)
FROM (
SELECT
RIGHT(u.manager, LEN(u.manager) - CHARINDEX('.', u.manager) - 1) AS FirstName
,LEFT(u.manager, CHARINDEX(', ', u.manager) - 1) AS LastName from Users u
) t
SELECT RIGHT(NameStripped, LEN(NameStripped) - (1 + CHARINDEX(', ', NameStripped))) + ' ' + LEFT(NameStripped, CHARINDEX(', ', NameStripped) - 1) AS ManagerName --Your original code
FROM (
SELECT replace(replace(
LEFT(u.manager, CHARINDEX('(', u.manager) - 2) --Get rid of nickname
, 'Mr. ', ''), 'Ms.', '') AS NameStripped --Get rid of Mr/Ms
from MyTable u) a
This should work - I used the code you posted, but added a subquery to remove the nicknames and prefixes.
Note that you may need to adjust this if a) you have more prefix options than this (in which case you could add additional replaces) and/or b) not everyone in your database has a nickname (in which case you'll want to wrap that part in a case statement, most likely).

Separating column into two separate columns

I have a column name. There are entries such-as:
Smith, John
Smith, Joe
One whole entry
I need them separated into two columns as this:
LastName | FirstName
---------------------------
Smith | John
Smith | Joe
One whole entry |
I'm using this query:
SELECT left(name, CHARINDEX(', ', name)) as LastName FROM LookUps
I've tried the following above, but it's displaying the following comma (e.g. Smith,). I need it to remove this following comma, but also display the full information for those entries without a comma.
Any help would be appreciated. Thanks..
select
case when charindex(',',name) > 0
then left(name, charindex(',',name)-1 )
else name end,
case when charindex(',',name) > 0
then ltrim(substring(name, charindex(',',name)+1, len(name) ))
else null end
from yourtable
Another way;
select
left(name, charindex(',', name + ',', 1) - 1) as lastname,
ltrim(substring(name, charindex(',', name + ',', 1) + 1, len(name))) as firstname

Teradata String Manipulation (Second Space)

I'm having great difficulty solving this seemingly easy task:
Purpose:
Create a query that eliminates the middle Initial
Example
Name
Smith, John A
Jane, Mary S
I would like an output such as this:
Name
Smith, John
Jane, Mary
Any tips on how to do this with Teradata SQL
I believe I solved the issue, albeit in a very poor way:
SELECT SUBSTR('SMITH, JOHN A', 0, (POSITION(' ' IN 'SMITH, JOHN A') + (POSITION(' ' IN SUBSTR('SMITH, JOHN A',(POSITION(' ' IN 'SMITH, JOHN A'))+ 1,50)))))
select a,
substr(a,1,index(a,' '))|| substr(trim(substr(a,index(a,' '))),1,index(trim(substr(a,index(a,' '))),' ')),
substr(trim(substr(a,index(a,' '))),index(trim(substr(a,index(a,' '))),' ')) last_name
from a
The challenge is making sure your names are consistently formatted. (Last_Name, Given_Name Middle_Initial) If they are then you may be able to solve this with recursive SQL. The following SQL would take Given_Name Last_Name and return Last_Name. You may be able to tweak it to accomplish your specific task. (My sample data was not consistently formatted so I was stuck trying to find the second (or third) occurrence of a white space character.)
WITH RECURSIVE cte (FullName, DelimPosition, RecursionLevel, Element, Remainder) AS
(
SELECT FullName
, 0 AS DelimPosition_
, 0
, CAST('' AS VARCHAR(128))
, FullName
FROM MyDatabase.Persons
UNION ALL
SELECT FullName
, CASE WHEN POSITION(' ' IN Remainder) > 0
THEN POSITION(' ' IN Remainder)
ELSE CHARACTER_LENGTH(Remainder)
END DelimPosition_
, RecursionLevel + 1
, SUBSTRING(Remainder FROM 0 FOR DelimPosition_ + 1)
, SUBSTRING(Remainder FROM DelimPosition_ + 1)
FROM cte
WHERE DelimPosition_ > 1
AND RecursionLevel < 3 -- Set max depth
)
SELECT FullName
, CASE WHEN POSITION('&' IN Element) = 0
THEN Element
ELSE NULL
END AS LastName
FROM cte c
WHERE RecursionLevel > 2
ORDER BY FullName;
Another option would be to implement a UDF that returns the rightmost n characters of a string. (e.g.RIGHT(FullName, n))
If the formatting is not consistent then we have to look at other less graceful options.
Hope this helps.