Separate fullname into first and last, and remove 'junk' - sql

Wasn't sure of the best way to word this. So I have a column with names, as below:
SalesPerson_Name
----------------
Undefined - 0
Sam Brett-sbrett
Kelly Roberts-kroberts
Michael Paramore-mparamore
Alivia Lawler-alawler
Ryan Hooker-rhooker
Heather Alford-halford
Cassandra Blegen-cblegen
JD Holland-jholland
Vendor Accounts-VENDOR
Other Accounts-OTHER
Getting the names separated is easy enough with PARSENAME and REPLACE functions, but where I'm running into a pickle is with getting rid of the 'junk' at the end:
SELECT SalesPerson_Key
,SalesPerson_Name
,CASE
WHEN PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 2) IS NULL
THEN PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 1)
ELSE PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 2)
END AS FirstName
,CASE
WHEN PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 2) IS NULL
THEN NULL
ELSE PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 1)
END AS LastName
FROM Salesperson
RESULTS FOR LASTNAME COLUMN:
LastName
--------
0
Brett-sbrett
Roberts-kroberts
Paramore-mparamore
Lawler-alawler
Hooker-rhooker
Alford-halford
Blegen-cblegen
Holland-jholland
Accounts-VENDOR
Accounts-OTHER
Specifically, I want to get rid of the text (userid) at the end of the last name. If the names were the same length, I could just use a RIGHT function, but they vary in length. Ideas?

select left(PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 1), len(SalesPerson_Name)-CHARINDEX('-',SalesPerson_Name)-1)
You are getting charindex of - and taking the left string of it.

If you just want to remove the last word (username) you can use a query like this
select
rtrim(
substring(
SalesPerson_Name,
1,
charindex('-',SalesPerson_Name,1)-1
)
)
from Salesperson
The charindex function locates the occurrence of the character/s you are looking for.

Consider whether hyphen is followed by a space or not, and split depending on these two cases
with Salesperson( SalesPerson_Name ) as
(
select 'Undefined - 0' union all
select 'Sam Brett-sbrett' union all
select 'Kelly Roberts-kroberts' union all
select 'Michael Paramore-mparamore' union all
select 'Alivia Lawler-alawler'
)
select case when substring(SalesPerson_Name,charindex(' ',SalesPerson_Name)+1,1) = '-' then
substring(SalesPerson_Name,charindex(' ',SalesPerson_Name)+3,len(SalesPerson_Name))
else
substring(SalesPerson_Name,charindex(' ',SalesPerson_Name)+1,len(SalesPerson_Name))
end as last_name
from Salesperson s;
last_name
------------------
0
Brett-sbrett
Roberts-kroberts
Paramore-mparamore
Lawler-alawler

Related

SQL to Take Last Name, First Name and MI and just return First and Last Name

I have searched Stack and getting close to what I need, but can't seem to figure out why when I have a person with just Last Name, First Name that the first name ends up as the Middle Initial? I am only needing the Last Name and First name to display it as First and Last name.
Sample Data:
EMP_ID
EMP_NAME
1234
JONES, JAMES R
5687
SMITH, BILL
What I want to end up with is:
EMP_ID
EMP_NAME_FULL
1234
JAMES JONES
5687
BILL SMITH
I am working with this code and once I can figure out how to resolve getting the first name to work, I planned to combine the First and Last Name substring/Parsing to one name.
SELECT DISTINCT
EMP_ID
,EMP_NAME
,SUBSTRING(EMP_NAME, 1, CHARINDEX(',', EMP_NAME) - 1) AS LASTNAME
,CASE WHEN PARSENAME(REPLACE(EMP_NAME, ',', '.'),1) LIKE '% %' THEN PARSENAME(REPLACE(PARSENAME(REPLACE(EMP_NAME, ',', '.'),1), ' ', '.'),2) ELSE PARSENAME(REPLACE(EMP_NAME, ',', '.'),1) END FIRSTNAME
,CASE WHEN PARSENAME(REPLACE(EMP_NAME, ' ', '.'),1) LIKE '%,%' THEN NULL ELSE PARSENAME(REPLACE(EMP_NAME, ' ', '.'),1) END MI
FROM EMP_TABLE
Try this:
[MYSQL]
select emp_name,
SUBSTRING_INDEX(SUBSTRING_INDEX(emp_name, ', ', -1), ' ', 1),
SUBSTRING_INDEX(emp_name, ',', 1)
from EMP_TABLE;
[MYSQL SSMS]
select emp_name,
CASE
WHEN CHARINDEX(' ', SUBSTRING(emp_name, CHARINDEX(', ',emp_name)+2)) >0
THEN SUBSTRING(SUBSTRING(SUBSTRING(emp_name, CHARINDEX(', ',emp_name)+2), ' '),1,CHARINDEX(' ',SUBSTRING(emp_name, CHARINDEX(', ',emp_name)+2)))
ELSE SUBSTRING(SUBSTRING(emp_name, CHARINDEX(', ',emp_name)+2), ' ')
END AS firstname,
SUBSTRING(emp_name, 1, CHARINDEX(',',emp_name) - 1) AS lastname
from EMP_TABLE;
I couldn't try it with MYSQL SSMS but the function CHARINDEX is the same as INSTR, the only difference is in INSTR you specify first where to look and then what to look.
I try it like this with your data and it worked.
Then, I convert every INSTR into CHARINDEX, and I inverted the parameters.
[DEMO]
with tmp as
(
select 'JONES, JAMES R' as emp_name from dual
union all
select 'SMITH, BILL' as emp_name from dual
)
select emp_name,
CASE
WHEN INSTR(SUBSTRING(emp_name, INSTR(emp_name,', ')+2),' ') >0
THEN SUBSTRING(SUBSTRING(SUBSTRING(emp_name, INSTR(emp_name,', ')+2), ' '),1,INSTR(SUBSTRING(emp_name, INSTR(emp_name,', ')+2),' '))
ELSE
SUBSTRING(SUBSTRING(emp_name, INSTR(emp_name,', ')+2), ' ')
END AS firstname,
SUBSTRING(emp_name, 1, INSTR(emp_name,',') - 1) AS lastname
from tmp;

What MySQL LIKE wildcard should I use to return someone's name without their title/prefix name?

I want to return name of customers and order them, but their titles must be excluded in ordering.
SELECT name
FROM customers
WHERE name LIKE ...
ORDER BY name
I mean by 'their titles' is such as Dr., Sn., Lady, Sir, Mr., and Mrs.
Possible solution to your problem.
In Oracle:
regexp_replace(user_name, '^(MISS|MS\.|MS|MRS\.|MRS|MR\.|MR)\s*', '') as user_name
Also you can use REPLACE () function like:
REPLACE (user_name, 'MISS', '') as user_name
If you have a column structure like (mr | mrs | other) / space / username you can try this:
with users(user_name) as
(select 'mr user name1' from dual union all
select 'miss username2 ' from dual union all
select 'other username 3' from dual )
select substr(user_name,instr(user_name,' ')+1) real_username from users
Output:
REAL_USERNAME
----------------
username 1
username 2
username 3
In MSSQL:
DECLARE #str VARCHAR(500)='Mr Sam'
SELECT Title,
first_name,
Substring(NAME, CASE
WHEN Charindex(' ', NAME) = 0 THEN 1
ELSE Charindex(' ', NAME)
END, Len(NAME)) last_name
FROM (SELECT CASE
WHEN LEFT(#str, Charindex(' ', #str)) IN( 'Mr', 'Mrs', 'Miss' ) THEN LEFT(#str, Charindex(' ', #str))
ELSE ''
END AS Title,
CASE
WHEN LEFT(#str, Charindex(' ', #str)) IN ( 'Mr', 'Mrs', 'Miss' ) THEN LEFT(Stuff(#str, 1, Charindex(' ', #str), ''), Charindex(' ', Stuff(#str, 1, Charindex(' ', #str), '')))
ELSE LEFT(#str, Charindex(' ', #str))
END AS first_name,
CASE
WHEN LEFT(#str, Charindex(' ', #str)) IN ( 'Mr', 'Mrs', 'Miss' ) THEN Stuff(#str, 1, Charindex(' ', #str), '')
ELSE #str
END NAME) a
Technically using a % before their name will allow any prefix, which would get you the desired result (ex. WHERE name LIKE ('%' + #name). However, this is not the recommended approach on larger data sets as you will see significant performance issues with this approach.
You need to be more specific. First of all: Which SQL "flavor" are you using? Postgres? Oracle? MySql? This is very important because each engine has different functions. Every time you ask an SQL question in SO, be sure to include at the very least a tag mentioning which DBMS you're using.
Now, what do you mean by "I shouldn't return them". Do you mean you should not return records which have a prefix, or do you mean you need to return the records but without the prefix? (So if you have Dr. Henry Gutierrez, do you exclude him from the result? Or have it output as Henry Gutierrez?)
This is also a good place for another tip: Always write an expected output in your questions. "If X is Y then I expect the output to be this:"
If you need to exclude them entirely, you can use a REGEXP match (Once again, I cannot list a specific function because I have no idea which type of SQL you're using) Something like WHERE REGEXP(UPPER(COL)) NOT ('^(MS|MR)\s+.*$')
If it's the 2nd case, that's going to be much harder because you'd need to get a substring which excludes the prefix, but the prefixes all have different sizes so you can't just write a "one size fits all"
In general, it's a bad normalization practice to have prefixes in your SQL database. You should have a column called PREFIX and another column for the name itself.
EDIT: Based on your answer. This is the closest you can get to achieving what you want.
SELECT NAME
FROM (SELECT NAME,
CASE WHEN NAME LIKE 'MR %' THEN SUBSTRING(NAME, 4)
WHEN NAME LIKE 'MRS %' THEN SUBSTRING(NAME, 5)
ELSE NAME AS NAME2
FROM YOUR_TABLE ORDER BY NAME2) AS SUBQ
I am 99% sure this will not return the results ordered though, because the select order might be ignored outside of the subquery, so you can also just try whats in the subquery in a main query instead, but this will output 2 columns.
***My reply is in pseudocode. It is not exactly MySql syntax, you will need to check MySql documentation for the actual substring function implementation it has and for the CASE syntax.

Splitting a Full Name into First and Last Name

I have a list of customer whose name is given as a full name.
I want to create a function that takes the full name as parameter and returns the first and last name separately. If this is not possible I can have two separate functions one that returns the first name and the other that returns the last name. The full name list contains names that have a maximum of three words.
What I want is this:-
When a full name is composed of two words. The first one should be
the name and the second one should be the last name.
When a full name is composed of three words. The first and middle words should be the first name while the third word should be the last name.
Example:-
**Full Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Result:-
**First Name Last Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
I have search and found solutions that are not working as intended and would like some advice.
Keeping it short and simple
DECLARE #t TABLE(Fullname varchar(40))
INSERT #t VALUES('John Paul White'),('Peter Smith'),('Thomas')
SELECT
LEFT(Fullname, LEN(Fullname) - CHARINDEX(' ', REVERSE(FullName))) FirstName,
STUFF(RIGHT(FullName, CHARINDEX(' ', REVERSE(FullName))),1,1,'') LastName
FROM
#t
Result:
FirstName LastName
John Paul White
Peter Smith
Thomas NULL
If you are certain that your names will only ever be two or three words, with single spaces, then we can rely on the base string functions to extract the first and last name components.
SELECT
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col, 1,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) - 1)
ELSE SUBSTRING(col, 1, CHARINDEX(' ', col) - 1)
END AS first,
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) + 1,
LEN(col) - CHARINDEX(' ', col, CHARINDEX(' ', col)))
ELSE SUBSTRING(col,
CHARINDEX(' ', col) + 1,
LEN(col) - CHARINDEX(' ', col))
END AS last
FROM yourTable;
Yuck, but it seems to work. My feeling is that you should fix your data model at some point. A more ideal place to scrub your name data would be outside the database, e.g. in Java. Or, better yet, fix the source of your data such that you record proper first and last names from the very beginning.
Demo here:
Rextester
Another option (just for fun) is to use a little XML in concert with an CROSS APPLY
Example
Select FirstName = ltrim(reverse(concat(Pos2,' ',Pos3,' ',Pos4,' ',Pos5)))
,LastName = reverse(Pos1)
From YourTable A
Cross Apply (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
From (Select Cast('<x>' + replace(reverse(A.[Full Name]),' ','</x><x>')+'</x>' as xml) as xDim) XMLData
) B
Returns
FirstName LastName
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Cher
Sally Anne Bella Donna Baxter
You're trying to do two things at once...I won't solve for you, but here's the direction I'd take:
1) Check this out for string splitting: https://ole.michelsen.dk/blog/split-string-to-table-using-transact-sql.html. This will allow you to parse the name into a temp table and you can perform your logic on it to create names based on your rules
2) Create this as a table-valued function so that you can return a single row of parsed FirstName, LastName from your parameter. That way you can join to it and include in your results
Have you tried by Using PARSENAME Function?
The last method in splitting a full name into its corresponding first name and last name is the use of the PARSENAME string function, as can be seen from the following script:
DECLARE #FullName VARCHAR(100)
SET #FullName = 'John White Doe'
SELECT CONCAT(PARSENAME(REPLACE(#FullName, ' ', '.'), 3),' ',PARSENAME(REPLACE(#FullName, ' ', '.'), 2)) AS [FirstName],
PARSENAME(REPLACE(#FullName, ' ', '.'), 1) AS [LastName]
For more information, Goto this Site
This is the output..
Make it a table-valued function.
see here for an example
And this is the code you need to create your function. Basically you just need to split your LastName
IF OBJECT_ID(N'dbo.ufnParseName', N'TF') IS NOT NULL
DROP FUNCTION dbo.ufnParseName;
GO
CREATE FUNCTION dbo.ufnParseName(#FullName VARCHAR(300))
RETURNS #retParseName TABLE
(
-- Columns returned by the function
FirstName nvarchar(150) NULL,
LastName nvarchar(50) NULL
)
AS
-- Returns the spliced last name.
BEGIN
DECLARE
#FirstName nvarchar(250),
#LastName nvarchar(250);
-- Get common contact information
SELECT #LastName = RTRIM(RIGHT(#FullName, CHARINDEX(' ', REVERSE(#FullName)) - 1));
SELECT #FirstName = LTRIM(RTRIM(Replace(#FullName, #LastName, '')))
INSERT #retParseName
SELECT #FirstName, #LastName;
RETURN;
END
You can run as SELECT * FROM dbo.ufnParseName('M J K');
Why Table-Valued-Function
You can get rid off the duplication of your sql query and achieve DRY
You can try the below query. It is written as per your requirement and it only handles full_name with 2 or 3 parts in it.
;WITH cte AS(
SELECT full_name, (LEN(full_name) - LEN(REPLACE(full_name, ' ', '')) + 1) AS size FROM #temp
)
SELECT FirstName =
CASE
WHEN size=3 THEN PARSENAME(REPLACE(full_name, ' ', '.'), 3) + ' ' + PARSENAME(REPLACE(full_name, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(full_name, ' ', '.'), 2)
END,
PARSENAME(REPLACE(full_name, ' ', '.'), 1) AS LastName
FROM cte

Parsing Name Field in SQL

I am trying to separate a name field into the appropriate fields. The name field is not consistently the same. It can show up as Doe III,John w or Doe,John, or Doe III,John, or Doe,John W or it may be lacking the suffix and or middle initial. Any ideas would be greatly appreciated.
SELECT (
CASE LEN(REPLACE(FirstName, ' ', ''))
WHEN LEN(FirstName + ' ') - 1
THEN PARSENAME(REPLACE(FirstName, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(FirstName, ' ', '.'), 3)
END
) AS LastName
,(
CASE LEN(REPLACE(FirstName, ' ', ''))
WHEN LEN(FirstName + ',') - 1
THEN NULL
ELSE PARSENAME(REPLACE(FirstName, ' ', '.'), 2)
END
) AS Suffix
,PARSENAME(REPLACE(FirstName, ' ', '.'), 1) AS FirstName
FROM Trusts.dbo.tblMember
I need the name regardless of the format, as stated above, to parse into the appropriate fields of LastName,Suffix,FirstName,MiddleInitial, regardless of whether it has a suffix or a middle initial
If the given 4 names are the only type of cases, then you can use something like below.
Note: I used a CTE table tbl2 to separate comma_pos,first_space,second_space for better understanding in the main query. You can replace these value in main query with their corresponding function in CTE, to make the main query faster. I mean replace comma_pos in main query with charindex(',',name) an so on.
Also I am assuming that there are no leading/trailing or extra whitespaces or any junk character in name column. If you have, then sanitize your data first before proceeding.
Rexter Sample
with tbl2 as (
select tbl.*,
charindex(',',name) as comma_pos,
charindex(' ',name,1) first_space,
charindex(' ',name,charindex(' ',name,1)+1) second_space
from tbl)
select tbl2.name
,case when second_space <> 0
then substring(name,comma_pos+1,second_space-comma_pos-1)
when first_space > comma_pos
then substring(name,comma_pos+1,first_space-comma_pos-1)
else substring(name,comma_pos+1,len(name)-comma_pos)
end as first_name
,case when second_space <> 0
then substring(name,second_space+1,len(name)-second_space)
when first_space > comma_pos
then substring(name,first_space+1,len(name)-first_space)
end as middle_name
,case when first_space=0 or first_space>comma_pos
then substring(name,1,comma_pos-1)
else substring(name,1,first_space-1)
end as last_name
,case when first_space=0 or first_space>comma_pos
then null
else substring(name,first_space,comma_pos-first_space)
end as suffix
from tbl2;

How to get middle portion from Sql server table data?

I am trying to get First name from employee table, in employee table full_name is like this: Dow, Mike P.
I tried with to get first name using below syntax but it comes with Middle initial - how to remove middle initial from first name if any. because not all name contain middle initial value.
-- query--
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
len(Employee_First_Name)) AS FirstName
---> remove middle initial from right side from employee
-- result
Full_name Firstname Dow,Mike P. Mike P.
--few example for Full_name data---
smith,joe j. --->joe (need result as)
smith,alan ---->alan (need result as)
Instead of specifying the len you need to use charindex again, but specify that you want the second occurrence of a space.
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
CHARINDEX(' ', Employee_First_Name, 2)) AS FirstName
One thing to note, the second charindex can return 0 if there is no second occurence. In that case, you would want to use something like the following:
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
IIF(CHARINDEX(' ', Employee_First_Name, 2) = 0, Len(Employee_First_name), CHARINDEX(' ', Employee_First_Name, 2))) AS FirstName
This removes the portion before the comma.. then uses that string and removes everything after space.
WITH cte AS (
SELECT *
FROM (VALUES('smith,joe j.'),('smith,alan'),('joe smith')) t(fullname)
)
SELECT
SUBSTRING(
LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname))),
0,
COALESCE(NULLIF(CHARINDEX(' ',LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname)))),0),LEN(fullname)))
FROM cte
output
------
joe
alan
joe
To be honest, this is most easily expressed using multiple levels of logic. One way is using outer apply:
select ttt.firstname
from t outer apply
(select substring(t.full_name, charindex(', ', t.full_name) + 2, len(t.full_name) as firstmi
) tt outer apply
(select (case when tt.firstmi like '% %'
then left(tt.firstmi, charindex(' ', tt.firstmi)
else tt.firstmi
end) as firstname
) as ttt
If you want to put this all in one complicated statement, I would suggest a computed column:
alter table t
add firstname as (stuff((case when full_name like '%, % %.',
then left(full_name,
charindex(' ', full_name, charindex(', ', full_name) + 2)
)
else full_name
end),
1,
charindex(', ', full_name) + 2,
'')
If format of this full_name field is the same for all rows, you may utilize power of SQL FTS word breaker for this task:
SELECT N'Dow, Mike P.' AS full_name INTO #t
SELECT display_term FROM #t
CROSS APPLY sys.dm_fts_parser(N'"' + full_name + N'"', 1033, NULL, 1) p
WHERE occurrence = 2
DROP TABLE #t