Teradata String Manipulation (Second Space) - sql

I'm having great difficulty solving this seemingly easy task:
Purpose:
Create a query that eliminates the middle Initial
Example
Name
Smith, John A
Jane, Mary S
I would like an output such as this:
Name
Smith, John
Jane, Mary
Any tips on how to do this with Teradata SQL
I believe I solved the issue, albeit in a very poor way:
SELECT SUBSTR('SMITH, JOHN A', 0, (POSITION(' ' IN 'SMITH, JOHN A') + (POSITION(' ' IN SUBSTR('SMITH, JOHN A',(POSITION(' ' IN 'SMITH, JOHN A'))+ 1,50)))))

select a,
substr(a,1,index(a,' '))|| substr(trim(substr(a,index(a,' '))),1,index(trim(substr(a,index(a,' '))),' ')),
substr(trim(substr(a,index(a,' '))),index(trim(substr(a,index(a,' '))),' ')) last_name
from a

The challenge is making sure your names are consistently formatted. (Last_Name, Given_Name Middle_Initial) If they are then you may be able to solve this with recursive SQL. The following SQL would take Given_Name Last_Name and return Last_Name. You may be able to tweak it to accomplish your specific task. (My sample data was not consistently formatted so I was stuck trying to find the second (or third) occurrence of a white space character.)
WITH RECURSIVE cte (FullName, DelimPosition, RecursionLevel, Element, Remainder) AS
(
SELECT FullName
, 0 AS DelimPosition_
, 0
, CAST('' AS VARCHAR(128))
, FullName
FROM MyDatabase.Persons
UNION ALL
SELECT FullName
, CASE WHEN POSITION(' ' IN Remainder) > 0
THEN POSITION(' ' IN Remainder)
ELSE CHARACTER_LENGTH(Remainder)
END DelimPosition_
, RecursionLevel + 1
, SUBSTRING(Remainder FROM 0 FOR DelimPosition_ + 1)
, SUBSTRING(Remainder FROM DelimPosition_ + 1)
FROM cte
WHERE DelimPosition_ > 1
AND RecursionLevel < 3 -- Set max depth
)
SELECT FullName
, CASE WHEN POSITION('&' IN Element) = 0
THEN Element
ELSE NULL
END AS LastName
FROM cte c
WHERE RecursionLevel > 2
ORDER BY FullName;
Another option would be to implement a UDF that returns the rightmost n characters of a string. (e.g.RIGHT(FullName, n))
If the formatting is not consistent then we have to look at other less graceful options.
Hope this helps.

Related

Separate fullname into first and last, and remove 'junk'

Wasn't sure of the best way to word this. So I have a column with names, as below:
SalesPerson_Name
----------------
Undefined - 0
Sam Brett-sbrett
Kelly Roberts-kroberts
Michael Paramore-mparamore
Alivia Lawler-alawler
Ryan Hooker-rhooker
Heather Alford-halford
Cassandra Blegen-cblegen
JD Holland-jholland
Vendor Accounts-VENDOR
Other Accounts-OTHER
Getting the names separated is easy enough with PARSENAME and REPLACE functions, but where I'm running into a pickle is with getting rid of the 'junk' at the end:
SELECT SalesPerson_Key
,SalesPerson_Name
,CASE
WHEN PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 2) IS NULL
THEN PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 1)
ELSE PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 2)
END AS FirstName
,CASE
WHEN PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 2) IS NULL
THEN NULL
ELSE PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 1)
END AS LastName
FROM Salesperson
RESULTS FOR LASTNAME COLUMN:
LastName
--------
0
Brett-sbrett
Roberts-kroberts
Paramore-mparamore
Lawler-alawler
Hooker-rhooker
Alford-halford
Blegen-cblegen
Holland-jholland
Accounts-VENDOR
Accounts-OTHER
Specifically, I want to get rid of the text (userid) at the end of the last name. If the names were the same length, I could just use a RIGHT function, but they vary in length. Ideas?
select left(PARSENAME(REPLACE(SalesPerson_Name, ' ', '.'), 1), len(SalesPerson_Name)-CHARINDEX('-',SalesPerson_Name)-1)
You are getting charindex of - and taking the left string of it.
If you just want to remove the last word (username) you can use a query like this
select
rtrim(
substring(
SalesPerson_Name,
1,
charindex('-',SalesPerson_Name,1)-1
)
)
from Salesperson
The charindex function locates the occurrence of the character/s you are looking for.
Consider whether hyphen is followed by a space or not, and split depending on these two cases
with Salesperson( SalesPerson_Name ) as
(
select 'Undefined - 0' union all
select 'Sam Brett-sbrett' union all
select 'Kelly Roberts-kroberts' union all
select 'Michael Paramore-mparamore' union all
select 'Alivia Lawler-alawler'
)
select case when substring(SalesPerson_Name,charindex(' ',SalesPerson_Name)+1,1) = '-' then
substring(SalesPerson_Name,charindex(' ',SalesPerson_Name)+3,len(SalesPerson_Name))
else
substring(SalesPerson_Name,charindex(' ',SalesPerson_Name)+1,len(SalesPerson_Name))
end as last_name
from Salesperson s;
last_name
------------------
0
Brett-sbrett
Roberts-kroberts
Paramore-mparamore
Lawler-alawler

Splitting a Full Name into First and Last Name

I have a list of customer whose name is given as a full name.
I want to create a function that takes the full name as parameter and returns the first and last name separately. If this is not possible I can have two separate functions one that returns the first name and the other that returns the last name. The full name list contains names that have a maximum of three words.
What I want is this:-
When a full name is composed of two words. The first one should be
the name and the second one should be the last name.
When a full name is composed of three words. The first and middle words should be the first name while the third word should be the last name.
Example:-
**Full Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Result:-
**First Name Last Name**
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
I have search and found solutions that are not working as intended and would like some advice.
Keeping it short and simple
DECLARE #t TABLE(Fullname varchar(40))
INSERT #t VALUES('John Paul White'),('Peter Smith'),('Thomas')
SELECT
LEFT(Fullname, LEN(Fullname) - CHARINDEX(' ', REVERSE(FullName))) FirstName,
STUFF(RIGHT(FullName, CHARINDEX(' ', REVERSE(FullName))),1,1,'') LastName
FROM
#t
Result:
FirstName LastName
John Paul White
Peter Smith
Thomas NULL
If you are certain that your names will only ever be two or three words, with single spaces, then we can rely on the base string functions to extract the first and last name components.
SELECT
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col, 1,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) - 1)
ELSE SUBSTRING(col, 1, CHARINDEX(' ', col) - 1)
END AS first,
CASE WHEN LEN(col) = LEN(REPLACE(col, ' ', '')) + 2
THEN SUBSTRING(col,
CHARINDEX(' ', col, CHARINDEX(' ', col) + 1) + 1,
LEN(col) - CHARINDEX(' ', col, CHARINDEX(' ', col)))
ELSE SUBSTRING(col,
CHARINDEX(' ', col) + 1,
LEN(col) - CHARINDEX(' ', col))
END AS last
FROM yourTable;
Yuck, but it seems to work. My feeling is that you should fix your data model at some point. A more ideal place to scrub your name data would be outside the database, e.g. in Java. Or, better yet, fix the source of your data such that you record proper first and last names from the very beginning.
Demo here:
Rextester
Another option (just for fun) is to use a little XML in concert with an CROSS APPLY
Example
Select FirstName = ltrim(reverse(concat(Pos2,' ',Pos3,' ',Pos4,' ',Pos5)))
,LastName = reverse(Pos1)
From YourTable A
Cross Apply (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
From (Select Cast('<x>' + replace(reverse(A.[Full Name]),' ','</x><x>')+'</x>' as xml) as xDim) XMLData
) B
Returns
FirstName LastName
John Paul White
Peter Smith
Ann Marie Brown
Jack Black
Sam Olaf Turner
Cher
Sally Anne Bella Donna Baxter
You're trying to do two things at once...I won't solve for you, but here's the direction I'd take:
1) Check this out for string splitting: https://ole.michelsen.dk/blog/split-string-to-table-using-transact-sql.html. This will allow you to parse the name into a temp table and you can perform your logic on it to create names based on your rules
2) Create this as a table-valued function so that you can return a single row of parsed FirstName, LastName from your parameter. That way you can join to it and include in your results
Have you tried by Using PARSENAME Function?
The last method in splitting a full name into its corresponding first name and last name is the use of the PARSENAME string function, as can be seen from the following script:
DECLARE #FullName VARCHAR(100)
SET #FullName = 'John White Doe'
SELECT CONCAT(PARSENAME(REPLACE(#FullName, ' ', '.'), 3),' ',PARSENAME(REPLACE(#FullName, ' ', '.'), 2)) AS [FirstName],
PARSENAME(REPLACE(#FullName, ' ', '.'), 1) AS [LastName]
For more information, Goto this Site
This is the output..
Make it a table-valued function.
see here for an example
And this is the code you need to create your function. Basically you just need to split your LastName
IF OBJECT_ID(N'dbo.ufnParseName', N'TF') IS NOT NULL
DROP FUNCTION dbo.ufnParseName;
GO
CREATE FUNCTION dbo.ufnParseName(#FullName VARCHAR(300))
RETURNS #retParseName TABLE
(
-- Columns returned by the function
FirstName nvarchar(150) NULL,
LastName nvarchar(50) NULL
)
AS
-- Returns the spliced last name.
BEGIN
DECLARE
#FirstName nvarchar(250),
#LastName nvarchar(250);
-- Get common contact information
SELECT #LastName = RTRIM(RIGHT(#FullName, CHARINDEX(' ', REVERSE(#FullName)) - 1));
SELECT #FirstName = LTRIM(RTRIM(Replace(#FullName, #LastName, '')))
INSERT #retParseName
SELECT #FirstName, #LastName;
RETURN;
END
You can run as SELECT * FROM dbo.ufnParseName('M J K');
Why Table-Valued-Function
You can get rid off the duplication of your sql query and achieve DRY
You can try the below query. It is written as per your requirement and it only handles full_name with 2 or 3 parts in it.
;WITH cte AS(
SELECT full_name, (LEN(full_name) - LEN(REPLACE(full_name, ' ', '')) + 1) AS size FROM #temp
)
SELECT FirstName =
CASE
WHEN size=3 THEN PARSENAME(REPLACE(full_name, ' ', '.'), 3) + ' ' + PARSENAME(REPLACE(full_name, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(full_name, ' ', '.'), 2)
END,
PARSENAME(REPLACE(full_name, ' ', '.'), 1) AS LastName
FROM cte

How to get middle portion from Sql server table data?

I am trying to get First name from employee table, in employee table full_name is like this: Dow, Mike P.
I tried with to get first name using below syntax but it comes with Middle initial - how to remove middle initial from first name if any. because not all name contain middle initial value.
-- query--
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
len(Employee_First_Name)) AS FirstName
---> remove middle initial from right side from employee
-- result
Full_name Firstname Dow,Mike P. Mike P.
--few example for Full_name data---
smith,joe j. --->joe (need result as)
smith,alan ---->alan (need result as)
Instead of specifying the len you need to use charindex again, but specify that you want the second occurrence of a space.
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
CHARINDEX(' ', Employee_First_Name, 2)) AS FirstName
One thing to note, the second charindex can return 0 if there is no second occurence. In that case, you would want to use something like the following:
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
IIF(CHARINDEX(' ', Employee_First_Name, 2) = 0, Len(Employee_First_name), CHARINDEX(' ', Employee_First_Name, 2))) AS FirstName
This removes the portion before the comma.. then uses that string and removes everything after space.
WITH cte AS (
SELECT *
FROM (VALUES('smith,joe j.'),('smith,alan'),('joe smith')) t(fullname)
)
SELECT
SUBSTRING(
LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname))),
0,
COALESCE(NULLIF(CHARINDEX(' ',LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname)))),0),LEN(fullname)))
FROM cte
output
------
joe
alan
joe
To be honest, this is most easily expressed using multiple levels of logic. One way is using outer apply:
select ttt.firstname
from t outer apply
(select substring(t.full_name, charindex(', ', t.full_name) + 2, len(t.full_name) as firstmi
) tt outer apply
(select (case when tt.firstmi like '% %'
then left(tt.firstmi, charindex(' ', tt.firstmi)
else tt.firstmi
end) as firstname
) as ttt
If you want to put this all in one complicated statement, I would suggest a computed column:
alter table t
add firstname as (stuff((case when full_name like '%, % %.',
then left(full_name,
charindex(' ', full_name, charindex(', ', full_name) + 2)
)
else full_name
end),
1,
charindex(', ', full_name) + 2,
'')
If format of this full_name field is the same for all rows, you may utilize power of SQL FTS word breaker for this task:
SELECT N'Dow, Mike P.' AS full_name INTO #t
SELECT display_term FROM #t
CROSS APPLY sys.dm_fts_parser(N'"' + full_name + N'"', 1033, NULL, 1) p
WHERE occurrence = 2
DROP TABLE #t

Extract last name, first name and suffix into separate columns

I was wondering if someone could provide me an easy way to extract the names into different columns as below. There is a comma after the Last Name and space between First Name, Middle Initial, and Suffix. Greatly appreciate it.
Stored Data:
Name
Walker,James M JR
Smith,Jack P
Smith,Whitney
Required result:
LastName FirstName Suffix
Walker James JR
Smith Jack
Smith Whitney
Tried Code:
select top 5 Name,
LEFT(Name, CHARINDEX(',', Name) - 1) AS LastName,
right(Name, len(Name) - CHARINDEX(',', Name)) as FirstName
Just having problem with separating First Name from Middle Initial and Suffix. Then getting Suffix from the last space from the right.
You really should store these parts of the name in separate columns (first normal form) to avoid such parsing.
You can put all the logic into one huge call of nested functions, but it is quite handy to separate them into single calls using CROSS APPLY.
The parsing is straight-forward:
find position of comma
split the string into part before comma (LastName) and part AfterComma
find position of first space in the second part AfterComma
split the string into two parts again - this gives FirstName and the rest (AfterSpace)
find position of space in AfterSpace
split the string into two parts again - this gives Initial and Suffix.
The query also checks results of CHARINDEX - it returns 0 if the string is not found.
Obviously, if the string value is not in the expected format, you'll get incorrect result.
DECLARE #T TABLE (Name varchar(8000));
INSERT INTO #T (Name) VALUES
('Walker'),
('Walker,James M JR'),
('Smith,Jack P'),
('Smith,Whitney');
SELECT
Name
,LastName
,AfterComma
,FirstName
,AfterSpace
,MidInitial
,Suffix
FROM
#T
CROSS APPLY (SELECT CHARINDEX(',', Name) AS CommaPosition) AS CA_CP
CROSS APPLY (SELECT CASE WHEN CommaPosition > 0 THEN
LEFT(Name, CommaPosition - 1) ELSE Name END AS LastName) AS CA_LN
CROSS APPLY (SELECT CASE WHEN CommaPosition > 0 THEN
SUBSTRING(Name, CommaPosition + 1, 8000) ELSE '' END AS AfterComma) AS CA_AC
CROSS APPLY (SELECT CHARINDEX(' ', AfterComma) AS SpacePosition) AS CA_SP
CROSS APPLY (SELECT CASE WHEN SpacePosition > 0 THEN
LEFT(AfterComma, SpacePosition - 1) ELSE AfterComma END AS FirstName) AS CA_FN
CROSS APPLY (SELECT CASE WHEN SpacePosition > 0 THEN
SUBSTRING(AfterComma, SpacePosition + 1, 8000) ELSE '' END AS AfterSpace) AS CA_AS
CROSS APPLY (SELECT CHARINDEX(' ', AfterSpace) AS Space2Position) AS CA_S2P
CROSS APPLY (SELECT CASE WHEN Space2Position > 0 THEN
LEFT(AfterSpace, Space2Position - 1) ELSE AfterSpace END AS MidInitial) AS CA_MI
CROSS APPLY (SELECT CASE WHEN Space2Position > 0 THEN
SUBSTRING(AfterSpace, Space2Position + 1, 8000) ELSE '' END AS Suffix) AS CA_S
result
Name LastName AfterComma FirstName AfterSpace MidInitial Suffix
Walker Walker
Walker,James M JR Walker James M JR James M JR M JR
Smith,Jack P Smith Jack P Jack P P
Smith,Whitney Smith Whitney Whitney

How to format the order of first/last name and remove prefix and nickname

I have a need to retrieve a hierarchy of managers and the column which stores the manager names for a given person are formatted like this Smith, Mr. William (Bill). I want this output to simply be William Smith. So far I have put this together:
SELECT DISTINCT RIGHT(u.manager, LEN(u.manager)-(1+CHARINDEX(', ', u.manager))) + ' ' +
LEFT(u.manager, CHARINDEX(', ', u.manager) - 1) as ManagerName
FROM Users u
The current result from that query using my example above is Mr. William (Bill) Smith. This CHARINDEX and SUBSTRING stuff always gives me a lot of trouble so I am not really sure what the easiest way to do this is. This is also a one-off, so I am not sure a function would be useful here.
DEMO
SELECT
SUBSTRING(manager,0,CHARINDEX(',', manager)) as surname,
SUBSTRING(manager,CHARINDEX('. ', manager)+2, LEN(manager)-CHARINDEX(' (', manager)+1) as name,
CONCAT(SUBSTRING(manager,CHARINDEX('. ', manager)+2, LEN(manager)-CHARINDEX(' (', manager)+1),
' ',
SUBSTRING(manager,0,CHARINDEX(',', manager))) as 'name surname'
FROM
Users
Result:
+-------------+-----------+--------------+
| surname | name | name surname |
+-------------+-----------+--------------+
Smith William William Smith
I took your query and modified a little bit:
SELECT
---this is the tricky part: inner part finds the first instance of '(' parenthesis
--and substract it from the length of the first name and get only the left part of the first name by subtracting it
CONCAT (
LEFT(t.FirstName, LEN(t.FirstName) - (LEN(t.FirstName) - CHARINDEX('(', t.FirstName) + 1))
,t.LastName
)
FROM (
--basically separating your above syntax to two columns
SELECT RIGHT('Smith, Mr. William (Bill)', LEN('Smith, Mr. William (Bill)') - CHARINDEX('.', 'Smith, Mr. William (Bill)') - 1) AS FirstName
,LEFT('Smith, Mr. William (Bill)', CHARINDEX(', ', 'Smith, Mr. William (Bill)') - 1) AS LastName
) t
Here is the query that should work with your table name and column:
SELECT
---Use case when statement to determine if there are any instances of '(' in the first name
CONCAT (
CASE
WHEN CHARINDEX('(', t.FirstName) > 0
THEN LEFT(t.FirstName, LEN(t.FirstName) - (LEN(t.FirstName) - CHARINDEX('(', t.FirstName) + 1))
ELSE t.FirstName + ' '
END
,t.LastName
)
FROM (
SELECT
RIGHT(u.manager, LEN(u.manager) - CHARINDEX('.', u.manager) - 1) AS FirstName
,LEFT(u.manager, CHARINDEX(', ', u.manager) - 1) AS LastName from Users u
) t
SELECT RIGHT(NameStripped, LEN(NameStripped) - (1 + CHARINDEX(', ', NameStripped))) + ' ' + LEFT(NameStripped, CHARINDEX(', ', NameStripped) - 1) AS ManagerName --Your original code
FROM (
SELECT replace(replace(
LEFT(u.manager, CHARINDEX('(', u.manager) - 2) --Get rid of nickname
, 'Mr. ', ''), 'Ms.', '') AS NameStripped --Get rid of Mr/Ms
from MyTable u) a
This should work - I used the code you posted, but added a subquery to remove the nicknames and prefixes.
Note that you may need to adjust this if a) you have more prefix options than this (in which case you could add additional replaces) and/or b) not everyone in your database has a nickname (in which case you'll want to wrap that part in a case statement, most likely).