Oracle 10G regexp for Name - sql

I am trying to write a regexp_replace to create a "Friendly" name for some employees. They are currently stored as FIRST <POSSIBLE MIDDLE INITIAL> LAST <POSSIBLE SUFFIX> <MULTIPLE WHITESPACE> SITE_ID
For example,
JOHN SMITH ABC
JOHN Q SMITH ABC
JOHN Q SMITH III ABC
I am trying to write a regex so that I will end up with:
Smith, John
Smith, John Q
Smith III, John Q
The ABC "Site ID" doesn't need to be included in my output.
This is what I tried with little success:
regexp_replace(
employee_name,
'^(\S+)\s(\S+)\s(\S+)',
'\3, \1 \2'
)
Also, I am using Oracle 10G. Any help would be greatly appreciated!

If your names don't show the problems ruakh points out, i.e., there aren't single-letter names or surnames, and no Hispanic names, you can try this regexp:
^(\S+)\s(\S\s)?(\S+)(\s\S+)?\s\s+\S+$
The replacement should be:
\3\4, \1\2

Related

Teradata - Parsing data using strtok function - identify bad data

I need help in identifying bad data(PERSON_NAME) from below select query due to which query is failing as it is not able to parse FIRST_NAME & LAST_NAME.
SELECT
PERSON_NAME
,Trim(OReplace(PERSON_NAME,StrTok(PERSON_NAME,' ',1),'')) AS AGENT_LAST_NAME
,StrTok(PERSON_NAME,' ',1) AS AGENT_FIRST_NAME
FROM TABLENAME
WHERE CAST(RECORD_START_TS AS DATE) = '2022-11-01';
*** Failure 6706 The string contains an untranslatable character.
Actual Output should look like below:
PERSON_NAME AGENT_LAST_NAME AGENT_FIRST_NAME
Abraham Gomezfitzgerald Gomezfitzgerald Abraham
Adam Tregoning Tregoning Adam
Ajiel Marino Marino Ajiel
Alexander Ford III Ford III Alexander
Fernanda Garvey Hernandez Garvey Hernandez Fernanda

Remove middle name from full name in SAS

I need a way to remove the middle names from my full name in SAS.
Example:
Name= MARY ANN SMITH
Name= JERRY J SMITH
Output wanted:
Name2= MARY SMITH
Name2= JERRY SMITH
Any ideas how I can do this?
If you have actual names of real people then the problem is much harder than you implied. Some people have first or last (or both) names that are more than one word. What about people that only have one name?
Anyway SCAN() can do what you want.
name2=catx(' ',scan(name,1,' '),scan(name,-1,' '));

Extract the first word before a space using SQL

Let's say I have a column in my users table called name:
name
----
Brian
Jill Johnson
Sarah
Steven Smith
I want to write a PostgreSQL query to fetch just the first name. This is what I have so far:
select substr(name, 0, position(' ' IN name) | length(name) + 1) as first_name from users;
name
----
Brian
Jill
Sarah
Steven
This appears to work but I feel like there much be an easier way.
Although I don't follow the logic, your code looks like Postgres. If so, use split_part():
select split_part(name, ' ', 1)
from users;

How replace only the next word after search string?

how can i find and replace only the next word after a search string with a select statement?
For example:
"The user Mr Smith helped me a lot" --> Output: "The user Mr X helped me a lot"
The search string is "Mr" and there a many different last names (data protection reasons).
Thank you :)
I'm not sure you are asking the correct question.
with your "replace next word" you will run into issues.
By the looks of your example, this is a free text input from an end user (assuming VARCHAR(MAX)) and it is hard to predict what variation end users could type.
you could therefore have search items;
- Mr Smith
Mr. Smith
Mister Smith
Mistar Smith (spelling error intentional for example)
Mrs Smith
Miss Smith
Mr & Mrs Smith (2 people)
Mr. + Mrs & Miss Smith (Family of 3 with the end users using/not using punctuation and using different AND symbols)
Mrs&MrSmith (End user didn't want Spaces for the entity)
Dr. Smith (or Doc or Doc. or Doct or Doct. or Doctor or misspelled Docter)
Prof. Smith (Pro. or Professor or Proffessor or Profeser or Pr)
Father Smith (Ftr.)
Rev Smith (rev
Earl Smith
Sir Smith
Dame Smith
Lady Smith
Chancellor Smith
etc. (Note: these are just English titles, what if there is a German Herr, a French Monsieur or a title from any other language in the text?)
Also you have the issue of "the next word", what about double barrel names, names that are broken by hyphens or even those containing apostrophes (e.g. Mr Smith Carroll / Mr Smith-Carroll / Mr O'Carroll)?
At what point do you want the next word to finish? The next space? The next non-surname? Do you have a list of all surnames to check this against?
You really need to encrypt the db to be 100% sure that no data will accidentally not be replaced.
Make it protocol going forward not to allow the use of actual names in free text boxes, i.e. have the end users type "Mr X" in the text box from now on, but encryption seems to be your best/safest option.
Use STUFF() Function of MSSQL. STUFF()
Below is a query which would help you to extract the required result.
Note:- I am using 'addressstreet' as a column which you can replace with your column and Box with Mr.
WITH CTE AS (
SELECT SUBSTRING(addressstreet ,1,(CHARINDEX(' BOX ',addressstreet + ' ')-1))TEST0,REPLACE(SUBSTRING([addressstreet], CHARINDEX('BOX ', [addressstreet]), LEN([addressstreet])),'BOX','')test, addressstreet FROM [dbo].[Vendor]
where addressstreet like '% BOX %')
,CTE2 AS(
SELECT TEST0, ltrim(stuff(TEST,1,charindex(' ',TEST),''))TEST1, ADDRESSSTREET FROM CTE
)
SELECT TEST0 + ' Mr X '+ ltrim(stuff(TEST1,1,charindex(' ',TEST1),'')) RequiredOutput, ADDRESSSTREET from cte2
This SQL statement is not a performance optimum query, but you can achieve your objective.
Output:-
RequiredOutput AddressStreet
P.O. Mr X Street Plaza P.O. BOX 32109 Street Plaza

SQL: How to find exact matches of words within a text

Please bear with me, I'm new to Access and SQL.
What I'm trying to do is to write a SQL query to filter through two tables - one contains words that are split into two columns and the other contains text. Essentially, what I want is a new table that gives me all of the exact matches of the two columns of words with the column of text.
Here's an analogous database to simulate what I want as a result:
Table A:
FirstName: LastName:
John Doe
Jane Doe
Josh Smith
James Jones
David Johnson
Table B:
FullName:
Jake Davidson
Mike Peters
Jason James
John Michael Smith
Query Result:
FirstName: LastName: FullName:
John Doe John Michael Smith
Josh Smith John Michael Smith
James Jones Jason James
(notice that the David - Davidson match didn't come up. i.e. I'd like exact matches only)
So help me fill in the blanks:
SELECT TableA.FirstName,TableA.LastName, TableB.FullName
FROM TableA,TableB
WHERE TableB.FullName LIKE (has an exact match with TableA.FirstName--not sure what to put )
UNION
SELECT TableA.FirstName,TableA.LastName, TableB.FullName
FROM TableA,TableB
WHERE TableB.FullName LIKE (has an exact match with TableA.LastName--not sure what to put)
;
This will be dependant on what you want it to do with FullNames with more than two names, like "John Jacob Smith", but, assuming you want it to ignore the middle word[s],
then try
Select firstname, lastname, fullname
from tableA a
Join tableb f
On f.firstname = Mid(a.fullname, 1, InStr(a.fullname, " ")-1)
Join tableb l
On l.lastname = Mid(a.fullname, InStrRev(a.FullNamee, " ")+1)
Here is an approach that compares each FullName to both Firstname and LastName:
select a.Firstname, a.LastName, b.FullName
from tableA as a inner join
tableB as b
on instr(' '&b.FullName&' ', ' '&a.FirstName&' ') > 0 and
instr(' '&b.FullName&' ', ' '&a.Lastname&' ') > 0
It assume that the delimiter for names is a space (as in your example). The comparison attaches a space onto the beginning and end of FullName and then looks for a space-padded first name and last name.