SQL: How to find exact matches of words within a text - sql

Please bear with me, I'm new to Access and SQL.
What I'm trying to do is to write a SQL query to filter through two tables - one contains words that are split into two columns and the other contains text. Essentially, what I want is a new table that gives me all of the exact matches of the two columns of words with the column of text.
Here's an analogous database to simulate what I want as a result:
Table A:
FirstName: LastName:
John Doe
Jane Doe
Josh Smith
James Jones
David Johnson
Table B:
FullName:
Jake Davidson
Mike Peters
Jason James
John Michael Smith
Query Result:
FirstName: LastName: FullName:
John Doe John Michael Smith
Josh Smith John Michael Smith
James Jones Jason James
(notice that the David - Davidson match didn't come up. i.e. I'd like exact matches only)
So help me fill in the blanks:
SELECT TableA.FirstName,TableA.LastName, TableB.FullName
FROM TableA,TableB
WHERE TableB.FullName LIKE (has an exact match with TableA.FirstName--not sure what to put )
UNION
SELECT TableA.FirstName,TableA.LastName, TableB.FullName
FROM TableA,TableB
WHERE TableB.FullName LIKE (has an exact match with TableA.LastName--not sure what to put)
;

This will be dependant on what you want it to do with FullNames with more than two names, like "John Jacob Smith", but, assuming you want it to ignore the middle word[s],
then try
Select firstname, lastname, fullname
from tableA a
Join tableb f
On f.firstname = Mid(a.fullname, 1, InStr(a.fullname, " ")-1)
Join tableb l
On l.lastname = Mid(a.fullname, InStrRev(a.FullNamee, " ")+1)

Here is an approach that compares each FullName to both Firstname and LastName:
select a.Firstname, a.LastName, b.FullName
from tableA as a inner join
tableB as b
on instr(' '&b.FullName&' ', ' '&a.FirstName&' ') > 0 and
instr(' '&b.FullName&' ', ' '&a.Lastname&' ') > 0
It assume that the delimiter for names is a space (as in your example). The comparison attaches a space onto the beginning and end of FullName and then looks for a space-padded first name and last name.

Related

Extract the first word before a space using SQL

Let's say I have a column in my users table called name:
name
----
Brian
Jill Johnson
Sarah
Steven Smith
I want to write a PostgreSQL query to fetch just the first name. This is what I have so far:
select substr(name, 0, position(' ' IN name) | length(name) + 1) as first_name from users;
name
----
Brian
Jill
Sarah
Steven
This appears to work but I feel like there much be an easier way.
Although I don't follow the logic, your code looks like Postgres. If so, use split_part():
select split_part(name, ' ', 1)
from users;

How replace only the next word after search string?

how can i find and replace only the next word after a search string with a select statement?
For example:
"The user Mr Smith helped me a lot" --> Output: "The user Mr X helped me a lot"
The search string is "Mr" and there a many different last names (data protection reasons).
Thank you :)
I'm not sure you are asking the correct question.
with your "replace next word" you will run into issues.
By the looks of your example, this is a free text input from an end user (assuming VARCHAR(MAX)) and it is hard to predict what variation end users could type.
you could therefore have search items;
- Mr Smith
Mr. Smith
Mister Smith
Mistar Smith (spelling error intentional for example)
Mrs Smith
Miss Smith
Mr & Mrs Smith (2 people)
Mr. + Mrs & Miss Smith (Family of 3 with the end users using/not using punctuation and using different AND symbols)
Mrs&MrSmith (End user didn't want Spaces for the entity)
Dr. Smith (or Doc or Doc. or Doct or Doct. or Doctor or misspelled Docter)
Prof. Smith (Pro. or Professor or Proffessor or Profeser or Pr)
Father Smith (Ftr.)
Rev Smith (rev
Earl Smith
Sir Smith
Dame Smith
Lady Smith
Chancellor Smith
etc. (Note: these are just English titles, what if there is a German Herr, a French Monsieur or a title from any other language in the text?)
Also you have the issue of "the next word", what about double barrel names, names that are broken by hyphens or even those containing apostrophes (e.g. Mr Smith Carroll / Mr Smith-Carroll / Mr O'Carroll)?
At what point do you want the next word to finish? The next space? The next non-surname? Do you have a list of all surnames to check this against?
You really need to encrypt the db to be 100% sure that no data will accidentally not be replaced.
Make it protocol going forward not to allow the use of actual names in free text boxes, i.e. have the end users type "Mr X" in the text box from now on, but encryption seems to be your best/safest option.
Use STUFF() Function of MSSQL. STUFF()
Below is a query which would help you to extract the required result.
Note:- I am using 'addressstreet' as a column which you can replace with your column and Box with Mr.
WITH CTE AS (
SELECT SUBSTRING(addressstreet ,1,(CHARINDEX(' BOX ',addressstreet + ' ')-1))TEST0,REPLACE(SUBSTRING([addressstreet], CHARINDEX('BOX ', [addressstreet]), LEN([addressstreet])),'BOX','')test, addressstreet FROM [dbo].[Vendor]
where addressstreet like '% BOX %')
,CTE2 AS(
SELECT TEST0, ltrim(stuff(TEST,1,charindex(' ',TEST),''))TEST1, ADDRESSSTREET FROM CTE
)
SELECT TEST0 + ' Mr X '+ ltrim(stuff(TEST1,1,charindex(' ',TEST1),'')) RequiredOutput, ADDRESSSTREET from cte2
This SQL statement is not a performance optimum query, but you can achieve your objective.
Output:-
RequiredOutput AddressStreet
P.O. Mr X Street Plaza P.O. BOX 32109 Street Plaza

VBA Access SQL - field within LIKE operator

Can I use a table column within a Like operator? I've created an example,
TableA
Names Location
Albert Smith Senior Aberdeen
John Lee London
Michael Rogers Junior Newcastle
Mary Roberts Edinburgh
TableB
Names
Albert Smith
John Lee
Michael Rogers
I want to do a query such as:
SELECT TableA.Location
into NewTable
FROM TableA
WHERE TableA.Names Like '*[TableB.Names]*';
In this case, there would be no match for Mary Roberts, Edinburgh but the first three locations would be returned.
Is it possible to put a column into a like statement?
If not does anyone have any ideas how I could do this?
Hope you can help
PS I can't use an actual asterisk since this is removed and the text italicised, also I have read about using % instead but this has not worked for me.
You can join the two tables and use LIKE within the JOIN clause:
SELECT TableA.Location
into NewTable
FROM TableA
INNER JOIN TableB ON TableA.Names LIKE TableB.Names & '*';
Honestly, I had no idea that you can do this in Access before I tried it just now :-)

Oracle 10G regexp for Name

I am trying to write a regexp_replace to create a "Friendly" name for some employees. They are currently stored as FIRST <POSSIBLE MIDDLE INITIAL> LAST <POSSIBLE SUFFIX> <MULTIPLE WHITESPACE> SITE_ID
For example,
JOHN SMITH ABC
JOHN Q SMITH ABC
JOHN Q SMITH III ABC
I am trying to write a regex so that I will end up with:
Smith, John
Smith, John Q
Smith III, John Q
The ABC "Site ID" doesn't need to be included in my output.
This is what I tried with little success:
regexp_replace(
employee_name,
'^(\S+)\s(\S+)\s(\S+)',
'\3, \1 \2'
)
Also, I am using Oracle 10G. Any help would be greatly appreciated!
If your names don't show the problems ruakh points out, i.e., there aren't single-letter names or surnames, and no Hispanic names, you can try this regexp:
^(\S+)\s(\S\s)?(\S+)(\s\S+)?\s\s+\S+$
The replacement should be:
\3\4, \1\2

T-SQL Query Problem

I have a column in Database called Full Name and I want split that name as FirstName and LastName:
Here is an Example:
FullName
Sam Peter
I want this to be
FirstName LastName
--------------------
Sam Peter
But the Problem is Some of the columns in Database have Full Names Like this
FullName
--------
Sam George Jack Peter
Sam Adam Peter
I want this to be
FirstName LastName
--------- --------
Sam George Jack Peter
Sam Adam Peter
How do I write T-SQL Query for this.
Thanks in Advance for all the help
There's a very thorough name parsing routine described in this answer. It handles your situation, along with much trickier cases like "Mr. Martin J Van Buren III".
String manipulation in SQL Server is notoriously weak.
Your best bet is to do it in your application layer.
For your example with more than 2 names, how do you know which fields those additional names go into? Are you guaranteed they will always have only one last name?
Found this on net (did not test it)
REVERSE(SUBSTRING(REVERSE([FullName]),1,FINDSTRING(REVERSE([FullName])," ",1)))
You can test it with
SELECT REVERSE(SUBSTRING(REVERSE([FullName]),1,FINDSTRING(REVERSE([FullName])," ",1)))
FROM Table
if it works you can then
UPDATE Table
SET LastName = REVERSE(SUBSTRING(REVERSE([FullName]),1,FINDSTRING(REVERSE([FullName])," ",1)))
I leave the exercise for the first name to you.
Are you just splitting at the last space? If so this should work:
select 'Sam George Jack Peter' as FullName
into #names
union select 'Sam Adam Peter'
select LEFT(FullName,LEN(FullName)-CHARINDEX(' ',REVERSE(FullName))) as FirstName
,RIGHT(FullName,CHARINDEX(' ',REVERSE(FullName))-1) as LastName
from #names
EDIT:
To handle names with no spaces and put the FullName as LastName
select 'Sam George Jack Peter' as FullName
into #names
union select 'Sam Adam Peter'
union select 'Peter'
select CASE
WHEN CHARINDEX(' ',FullName) = 0 THEN ''
ELSE LEFT(FullName,LEN(FullName)-CHARINDEX(' ',REVERSE(FullName)))
END as FirstName
,CASE
WHEN CHARINDEX(' ',FullName) = 0 THEN FullName
ELSE LTRIM(RIGHT(FullName,CHARINDEX(' ',REVERSE(FullName))))
END as LastName
from #names