Using LIKE clause when formats are different - sql

I was given a patient list with names and I am trying to match with a list already in our database and am having troubles given the format of the name field in the patient list. This list is taken from a web form so people can input names however they want so it does not match up well.
WEBFORM_NAME
PATIENT_NAME
JOHN SMITH
SMITH,JOHN L
SHANNON BROWN
BROWN,SHANNON MARIE
Is there a way to use a LIKE clause in an instance like this? All I really need is the LIKE clause to find the first name because I have joined on phone number and email address already. My issue is when households have the same phone number and email address (spouses for example) I just want to return the right person in the household.

Not sure if all you need is to get first name, here is the WIldCard expression to get first name
SELECT LEFT(WEBFORM_NAME,CHARINDEX(' ',WEBFORM_NAME)-1) AS FirstName1,
SUBSTRING(PATIENT_NAME,CHARINDEX(',',PATIENT_NAME)+1,(CHARINDEX(' ',PATIENT_NAME)-CHARINDEX(',',PATIENT_NAME))) AS FirstName2
FROM yourTable

The assumption here seems to be that the webform (where user would manually) type in the name would be of the format <First Name> [<optional middle Name(s)>] <Last Name>, where as the data stored in the table are of the form <Last Name>,<First Name> [<optional middle Name(s)>]. Its not an exact science, but since other criteria (like email, phone etc) have been matched best case
select *
from webform w, patient p
where
-- extract just the last name and match that
regexp_like(p.name,
'^' ||
regexp_extract(w.name,
'([^[:space:],][[:space:],])*([^[:space:],]+)', 1, 2))
and -- extract the first name and match it
regexp_like(p.name,
',[[:space:]]*' ||
regexp_extract(w.name, '(^[^[:space:],]+)'))
Since webform is free form user input, its hard to handle abbreviated middle name(s) and other variations so using the above will do first name and last name based matching which in addition to the matching you are already doing should help.

Related

How to get the specified string in a sentence

I have a section as Note which contains the Patient Name and Patient Number I want to fetch Patient Name only. I tried using the CHARINDEX function but was able to fetch the Name along with the Phone written at the end. How can I try to remove the last 5 chars or optimize to fetch only the name from the column?
Input:
Note
Patient Name: John Mathews Phone Number: 1234567890
Required Output: John Mathews
Currently, I am using the following SQL query to get the output as:
SELECT SUBSTRING(Note, CHARINDEX(':', Note)
, CHARINDEX('Phone',Note) - CHARINDEX('Name', Note))
The output I have received is :
Output: John Mathews Phone
I want to remove the Phone part, I tried using various methods but was unable to find a proper solution for the same.
Can someone help me where should I make changes in the same function without using another substring to find the length and removing it from the end?
Based on your sample data and description, the patient name always begins at position 15. That makes this rather simple:
select substring(v.note, 15, charindex('Phone Number:', note) - 16)

SQL query to get the first letter of each word and seperate it by a dot and a space

I have never really used SQL much but recent changes due to working from home is forcing me to gain some knowledge in it. I have been doing fine so far but I am now running into a problem that I can't seem to find a solution for.
I have an excel sheet that pulls customer information trough a SQL query which runs by VBA code.
What I first needed to do is to get a full name from a customer and input this into the sheet. This works fine. I am using the following query for this:
Select concat(concat(Customer_First_Names,' '), Customer_Last_Name) FROM CustomerInformationTable where Customer_Number = &&1
This gives me the full name of a customer and spaces in between the first and last name and in between the names (the full first names are already spaced in between in the table).
Now, I got another request to not retrieve the first full first names and last name of a customer, but their initials and the last name.
For example:
Barack Hussein Obama
Would become
B. H. Obama
I need to do 2 things for this:
I need to change my query to retrieve only the initials for each first name. Like I said, all full first names (even if a customer has more then one first name) is located in the column Customer_First_Names.
I need to add a dot and a space after each initial.
How would I go on about this?
I have been thinking about using SUBSTRING but I am struggling on how to do this if there is more then one first name.
So this is not going to work:
Select concat(substr(Customer_First_Names, 1, 1), '. ') from CustomerInformationTable where Customer_Number = &&1
My apologies if this has already been ask on the board so far, I looked but I did not find a suitable solution.
Assuming you don't want to see 2 dots after someone who has just one first name (like J.. Smith), then here's a solution that works in postgres. Not sure what your db is, so you may need to adjust as needed.
The 'with' query is splitting apart the first names, limiting to two.
The 'case' statement then checks if the person has a second first name. If not, then only the first initial is provided and followed by a dot. Otherwise, both first initials are followed by a dot. Final results, all initials and names are separated by a space (like T. R. Smith).
So, a table looking like this:
cid first last
1 JAKE SMITH
2 TERREL HOWARD WILLIAMS
3 PHOEBE M KATES
Will produce the following results with the query below.
cid cust_name
1 J. SMITH
2 T. H. WILLIAMS
3 P. M. KATES
with first_names as
(select distinct customer_number ,
split_part(customer_first_name, ' ', 1) as first1,
split_part(customer_first_name, ' ', 2) as first2
from CustomerInformationTable
)
select distinct customer_number,
case
when fn.first2 = '' then substring(fn.first1, 1, 1) || '.'
else substring(fn.first1, 1, 1) || '. ' || substring(fn.first2, 1, 1) || '.'
end
|| ' ' || a.customer_last_name as cust_name
from CustomerInformationTable a
join first_names fn on fn.customer_number = a.customer_number

CASE ReGex with substring

I'm writing a SQL query where I am taking the substring of 2 names (First name/last name) to create an initials column, the data is unstructured to a certain extent (Can't show for GDPR reasons) but where there is a company name it is just in the surname column.
I'm trying to use Regex to say when the already present initials column is 1 letter (I.e not an initial) and if it is not an initial run a command that I wrote which successfully works.
CAST(CASE
WHEN [DATA_TABLE].[INITIALS] = '\d' THEN (CONCAT(substring([DATA_TABLE].[FIRSTNAMES],1,1),substring([DATA_TABLE].[SURNAME],1,1)) AS char) AS INITIALS
ELSE [DATA_TABLE].[INITIALS]
end as char) as INITIALS,
An example of the data format:
First name last name initials
John smith JS
Electrical company E
Sam Craig SC
I want the names that are just in the surname (Company names) to just remain as they are with no change (I.e The \d regex). Ones which don't will become the substring of their first name as (1,1) and a substring of their last name to also be (1,1).

Get following letter from given

I have a table with company names. Some companies have different locations and different legal names but they should be reported under the same Group Code. The Code is made up using the first five letters.
Company GroupCode
DEEZER FRANCE DEEZE
DEEZER SPAIN DEEZE
DEEZER ALGERIA DEEZE
So far so good. Now I’m adding a different company which starts with the same letters but should get a new Group Code.
A new Code should be assigned if the company name does not contain a word which is part of a company name already having a GroupCode. In this Case DEEZER is the key word which determines association with GroupCode DEEZE
Rule is that the code should then use the first four letters + the fifth letter next in the alphabet. If this code also exists then use the first four letters + the fifth letter next but one in the alphabet. The required result would look like:
Company GroupCode Status
DEEZER FRANCE DEEZE EXISTING
DEEZER SPAIN DEEZE EXISITNG
DEEZER ALGERIA DEEZE EXISTING
DEEZEMBER DEEZF CREATED
DEEZEMAL DEEZG CREATED
So what I need to figure out is the next „unused“ letter. How can I achieve this with SQL Server 2008 R2?
Try this:
;with cte as
(select max(groupcode) maxcode
from yourtable
where left(code,4) = left(#companyname,4))
insert into yourtable (company, groupcode, [status])
select #companyname,
case when maxcode is null then left(#companyname,4) + 'a' else left(maxcode,4) + char(ascii(right(maxcode,1))+1) end,
'created'
from cte
Assumption: Your input is taking the company name as a parameter from somewhere, presumably the front end.
The idea is to use ascii function to get the ASCII code of the last letter, increment it by 1 and go back to the corresponding character using char function.
Be warned, however, that this is definitely not the best solution. For instance, I have not implemented bounds checking to ensure range between A and Z. In fact, I would suggest that you handle this in application code rather than at DB level.

How do you query only part of the data in the row of a column - Microsoft SQL Server

I have a column called NAME, I have 2000 rows in that column that are filled with people's full names, e.g. ANN SMITH. How do I do a query that will list all the people whose first name is ANN? There are about 20 different names whose first name is ANN but the surname is different.
I tried
and (NAME = 'ANN')
but it returned zero results.
I have to enter the FULL name and (NAME = 'ANN SMITH') ANN SMITH to even get a result .
I just want to list all the people with there first name as ANN
Try in your where clause:
Where Name like 'ANN %'
Should work mate.
ANN% will find all results where ANN is first then anything after.
%ANN% will find the 3 letters ANN in any part of that rows field.
Hope it helps
Also usually Name is separated into First names and second name columns.
this will save Having to use wild cards in your SQL and provide A bit more normalized data.
SELECT NAME
FROM NAMES
WHERE NAME LIKE 'ANN %'
This should wildcard select anything that begins with 'ANN' followed by a space.