Parsing an Comparing FullNames on a Join between two tables - sql

I want to compare two strings from two different tables which contain the full name of a person is this format "Blow, Joe" since in one table the user may have the full name like that and other table might have the same user but the full name as "Blow, Joseph) so I want to grab the first two character from both the first and last name and see if they match. Then if they do I wan to update the record. I am not sure what I am doing wrong but I was getting an out of range error and now I am getting incorrect syntax near 'SUBSTRING' which I am looking into now. Does anyone know of a good way to achieve what I am trying to accomplish?
This is what I currently have:
SELECT *
FROM EmployeeMaster e
JOIN EmployeeDivisions d ON SUBSTRING(REPLACE(RTRIM(LTRIM(LEFT(e.FullName,CHARINDEX(',',e.FullName) - 1))),' ',''),1,3) LIKE SUBSTRING(REPLACE(RTRIM(LTRIM(LEFT(d.Name,CHARINDEX(',',d.Name) - 1))),' ',''),1,3)
SUBSTRING(REPLACE(RTRIM(LTRIM(SUBSTRING(e.FullName,CHARINDEX(',',e.FullName) + 1, LEN(e.FullName)))),' ',''),1,3) LIKE SUBSTRING(REPLACE(RTRIM(LTRIM(SUBSTRING(d.Name,CHARINDEX(',',d.Name) + 1, LEN(d.Name)))),' ',''),1,3)

I guess I don't have to point out that this check might match names that are very different. In your example Blow, Josephwould match not onlyBlow, Joebut alsoBlack, Johnand so on...
Maybe you should at least extend the check to include the complete surname together with part of the given name.
But... if you still want to compare the first two letters in the word before the comma, and the first two letters in the word after the comma then use this:
SELECT *
FROM EmployeeMaster e
JOIN EmployeeDivisions d ON
(
SUBSTRING(REPLACE(RTRIM(LTRIM(LEFT(e.FullName,CHARINDEX(',',e.FullName) - 1))),' ',''),1,2)
=
SUBSTRING(REPLACE(RTRIM(LTRIM(LEFT(d.Name,CHARINDEX(',',d.Name) - 1))),' ',''),1,2)
)
AND
(
SUBSTRING(REPLACE(RTRIM(LTRIM(SUBSTRING(e.FullName,CHARINDEX(',',e.FullName) + 1, LEN(e.FullName)))),' ',''),1,2)
=
SUBSTRING(REPLACE(RTRIM(LTRIM(SUBSTRING(d.Name,CHARINDEX(',',d.Name) + 1, LEN(d.Name)))),' ',''),1,2)
)
You might be able to reduce the complexity of the join to this:
LEFT(LTRIM(e.FullName),CHARINDEX(',',e.FullName)-1)
=
LEFT(LTRIM(d.Name),CHARINDEX(',',d.Name)-1)
AND
SUBSTRING(e.FullName,CHARINDEX(',',e.FullName) + 1, 3)
=
SUBSTRING(d.Name,CHARINDEX(',',d.Name) + 1, 3)

Related

Check if string is found in one of multiple columns in SQL

I want to search a string in multiple columns to check if it exists in any.
I found a solution for it here
The answer by Thorsten is short but that is a solution for mysql server not for SQL Server.
So I would like to apply similar query in SQL Server.
Here is the query suggested by Thorsten.
Select *
from tblClients
WHERE name || surname LIKE '%john%'
I tried it as
/* This returns nothing */
Select *
from Items
Where ISNULL(Code, '') + ISNULL(Code1, '') = '6922896068701';
Go
/* This generate error Msg 102, Level 15, State 1, Line 3
Incorrect syntax near '|'.
I also used this one in mysql but it does not show the exact match.
*/
Select *
from Items
WHERE Code || Code1 = '6922896068701';
Go
/* This generate error Msg 4145, Level 15, State 1, Line 5
An expression of non-boolean type specified in a context where a condition is expected, near 'Or'. */
Select *
from Items
WHERE Code Or Code1 = '6922896068701';
Go
Is it really possible in SQL Server?
Note: The answer by J__ works accurately in the upper Question link but I want the comparison string to be entered once for all columns where I look for it like Thorsten.
Actually I think that separate logical checks in the WHERE clause for each column is the way to go here. If you can't do that for some reason, consider using a WHERE IN (...) clause:
SELECT *
FROM Items
WHERE '6922896068701' IN (Code, Code1);
If instead you want LIKE logic, then it gets tricky. If you knew that the matching codes would always consist of numbers/letters, then you could try:
SELECT *
FROM Items
WHERE ',' + Code + ',' + Code1 + ',' LIKE '%,6922896068701,%';
I would recommend doing the two comparisons separately:
WHERE name LIKE '%john%' OR
surname LIKE '%john%'
Unless you specifically want to find times when the names are combined, such as "Maryjoh" "Needlebaum" or whatever.
It is generally better to focus on one column at a time, because that helps the optimizer.
For MS SQL this may work;
Select *
from Items
WHERE Code = '6922896068701' Or Code1 = '6922896068701'

Substring in Left Join condition

I want to do substring within the join condition, but it is not working.
SELECT
IF (ps.shop = 'NL',TopCat.Parent_Title, CategoryUID.Parent_Title) as Parent_Title,
IF (ps.shop = 'NL',TopCat.Sub_Title_1, CategoryUID.Sub_Title_1) as Sub_Title_1,
IF (ps.shop = 'NL',TopCat.Sub_Title_2, CategoryUID.Sub_Title_2) as Sub_Title_2,
ps.ean, ps.product_resource_id
FROM `xxlhoreca-bi.PriceSearch.XXL_PriceComparison` ps
LEFT JOIN
`xxlhoreca-bi.DataImport.TopCategories` topCat
ON
ps.product_resource_id = topCat.product_resource_id
LEFT JOIN
`DataImport.CategoryUID` CategoryUID
ON
SAFE_CAST(SUBSTR('DataImport.CategoryMappingWithLocalID.Reporting_ID', 4) AS INT64) = CategoryUID.Category_ID
GROUP BY
1, 2, 3, 4, 5
Is there any way around how I can write substring within LEFT JOIN condition?
I need to change the substring part, but I have not been able to achieve it. Any helps would be really appreciated!
Thanks in advance!
You are on roughly the right track.
I am going to make a few assumptions here so bear with me, but I think there are educated guesses.
I think this DataImport.CategoryMappingWithLocalID.Reporting_ID is a field (Reporting_ID) from a table (CategoryMappingWithLocalID) you have in your dataset (DataImport).
What you are trying to achieve is to get the categories that are included in your CategoryMappingWithLocalID.
You are trying to get a substring from the Reporting_ID field because it has the ID you want within the first 4 characters.
Because SUBSTR requires a string, you are trying to turn that dataset.table.field reference in a string by putting it in single quotes, which leads me to think it might actually be a numeric field in the original table.
Now, the solution.
You need to use the table in your query if you want to use it in your JOIN ON clause. Therefore, you need to add an extra JOIN there.
You are on the right track with the SUBSTR part, but what you need to use is CAST(field AS STRING) to convert your numeric value into a string.
Put those two things together in your query and you are ready to go my friend.
JOIN `DataImport.CategoryMappingWithLocalID` AS category_mapping
ON
SAFE_CAST(SUBSTR(CAST(DataImport.CategoryMappingWithLocalID.Reporting_ID AS STRING), 4) AS INT64) = CategoryUID.Category_ID

How can I compare two columns for similarity in SQL Server?

I have one column that called 'message' and includes several data such as fund_no, detail, keywords. This column is in table called 'trackemails'.
I have another table, called 'sendemails' that has a column called 'Fund_no'.
I want to retrieve all data from 'trackemail' table that the column 'message' contains characters same as 'Fund_no' in 'trackemails' Table.
I think If I want to check the equality, I would write this code:
select
case when t.message=ts.fund_no then 1 else 0 end
from trackemails t, sendemails s
But, I do want something like below code:
select
case when t.message LIKE ts.fund_no then 1 else 0 end
from trackemails t, sendemails s
I would be appreciate any advice to how to do this:
SELECT *
FROM trackemails tr
INNER JOIN sendemail se on tr.Message like '%' + se.Fund_No + '%'
Dear Check SQL CHARINDEX() Function. This function finds a string in another string and returns int for the position they match. Like
SELECT CHARINDEX('ha','Elham')
-- Returns: 3
And as you need:
SELECT *
,(SELECT *
FROM sendemail
WHERE CHARINDEX(trackemails.Message,sendemail.Fund_No)>0 )
FROM trackemails
For more information, If you want something much better for greater purposes, you can use Fuzzy Lookup Component in SSDT SSIS. This Component gives you a new column in the output which shows the Percentages of similarity of two values in two columns.

decoding a text string to use in a join

I'm trying to extract the number from a text string and join it to another table. Here's what I have so far:
SELECT sect.id,
sect.section_number,
sect.expression,
p.abbreviation
FROM sections sect
JOIN period p ON SUBSTR(sect.expression, 1, (INSTR(sect.expression,'(')-1)) = p.period_number
AND p.schoolid = 73253
AND p.year_id = 20
JOIN courses c ON sect.course_number = c.course_number
WHERE sect.schoolid = 73253
AND sect.termid >= 2000
I read some other threads and figured out how to strip out the number (which always comes before the left parenthesis). The problem is that this only accounts for two of the three styles of data that live in the sect.expression column-
9(A) - check
10(A) - check
but not
5-6(A)
5-6(A) would kick back an Oracle 01722 invalid number error.
Is there a way I could modify the substr... line so that for the 5-6(A) data type it would grab the first number (the 5) and join off of that?
It's worth mentioning that I only have read rights to this table so any solution that depends on creating some kind of helper table/column won't work.
Thanks!
You can use REGEXP_REPLACE
1) If you want to extract only numbers:
JOIN period p ON REGEXP_REPLACE(sect.expression, '[^0-9]', '') = p.period_number
2) If you want to match with the digits in the start of the string and ignore the ones that appear later:
JOIN period p ON REGEXP_REPLACE(sect.expression, '^(\d+)(.*)', '\1')
Being Oracle 10g, you could use a regex instead:
JOIN period p ON REGEXP_SUBSTR(sect.expression, '^\d+', 1, 1) = p.period_number
Admittedly, the regex I provided needs work - it will get the first number at the start of the string. If you need a more complicated regex, I recommend this site: http://www.regular-expressions.info/tutorial.html

Is it possible to use LIKE and IN for a WHERE statment?

I have a list of place names and would like to match them to records in a sql database the problem is the properties have reference numbers after there name. eg. 'Ballymena P-4sdf5g'
Is it possible to use IN and LIKE to match records
WHERE dbo.[Places].[Name] IN LIKE('Ballymena%','Banger%')
No, but you can use OR instead:
WHERE (dbo.[Places].[Name] LIKE 'Ballymena%' OR
dbo.[Places].[Name] LIKE 'Banger%')
It's a common misconception that for the construct
b IN (x, y, z)
that (x, y, z) represents a set. It does not.
Rather, it is merely syntactic sugar for
(b = x OR b = y OR b = z)
SQL has but one data structure: the table. If you want to query search text values as a set then put them into a table. Then you can JOIN your search text table to your Places table using LIKE in the JOIN condition e.g.
WITH Places (Name)
AS
(
SELECT Name
FROM (
VALUES ('Ballymeade Country Club'),
('Ballymena Candles'),
('Bangers & Mash Cafe'),
('Bangebis')
) AS Places (Name)
),
SearchText (search_text)
AS
(
SELECT search_text
FROM (
VALUES ('Ballymena'),
('Banger')
) AS SearchText (search_text)
)
SELECT *
FROM Places AS P1
LEFT OUTER JOIN SearchText AS S1
ON P1.Name LIKE S1.search_text + '%';
well a simple solution would be using regular expression not sure how it's done in sql but probably something similiar to this
WHERE dbo.[Places].[Name] SIMILAR TO '(Banger|Ballymena)';
or
WHERE dbo.[Places].[Name] REGEXP_LIKE(dbo.[Places].[Name],'(Banger|Ballymena)');
one of them should atleast work
you could use OR
WHERE
dbo.[Places].[Name] LIKE 'Ballymena%'
OR dbo.[Places].[Name] LIKE 'Banger%'
or split the string at the space, if the places.name is always in the same format.
WHERE SUBSTRING(dbo.[Places].[Name], 1, CHARINDEX(dbo.[Places].[Name], ' '))
IN ('Ballymena', 'Banger')
This might decrease performance, because the database may be able to use indexes with like (if the wildcard is at the end you have even a better chance) but most probably not when using substring.