Comparing non identical fields in two different tables - sql

I am trying to compare names in 2 different tables.
In Table1 the field is called Name1 and has values like Lynn Smith.
In Table2, the field is called Name2 and it has the value like Lynn Smith (Extra)
How can I compare the two name values ignoring the text in the brackets?
I want to write a query where I need some other fields where the main name is the same.

One method would use like:
select . . .
from t1 join
t2
on t2.name2 like t1.name1 + ' (%)';
However, this is probably not efficient. If you want performance, you can extract the name into a separate column in the second table and create an index on it:
alter table t2 add column name_cleaned as
(left(name2, charindex(' (', name2 + ' (') - 2));
create index idx_t2_name_cleaned on t2(name_cleaned);
Then you can phrase the query as:
select . . .
from t1 join
t2
on t2.name2_cleaned = t1.name1;

One way to do this is to direct compare the names after cleaning up on one side.
Unlike Gordon's answer, I'd do this with another table containing data to compare from table2.
SELECT Table2Id, Name2, NULL as cleanedName INTO NewTable FROM Table2
Now we update the cleanedName column to strip off extra information from Name2 column like below. You may also create an index on this table.
UPDATE cleanedName
SET cleanedName = LEFT (name2,CHARINDEX('(',Name2))
Now drop and re-create index on CleanedName column and then compare with Table1.Name1 column

If all the values in Table2 Column2 have space between the end of the second name and the first (open) bracket then you could use this:
SELECT SUBSTRING('Lynn Smith (Extra)',1,PATINDEX('%(%','Lynn Smith (Extra)')-2)
If you were to replace 'Lynn Smith (Extra)' with the column name:
SELECT SUBSTRING('name2',1,PATINDEX('%(%','name2')-2)
then it would show a list of the values in name2 without the text in the brackets, in other words, in the same format (as such) as the names in name1 on table1.
SUBSTRING and PATINDEX are String functions.
SUBSTRING asks for three 'arguments': (1) expression (2) start and (3) length.
(1) As you can see above the first argument can be (amongst other things)
either a constant - 'Lynn Smith (Extra)' or a column - 'name2'
(2) the start of the result you want so, in this example, the first (or left)
character in the string in the column or constant is signified by the number 1.
(3) how many characters do you want to see in the result? In this example I have used PATINDEX to create a number (see below).
PATINDEX asks for two arguments: (1) %pattern% and (2) expression
(1) is the character or group of characters (shape or 'pattern') you are looking
to locate, the reason for the wildcard characters %% either side of the
pattern is because there may be characters either side of the pattern
(2) is (amongst other things) the constant or column that contains the pattern
from argument 1.
Whilst SUBSTRING returns character data (part of the string) PATINDEX produces a number, that number is the first character in the pattern (given as a number, counting from the left of the expression).

Related

How to Extract only numbers from the String without using function in SQL

Table contains data as below
Table Name is REGISTER
Column Name is EXAM_CODE
Values like ('S6TJ','S7','S26','S24')
I want answer like below
Result set - > (6,7,26,24)
Please suggest solution - since regexp_replace is not recognized built in function name in SQL.
The complexity of the answer depends on two things: the RDBMS used and whether the numbers in the EXAM_CODE are contiguous.
I have assumed that the RDBMS is SQL Server and the numbers in EXAM_CODE are always contiguous. If not, please advise and I can revise the answer.
The following SQL shows a way of accomplishing the above using PATINDEX.:
CREATE TABLE #REGISTER (EXAM_CODE VARCHAR(10));
INSERT INTO #REGISTER VALUES ('S6TJ'),('S7'),('S26'),('S24');
SELECT LEFT(EXAM_CODE, PATINDEX('%[^0-9]%', EXAM_CODE) - 1)
FROM (
SELECT RIGHT(EXAM_CODE, LEN(EXAM_CODE) - PATINDEX('%[0-9]%', EXAM_CODE) + 1) + 'A' AS EXAM_CODE
FROM #REGISTER
) a
DROP TABLE #REGISTER
This outputs:
6
7
26
24
PATINDEX matches a specified pattern against a string (or returns 0 if there is no match).
Using this, the inner query fetches all of the string AFTER the first occurence of a number. The outer query then strips any text that may appear on the end of the string.
Note: The character A is appended to the result of the inner query in order to ensure that the PATINDEX check in the outer query will make a match. Otherwise, PATINDEX would return 0 and an error would occur.

How to restrict entire FTS5 Query to a single column?

Currently, I'm trying to execute an FTS5 query via libsqlite, and need to restrict the query to a specific column. In FTS4, this was possible by doing:
SELECT foo, bar FROM tableName WHERE columnName MATCH ?
and then binding the search string to the statement. However, with FTS5, the LHS of the MATCH operator must be the FTS table name itself, and the column name must be a part of the query:
SELECT foo, bar FROM tableName WHERE tableName MATCH 'columnName:' || ?.
This works when the binded string is a single phrase. However, consider the search text this is great. The query then becomes:
SELECT foo, bar FROM tableName WHERE tableName MATCH 'columnName:pizza is great';
Only pizza is restricted to to the columnName, but the rest of the phrase is matched against all columns.
How can I work around this?
The documentation says:
A single phrase … may be restricted to matching text within a specified column of the FTS table by prefixing it with the column name followed by a colon character.
So the column name applies only to a single phrase.
If you have three phrases, you need to specify the column name three times:
tableName MATCH 'columnName:pizza columnName:is columnName:great'

SQL Validating a string variable

Using SQL Scripts, I need to validate Comma Separate value. How should i validate the String Variable ?
Validation should be both Right / Left Trim for each value and there should not be any special characters such as Comma or Period for the last value.
create table #test
(col varchar(100))
insert into #test values
('1,2'),
('1,2,'),
('1,'),
('1,2,3,4,5')
select * from #test
In the above query, for the second value - Expected Result is 1,2
In the above query, for the Third value - Expected Result is 1
You can update your table to fix "offensive" values.
update #test
set col = substring(col, 1, len(col) - 1)
where col not like '%[0-9]'
This will remove last character where value doesn't end by a digit.
You can use a check constraint. You seem to want something like this:
alter table t add constraint chk_name as
(name like '%,%' and
name not like '%,%,%' and
name not like '%[^a-zA-Z,]%'
)
SQL Server doesn't have support for regular expressions. This implements the rules:
Name has to have a comma
Name does not have two commas
Name consists only of alphabetic characters and a comma
You may find that you need slightly more flexibility, but this handles the cases in your question.

Oracle SQL - Joining list of values to a field with those values concatenated

The title is a bit confusing, so I'll explain with an example what I'm trying to do.
I have a field called "modifier". This is a field with concatenated values for each individual. For example, the value in one row could be:
*26,50,4 *
and the value in the next row
*4 *
And the table (Table A) would look something like this:
Key Modifier
1 *26,50,4 *
2 *4 *
3 *1,2,3,4 *
The asterisks are always going to be in the same position (here, 1 and 26) with an uncertain number of numbers in between, separated by commas.
What I'd like to do is "join" this "modifier" field to another table (Table B) with a list of possible values for that modifier. e.g., that table could look like this:
ID MOD
1 26
2 3
3 50
4 78
If a value in A.modifier appears in B.mod, I want to keep that row in Table A. Otherwise, leave it out. (I use the term "join" loosely because I'm not sure that's what I need here.)
Is this possible? How would I do it?
Thanks in advance!
edit 1: I realize I can use regular expressions and do a bunch of or statements that search for the comma-separated values in the MOD list, but is there a better way?
One way to do it is using TRIM, string concatenations and LIKE.
SELECT *
FROM tableA a
WHERE EXISTS(
SELECT 1 FROM tableB b
WHERE
','|| trim( trim( BOTH '*' FROM a.Modifier )) ||','
LIKE '%,'|| b.mod || ',%'
);
Demo --> http://www.sqlfiddle.com/#!4/1caa8/10
This query migh be still slow for huge tables (it always performs full scans of tables or indexes), however it should be faster than using regular expressions or parsing comma separated lists into individual values.

How to delete a common word from large number of datas in a Postgres table

I have a table in Postgres. In that table more than 1000 names are there. Most of the names are start with SHRI or SMT. I want to delete this SHRT and SMT from the names and to save original name only. How can I do that with out any database function?
I'll step you through the logic:
Select left(name,3) from table
This select statement will bring back the first 3 chars of a column (the 'left' three). If we are looking for SMT in the first three chars, we can move it to the where statement
select * from table where left(name,3) = 'SMT'
Now from here you have a few choices that can be used. I'm going to keep to the left/right style, though replace could likely be used. We want the chars to the right of the SMT, but we don't know how long each string is to pick out those chars. So we use length() to determine that.
select right(name,length(name)-3) from table where left(name,3) = 'SMT'
I hope my syntax is right there, I'm lacking a postgres environment to test it. The logic is 'all the chars on the right of the string except the last 3 (the minus 3 excludes the 3 chars on the left. change this to 4 if you want all but the last 4 on the left)
You can then change this to an update statement (set name = right(name,length(name)-3) ) to update the table, or you can just use the select statement when you need the name without the SMT, but leave the SMT in the actual data.