Using Contains how to search for more than one value - sql

Here is what I am trying to do:
select * from Person where CONTAINS((fName,sName),'John AND Doe')
So I am trying to search for John Doe but I get that I cannot use AND here (just showing my chain of thought) so how can I search for John in the fName and Doe in the sName ? But I don't want to use "Contains" twice like this:
SELECT * FROM Person
WHERE CONTAINS((fName), 'John')
AND CONTAINS((sName), 'Doe');
Since we can have
(fName,sName)
but I cannot use
'John','Doe'/'John' AND 'Doe'

Your statement
SELECT * FROM Person
WHERE CONTAINS((fName), 'John')
AND CONTAINS((sName), 'Doe');
can't compile because CONTAINS returns a number, not a boolean.
It shall respond,on oracle, with this kind of error
ORA-00920: opérateur relationnel non valide
00920. 00000 - "invalid relational operator"
(the relational operator error's cause is not the AND's existence, it's because you try to AND two numbers)
What do you intend to do? If you want to select the line in the person table whose fName columns contains the substring John and whose sName column contains the substring Doe, you may use the like operator, which uses % as a wildcard.
SELECT * FROM Person
WHERE fName LIKE '%John%'
and sName LIKE '%Doe%'
I don't practice much the CONTAINS method, but if you desire to use it, according to documentation, you should use something like
SELECT * FROM Person
WHERE CONTAINS(fName, 'John') > 0
and CONTAINS(sName,'Doe') > 0
If you really don't want to use the AND operator (for whatever obscure reason like proving a sql injection filter is bad), you can use this trick : compare a concatenation of the 2 columns and the like operator like so
SELECT * FROM Person
WHERE fName || sName like '%John%Doe%';
but this last trick will also match the line where fname is John Doe and sName is Jean-Michel :)

This should work:
SELECT * FROM dbo.Person
WHERE CONTAINS((fName), 'John')
AND CONTAINS((sName), 'Doe');
But in your case you dont want to use 2 contains so:
Alternatively, you could add a new computed column with a full text index on it.
Add a column like this:
computedCol AS fName+ ' ' + sName
And create the full text index:
CREATE FULLTEXT INDEX ON Person(computedCol LANGUAGE 1033)
KEY INDEX pk_Person_yourPrimaryKeyName
Then you can do this:
SELECT * FROM dbo.Person
WHERE CONTAINS(*, 'John AND Doe')
Hope it works.

Related

How can I bring a register only if it doesn't contains a string - SQL?

My query is something like this:
SELECT * FROM table WHERE name LIKE "%Mary%"
But now, I want bring all the occurrences as long as it's not contains "Jane" especially in "Mary". I do not want "Mary Jane", neither "Jane Mary" or any variation (e.g. "Mary Smith Jane").
I really don't know how to.
EDIT:
I'm not sure if I can only use a "not like" because I'm already using "not like" in the same query for other reasons.
In fact:
SELECT * FROM table WHERE name NOT LIKE "%John%"
AND name NOT LIKE '%Charlie%'
AND name LIKE '%Mary%'
Just add that to the WHERE clause:
WHERE name LIKE '%Mary%' AND
name NOT LIKE '%Mary Jane%'
Or, if you mean that the exact match is not what you want:
WHERE name LIKE '%Mary%' AND
name <> 'Mary Jane'
SELECT *
FROM your_table
WHERE name LIKE '%Mary%'
and name <> 'Mary Jane'
In order to retrieve records that contain Mary in the string, but not Jane, you will want to keep your clause for LIKE '%Mary%' and add a clause for NOT LIKE '%Jane%'.
You can group these clauses together with parenthesis in order to isolate them from other clauses in the query.
SELECT *
FROM table
WHERE
(name LIKE '%Mary%'
AND name NOT LIKE '%Jane%')

selecting with a column being one of two possible values

I have a table called "people" with a column named "name". I would like to select all rows where the name is "bob" or "john". I have tried the following and many variants of it, none of which work. How can I do this correctly?
select * from people where name is bob or john;
Thanks
To compare a column with a value you need to use = not IS
select *
from people
where name = 'bob'
or name = 'john';
Alternatively you can use the IN operator.
select *
from people
where name IN ('bob','john');
Note that string comparison is case-sensitive in SQL. So the above will not return rows where the name is Bob or John

MySQL SELECT query string matching

Normally, when querying a database with SELECT, its common to want to find the records that match a given search string.
For example:
SELECT * FROM customers WHERE name LIKE '%Bob Smith%';
That query should give me all records where 'Bob Smith' appears anywhere in the name field.
What I'd like to do is the opposite.
Instead of finding all the records that have 'Bob Smith' in the name field, I want to find all the records where the name field is in 'Robert Bob Smith III, PhD.', a string argument to the query.
Just turn the LIKE around
SELECT * FROM customers
WHERE 'Robert Bob Smith III, PhD.' LIKE CONCAT('%',name,'%')
You can use regular expressions like this:
SELECT * FROM pet WHERE name REGEXP 'Bob|Smith';
Incorrect:
SELECT * FROM customers WHERE name LIKE '%Bob Smith%';
Instead:
select count(*)
from rearp.customers c
where c.name LIKE '%Bob smith.8%';
select count will just query (totals)
C will link the db.table to the names row you need this to index
LIKE should be obvs
8 will call all references in DB 8 or less (not really needed but i like neatness)

How can I compare two name strings that are formatted differently in SQL Server?

What would be the best approach for comparing the following set of strings in SQL Server?
Purpose: The main purpose of this Stored Procedure is to compare a set of input Names of Customers to names that Exist in the Customer database for a specific accounts. If there is a difference between the input name and the name in the Database this should trigger the updating of the Customer Database with the new name information.
Conditions:
Format of Input: FirstName [MiddleName] LastName
Format of Value in Database: LastName, FirstName MiddleName
The complication arises when names like this are presented,
Example:
Input: Dr. John A. Mc Donald
Database: Mc Donald, Dr. John A.
For last names that consist of 2 or more parts what logic would have to be put into place
to ensure that the lastname in the input is being compared to the lastname in the database and likewise for the first name and middle name.
I've thought about breaking the database values up into a temp HASH table since I know that everything before the ',' in the database is the last name. I could then check to see if the input contains the lastname and split out the FirstName [MiddleName] from it to perform another comparison to the database for the values that come after the ','.
There is a second part to this however. In the event that the input name has a completely New last name (i.e. if the name in the database is Mary Smith but the updated input name is now Mary Mc Donald). In this case comparing the database value of the last name before the ',' to the input name will result in no match which is correct, but at this point how does the code know where the last name even begins in the input value? How does it know that her Middle name isn't MC and her last name Donald?
Has anyone had to deal with a similar problem like this before? What solutions did you end up going with?
I greatly appreciate your input and ideas.
Thank you.
Realistically, it's extremely computationally difficult (if not impossible) to know if a name like "Mary Jane Evelyn Scott" is first-middle-last1-last2, first1-first2-middle-last, first1-first2-last1-last2, or some other combination... and that's not even getting into cultural considerations...
So personally, I would suggest a change in the data structure (and, correspondingly, the application's input fields). Instead of a single string for name, break it into several fields, e.g.:
FullName{
title, //i.e. Dr., Professor, etc.
firstName, //or given name
middleName, //doesn't exist in all countries!
lastName, //or surname
qualifiers //i.e. Sr., Jr., fils, D.D.S., PE, Ph.D., etc.
}
Then the user could choose that their first name is "Mary", their middle name is "Jane Evelyn", and their last name is "Scott".
UPDATE
Based on your comments, if you must do this entirely in SQL, I'd do something like the following:
Build a table for all possible combinations of "lastname, firstname [middlename]" given an input string "firstname [middlename] lastname"
Run a query based on the join of your original data and all possible orderings.
So, step 1. would take the string "Dr. John A. Mc Donald" and create the table of values:
'Donald, Dr. John A. Mc'
'Mc Donald, Dr. John A.'
'A. Mc Donald, Dr. John'
'John A. Mc Donald, Dr.'
Then step 2. would search for all occurrences of any of those strings in the database.
Assuming MSSQL 2005 or later, step 1. can be achieved using some recursive CTE, and a modification of a method I've used to split CSV strings (found here) (SQL isn't the ideal language for this form of string manipulation...):
declare #str varchar(200)
set #str = 'Dr. John A. Mc Donald'
--Create a numbers table
select [Number] = identity(int)
into #Numbers
from sysobjects s1
cross join sysobjects s2
create unique clustered index Number_ind on #Numbers(Number) with IGNORE_DUP_KEY
;with nameParts as (
--Split the name string at the spaces.
select [ord] = row_number() over(order by Number),
[part] = substring(fn1, Number, charindex(' ', fn1+' ', Number) - Number)
from (select #str fn1) s
join #Numbers n on substring(' '+fn1, Number, 1) = ' '
where Number<=Len(fn1)+1
),
lastNames as (
--Build all possible lastName strings.
select [firstOrd]=ord, [lastOrd]=ord, [lastName]=cast(part as varchar(max))
from nameParts
where ord!=1 --remove the case where the whole string is the last name
UNION ALL
select firstOrd, p.ord, l.lastName+' '+p.part
from lastNames l
join nameParts p on l.lastOrd+1=p.ord
),
firstNames as (
--Build all possible firstName strings.
select [firstOrd]=ord, [lastOrd]=ord, [firstName]=cast(part as varchar(max))
from nameParts
where ord!=(select max(ord) from nameParts) --remove the case where the whole string is the first name
UNION ALL
select p.ord, f.lastOrd, p.part+' '+f.firstName
from firstNames f
join nameParts p on f.firstOrd-1 = p.ord
)
--Combine for all possible name strings.
select ln.lastName+', '+fn.firstName
from firstNames fn
join lastNames ln on fn.lastOrd+1=ln.firstOrd
where fn.firstOrd=1
and ln.lastOrd = (select max(ord) from nameParts)
drop table #Numbers
Since I had my share of terrible experience with data from third parties, it is almost guaranteed that the input data will contain lots of garbage not following the specified format.
When trying to match data multipart string data like in your case, I preprocessed both input and our data into something I called "normalized string" using the following method.
strip all non-ascii chars (leaving language-specific chars like "č" intact)
compact spaces (replace multiple spaces with single one)
lower case
split into words
remove duplicates
sort alphabetically
join back to string separated by dashes
Using you sample data, this function would produce:
Dr. John A. Mc Donald ->
a-donald-dr-john-mc Mc Donald, Dr.
John A.-> a-donald-dr-john-mc
Unfortunaly it's not 100% bulletproof, there are cases where degenerated inputs produce invalid matches.
Your name field is bad in the database. Redesign and get rid of it. If you havea a first name, middlename, lastname, prefix and suffix sttructure, you can hava computed filed that has the structure you are using. But it is a very poor way to store data and your first priority should be to stop using it.
Since you have a common customer Id why aren't you matching on that instead of name?

Get words from sentence - SQL

Suppose I have a description column that contains
Column Description
------------------
I live in USA
I work as engineer
I have an other table containing the list of countries, since USA (country name) is mentioned in first row, I need that row.
In second case there is no country name so I don't need that column.
Can you please clarify
You may want to try something like the following:
SELECT cd.*
FROM column_description cd
JOIN countries c ON (INSTR(cd.description, c.country_name) > 1);
If you are using SQL Server, you should be able to use the CHARINDEX() function instead of INSTR(), which is available for MySQL and Oracle. You can also use LIKE as other answers have suggested.
Test case:
CREATE TABLE column_description (description varchar(100));
CREATE TABLE countries (country_name varchar(100));
INSERT INTO column_description VALUES ('I live in USA');
INSERT INTO column_description VALUES ('I work as engineer');
INSERT INTO countries VALUES ('USA');
Result:
+---------------+
| description |
+---------------+
| I live in USA |
+---------------+
1 row in set (0.01 sec)
This is a really bad idea, to join on arbitrary text like this. It will be very slow and may not even work.. give it a shot:
select t1.description, c.*
from myTable t1
left join countries c on t1.description like CONCAT('%',c.countryCode,'%')
Its not entierly clear from your post but I think you are asking to return all the rows in the table that contain the descriptions which contain a certain country name? If thats the case you can just use the sql LIKE operator like the following.
select
column_description
from
description_table
where
column_description like %(select distinct country_name from country)%
If not I think your only other choice is Dans post.
Enjoy !
All the suggestions so far seem to match partial words e.g. 'I AM USAIN BOLT' would match the country 'USA'. The question implies that matching should be done on whole words.
If the text was consisted entirely of alphanumeric characters and each word was separated by a space character, you could use something like this
Descriptions AS D1
LEFT OUTER JOIN Countries AS C1
ON ' ' + D1.description + ' '
LIKE '%' + ' ' + country_name + ' ' + '%'
However, 'sentence' implies punctuation e.g. the above would fail to match 'I work in USA, Goa and Iran.' You need to delimit words before you can start matching them. Happily, there are already solutions to this problem e.g. full text search in SQL Server and the like. Why reinvent the wheel?
Another problem is that a single country can go by many names e.g. my country can legitimately be referred to as 'Britain', 'UK', 'GB' (according to my stackoverflow profile), 'England' (if you ask my kids) and 'The United Kingdom of Great Britain and Northern Ireland' (the latter is what is says on my passport and no it won't fit in your NVARCHAR(50) column ;) to name but a few.