Full text search on multiple columns sql server - sql

I have the following table
Id Author
1 Alexander Mccall Smith
2 Ernest Hemingway
3 Giacomo Leopardi
4 Henry David Thoreau
5 Mary Higgins Clark
6 Rabindranath Tagore
7 Thomas Pynchon
8 Zora Neale Hurston
9 William S. Burroughs
10 Virginia Woolf
11 William tell
I want to search the Author by putting first few characters of the first and last name.
eg: Search Text: Will tel
Then the search result show the following result
William tell
eg: Search Text: will Burrou
Then the search result show the following result
William S. Burroughs
eg: Search Text: Will
Then the search result show the following result
William S. Burroughs
William tell
What is the efficient way to achieve this in sql server ?

As you mentioned this can be achieved using Full Text Search. You have to create the FTS catalog and then index on the table and column(s). You stated in the title 'Columns' but I only see one table column in your example so I will create the queries using that.
-- example 1 searching on Will and Tel
SELECT Id, Author
FROM Authors
WHERE CONTAINS(Author, '"Will*" AND "tel*"')
-- example 2 searching on Will and Burrou
SELECT Id, Author
FROM Authors
WHERE CONTAINS(Author, '"will*" AND "Burrou*"')
-- example 3 searching on Will
SELECT Id, Author
FROM Authors
WHERE CONTAINS(Author, '"will*"')
For further reference see
The Contains clause which searches for precise or fuzzy matches.
Article Query with Full-Text Search.

Less efficient than #Igor's answer as the table size grows, but you can also use the Like statement.
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.
-- example 1 searching on Will and Tel
SELECT Id, Author
FROM Authors
WHERE Author Like('Will%Tel%')
-- example 2 searching on Will and Burrou
SELECT Id, Author
FROM Authors
WHERE Author Like('Will%Burrou%')
-- example 3 searching on Will
SELECT Id, Author
FROM Authors
WHERE Author Like('Will%')
Cons: It is slower than the contains statement.You need to include the % sign after any other keyword you're looking to search for.
Pros: Can be faster than contains statement in cases of smaller(<1000) row tables.

Related

SQL subquery as part of LIKE search

From this Question we learned to use a subquery to find information once-removed.
Subquery we learned :
SELECT * FROM papers WHERE writer_id IN ( SELECT id FROM writers WHERE boss_id = 4 );
Now, I need to search a table, both in column values that table, and in column values related by id on another table.
Here are the same tables, but col values contain more text for our "searching" reference...
writers :
id
name
boss_id
1
John Jonno
2
2
Bill Bosworth
2
3
Andy Seaside
4
4
Hank Little
4
5
Alex Crisp
4
The writers have papers they write...
papers :
id
title
writer_id
1
Boston
1
2
Chicago
4
3
Cisco
3
4
Seattle
2
5
North
5
I can use this to search only the names on writers...
Search only writers.name : (Not what I want to do)
SELECT * FROM writers WHERE LOWER(name) LIKE LOWER('%is%');
Output for above search : (Not what I want to do)
id
name
boss_id
5
Alex Crisp
4
I want to return cols from writers (not papers), but searching text both in writers.name and the writers.id-associated papers.title.
For example, if I searched "is", I would get both:
Alex Crisp (for 'is' in the name 'Crisp')
Andy Seaside (because Andy wrote a paper with 'is' in the title 'Cisco')
Output for "is" search :
id
title
writer_id
2
Chicago
4
4
Seattle
2
Here's what I have that doesn't work:
SELECT * FROM papers WHERE LOWER(title) LIKE LOWER('%is%') OR writer_id ( writers=writer_id WHERE LOWER(name) LIKE LOWER('%$is%') );
The best way to express this criteria is by using a correlated query with exists:
select *
from writers w
where Lower(w.name) like '%is%'
or exists (
select * from papers p
where p.writer_id = w.id and Lower(p.title) like '%is%'
);
Note you don't need to use lower on the string you are providing, and you should only use lower if your collation truly is case-sensitive as using the function makes the search predicate unsargable.
Since you want to return cols from writers (not papers) you should select them first, and use stuff from papers in the criteria
select *
from writers w
where
w.name like '%is%'
or
w.id in (select p.writer_id
paper p
where p.title like '%is%'
)
You can add your LOWER functions (my sql environment is not case-sensitive, so I didn't need them)

SQL WHERE column values into capital letters

Let's say I have the following entries in my database:
Id
Name
12
John Doe
13
Mary anne
13
little joe
14
John doe
In my program I have a string variable that is always capitalized, for example:
myCapString = "JOHN DOE"
Is there a way to retrieve the rows in the table by using a WHERE on the name column with the values capitalized and then matching myCapString?
In this case the query would return two entries, one with id=12, and one with id=14
A solution is NOT to change the actual values in the table.
A general solution in Postgres would be to capitalize the Name column and then do a comparison against an all-caps string literal, e.g.
SELECT *
FROM yourTable
WHERE UPPER(Name) = 'JOHN DOE';
If you need to implement this is Knex, you will need to figure out how to uppercase a column. This might require using a raw query.

Merge SQL Rows in Subquery

I am trying to work with two tables on BigQuery. From table1 I want to find the accession ID of all records that are "World", and then from each of those accession numbers I want to create a column with every name in a separate row. Unfortunately, when I run this:
Select name
From `table2`
Where acc IN (Select acc
From `table1`
WHERE source = 'World')
Instead of getting something like this:
Acc1
Acc2
Acc3
Jeff
Jeff
Ted
Chris
Ted
Blake
Rob
Jack
Jack
I get something more like this:
row
name
1
Jeff
2
Chris
3
Rob
4
Jack
5
Jeff
6
Jack
7
Ted
8
Blake
Ultimately, I am hoping to download the data and somehow use python or something to take each name and count the number of times it shows up with each other name at a given accession number, and furthermore measure the degree to which each pairing is also found with third names in any given column, i.e. the degree to which they share a cohort. So I need to preserve the groupings which exist with each accession number, but I am struggling to find info on how one might do this.
Could anybody point me in the right direct for this, or otherwise is the way I am going about this wise if that is my end goal?
Thanks!
This is not a direct answer to the question you asked. In general, it is easier to handle multiple rows rather than multiple columns.
So, I would recommend that you put each acc value in a separate row and then list the names as an array:
select t2.acc, array_agg(t2.name order by t2.name) as names
from `table2` t2
where t2.acc in (Select t1.acc
From `table1` t1
where t1.source = 'World'
)
group by t2.acc;
Otherwise, you are going to have a challenge just naming the columns in your result set.

Exact String match using oracle CONTAINS for multiple values

I need to get exact name string match from multiple values using Contains function.
I used below query to get data which matches exactly JOHN SMITH OR MIKE DAVID but query is fetching all data which has JOHN SMITH, JOHN, SMITH, MIKE, DAVID, MIKE DAVID, JOHN..SMITH, JOHN/SMITH,....
where contains(names,'{JOHN SMITH} OR {MIKE DAVID}')>0
Note - I don't want to use multiple like in OR conditions.We need to pass around 200 to 300 values (names) to do match pattern.
Can anyone let me know how to get exact match from multiple values using CONTAINS?
Thanks
Anand
I don't use Oracle but i liked this question and i will try to answer it based on what i read on Oracle's documentation, regarding CONTAINS.
so i am currently reading about Stored Query Expression (SQE),
Stored Query Expression (SQE)
Use the SQE operator to call a stored query expression created with
the CTX_QUERY.STORE_SQE procedure.
Stored query expressions can be used for creating predefined bins for
organizing and categorizing documents or to perform iterative queries,
in which an initial query is refined using one or more additional
queries.
So perhaps you could pass all the names you want to query in such a way. For example:
begin
ctx_query.store_sqe('mynames', 'JOHN SMITH OR MIKE DAVID OR GEORGE HARRIS OR JOHN DOE');
end;
And then call it:
SELECT SCORE(1), nameid FROM mytable
WHERE CONTAINS(names, 'sqe(mynames)', 1)> 0
ORDER BY SCORE(1);
Hope this was helpful.

Populating column for Oracle Text search from 2 tables

I am investigating the benefits of Oracle Text search, and currently am looking at collecting search text data from multiple (related) tables and storing the data in the smaller table in a 1-to-many relationship.
Consider these 2 simple tables, house and inhabitants, and there are NEVER any uninhabited houses:
HOUSE
ID Address Search_Text
1 44 Some Road
2 31 Letsby Avenue
3 18 Moon Crescent
INHABITANT
ID House Name Nickname
1 1 Jane Doe Janey
2 1 John Doe JD
3 2 Jo Smythe Smithy
4 2 Percy Plum PC
5 3 Apollo Lander Moony
I want to to write SQL that updates the HOUSE.Search_Text column with text from INHABITANT. Now because this is a 1-to-many, the SQL needs to collate the data in INHABITANT for each matching row in house, and then combine the data (comma separated) and update the Search_Text field.
Once done, the Oracle Text search index on HOUSE.Search_Text will return me HOUSEs that match the search criteria, and I can look up INHABITANTs accordingly.
Of course, this is a very simplified example, I want to pick up data from many columns and Full Text Search across fields in both tables.
With the help of a colleague we've got:
select id, ADDRESS||'; '||Names||'; '||Nicknames as Search_Text
from house left join(
SELECT distinct house_id,
LISTAGG(NAME, ', ') WITHIN GROUP (ORDER BY NAME) OVER (PARTITION BY house_id) as Names,
LISTAGG(NICKNAME, ', ') WITHIN GROUP (ORDER BY NICKNAME) OVER (PARTITION BY house_id) as Nicknames
FROM INHABITANT)
i on house.id = i.house_id;
which returns:
1 44 Some Road; Jane Doe, John Doe; JD, Janey
2 31 Letsby Avenue; Jo Smythe, Percy Plum; PC, Smithy
3 18 Moon Crescent; Apollo Lander; Moony
Some questions:
Is this an efficient query to return this data? I'm slightly
concerned about the distinct.
Is this the right way to use Oracle Text search across multiple text fields?
How to update House.Search_Text with the results above? I think I need a correlated subquery, but can't quite work it out.
Would it be more efficient to create a new table containing House_ID and Search_Text only, rather than update House?