How to implement a Keyword Search in MySQL? - sql

I am new to SQL programming.
I have a table job where the fields are id, position, category, location, salary range, description, refno.
I want to implement a keyword search from the front end. The keyword can reside in any of the fields of the above table.
This is the query I have tried but it consist of so many duplicate rows:
SELECT
a.*,
b.catname
FROM
job a,
category b
WHERE
a.catid = b.catid AND
a.jobsalrange = '15001-20000' AND
a.jobloc = 'Berkshire' AND
a.jobpos LIKE '%sales%' OR
a.jobloc LIKE '%sales%' OR
a.jobsal LIKE '%sales%' OR
a.jobref LIKE '%sales%' OR
a.jobemail LIKE '%sales%' OR
a.jobsalrange LIKE '%sales%' OR
b.catname LIKE '%sales%'

For a single keyword on VARCHAR fields you can use LIKE:
SELECT id, category, location
FROM table
WHERE
(
category LIKE '%keyword%'
OR location LIKE '%keyword%'
)
For a description you're usually better adding a full text index and doing a Full-Text Search (MyISAM only):
SELECT id, description
FROM table
WHERE MATCH (description) AGAINST('keyword1 keyword2')

SELECT
*
FROM
yourtable
WHERE
id LIKE '%keyword%'
OR position LIKE '%keyword%'
OR category LIKE '%keyword%'
OR location LIKE '%keyword%'
OR description LIKE '%keyword%'
OR refno LIKE '%keyword%';

Ideally, have a keyword table containing the fields:
Keyword
Id
Count (possibly)
with an index on Keyword. Create an insert/update/delete trigger on the other table so that, when a row is changed, every keyword is extracted and put into (or replaced in) this table.
You'll also need a table of words to not count as keywords (if, and, so, but, ...).
In this way, you'll get the best speed for queries wanting to look for the keywords and you can implement (relatively easily) more complex queries such as "contains Java and RCA1802".
"LIKE" queries will work but they won't scale as well.

Personally, I wouldn't use the LIKE string comparison on the ID field or any other numeric field. It doesn't make sense for a search for ID# "216" to return 16216, 21651, 3216087, 5321668..., and so on and so forth; likewise with salary.
Also, if you want to use prepared statements to prevent SQL injections, you would use a query string like:
SELECT * FROM job WHERE `position` LIKE CONCAT('%', ? ,'%') OR ...

I will explain the method i usally prefer:
First of all you need to take into consideration that for this method you will sacrifice memory with the aim of gaining computation speed.
Second you need to have a the right to edit the table structure.
1) Add a field (i usually call it "digest") where you store all the data from the table.
The field will look like:
"n-n1-n2-n3-n4-n5-n6-n7-n8-n9" etc.. where n is a single word
I achieve this using a regular expression thar replaces " " with "-".
This field is the result of all the table data "digested" in one sigle string.
2) Use the LIKE statement %keyword% on the digest field:
SELECT * FROM table WHERE digest LIKE %keyword%
you can even build a qUery with a little loop so you can search for multiple keywords at the same time looking like:
SELECT * FROM table WHERE
digest LIKE %keyword1% AND
digest LIKE %keyword2% AND
digest LIKE %keyword3% ...

You can find another simpler option in a thread here: Match Against.. with a more detail help in 11.9.2. Boolean Full-Text Searches
This is just in case someone need a more compact option. This will require to create an Index FULLTEXT in the table, which can be accomplish easily.
Information on how to create Indexes (MySQL): MySQL FULLTEXT Indexing and Searching
In the FULLTEXT Index you can have more than one column listed, the result would be an SQL Statement with an index named search:
SELECT *,MATCH (`column`) AGAINST('+keyword1* +keyword2* +keyword3*') as relevance FROM `documents`USE INDEX(search) WHERE MATCH (`column`) AGAINST('+keyword1* +keyword2* +keyword3*' IN BOOLEAN MODE) ORDER BY relevance;
I tried with multiple columns, with no luck. Even though multiple columns are allowed in indexes, you still need an index for each column to use with Match/Against Statement.
Depending in your criterias you can use either options.

I know this is a bit late but what I did to our application is this. Hope this will help someone tho. But it works for me:
SELECT * FROM `landmarks` WHERE `landmark_name` OR `landmark_description` OR `landmark_address` LIKE '%keyword'
OR `landmark_name` OR `landmark_description` OR `landmark_address` LIKE 'keyword%'
OR `landmark_name` OR `landmark_description` OR `landmark_address` LIKE '%keyword%'

Related

Is there a way to express AND in SIMILAR TO ignoring order of matches?

I have a Redshift table column that contains 1 to many hashtags (e.g. #a, #b, etc.). I want to write a query that finds rows where all tags from a given set exist (e.g. #a and #b) while not picking up other rows that have some but not all of the tags (e.g. only #a or only #b).
I can see how to do this with multiple LIKE statements (e.g. LIKE '%#a %' AND LIKE '%#b%') but I would really like to do it with a single statement. I can see how to do this with SIMILAR TO but not in a way that ignores ordering. The following would work but only if I include all possible combinations of ordering.
SELECT * FROM table WHERE field SIMILAR TO '(%#a%)(%#b%)|(%#b%)(%#a%)'
This works but having to list all combinations of the tags I'm looking for would be a royal pain and prone to error. Is there a way to express 'AND' in SIMLAR TO (or another function) in Redshift that ignores order?
Make sure to capture the whole tag in any position and not match on incomplete tags:
SELECT *
FROM table
WHERE (field LIKE '#a#%' OR field LIKE '%#a') AND
(field LIKE '#b#%' OR field LIKE '%#b')
This avoids matching data such as #ac#b
Use AND and LIKE:
SELECT t.*
FROM table t
WHERE field LIKE '%#a%' AND
field LIKE '%#b%';

Why do CONTAINS and LIKE return different results?

I have the following query. There are two possible columns that may hold the value I'm looking for, let's call them FieldA and FieldB.
If I execute this:
SELECT COUNT(1)
FROM Table
WHERE CONTAINS(Table.*, 'string')
I get back "0".
However, if I execute this:
SELECT COUNT(1)
FROM TABLE
WHERE FieldA LIKE '%string%' OR FieldB LIKE '%string%'
I get back something like 9000. I then checked and there are rows that have the word string in either FieldA.
Why does this happen? I recall that CONTAINS uses a full-text index, but I also recall that LIKE does the same, so if the problem was that the indexes are outdated, then it should fail for both of them, right?
Thanks
I believe that CONTAINS and full text searching will only yield whole word results, so you won't match the same as LIKE '%string%'. If you want to right wildcard your CONTAINS, you must write it like:
SELECT COUNT(1) FROM Table WHERE CONTAINS(Table.*, '"string*"')
However, if you want to left wildcard, you can't! You have to store a copy of your database reversed and then do:
SELECT COUNT(1) FROM Table WHERE CONTAINS(Table.*, '"gnirts*"')
https://learn.microsoft.com/en-us/previous-versions/office/developer/sharepoint-2010/ms552152(v=office.14)
How do you get leading wildcard full-text searches to work in SQL Server?
So in the example in the question, doing a CONTAINS(Table.*, 'string') is not the same as doing LIKE '%string%' and would not have the same results.

Improve the performance of query containing upper and nvl fucntion

INSERT INTO tab2 NOLOGGING
SELECT
ID,
ORG_NAME
FROM tab3
WHERE (( upper(NVL(org_name,company_given)) LIKE '%MSOFT%'
OR upper(NVL(org_name,company_given)) LIKE 'M SOFT'
OR upper(NVL(org_name,company_given)) LIKE '%MISOFT%'
OR upper(NVL(org_name,company_given)) LIKE 'MSN %'
OR upper(NVL(org_name,company_given)) LIKE '%N APP%'
OR upper(NVL(org_name,company_given)) LIKE '%NAPP%'
OR upper(NVL(org_name,company_given)) LIKE '%NAPPE%'
OR upper(NVL(org_name,company_given)) LIKE '%NAPPS%'
OR upper(NVL(org_name,company_given)) LIKE '%NEK%APPLIANCE%'
the above coding is taking too much time. Table tab3 is very huge.
The above is dynamic. Any alternatives for nvl?
The line below
OR upper(NVL(org_name,company_given)) LIKE 'M SOFT'
could be replaced with
OR ((orgname is not null and upper(org_name) LIKE 'M SOFT')
OR ((orgname is null and upper(company_given) LIKE 'M SOFT')
Not sure it's faster.
Also you can try to run it once with subquery
SELECT *
FROM (
SELECT
ID,
ORG_NAME,
upper(NVL(org_name,company_given)) as name_for_filter
FROM tab3)
WHERE name_for_filter LIKE '%MSOFT%'
OR name_for_filter LIKE 'M SOFT'
...
The best way would be to introduce a name_for_filter column in the table and fill it once with a trigger. Then the column could be used for the filtering
This query is going to execute a full table scan of your table. You say that table is huge, so it's going to take a long time.
A normal index won't help because there are two columns in play. Even a function-based index like this ...
create index fbi3 on tab3( upper(NVL(org_name, company_given) ))
... won't help because indexes are useless against a like filter with a wildcard at the front, and you have those:
LIKE '%NEK%APPLIANCE%'
If this is a one-time exercise I would suggest you swallow the time and wait for the statement to finish. But let's assume you want to do this kind of query often. If so, it's worth building infrastructure to support it.
A new column for the search criteria. Basically a column which is pre-populated with the arguments used in the functions. For 11g or higher make this a virtual column:
alter table tab3 add search_name as ( upper(NVL(org_name, company_given)));
If using an older version of the database you will have to build a normal column and populate it with triggers.
Build a Text index on the search_name column. As it is short you can use a CTXCAT index, which will be maintained transactionally.
Then you need to rewrite the query to use catsearch() syntax instead of like operator. Find out more
As already suggested, it's probably best to create a prepared search column. You could even remove the spaces to avoid a search for both 'N APP' and 'NAPP' for example (but that could lead to false positives in some cases).
On top of that you can remove the check for %NAPPE% and %NAPPS% because you already include records containing %NAPP%
It should be faster when using:
pseudocode:
'MSN %'
or ('%SOFT%' and ('M SOFT' or '%MSOFT%' or '%MISOFT%'))
or ('%APP%' and ('%N APP%' or '%NAPP%' or '%NEK%APPLIANCE%'))
If SOFT or APP is not found there is no need to check for the others containing the same word - the and will avoid that if the first part is already false.
If this is just an example and those parameters are variable, you could write some code to optimize those search terms (unless the SQL server already does that).

How to search a given text in a table in sql server

I want to search a text in a table without knowing its attributes.
Example : I have a table Customer,and i want to search a record which contains 'mohit' in any field without knowing its column name.
You are looking for Full Text Indexing
Example using the Contains
select ColumnName from TableName
Where Contains(Col1,'mohit') OR contains(col2,'mohit')
NOTE - You can convert the above Free text query into dynamic Query using the column names calculated from the sys.Columns Query
Also check below
FIX: Full-Text Search Queries with CONTAINS Clause Search Across Columns
Also you can check all Column Name From below query
Select Name From sys.Columns Where Object_Id =
(Select Object_Id from sys.Tables Where Name = 'TableName')
Double-WildCard LIKE statements will not speed up the query.
If you wanna make a full search on the table, you must surely be knowing the structure of the table. Considering the table has fields id, name, age, and address, then your SQL Query should be like:
SELECT * FROM `Customer`
WHERE `id` LIKE '%mohit%'
OR `name` LIKE '%mohit%'
OR `age` LIKE '%mohit%'
OR `address` LIKE '%mohit%';
Mohit, I'm glad you devised the solution by yourself.
Anyway, whenever you again face an unknown table or database, I think it will be very welcome the code snippet I just posted here.
Ah, one more thing: the answers given did not addressed your problem, did they?

How does phpmyadmin implement "search" feature?

In phpMyAdmin, there is a search in database feature by which I can input a word and find it in any table(s).
How to implement it by SQL statement? I know the LIKE operation, but it's syntax is:
WHERE column_name LIKE pattern
How to search in all columns? Any how to specify it's a exact keyword or regular express?
SELECT * FROM your_table_name WHERE your_column_name LIKE 'search_box_text';
Where search_box_text is what you enter in the search. It will also say in the result page what kind of query it made. The same query with regular expressions is:
SELECT * FROM your_table_name WHERE your_column_name REGEXP 'search_box_text';
Remember that the wildcard in mysql is %. Eg. "LIKE '%partial_search_text%'
If you want to search in multiple columns, you can check which columns are in table with:
DESCRIBE TABLE your_table_name;
Or if you already know your columns:
SELECT * FROM your_table_name
WHERE your_column_1 LIKE '%search%'
AND your_column_2 LIKE '%search%'
AND your_column_3 LIKE '%search%';