Optimizing like '%text%' condition without removing it - sql

I need to optimize query although I don't have access to application source code.
Simplified query goes like this:
select from eu.myview where name like '%text%'
Until now I was solving this by generating full text index and replacing like with contains condition. That worked fine. However, in this case I don't have access to application source code and I can't remove like condition. I have access to database and I can change eu.myview source.
Is there anything I can do to optimize this query?

No.
The leading % invalidates use of index seeks so every row must be examined... whether via an index or table scan

Related

How to overcome ( SQL Like operator ) performance issues?

I have Users table contains about 500,000 rows of users data
The Full Name of the User is stored in 4 columns each have type nvarchar(50)
I have a computed column called UserFullName that's equal to the combination of the 4 columns
I have a Stored Procedure searching in Users table by name using like operatior as below
Select *
From Users
Where UserFullName like N'%'+#FullName+'%'
I have a performance issue while executing this SP .. it take long time :(
Is there any way to overcome the lack of performance of using Like operator ?
Not while still using the like operator in that fashion. The % at the start means your search needs to read every row and look for a match. If you really need that kind of search you should look into using a full text index.
Make sure your computed column is indexed, that way it won't have to compute the values each time you SELECT
Also, depending on your indexing, using PATINDEX might be quicker, but really you should use a fulltext index for this kind of thing:
http://msdn.microsoft.com/en-us/library/ms187317.aspx
If you are using index it will be good.
So you can give the column an id or something, like this:
Alter tablename add unique index(id)
Take a look at that article http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning .
It easily describes how LIKE works in terms of performance.
Likely, you are facing such issues because your whole table shoould be traversed because of first % symbol.
You should try creating a list of substrings(in a separate table for example) representing k-mers and search for them without preceeding %. Also, an index for such column would help. Please read more about KMer here https://en.m.wikipedia.org/wiki/K-mer .
This will not break an index and will be more efficient to search.

PostgreSQL only returns result when adding additional AND clause

I have a query like this:
SELECT boroughs.name
FROM boroughs, uniroads
WHERE uniroads.normalizedName='6 AVENUE'
AND st_intersects(boroughs.geometry, uniroads.way)
AND boroughs.name='Brooklyn'
0 results
But when I run it, it returns no results. However, I'm able to find the specific row in the table I'd like it to return, and when I add a clause asking for that particular row, it works fine:
SELECT boroughs.name
FROM boroughs, uniroads
WHERE uniroads.normalizedName='6 AVENUE'
AND st_intersects(boroughs.geometry, uniroads.way)
AND boroughs.name='Brooklyn'
AND uniroads.osm_id='23334071'
1 result
I'm using Postgres 9.2.2.0 with PostGIS through Postgres.app.
A guess.
uniroads.osm_id looks like a Key, therefore it is most likely to be indexed.
AND uniroads.osm_id='23334071' clause is causing (another, perhaps?) index to be used, thus this might mean (some?) originally used indexes are corrupted.
Maybe the following might help?
REINDEX TABLE boroughs;
REINDEX TABLE uniroads;
In any case, EXPLAIN ANALYZE wanted for both queries, as well as complete definitions of tables involved.

where like over varchar(500)

I have a query which slows down immensely when i add an addition where part
which essentially is just a like lookup on a varchar(500) field
where...
and (xxxxx.yyyy like '% blahblah %')
I've been racking my head but pretty much the query slows down terribly when I add this in.
I'm wondering if anyone has suggestions in terms of changing field type, index setup, or index hints or something that might assist.
any help appreciated.
sql 2000 enterprise.
HERE IS SOME ADDITIONAL INFO:
oops. as some background unfortunately I do need (in the case of the like statement) to have the % at the front.
There is business logic behind that which I can't avoid.
I have since created a full text catalogue on the field which is causing me problems
and converted the search to use the contains syntax.
Unfortunately although this has increased performance on occasion it appears to be slow (slower) for new word searchs.
So if i have apple.. apple appears to be faster the subsequent times but not for new searches of orange (for example).
So i don't think i can go with that (unless you can suggest some tinkering to make that more consistent).
Additional info:
the table contains only around 60k records
the field i'm trying to filter is a varchar(500)
sql 2000 on windows server 2003
The query i'm using is definitely convoluted
Sorry i've had to replace proprietary stuff.. but should give you and indication of the query:
SELECT TOP 99 AAAAAAAA.Item_ID, AAAAAAAA.CatID, AAAAAAAA.PID, AAAAAAAA.Description,
AAAAAAAA.Retail, AAAAAAAA.Pack, AAAAAAAA.CatID, AAAAAAAA.Code, BBBBBBBB.blahblah_PictureFile AS PictureFile,
AAAAAAAA.CL1, AAAAAAAA.CL1, AAAAAAAA.CL2, AAAAAAAA.CL3
FROM CCCCCCC INNER JOIN DDDDDDDD ON CCCCCCC.CID = DDDDDDDD.CID
INNER JOIN AAAAAAAA ON DDDDDDDD.CID = AAAAAAAA.CatID LEFT OUTER JOIN BBBBBBBB
ON AAAAAAAA.PID = BBBBBBBB.Product_ID INNER JOIN EEEEEEE ON AAAAAAAA.BID = EEEEEEE.ID
WHERE
(CCCCCCC.TID = 654321) AND (DDDDDDDD.In_Use = 1) AND (AAAAAAAA.Unused = 0)
AND (DDDDDDDD.Expiry > '10-11-2010 09:23:38') AND
(
(AAAAAAAA.Code = 'red pen') OR
(
(my_search_description LIKE '% red %') AND (my_search_description LIKE '% nose %')
AND (DDDDDDDD.CID IN (63,153,165,305,32,33))
)
)
AND (DDDDDDDD.CID IN (20,32,33,63,64,65,153,165,232,277,294,297,300,304,305,313,348,443,445,446,447,454,472,479,481,486,489,498))
ORDER BY AAAAAAAA.f_search_priority DESC, DDDDDDDD.Priority DESC, AAAAAAAA.Description ASC
You can see throwing in the my_search_description filter also includes a dddd.cid filter (business logic).
This is the part which is slowing things down (from a 1.5-2 second load of my pages down to a 6-8 second load (ow ow ow))
It might be my lack of understanding of how to have the full text search catelogue working.
Am very impressed by the answers so if anyone has any tips I'd be most greatful.
If you haven't already, enable full text indexing.
Unfortunately, using the LIKE clause on a query really does slow things down. Full Text Indexing is really the only way that I know of to speed things up (at the cost of storage space, of course).
Here's a link to an overview of Full-Text Search in SQL Server which will show you how to configure things and change your queries to take advantage of the full-text indexes.
More details would certainly help, but...
Full-text indexing can certainly be useful (depending on the more details about the table and your query). Full Text indexing requires a good bit of extra work both in setup and querying, but it's the only way to try to do the sort of search you seek efficiently.
The problem with LIKE that starts with a Wildcard is that SQL server has to do a complete table scan to find matching records - not only does it have to scan every row, but it has to read the contents of the char-based field you are querying.
With or without a full-text index, one thing can possibly help: Can you narrow the range of rows being searched, so at least SQL doesn't need to scan the whole table, but just some subset of it?
The '% blahblah %' is a problem for improving performance. Putting the wildcard at the beginning tells SQL Server that the string can begin with any legal character, so it must scan the entire index. Your best bet if you must have this filter is to focus on your other filters for improvement.
Using LIKE with a wildcard at the beginning of the search pattern forces the server to scan every row. It's unable to use any indexes. Indexes work from left to right, and since there is no constant on the left, no index is used.
From your WHERE clause, it looks like you're trying to find rows where a specific word exists in an entry. If you're searching for a whole word, then full text indexing may be a solution for you.
Full text indexing creates an index entry for each word that's contained in the specified column. You can then quickly find rows that contain a specific word.
As other posters have correctly pointed out, the use of the wildcard character % within the LIKE expression is resulting in a query plan being produced that uses a SCAN operation. A scan operation touches every row in the table or index, dependant on the type of scan operation being performed.
So the question really then becomes, do you actually need to search for the given text string anywhere within the column in question?
If not, great, problem solved but if it is essential to your business logic then you have two routes of optimization.
Really go to town on increasing the overall selectivity of your query by focusing your optimization efforts on the remaining search arguments.
Implement a Full Text Indexing Solution.
I don't think this is a valid answer, but I'd like to throw it out there for some more experienced posters comments...are these equivlent?
where (xxxxx.yyyy like '% blahblah %')
vs
where patindex(%blahbalh%, xxxx.yyyy) > 0
As far as I know, that's equivlent from a database logic standpoint as it's forcing the same scan. Guess it couldn't hurt to try?

Placing index columns on the left of a mysql WHERE statement?

I was curious since i read it in a doc. Does writing
select * from CONTACTS where id = ‘098’ and name like ‘Tom%’;
speed up the query as oppose to
select * from CONTACTS where name like ‘Tom%’ and id = ‘098’;
The first has an indexed column on the left side. Does it actually speed things up or is it superstition?
Using php and mysql
Check the query plans with explain. They should be exactly the same.
This is purely superstition. I see no reason that either query would differ in speed. If it was an OR query rather than an AND query however, then I could see that having it on the left may spped things up.
interesting question, i tried this once. query plans are the same (using EXPLAIN).
but considering short-circuit-evaluation i was wondering too why there is no difference (or does mysql fully evaluate boolean statements?)
You may be mis-remembering or mis-reading something else, regarding which side the wildcards are on a string literal in a Like predicate. Putting the wildcard on the right (as in yr example), allows the query engine to use any indices that might exist on the table column you are searching (in this case - name). But if you put the wildcard on the left,
select * from CONTACTS where name like ‘%Tom’ and id = ‘098’;
then the engine cannot use any existing index and must do a complete table scan.

What is the best way to do a wildcard search in sql server 2005?

So I have a stored procedure that accepts a product code like 1234567890. I want to facilitate a wildcard search option for those products. (i.e. 123456*) and have it return all those products that match. What is the best way to do this?
I have in the past used something like below:
SELECT #product_code = REPLACE(#product_code, '*', '%')
and then do a LIKE search on the product_code field, but i feel like it can be improved.
What your doing already is about the best you can do.
One optimization you might try is to ensure there's an index on the columns you're allowing this on. SQL Server will still need to do a full scan for the wildcard search, but it'll be only over the specific index rather than the full table.
As always, checking the query plan before and after any changes is a great idea.
A couple of random ideas
It depends, but you might like to consider:
Always look for a substring by default. e.g. if the user enters "1234", you search for:
WHERE product like "%1234%"
Allow users full control. i.e. simply take their input and pass it to the LIKE clause. This means that they can come up with their own custom searches. This will only be useful if your users are interested in learning.
WHERE product like #input