Difference between NOT LIKE and '[^string]' - sql

Is there any difference between the 2 syntaxes?
SELECT [dbo].[Employees].[FirstName]
FROM [dbo].[Employees]
WHERE [dbo].[Employees].[FirstName] NOT LIKE '[a-c]%';
SELECT [dbo].[Employees].[FirstName]
FROM [dbo].[Employees]
WHERE [dbo].[Employees].[FirstName] LIKE '[^a-c]%';

One will match an empty string. The other will not.
And for more complex patterns, there's almost always a difference. Especially if you start exploiting double-negatives e.g. if you have a NOT LIKE '%[^0-9]%' pattern, there's no simple way to reduce it.

Related

SQL: LIKE with OR vs IN

Is there any performance difference between the following?
NAME LIKE '%EXPRESSION_1%'
OR NAME LIKE '%EXPRESSION_2%'
...
OR NAME LIKE '%EXPRESSION_N%'
VS
NAME IN (ACTUAL_VALUE_1,ACTUAL_VALUE_2,.., ACTUAL_VALUE_N)
The IN version is potentially much, much faster.
The two versions do not do the same thing. But, if either meets your needs, the IN version can take advantage of an index on NAME. The LIKE version cannot, because the pattern starts with a wildcard.
You could write this as:
WHERE NAME LIKE 'EXPRESSION_%'
If this meets your needs, it can also take advantage of an index on NAME.
You can simply try to use
NAME LIKE '%EXPRESSION_%'
As far as the performance is concerned then IN is comparatively faster than OR. Also you can confirm the performance using the execution plan of your two queries.
Also as commented above, the two queries which you are showing are different.
The first query:
NAME LIKE '%EXPRESSION_1%'
OR NAME LIKE '%EXPRESSION_2%'
...
OR NAME LIKE '%EXPRESSION_N%'
will try to fetch the result which has sample data like
EXPRESSION_123
XXXEXPRESSION_1234
EXPRESSION_2323
EXPRESSION_2......
whereas your second query will match the records which are exactly matching to
ACTUAL_VALUE_1,ACTUAL_VALUE_2.....
If you are using variable expression which can be change according to the given parameter. Then use of
declare #Expression1 varchar(50)
Set #Expression2 = '%'+ #Expression1 +'%'
NAME LIKE #Expression2
so whatever parameter will come in #Expression1 it will automatically take care of it.

How do I declare priority in SQL statements?

My question seems to be quite simple, but I'm worried the answer might actually be somewhat complex. I am trying to perform a simple Select query that behaves like the following.
Here is the code:
SELECT * FROM tbl_tbl WHERE tbl_tbl.colA LIKE '%foo%' OR tbl.tbl.colA LIKE '%oof%' AND
tbl_tbl.colB LIKE '%bar%' OR tbl_tbl.colB LIKE '%rab%'
So I am just searching for 4 strings (2 in each column), and if I find one in each pair, I want to show that entire entry.
Mathematically, it makes quite a bit of sense to me.
I want to do (This OR That) AND (One OR Another) where any combination of This/One, This/Another, etc. passes the expression.
Pretty simple right?
How do I tell SQL to work right (you know, like that obscure way in my mind)?
Currently, I'm getting entries out of my table where only 1 of the column disciplines match, and that's not giving me the specificity of the priority I am looking for.
You would express it using parentheses and boolean logic in the where clause:
SELECT *
FROM tbl_tbl t
WHERE (t.colA LIKE '%foo%' OR t.colA LIKE '%oof%') AND
(t.colB LIKE '%bar%' OR t.colB LIKE '%bar%');
Do note that this is based on your example in the question. The second clause of the AND has two conditions that are the same. I assume this is a typo in the question, but not knowing the right pattern, I've left it in the answer.

Is this possible with the SELECT LIKE specification?

Looking at this link: SQL SELECT LIKE
What if you were searching for a name that starts with H and ends with dinger?
Would I use:
SELECT NAME LIKE
'H_dinger'
'H...dinger' or
'H%dinger' ?
I'll assume H_dinger would think there is only 1 character in between, but I don't know what it is -- so I'm searching for it.
H...dinger isn't valid.
And H%dinger seems like it would check it all, but on the site, that isn't even listed?
You would use %, which is the variable-sized wildcard.
But you need to get the syntax right, such as with:
select NAME from TABLE where NAME like 'H%dinger'
Keep in mind that queries using % may be a performance issue (depending on how it's used and the DBMS engine). It can prevent the efficient use of indexes to speed up queries. It probably won't matter for small tables but it's something to keep in mind if you ever need to scale.

Sybase Multiple Substrings Search

I need to get data back from a text field. The input is not all going to be pretty...some of the users don't spell well or consistently. I need to look for a variety of misspellings as well as alternative terms.
I am working with Sybase ASE and am wondering if the AND statement is getting unwieldy and may not be optimal? Here is one attempt:
AND (entry_txt like 'fight' OR
entry_txt like 'confron%' OR
entry_txt like 'aggres%' OR
entry_txt like 'grab' OR
entry_txt like 'push' OR
entry_txt like 'strike' OR
entry_txt like 'hit' OR
entry_txt like 'assa%')
It will get longer as I add some new requirements for additional terms as well as some proprietary names and 8-9 more variations therein! Is there a more efficient way to do this or is that it?
I have also read that LIKE should be used for partial string comparison and IN for values from a set. How about values from a set of partial strings? Could I /should I use IN here and does that help performance?
I am searching thousands of docs so there is a lot of data to have to go through.
Yes, for the ones that you don't have % you can use IN, for the others you still need to use OR.
It would look something like this:
AND (entry_txt in ('fight', 'grab', 'push', 'strike', 'hit')
OR entry_txt like 'confron%'
OR entry_txt like 'aggres%'
OR entry_txt like 'assa%')
You can actually put "like" expressions in an expression - another column in a table, or a variable.
So you could create a table with one varchar column called "like_expr" or something like that.
Then put all the above expressions into it, including the ones without % in, because they'll just degenerate to an equality operation.
In terms of efficiency, if entry_txt is indexed then the index can be used. I would think Sybase would find it easier to join to the like_expr table than to do lots and lots of ORs, but both should use the index - that should be a separate issue.)
create table abe (a varchar(20))
insert abe values ('hello')
create table abe2 (l varchar(20))
insert abe2 values ('h%')
select * from abe a where exists (select 1 from abe2 where a.a like l)
a
hello

For an Oracle NUMBER datatype, LIKE operator vs BETWEEN..AND operator

Assume mytable is an Oracle table and it has a field called id. The datatype of id is NUMBER(8). Compare the following queries:
select * from mytable where id like '715%'
and
select * from mytable where id between 71500000 and 71599999
I would think the second is more efficient since I think "number comparison" would require fewer number of assembly language instructions than "string comparison". I need a confirmation or correction. Please confirm/correct and throw any further comment related to either operator.
UPDATE: I forgot to mention 1 important piece of info. id in this case must be an 8-digit number.
If you only want values between 71500000 and 71599999 then yes the second one is much more efficient. The first one would also return values between 7150-7159, 71500-71599 etc. and so forth. You would either need to sift through unecessary results or write another couple lines of code to filter the rest of them out. The second option is definitely more efficient for what you seem to want to do.
It seems like the execution plan on the second query is more efficient.
The first query is doing a full table scan of the id's, whereas the second query is not.
My Test Data:
Execution Plan of first query:
Execution Plan of second query:
I don't like the idea of using LIKE with a numeric column.
Also, it may not give the results you are looking for.
If you have a value of 715000000, it will show up in the query result, even though it is larger than 71599999.
Also, I do not like between on principle.
If a thing is between two other things, it should not include those two other things. But this is just a personal annoyance.
I prefer to use >= and <= This avoids confusion when I read the query. In addition, sometimes I have to change the query to something like >= a and < c. If I started by using the between operator, I would have to rewrite it when I don't want to be inclusive.
Harv
In addition to the other points raised, using LIKE in the manner you suggest would cause Oracle to not use any indexes on the ID column due to the implicit conversion of the data from number to character, resulting in a full table scan when using LIKE versus and index range scan when using BETWEEN. Assuming, of course, you have an index on ID. Even if you don't, however, Oracle will have to do the type conversion on each value it scans in the LIKE case, which it won't have to do in the other.
You can use math function, otherwise you have to use to_char function to use like, but it will cause performance problems.
select * from mytable where floor(id /100000) = 715
or
select * from mytable where floor(id /100000) = TO_NUMBER('715') // this is parametric