How do the SQL "IS" and "=" operators differ? - sql

I am building some prepared statements that use parametrized values. As an example:
SELECT * FROM "Foo" WHERE "Bar"=#param
Sometimes #param might be NULL. In such cases, I want the query to return records where Bar is NULL, but the above query will not do that. I have learned that I can use the IS operator for this. In other words:
SELECT * FROM "Foo" WHERE "Bar" IS #param
Aside from the differing treatment of NULL, are there any other ways in which the above two statements will behave differently? What if #param is not NULL, but is instead, let's say, 5? Is using the IS operator in that case a safe (and sane) thing to do? Is there some other approach I should be taking?

You want records from Foo where Bar = #param, or if #param is null, where Bar is null. Some of the proposed solutions will give you null records with nonnull #param, which does not sound like your requirement.
Select * from Foo where (#param is null and Bar is null) or (Bar = #param)
This doesn't say whether this is Oracle or SQL Server or another RDBMS, because they each implement slightly different helper functions. SQL's ISNULL(first, second) like NVL(first, second). I like SQL Server's COALESCE() for the general applicability.
The IS comparison is only for null comparisons.
If you are using SQL Server and if you really need a different 3VL logic truth table to solve your problem (that is, if you have a specific need for "NULL=NULL" to be "true" at some point in time, and also recognize that this is deprecated and barring your reasons, not a good idea in general), within your code block you can use the directive
SET ANSI_NULLS OFF
Here's the BOL on it:
http://msdn.microsoft.com/en-us/library/ms188048.aspx

You may be thinking about this incorrectly. If you're talking about SQL Server, for example (since that's what I have to hand), your second example will result in a syntax error. The value on the right-hand side of IS cannot be 5.
To explain, consider MSDN's explanation of these two operators in T-SQL (note that asking about "SQL" and about "SQL Server" are not necessarily the same).
Equals (=) operator
IS NULL operator
Notice something important, there. There is no such thing as the "IS" operator in T-SQL. There is specifically the <expression> IS [NOT] NULL operator, which compares a single expression to NULL.
That's not the same thing as the = operator, which compares two expressions to each other, and has certain behavior when one or both of the expressions happens to be NULL!

Edit:
(Update from OP: This doesn't do what I If #param is 5, then I want to see only records where Bar is 5. I want to see records where Bar is NULL if, and only if, #param is NULL. I apologize if my question didn't make that clear.)
In that case, I think you should try something like this:
SELECT * FROM Foo WHERE Bar=#param OR (Bar IS NULL AND #param IS NULL)
Previous post:
Why not simply use OR ?
SELECT * FROM "Foo" WHERE "Bar"=#param OR "Bar" IS NULL
In SQL Server, you can use ISNULL:
SELECT * FROM "Foo" WHERE ISNULL("Bar",#param)=#param

I don't know what version of SQL you are using but IS makes no sense in the context you just described. I get a syntax error if I try to use it the way you described. Why would you want to use it over = anyway? This is the common usage and the one software maintainers woudl expect to find.

What specific database are you using?
If you're doing searches based on null (or not null), using IS is the way to go. I cannot provide a technical reason but I use this syntax all the time.
SELECT * FROM Table WHERE Field IS NULL
SELECT * FROM Table WHERE Field IS NOT NULL

Related

JPA query parameter IN or IS NULL

I have an issue with this simple query :
#Query("SELECT c FROM Cat c WHERE c.id IN (:idCat) OR :idCat IS NULL")
List<Cat> getAllCatWithOrWithoutId(#Param("idCat")List<String> idCat);
Which queried in a list, or, if the id is not mentionned, select all cats in table (idCat is optional actually).
It seems working when it's an "=" operator instead of IN but when I run the query I have the error message is : "invalid relational operator".
Even if I try with a native query.
I tried to replace idCat value by single value (it worked), or by null (it worked too), but not when I put several values.
Is it something wrong in syntax or is it simply impossible with an IN statement?
When you provide a filled list, your second part of the query (OR :idCat IS NULL) is translated to a list of values to compare to "IS NULL".
As stated on JSR-338, chapter 4.6.11, "[a] null comparison expression tests whether or not the single-valued path expression or input parameter is a NULL value", so it is not expected to support a filled list as an argument. To check lists, there is the "IS [NOT] EMPTY" expression, but it expects a collection_valued_path_expression and it is not your case.
I've stumbled sometimes with strange behavior of some persistence providers where they supported (anti-spec) this kind of comparison if you surround the parameter with parenthesis, but, again, you can not rely on it for future evolutions.
The best approach in your case would be define two different JPQLs or methods to deal with your both desired scenarios.

SQL isnull error

While trying to do a select query using the isnull which, i've tried in 2 differents servers that are identical one to the other. (They both use the same procedure, dll, return page, they just change from one ip to the other)
SELECT * FROM
ITEM_TEST
WHERE ITEM_NAME = isnull(#ITEM_TESTE, ITEM_NAME)
The operation is working without problems in one of the servers, returning all options when the #ITEM_TESTE is NULL, while in the other, it returns ONLY the ones that are NOT NULL.
I'm using a sybase-based-application (version 12.5) called SQLdbx (version 3.14)
Case it's not so openly understood, #ITEM_TESTE is a variable given from the user that is optional, meaning it can be null where the ITEM_NAME accepts a STRING to it, while it's also option the ITEM_TEST is a table with more than 10 variables, i'm simplifing it. This search, however, want's all the possibles results even if ITEM_NAME is UNKOWN while using others variables to narrow down the search. (I thought about creating a search with an IF condition that excluded ITEM_NAME and it worked, but the it made the search so "laggy" due to perfomance issues.)
EDIT
Change the name of the variables to make it less confusing (both with the same name) and added an explaining for easier understanding
Also, due to copyright issues that i can't post the exact code here.
This is your where clause:
WHERE ITEM_TESTE = isnull(#ITEM_TESTE, ITEM_TESTE)
This where clause will never be true when ITEM_TESTE is NULL, because NULL = NULL evaluates to not true in the SQL world.
Presumably, you want:
WHERE (ITEM_TEST = #ITEM_TESTE OR #ITEM_TESTE IS NULL)
The way it was explained to me and has forever stuck after all of these years, is that NULL is not nothing, it is unknown, so you cannot use an equality check to verify two things you know nothing about are equally nothing. IS is checking that they are in the same unknown state, which has nothing to do with a value.
So as the others have said = NULL will never work, because = implies value comparison.

SQL CASE statements on Informix - Can you set more than one field in the END section of a case block?

Using IBM Informix Dynamic Server Version 10.00.FC9
I'm looking to set multiple field values with one CASE block. Is this possible? Do I have to re-evaluate the same conditions for each field set?
I was thinking of something along these lines:
SELECT CASE WHEN p.id = 9238 THEN ('string',3) END (varchar_field, int_field);
Where the THEN section would define an 'array' of fields similar to the syntax of
INSERT INTO table (field1,field2) values (value1,value2)
Also, can it be done with a CASE block of an UPDATE statement?
UPDATE TABLE SET (field1,field2) = CASE WHEN p.id=9238 THEN (value1,value2) END;
Normally, I'd ask for the version of Informix that you're using, but it probably doesn't matter much this time. The simple answer is 'No'.
A more complex answer might discuss using a row type constructor, but that probably isn't what you want on the output. And, given the foregoing, then the UPDATE isn't going to work (and would require an extra level of parentheses if it was going to).
No, a CASE statement resolves to an expression (see IBM Informix Guide to SQL: Syntax CASE Expressions) and can be used in places where an expression is permitted. An expression is a single value.
from http://en.wikipedia.org/wiki/Expression_%28programming%29
An expression in a programming
language is a combination of explicit
values, constants, variables,
operators, and functions that are
interpreted according to the
particular rules of precedence and of
association for a particular
programming language, which computes
and then produces (returns, in a
stateful environment) another value.
Found an easy way to do it located here:
how to have listview row colour to change based on data in the row
Solution was just adding the case statement to my sql statement. Just maid my life much easier.

How can I select rows that are null using bound queries in Perl's DBI?

I want to be able to pass something into an SQL query to determine if I want to select only the ones where a certain column is null. If I was just building a query string instead of using bound variables, I'd do something like:
if ($search_undeleted_only)
{
$sqlString .= " AND deleted_on IS NULL";
}
but I want to use bound queries. Would this be the best way?
my $stmt = $dbh->prepare(...
"AND (? = 0 OR deleted_on IS NULL) ");
$stmt->execute($search_undeleted_only);
Yes; a related trick is if you have X potential filters, some of them optional, is to have the template say " AND ( ?=-1 OR some_field = ? ) ", and create a special function that wraps the execute call and binds all the second ?s. (in this case, -1 is a special value meaning 'ignore this filter').
Update from Paul Tomblin: I edited the answer to include a suggestion from the comments.
So you're relying on short-circuiting semantics of boolean expressions to invoke your IS NULL condition? That seems to work.
One interesting point is that a constant expression like 1 = 0 that did not have parameters should be factored out by the query optimizer. In this case, since the optimizer doesn't know if the expression is a constant true or false until execute time, that means it can't factor it out. It must evaluate the expression for every row.
So one can assume this add a minor cost to the query, relative to what it would cost if you had used a non-parameterized constant expression.
Then combining with OR with the IS NULL expression may also have implications for the optimizer. It might decide it can't benefit from an index on deleted_on, whereas in a simpler expression it would have. This depends on the RDBMS implementation you're using, and the distribution of values in your database.
I think that's a reasonable approach. It follows the normal filter pattern nicely and should give good performance.

SQL Server: Is SELECTing a literal value faster than SELECTing a field? [duplicate]

This question already has answers here:
Subquery using Exists 1 or Exists *
(6 answers)
Closed 7 years ago.
I've seen some people use EXISTS (SELECT 1 FROM ...) rather than EXISTS (SELECT id FROM ...) as an optimization--rather than looking up and returning a value, SQL Server can simply return the literal it was given.
Is SELECT(1) always faster? Would Selecting a value from the table require work that Selecting a literal would avoid?
In SQL Server, it does not make a difference whether you use SELECT 1 or SELECT * within EXISTS. You are not actually returning the contents of the rows, but that rather the set determined by the WHERE clause is not-empty. Try running the query side-by-side with SET STATISTICS IO ON and you can prove that the approaches are equivalent. Personally I prefer SELECT * within EXISTS.
For google's sake, I'll update this question with the same answer as this one (Subquery using Exists 1 or Exists *) since (currently) an incorrect answer is marked as accepted. Note the SQL standard actually says that EXISTS via * is identical to a constant.
No. This has been covered a bazillion times. SQL Server is smart and knows it is being used for an EXISTS, and returns NO DATA to the system.
Quoth Microsoft:
http://technet.microsoft.com/en-us/library/ms189259.aspx?ppud=4
The select list of a subquery
introduced by EXISTS almost always
consists of an asterisk (*). There is
no reason to list column names because
you are just testing whether rows that
meet the conditions specified in the
subquery exist.
Also, don't believe me? Try running the following:
SELECT whatever
FROM yourtable
WHERE EXISTS( SELECT 1/0
FROM someothertable
WHERE a_valid_clause )
If it was actually doing something with the SELECT list, it would throw a div by zero error. It doesn't.
EDIT: Note, the SQL Standard actually talks about this.
ANSI SQL 1992 Standard, pg 191 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
3) Case:
a) If the <select list> "*" is simply contained in a <subquery> that is immediately contained in an <exists predicate>, then the <select list> is equivalent to a <value expression> that is an arbitrary <literal>.
When you use SELECT 1, you clearly show (to whoever is reading your code later) that you are testing whether the record exists. Even if there is no performance gain (which is to be discussed), there is gain in code readability and maintainability.
Yes, because when you select a literal it does not need to read from disk (or even from cache).
doesn't matter what you select in an exists clause. most people do select *, then sql server automatically picks the best index
As someone pointed out sql server ignores the column selection list in EXISTS so it doesn't matter. I personally tend to use "SELECT null ..." to indicate that the value is not used at all.
If you look at the execution plan for
select COUNT(1) from master..spt_values
and look at the stream aggregate you will see that it calculates
Scalar Operator(Count(*))
So the 1 actually gets converted to *
However I have read somewhere in the "Inside SQL Server" series of books that * might incur a very slight overhead for checking column permissions. Unfortunately the book didn't go into any more detail than that as I recall.
Select 1 should be better to use in your example. Select * gets all the meta-data assoicated with the objects before runtime which adss overhead during the compliation of the query. Though you may not see differences when running both types of queries in your execution plan.