I am trying to work out how SQL queries are run and have hit a bit of a stumbling block.
If a where clause akin to the below is used:
A OR B AND C
This could mean either of the below
(A OR B) AND C
or
A OR (B AND C)
In the majority of cases the results will be the same, but if the set to be queried contains solely {A}, the first variant would return an empty result set and the second would return {A}. SQL does in fact return the 1 result.
Does anyone know (or have links to) any insight that will help me understand how queries are built?
Ketchup
The order is the following according to MSDN:
~ (Bitwise NOT)
(*) (Multiply), / (Division), % (Modulo)
(+) (Positive), - (Negative), + (Add), (+ Concatenate), - (Subtract), & (Bitwise AND), ^ (Bitwise Exclusive OR), | (Bitwise OR)
=, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
NOT
AND
ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
= (Assignment)
In the knowledge (from documentation) that AND has a higer precedence than OR, you should aim to write predicates for WHERE clauses in conjunctive normal form ("a seires of AND clauses").
If the intention is
( A OR B ) AND C
then write it thus and all is good.
However, if the intention is
A OR ( B AND C )
then I suggest you apply the distributive rewrite law that results in conjunctive normal form i.e.
( P AND Q ) OR R <=> ( P OR R ) AND ( Q OR R )
In your case:
A OR ( B AND C ) <=> ( A OR B ) AND ( A OR C )
AND and OR have different precedende.
See Precedence Level
For SQL-Server (which is your tag) here is the precedence http://msdn.microsoft.com/en-us/library/ms190276.aspx but..
If you're worried about the exact result set given you should indeed start working with () subsets.
Related
So I have inherited some SQL code that leaves me a bit uneasy:
FROM table AS x LEFT JOIN table AS y ON y.date_1 <= x.date AND y.date_2 >= x.date
And I am more accustomed to something like:
FROM table AS x LEFT JOIN table AS y ON x.date BETWEEN y.date_1 AND y.date_2
I didn't see a difference in the execution plan. Is one more preferred or optimal compared to the other?
In my opinion, you are right that they don't have differences. They only differ on how you are going to use Comparison Operators and BETWEEN. Comparison operators are relational where you compare two values, and BETWEEN is when you want to get the values on your given range.
From your example code above, BETWEEN is more preferred since it is shorter and less complex.
I also found an answered question here.
Note: Inequality operator is !=, although it is also included in comparison operators.
I have 2 tables A_ORDERS and A_ORDERS_LOG. Columns of interest in table A_ORDERS are PEN and SKL, and Columns in table A_ORDERS_LOG that are of interest are Status1 and Status2 as well as DATE (just for ordering my result). PEN is unique.
This is what I have
select
l.pen,status1,status2,date,a.skl
from A_ORDERS_LOG l
join A_ORDERS a on l.PEN=a.PEN
where a.skl='XY'
and status1='75' or status1='13'
and status2='13' or status2='11'
order by l.id,l.date;
I get all the rows for all the other skl's there are in the table A_ORDERS, but I only want rows regarding skl=XY from A_ORDERS. The thing is that PENs travel through different status and I want all the rows for each PEN where it went through possible status from 75 to 13 and then from 13 to 11.
When I put it like this I get what I want.
and (status1='75' or status1='13')
and (status2='13' or status2='11');
Why is that?
Thanks in advance :)
This happens because when there are multiple operators, operator precedence determines the sequence of operations. The order of execution can significantly affect the resulting value.
Operator AND Is having higher precedence over OR operator.
You want OR (Status 1 and status 2 ) to be considered first.So, when you write them in parenthesis () OR expression gets evaluated first rather than AND. Absence of parentheses will lead to the execution of AND expression.
Your query:
select l.pen,
status1,
status2,
date,
a.skl
from A_ORDERS_LOG l
join A_ORDERS a
on l.PEN=a.PEN
where a.skl='XY'
and status1='75'
or status1='13'
and status2='13'
or status2='11'
order by l.id,l.date;
Then AND has a higher operator precedence than OR so your query is interpreted as:
where ( a.skl='XY' and status1='75' )
or ( status1='13' and status2='13' )
or status2='11'
What you want is to use brackets to enforce the precedence you require:
where a.skl='XY'
and ( status1='75' or status1='13' )
and ( status2='13' or status2='11' )
or you could use IN (and get rid of the ORs):
where a.skl='XY'
and status1 IN ( '75', '13' )
and status2 IN ( '13', '11' )
This is because of the operator precedence. It's like in math. In
2 + 3 * 4 + 5
* has precedence over +, i.e. the above expression is equivalent to
2 + (3 * 4) + 5
If you want to perform the addition first, you must use braces
(2 + 3) * (4 + 5)
In SQL AND has precedence over OR. Therefore
a.skl='XY' AND status1='75' OR status1='13' AND status2='13' OR status2='11'
is equivalent to
(a.skl='XY' AND status1='75') OR (status1='13' AND status2='13') OR status2='11'
You want
a.skl='XY' AND (status1='75' OR status1='13') AND (status2='13' OR status2='11')
and must therefore change the precedence with braces. Braces always have the highest precedence.
See: Operator precedence rules (Oracle Docs).
A colleague of mine who is generally well-versed in SQL told me that the order of operands in a > or = expression could determine whether or not the expression was sargable. In particular, with a query whose case statement included:
CASE
when (select count(i.id)
from inventory i
inner join orders o on o.idinventory = i.idInventory
where o.idOrder = #order) > 1 THEN 2
ELSE 1
and was told to reverse the order of the operands to the equivalent
CASE
when 1 < (select count(i.id)
from inventory i
inner join orders o on o.idinventory = i.idInventory
where o.idOrder = #order) THEN 2
ELSE 1
for sargability concerns. I found no difference in query plans, though ultimately I made the change for the sake of sticking to team coding standards. Is what my co-worker said true in some cases? Does the order of operands in an expression have potential impact on its execution time? This doesn't mesh with how I understand sargability to work.
For Postgres, the answer is definitely: "No." (sql-server was added later.)
The query planner can flip around left and right operands of an operator as long as a COMMUTATOR is defined, which is the case for all instance of < and >. (Operators are actually defined by the operator itself and their accepted operands.) And the query planner will do so to make an expression "sargable". Related answer with detailed explanation:
Can PostgreSQL index array columns?
It's different for other operators without COMMUTATOR. Example for ~~ (LIKE):
LATERAL JOIN not using trigram index
If you're talking about the most popular modern databases like Microsoft SQL, Oracle, Postgres, MySql, Teradata, the answer is definitely NO.
What is a SARGable query?
A SARGable query is the one that strive to narrow the number of rows a database has to process in order to return you the expected result. What I mean, for example:
Consider this query:
select * from table where column1 <> 'some_value';
Obviously, using an index in this case is useless, because a database most certainly would have to look through all rows in a table to give you expected rows.
But what if we change the operator?
select * from table where column1 = 'some_value';
In this case an index can give good performance and return expected rows almost in a flash.
SARGable operators are: =, <, >, <= ,>=, LIKE (without %), BETWEEN
Non-SARGable operators are: <>, IN, OR
Now, back to your case.
Your problem is simple. You have X and you have Y. X > Y or Y < X - in both cases you have to determine the values of both variables, so this switching gives you nothing.
P.S. Of course, I concede, there could be databases with very poor optimizers where this kind of swithing could play role. But, as I said before, in modern databases you should not worry about it.
I would like to write sql code(using PROC SQL in SAS) using the logical
A and (B or C)
Where A is of the form (A1 AND A_2 AND A_3 ... A_n) so in other words it long
since the AND operator evaluates first, in sql code, I cant write it as
A AND (B or C)
because the parentheses does have any effect I would get A AND B OR C
My question is do I have to write it as:
(A and B) or (A and C)
this would require to write a long (logical) expression A two times.
First, this should work if B and C have only one clause:
A and (B or C)
But, we are not currently suffering a shortage of parentheses in the world, so you can use more:
( A ) and ( ( B ) or ( C ) )
Just wrap each logic condition (no matter how long) in parentheses.
I’ve read that logical operator AND has higher order of precedence than logical operator IN, but that doesn’t make sense since if that was true, then wouldn’t in the following statement the AND condition got evaluated before the IN condition ( thus before IN operator would be able to check whether Released field equals to any of the values specified within parentheses ?
SELECT Song, Released, Rating
FROM Songs
WHERE
Released IN (1967, 1977, 1987)
AND
SongName = ’WTTJ’
thanx
EDIT:
Egrunin and ig0774, I’ve checked it and unless I totally misunderstood your posts, it seems that
WHERE x > 0 AND x < 10 OR special_case = 1
is indeed the the same as
WHERE (x > 0 AND x < 10) OR special_case = 1
Namely, I did the the following three queries
SELECT *
FROM Songs
WHERE AvailableOnCD='N' AND Released > 2000 OR Released = 1989
SELECT *
FROM Songs
WHERE (AvailableOnCD='N' AND Released > 2000) OR Released = 1989
SELECT *
FROM Songs
WHERE AvailableOnCD='N' AND (Released > 2000 OR Released = 1989)
and as it turns out the following two queries produce the same result:
SELECT *
FROM Songs
WHERE AvailableOnCD='N' AND Released > 2000 OR Released = 1989
SELECT *
FROM Songs
WHERE (AvailableOnCD='N' AND Released > 2000) OR Released = 1989
while
SELECT *
FROM Songs
WHERE AvailableOnCD='N' AND (Released > 2000 OR Released = 1989)
gives a different result
I'm going to assume you're using SQL Server, as in SQL Server AND has a higher order of precedence than IN. So, yes, the AND is evaluated first, but the rule for evaluating AND, is to check the expression on the left (in your sample, the IN part) and, if that is true, the expression on the right. In short, the AND clause is evaluated first, but the IN clause is evaluated as part of the AND evaluation.
It may be simpler to understand the order of precedence here as referring to how the statement is parsed, rather than how it is executed (even if MS's documentation equivocates on this).
Edit in response to comment from the OP:
I'm not all together certain that IN being classified as a logical operator is not specific to SQL Server. I've never read the ISO standard, but I would note that the MySQL and Oracle docs define IN as a comparison operator, Postgres as a subquery expression, and Sybase itself as a "list operator". In my view, Sybase is the nearest to the mark here since the expression a IN (...) asks whether the value of attribute a is an element of the list of items between the parentheses.
That said, I might imagine the reason that SQL Server chose to classify IN as a logical operator is two-fold:
IN and the like do not have the type restrictions of the SQL Server comparison operators (=, !=, etc. cannot apply to text, ntext or image types; IN and other subset operators can be used against any type, except, in strict ISO SQL, NULL)
The result of an IN, etc. operation is a boolean value just like the other "logical operators"
Again, to my mind, this is not a sensible classification, but it is what Microsoft chose. Maybe someone else has further insight into why they may have so decided?
Call me a n00b, but I always use parentheses in nontrivial compound conditions.
SELECT Song, Released, Rating
FROM Songs
WHERE
(Released IN (1967, 1977, 1987))
AND
SongName = ’WTTJ’
Edited (Corrected, the point remains the same.)
Just yesterday I got caught by this. Started with working code:
WHERE x < 0 or x > 10
Changed it in haste:
WHERE x < 0 or x > 10 AND special_case = 1
Broke, because this is what I wanted:
WHERE (x < 0 or x > 10) AND special_case = 1
But this is what I got:
WHERE x < 0 or (x > 10 AND special_case = 1)
In Mysql at least, it has a lower precedence. See http://dev.mysql.com/doc/refman/5.0/en/operator-precedence.html
I think of IN is a comparison operator whereas AND is a logical operator. So it's a bit apples and oranges, since the comparison operator must be evaluated first to see if the condition is true, then the logical operator is used to evaluate the conditions.