SQL Parentheses use in an OR clause - sql

Was wondering whether anyone would know why do we use the parentheses in this SQL:
So, the format goes as follows:
Name,location and department of the service of the employees whose name starts with A or B. (A rough translation from French).
I answered the following way:
SELECT service.nom_serv, localite.ville, localite.departemen
FROM service, localite, employe
WHERE service.code_loc=localite.code_loc
AND employe.service=service.code_serv
AND ((employe.nom LIKE 'A%') OR (employe.nom LIKE 'B%'))
Basically, where the last AND is concerned for the WHERE, couldn't I simply do without the parenthesis in order to have the SQL select for me employees with their name starting either with an A or a B? What difference does positioning a parenthesis in that way make? And ahy is there a double use of parentheses? Or is it to prioritize the OR in the last clause, since an AND is preceding it?

Take a look at the Operator Precedence in SQL Server (You've not specified that, but I'd imagine it's the same for all RDBMS). What this means is that ANDs (without parenthesis) are evaluated before1 bind more tightly than ORs.
So in your specific case, without the parenthesis, the conditions are:
employe.service=service.code_serv AND employe.nom LIKE 'A%'
OR
employe.nom LIKE 'B%'
1Evaluation order is deliberately not specified in SQL, allowing many more possible re-orderings that languages that guarantee left-to-right or precedence ordered evaluation.

You use it to specify grouping of the clause, not priority. SQL does not allow you to specify priority as the optimizer will create the best priority for you.
AND ()
Will take both of the OR conditions in one statement. So if either is true then the AND is true as well. The inner parentheses are not necessary, but help in visualizing the separation.
Without the outer parentheses it would allow anything with the final clause as true as well.

There are extra parenthesis. The rule in math is to add the parenthesis to clarify the logic. In this case if you remove all of the parenthesis you'll get the wrong answer. What you have is a AND ((b) OR (c)). Removing all of the parenthesis would take it from (a OR b) AND (a OR c) to (a AND b) OR c which is incorrect.

Related

And/or SQL statement

I'm struggling to get a SQL statement to run.
I need to have an and / or statement which gives me:
Where Condition 1 is true
OR
Where both Condition 2 AND Condition 3 are true. (not only one of them)
Appreciate some ideas :)
You can split them with parenthesis, you also only need to define 'WHERE' once. Example:
WHERE
{condition_1} or ({condition_2} and {condition_3})
Edit: You don't technically require parenthesis due to AND having a higher precedence than OR, but it makes it much easier to read and see at a glance exactly what you're trying to do.

Decode not working

I can't seem to figure out why this won't work - can someone please help? This is part of a larger query, so I don't want to have to update the one that already exists - just wanna add to it -
SELECT INNERPART.*,
SUBSTR(status_remday, 1,1) AS COMPLETE,
**--this line shows if it is completed or not**
DECODE(SUBSTR(status_remday, 1,1),'Y','Complete','N','Incomplete', null) AS qualCompleted,
**--need this to show if the curriculum is complete or not, in it's own row. will eventually have about 10 or more qual_ids**
decode(INNERPART.qualID,'ENG_CURR_SAFETY CERT', qualCompleted) as SAFETY
FROM (Innerpart)
The problem is that the SQL syntax (the Oracle dialect, anyway) doesn't allow you to define an alias in a SELECT clause and then reference the same alias in the same SELECT clause (even if it's later in the clause).
You define qualCompleted as a DECODE, and then you reference qualCompleted in the second DECODE. That won't work.
If you don't want to define qualCompleted at one level and then wrap everything within an outer SELECT where you can reference that name, your other option is to use the first DECODE, as is (not by alias) in the second DECODE.
This:
decode(INNERPART.qualID,'ENG_CURR_SAFETY CERT', qualCompleted) as SAFETY
should instead be written as
decode(INNERPART.qualID,'ENG_CURR_SAFETY CERT',
DECODE(SUBSTR(status_remday, 1,1),'Y','Complete','N','Incomplete', null) )
as SAFETY
One more thing: by default, DECODE returns null if the first parameter is not matched in DECODE. So you don't actually need to give the last parameter (null) in your definition of qualCompleted.
EDIT: here is what the Oracle documentation says about column aliases.
Link: https://docs.oracle.com/database/121/SQLRF/statements_10002.htm#i2080424
c_alias Specify an alias for the column expression. Oracle Database will use this alias in the column heading of the result set.
The AS keyword is optional. The alias effectively renames the select
list item for the duration of the query. The alias can be used in
the order_by_clause but not other clauses in the query.
This means a few things. An alias like the qualCompleted you created cannot be used in the same query in the WHERE clause, GROUP BY, etc. - and not even in the SELECT clause where it was created. It can ONLY be used in the ORDER BY clause of the same query. Any other use must be in a surrounding, "outer" query. It also does mean, though, that you can use it in ORDER BY, if needed.
In your case, if you ONLY created qualCompleted so that you can reference it in another DECODE, and had no other use for it, then you don't even need to define it at all (since it doesn't help anyway); just define SAFETY directly as a nested call to DECODE.

Why is SQL strict about clause order?

This works:
SELECT * FROM users ORDER BY id LIMIT 5
This doesn't - throws a syntax error:
SELECT * FROM users LIMIT 5 ORDER BY id
SQL seems to be too strict about clause order.
Does it have a good reason to be that strict?
P.S. SELECT and FROM specify the source of the data and I agree that this should have a specific position in the query. The other clauses, though, just "play" with that data - they have a relationship with the source of the data, but not with each other so the fact that they should be ordered in a particular way doesn't seem very intuitive to me.
Hugh Darwen theorizes that it was fashionable for languages to be this way in the 1960s:
Do you take SELECT-FROM-WHERE for
granted, or do you, like me, find it
rather curious that the System R team
should have spurned the normal way of
writing expressions of arbitrary
complexity in favour of something
utterly idiosyncratic and, one might
say, rather dictatorial...?
The fact is that in the 1960s various
scripting languages (as we tend to
call such things these days) had come
about for the purposes of report
generation, especially ad hoc report
generation. We had one such language
in the prerelational DBMS called
Terminal Business System (TBS) that I
worked on for IBM from 1969-77. Our
language required the user to specify
the required report in a series of
steps that had to be given in the
prescribed order...
A somewhat similar but much more
sophisticated report generator was
later developed by IBM in the US, as
part of a product called (prosaically,
as was IBM's style in those days)
Generalized Information System
(GIS)... when I first looked at SQL,
my immediate reaction was "Oh no!
Son of GIS? Please not that!" I
might have been quite wrong about
this. The similarity I perceived
might have been illusory and even if
it was not, I have no firm evidence
that anybody in the System R team was
familiar with GIS. The fact remains
that the general style of a fixed
order of actions was the order of the
day at the time. I postulate that
SQL's SELECT-FROM-WHERE arose out of
this fashion.
From HAVING a Blunderful Time
When you write in english or another language, you also use a specific grammar.
You never put the verb at the end of the sentence in english but it is used in german.
In SQL, it's the same there is a syntax you have to respect.
You also don't write a query like that FROM user SELECT * ORDER BY xy WHERE a=b
Because this is MySQL's syntax:
http://dev.mysql.com/doc/refman/5.0/en/select.html
The parser is built for speed. If you allow inconsistant syntax, the parser takes more time to figure out what needs to be done. This can have a significant impact on performance.
Also, it makes it easier for humans to read.
LIMIT aside (it's TOP in SQL Server for example), there is a logical processing ORDER for SQL statements that reflects the syntax order
FROM
WHERE
GROUP BY
CUBE/ROLLUP
HAVING
SELECT
DISTINCT
TOP/LIMIT
ORDER BY
if you substitute LIMIT with TOP then it quite obvious
First and foremost, let me tell you
"Every language has its set of rules & regulations "
We need to follow this rules. Breaching of any kind of rules (e.g syntax) of any of the command will lead us to error.
Use the ORDER BY clause to specify the order in which cells on the left-hand side of the rule are to be evaluated. The expr must resolve to a dimension or measure column. If the ORDER BY clause is not specified, the order defaults to the order of the columns as specified in the DIMENSION BY clause.
Check out here for more :
Order By Clause (a)
Order By Clause (b)

Can scalar functions be applied before filtering when executing a SQL Statement?

I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.
Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing.
Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.
NAMES: ID, NAME
1, "Peter"
2, "X"
LONG_NAMES: ID, NAME_LENGTH
1, 5
If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):
SELECT substring(NAME,1,len(NAME)-3)
FROM NAMES;
I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail.
The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:
SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3)
FROM NAMES
INNER JOIN LONG_NAMES
ON NAMES.ID = LONG_NAMES.ID;
At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.
However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?
Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.
I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.
TSQL divide by zero encountered despite no columns containing 0
EDIT: As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.
Joe Celko, who helped write early SQL standards, has posted something similar to this several times in various USENET newsfroups. (I'm skipping over the clauses that don't apply to your SELECT statement.) He usually said something like "This is how statements are supposed to act like they work". In other words, SQL implementations should behave exactly as if they did these steps, without actually being required to do each of these steps.
Build a working table from all of
the table constructors in the FROM
clause.
Remove from the working table those
rows that do not satisfy the WHERE
clause.
Construct the expressions in the
SELECT clause against the working table.
So, following this, no SQL dbms should act like it evaluates functions in the SELECT clause before it acts like it applies the WHERE clause.
In a recent posting, Joe expands the steps to include CTEs.
CJ Date and Hugh Darwen say essentially the same thing in chapter 11 ("Table Expressions") of their book A Guide to the SQL Standard. They also note that this chapter corresponds to the "Query Specification" section (sections?) in the SQL standards.
You are thinking about something called query execution plan. It's based on query optimization rules, indexes, temporaty buffers and execution time statistics. If you are using SQL Managment Studio you have toolbox over your query editor where you can look at estimated execution plan, it shows how your query will change to gain some speed. So if just used your Name table and it is in buffer, engine might first try to subquery your data, and then join it with other table.

SQL CASE statements on Informix - Can you set more than one field in the END section of a case block?

Using IBM Informix Dynamic Server Version 10.00.FC9
I'm looking to set multiple field values with one CASE block. Is this possible? Do I have to re-evaluate the same conditions for each field set?
I was thinking of something along these lines:
SELECT CASE WHEN p.id = 9238 THEN ('string',3) END (varchar_field, int_field);
Where the THEN section would define an 'array' of fields similar to the syntax of
INSERT INTO table (field1,field2) values (value1,value2)
Also, can it be done with a CASE block of an UPDATE statement?
UPDATE TABLE SET (field1,field2) = CASE WHEN p.id=9238 THEN (value1,value2) END;
Normally, I'd ask for the version of Informix that you're using, but it probably doesn't matter much this time. The simple answer is 'No'.
A more complex answer might discuss using a row type constructor, but that probably isn't what you want on the output. And, given the foregoing, then the UPDATE isn't going to work (and would require an extra level of parentheses if it was going to).
No, a CASE statement resolves to an expression (see IBM Informix Guide to SQL: Syntax CASE Expressions) and can be used in places where an expression is permitted. An expression is a single value.
from http://en.wikipedia.org/wiki/Expression_%28programming%29
An expression in a programming
language is a combination of explicit
values, constants, variables,
operators, and functions that are
interpreted according to the
particular rules of precedence and of
association for a particular
programming language, which computes
and then produces (returns, in a
stateful environment) another value.
Found an easy way to do it located here:
how to have listview row colour to change based on data in the row
Solution was just adding the case statement to my sql statement. Just maid my life much easier.