sql logical A and (B or C) - sql

I would like to write sql code(using PROC SQL in SAS) using the logical
A and (B or C)
Where A is of the form (A1 AND A_2 AND A_3 ... A_n) so in other words it long
since the AND operator evaluates first, in sql code, I cant write it as
A AND (B or C)
because the parentheses does have any effect I would get A AND B OR C
My question is do I have to write it as:
(A and B) or (A and C)
this would require to write a long (logical) expression A two times.

First, this should work if B and C have only one clause:
A and (B or C)
But, we are not currently suffering a shortage of parentheses in the world, so you can use more:
( A ) and ( ( B ) or ( C ) )
Just wrap each logic condition (no matter how long) in parentheses.

Related

Can I multiply the output of a SQL query from two separate tables within the same query?

I am taking two values (A, B) from similar but different tables. E.g. A is the count(*) of Table R, but B is a complex calculation based off a slightly adapted table (we can call it S).
So I did this:
SELECT
(SELECT count(*)*60 FROM R) AS A,
[calculation for B] AS B
FROM R
WHERE
[modification to R to get S]
Not sure if this was the smartest way to do it (probably was not, I'm a new user).
Now I want to do some multiplications:
A*B
B-(A*0.75)
B-(A*0.8)
B-(A*0.85)
etc.
Is there a way to do this within the same query?
Thanks.
The simplest way,
SELECT A*B p1, B-(A*0.75) p2, B-(A*0.8) p3, B-(A*0.85) p4, ...
FROM (
-- your original query producing columns A, B ...
) t

Adding a "calculated column" to BigQuery query without repeating the calculations

I want to resuse value of calculated columns in a new third column.
For example, this query works:
select
countif(cond1) as A,
countif(cond2) as B,
countif(cond1)/countif(cond2) as prct_pass
From
Where
Group By
But when I try to use A,B instead of repeating the countif, it doesn't work because A and B are invalid:
select
countif(cond1) as A,
countif(cond2) as B,
A/B as prct_pass
From
Where
Group By
Can I somehow make the more readable second version work ?
Is this first one inefficient ?
You should construct a subquery (i.e. a double select) like
SELECT A, B, A/B as prct_pass
FROM
(
SELECT countif(cond1) as A,
countif(cond2) as B
FROM <yourtable>
)
The same amount of data will be processed in both queries.
In the subquery one you will do only 2 countif(), in case that step takes a long time then doing 2 instead of 4 should be more efficient indeed.
Looking at an example using bigquery public datasets:
SELECT
countif(homeFinalRuns>3) as A,
countif(awayFinalRuns>3) as B,
countif(homeFinalRuns>3)/countif(awayFinalRuns>3) as division
FROM `bigquery-public-data.baseball.games_post_wide`
or
SELECT A, B, A/B as division FROM
(
SELECT countif(homeFinalRuns>3) as A,
countif(awayFinalRuns>3) as B
FROM `bigquery-public-data.baseball.games_post_wide`
)
we can see that doing all in one (without a subquery) is actually slightly faster. (I ran the queries 6 times for different values of the inequality, 5 times was faster and one time slower)
In any case, the efficiency will depend on how taxing is to compute the condition in your particular dataset.

SQL Order of precendence of query expressions

I am trying to work out how SQL queries are run and have hit a bit of a stumbling block.
If a where clause akin to the below is used:
A OR B AND C
This could mean either of the below
(A OR B) AND C
or
A OR (B AND C)
In the majority of cases the results will be the same, but if the set to be queried contains solely {A}, the first variant would return an empty result set and the second would return {A}. SQL does in fact return the 1 result.
Does anyone know (or have links to) any insight that will help me understand how queries are built?
Ketchup
The order is the following according to MSDN:
~ (Bitwise NOT)
(*) (Multiply), / (Division), % (Modulo)
(+) (Positive), - (Negative), + (Add), (+ Concatenate), - (Subtract), & (Bitwise AND), ^ (Bitwise Exclusive OR), | (Bitwise OR)
=, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
NOT
AND
ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
= (Assignment)
In the knowledge (from documentation) that AND has a higer precedence than OR, you should aim to write predicates for WHERE clauses in conjunctive normal form ("a seires of AND clauses").
If the intention is
( A OR B ) AND C
then write it thus and all is good.
However, if the intention is
A OR ( B AND C )
then I suggest you apply the distributive rewrite law that results in conjunctive normal form i.e.
( P AND Q ) OR R <=> ( P OR R ) AND ( Q OR R )
In your case:
A OR ( B AND C ) <=> ( A OR B ) AND ( A OR C )
AND and OR have different precedende.
See Precedence Level
For SQL-Server (which is your tag) here is the precedence http://msdn.microsoft.com/en-us/library/ms190276.aspx but..
If you're worried about the exact result set given you should indeed start working with () subsets.

How to select only those rows which have more than one of given fields with values

Is there some elegant way to do that, without a big WHERE with lots of AND and OR? For example there are 4 columns: A, B, C, D. For each row the columns have random integer values. I need to select only those rows which have more than one column with a non-zero value. For example (1,2,3,4) and (3,4,0,0) should get selected, however (0,0,7,0) should not be selected (there are no rows that have zeros only).
PS. I know how this looks but the funny thing is that this is not exam or something, it's a real query which I need to use in a real app :D
SELECT *
FROM mytable
WHERE (0, 0, 0) NOT IN ((a, b, c), (a, b, d), (a, c, d), (b, c, d))
This I believe is this shortest way, though not necessarily the most efficient.
There. No WHERE, no OR and no AND:
SELECT
IF(`column1` != 0,1,0) +
IF(`column2` != 0,1,0) +
IF(`column3` != 0,1,0) +
IF(`column4` != 0,1,0) AS `results_sum`
FROM `table`
HAVING
`results_sum` > 1
Try
select *
from table t
where ( abs(sign(A))
+ abs(sign(B))
+ abs(sign(C))
+ abs(sign(D))
) > 0

How to "default" a column in a SELECT query

Say I have a database table T with 4 fields, A, B, C, and D. A, B, and C are the primary key. For any combination of [A, B], there is always a row where C == spaces. There may or may not be other rows where C != spaces. I have a query that gets all rows where [A, B] == [in_a, in_b], and also where C == in_c if such a row exists, or C == spaces if the in_c row doesn't exist. So, if there is a row that matches the particular C value, I want that one, otherwise I want the spaces one. It is very important that if there is a matching C row, that I not be returned the spaces one along with it.
I have a working query, but its not very fast. This is executing on DB2 for z/OS. I have full control over these tables, so I can define new indicies if needed. The only index on the table right now is [A, B, C], the primary key. This SQL is kinda messy, and I feel theres a better way to accomplish this task. What can I do to make this query faster?
The query I have now is:
SELECT A, B, C, D FROM T
WHERE A = :IN_A AND B > :IN_B AND
(C = :IN_C
OR (NOT EXISTS(
SELECT B FROM T WHERE
A = :IN_A AND B > :IN_B AND C = :IN_C))
AND C = " ");
Caveat emptor, as I am not familiar with DB2 SQL...
You could try using an ORDER BY clause to sort the matching rows such that a row with c = spaces is last in the sorted set, then retrieve just the first row of the set. Something like:
select first
A, B, C, D
from T
where A = :IN_A
and B = :IN_B
order by C desc;
This assumes that the FIRST and ORDER BY DESC clauses do what I expect them to.
This will work on DB2 LUW, not sure if the order by clause works on DB2 Z:
select
a, b, c, d
from t
where a = :IN_A
and b = :IN_B
and c in (:IN_C,' ')
order by
case c when ' ' then 2 else 1 end
fetch first 1 row only
Make sure that the ' ' value matches the actual value of the column.
Good luck,
Why not start up the index advisor and reads its advices? (or is this only on DB2 for i/OS?)
We use the advisor for our very big production environment and it gives great advices. But having that said, it's always good to start with a good statement.