SQL Case inside WHEN - sql

May be this is silly but can we write a case inside another case's WHEN?
Below code working for me but I am not sure if this is correct.
SELECT
(SUM(CASE
WHEN (
CASE
WHEN r.status < b.status
THEN r.status
ELSE b.status
END
) = '4'
THEN 1
ELSE 0
END)
) AS WORKED
FROM
tbl1 r, tbl2 b
All the examples on nested cases are like CASE inside a THEN so I am not sure if this a good practice. Is there a better way to get the same results?

Yes you can. MSDN also informs us that in SQL SERVER, you can only have a maximum of 10 CASE expressions embedded into each other. Oddly enough, a search for ORACLE up negative about this potential limitation. Probably important to note.
Of course, you can also just use more WHEN (up to 255 in ORACLE) statements, too, but that only works if you do not need to nest your logic (such as comparing two different columns values)
Sources:
https://msdn.microsoft.com/en-us/library/ms181765.aspx
http://www.techonthenet.com/oracle/functions/case.php

Related

How to add a value to this SQL statement in DB2?

I need to add 'P06' to the case where the subquery is selecting RPCODE. I'm still learning about SQL so I'm still not an expert at subqueries so I'm not exactly sure how to add a value to the statement.
My first solution was just to add OR 'P06' after 'P01', but that doesn't seem right.
CASE WHEN (SELECT RPCODE FROM AGQA.QAB2010
WHERE INDATE || INTIME = ( SELECT MAX(INDATE||INTIME) FROM AGQA.QAB2010 WHERE RTAG IN (SELECT TAG FROM TAGDATA) )
AND RTAG IN (SELECT TAG FROM TAGDATA) ORDER BY RPDATE DESC, SER DESC FETCH FIRST 1 ROW ONLY) = 'P01' THEN 'N' ELSE 'C' END
ELSE 'R' END, 'S' ) AS TTYPE
Right now, when the RPCODE is 'P01', the TTYPE shows as 'N'. I need to add 'P06' so that the TTYPE will show as 'N' for RPCODE 'P06' as well
As Rob Wilson commented...
Change the = 'P01' to IN ('P01', 'P06')
However, while the statement may work for you, performance over a dataset of any decent size is probably going to suck.
The number of sub-select's and fetch first row are red flags to my eye.
With a background in RPG development on Db2 for i, the statement looks like many I've seen from RPG programmers used to working with data 1 record at a time rather than working with sets of data.
But the same "row by agonizing row" (RBAR as coined by Jeff Moden of SqlServerCentral.com) processing can be seen in SQL from developers on any platform and from any background.
Unfortunately, moving to a set base process isn't a quick fix for non-trivial statements. The complete statement and detailed information about the data and the table design is needed.

Is this statement quicker than the previous?

I am running through some old code if I changed the logic of this CASE statement:
CASE WHEN ClaimNo.ClaimNo IS NULL THEN '0'
WHEN ClaimNo.ClaimNo = 1 THEN '1'
WHEN ClaimNo.ClaimNo = 2 THEN '2'
WHEN ClaimNo.ClaimNo = 3 THEN '3'
WHEN ClaimNo.ClaimNo = 4 THEN '4'
ELSE '5+'
END AS ClaimNo ,
If I changed it to:
CASE WHEN ClaimNo.ClaimNo >= 5 THEN '5+'
ELSE COALESCE(ClaimNo.ClaimNo,0) END 'ClaimNo' ,
Would the statement technically be quicker? Its obviously a lot shorter as a statement and appears that it wouldn't run as many statements to obtain the same result.
These are not the same! The case expression returns one type and in this case you want the type to be a string (because '5+' is a string). However, mixing strings and integers in the wheres will result in a type conversion error.
Which is faster depends on the distribution of the data. If most of the data consists of 5 or more, then the second method would be faster . . . and work if written as:
(CASE WHEN ClaimNo.ClaimNo >= 5 THEN '5+'
ELSE CAST(COALESCE(ClaimNo.ClaimNo, 0) as VARCHAR(255))
END) as ClaimNo,
In fact, there is only one comparison, so from the perspective of doing the comparisons it will be faster.
The next question is whether the conversion from a number to a string is faster than the multiple comparisons with each value listed separately. Let me be honest: I do not know. And I have been concerned about query performance for a long time.
Why don't I know? Such micro-optimizations generally have basically no impact in the real world. You should use the version of the logic that works; readability and maintainability are also important. Of course performance is an issue, but the bit fiddling techniques that are important in other languages often have no place in SQL which is designed to handle much larger quantities of data, spread across multiple processors and disks.

sql server group by with an alias

I am new to sql server, transitioning from mysql.
I have a complicated case statement that I would like to group on 6 whens and an else. Likely to get larger. To be able to run it, I need to copy the statement into the group by each time there is a modification. In mySql I would just group by the column number. Is there any work around for this? Making the code very ugly.
Is there going to be a performance penalty in creating a sub query for my case, then just grouping on the result field. Seems like trying to make the code more elegant will cause the query to use more resources.
Thanks
Below is a field I am grouping on. As I make a modification to the field for more edge cases, then I need to change code in up to 3 places. Makes for some very ugly code, and I need no extra help doing that myself.
dz_code = case
when isnull(dz.dz_code,'N/A') in ('GAB', 'MAB', 'N/A') and dc.howdidyouhear = 'Television' then 'Television'
when isnull(dz.dz_code,'N/A') in ('GAB', 'MAB', 'N/A') and dc.howdidyouhear in ('Other', 'N/A') then 'Other'
WHEN dz.dz_code = 'irs,irs' THEN 'irs'
when dz.dz_code like '%SDE%' THEN 'SDE'
when dz.dz_code like 'referral,' then REPLACE(dz.dz_code, 'referral','')
when charindex(',',dz.dz_code) = 4 then left(dz.dz_code,3)
else
dz.dz_code
END,
Maybe you can wrap the query in a subquery and use the alias in the select and the group by. It looks a little bulky in this example, but if you've got more complex case switches, or more than one of them, then this solution will probably much smaller and more readable.
select
CaseField
from
(select
case when 1 = 2 then
3
else 4 end as CaseField
from
YourTable t) c
group by
CaseField

Is it possible to use the result of a subquery in a case statement of the same outer query?

I am writing a search routine with a ranking algorithm and would like to get this in one pass.
My Ideal query would be something like this....
select *, (select top 1 wordposition
from wordpositions
where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502
) as WordPos,
case when WordPos<11 then 1 else case WordPos<50 then 2 else case WordPos<100 then 3 else 4 end end end end as rank
from items
Is it possible to use WordPos in a case right there? It's generating an error on me , Invalid column name 'WordPos'.
I know I can redo the subquery for each case but I think it would actually re-run the case wouldn't it?
For example:
select *, case when (select top 1 wordposition from wordpositions where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502)<11 then 1 else case (select top 1 wordposition from wordpositions where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502)<50 then 2 else case (select top 1 wordposition from wordpositions where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502)<100 then 3 else 4 end end end end as rank from items
That works....but is it really re-running the identical query each time?
It's hard to tell from the tests as the first time it runs it's slow but subsequent runs are quick....it's caching...so would that mean that the first time it ran it for the first row, the subsequent three times it would get the result from cache?
Just curious what the best way to do this would be...
Thank you!
Ryan
You can do this using a subquery. I will stick with your SQL Server syntax, even though the question is tagged mysql:
select i.*,
(case when WordPos < 11 then 1
when WordPos < 50 then 2
when WordPos < 100 then 3
else 4
end) as rank
from (select i.*,
(select top 1 wpwordposition
from wordpositions wp
where recordid=i.pk_itemid and wordid=79588 and nextwordid=64502
) as WordPos
from items i
) i;
This also simplifies the case statement. You do not need nested case statements to handle multiple conditions, just multiple where clauses.
No. Identifiers introduced in the output clause (the fact that it comes from a sub-query is irrelevant) cannot be used within the same SELECT statement.
Here are some solutions:
Rewrite the query using a JOIN1, This will eliminate the issue entirely and fits well with RA.
Wrap the entire SELECT with the sub-query within another SELECT with the case. The outer select can access identifiers introduced by the inner SELECT's output clause.
Use a CTE (if SQL Server). This is similar to #2 in that it allows an identifier to be introduced.
While "re-writing" the sub-query for each case is very messy it should still result in an equivalent plan - but view the query profile! - as the results of the query are non-volatile. As such the equivalent sub-queries can be safely moved by the query planner which should move the sub-query/sub-queries to a JOIN to avoid any "re-running" in the first place.
1 Here is a conversion to use a JOIN, which is my preferred method. (I find that if a query can't be written in terms of a JOIN "easily" then it might be asking for the wrong thing or otherwise be showing issues with schema design.)
select
wp.wordposition as WordPos,
case wp.wordposition .. as Rank
from items i
left join wordpositions wp
on wp.recordid = i.pk_itemid
where wp.wordid = 79588
and wp.nextwordid = 64502
I've made assumptions about the multiplicity here (i.e. that wordid is unique) which should be verified. If this multiplicity is not valid and not correctable otherwise (and you're indeed using SQL Server), then I'd recommend using ROW_NUMBER() and a CTE.

SQL and logical operators and null checks

I've got a vague, possibly cargo-cult memory from years of working with SQL Server that when you've got a possibly-null column, it's not safe to write "WHERE" clause predicates like:
... WHERE the_column IS NULL OR the_column < 10 ...
It had something to do with the fact that SQL rules don't stipulate short-circuiting (and in fact that's kind-of a bad idea possibly for query optimization reasons), and thus the "<" comparison (or whatever) could be evaluated even if the column value is null. Now, exactly why that'd be a terrible thing, I don't know, but I recall being sternly warned by some documentation to always code that as a "CASE" clause:
... WHERE 1 = CASE WHEN the_column IS NULL THEN 1 WHEN the_column < 10 THEN 1 ELSE 0 END ...
(the goofy "1 = " part is because SQL Server doesn't/didn't have first-class booleans, or at least I thought it didn't.)
So my questions here are:
Is that really true for SQL Server (or perhaps back-rev SQL Server 2000 or 2005) or am I just nuts?
If so, does the same caveat apply to PostgreSQL? (8.4 if it matters)
What exactly is the issue? Does it have to do with how indexes work or something?
My grounding in SQL is pretty weak.
I don't know SQL Server so I can't speak to that.
Given an expression a L b for some logical operator L, there is no guarantee that a will be evaluated before or after b or even that both a and b will be evaluated:
Expression Evaluation Rules
The order of evaluation of subexpressions is not defined. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order.
Furthermore, if the result of an expression can be determined by evaluating only some parts of it, then other subexpressions might not be evaluated at all.
[...]
Note that this is not the same as the left-to-right "short-circuiting" of Boolean operators that is found in some programming languages.
As a consequence, it is unwise to use functions with side effects as part of complex expressions. It is particularly dangerous to rely on side effects or evaluation order in WHERE and HAVING clauses, since those clauses are extensively reprocessed as part of developing an execution plan.
As far as an expression of the form:
the_column IS NULL OR the_column < 10
is concerned, there's nothing to worry about since NULL < n is NULL for all n, even NULL < NULL evaluates to NULL; furthermore, NULL isn't true so
null is null or null < 10
is just a complicated way of saying true or null and that's true regardless of which sub-expression is evaluated first.
The whole "use a CASE" sounds mostly like cargo-cult SQL to me. However, like most cargo-cultism, there is a kernel a truth buried under the cargo; just below my first excerpt from the PostgreSQL manual, you will find this:
When it is essential to force evaluation order, a CASE construct (see Section 9.16) can be used. For example, this is an untrustworthy way of trying to avoid division by zero in a WHERE clause:
SELECT ... WHERE x > 0 AND y/x > 1.5;
But this is safe:
SELECT ... WHERE CASE WHEN x > 0 THEN y/x > 1.5 ELSE false END;
So, if you need to guard against a condition that will raise an exception or have other side effects, then you should use a CASE to control the order of evaluation as a CASE is evaluated in order:
Each condition is an expression that returns a boolean result. If the condition's result is true, the value of the CASE expression is the result that follows the condition, and the remainder of the CASE expression is not processed. If the condition's result is not true, any subsequent WHEN clauses are examined in the same manner.
So given this:
case when A then Ra
when B then Rb
when C then Rc
...
A is guaranteed to be evaluated before B, B before C, etc. and evaluation stops as soon as one of the conditions evaluates to a true value.
In summary, a CASE short-circuits buts neither AND nor OR short-circuit so you only need to use a CASE when you need to protect against side effects.
Instead of
the_column IS NULL OR the_column < 10
I'd do
isnull(the_column,0) < 10
or for the first example
WHERE 1 = CASE WHEN isnull(the_column,0) < 10 THEN 1 ELSE 0 END ...
I've never heard of such a problem, and this bit of SQL Server 2000 documentation uses WHERE advance < $5000 OR advance IS NULL in an example, so it must not have been a very stern rule. My only concern with OR is that it has lower precedence than AND, so you might accidentally write something like WHERE the_column IS NULL OR the_column < 10 AND the_other_column > 20 when that's not what you mean; but the usual solution is parentheses rather than a big CASE expression.
I think that in most RDBMSes, indices don't include null values, so an index on the_column wouldn't be terribly useful for this query; but even if that weren't the case, I don't see why a big CASE expression would be any more index-friendly.
(Of course, it's hard to prove a negative, and maybe someone else will know what you're referring to?)
Well, I've repeatedly written queries like the first example since about forever (heck, I've written query generators that generate queries like that), and I've never had a problem.
I think you may be remembering some admonishment somebody gave you sometime against writing funky join conditions that use OR. In your first example, the conditions joined by the OR restrict the same one column of the same table, which is OK. If your second condition was a join condition (i.e., it restricted columns from two different tables), then you could get into bad situations where the query planner just has no choice but to use a Cartesian join (bad, bad, bad!!!).
I don't think your CASE function is really doing anything there, except perhaps hamper your query planner's attempts at finding a good execution plan for the query.
But more generally, just write the straightforward query first and see how it performs for realistic data. No need to worry about a problem that might not even exist!
Nulls can be confusing. The " ... WHERE 1 = CASE ... " is useful if you are trying to pass a Null OR a Value as a parameter ex. "WHERE the_column = #parameter. This post may be helpful Passing Null using OLEDB .
Another example where CASE is useful is when using date functions on the varchar columns. adding ISDATE before using say convert(colA,datetime) might not work, and when colA has non-date data the query can error out.