This question already has answers here:
How do I perform an IF...THEN in an SQL SELECT?
(29 answers)
Closed 9 years ago.
I have a Query that's supposed to run like this -
If(var = xyz)
SELECT col1, col2
ELSE IF(var = zyx)
SELECT col2, col3
ELSE
SELECT col7,col8
FROM
.
.
.
How do I achieve this in T-SQL without writing separate queries for each clause? Currently I'm running it as
IF (var = xyz) {
Query1
}
ELSE IF (var = zyx) {
Query2
}
ELSE {
Query3
}
That's just a lot of redundant code just to select different columns depending on a value.
Any alternatives?
You are looking for the CASE statement
http://msdn.microsoft.com/en-us/library/ms181765.aspx
Example copied from MSDN:
USE AdventureWorks;
GO
SELECT ProductNumber, Category =
CASE ProductLine
WHEN 'R' THEN 'Road'
WHEN 'M' THEN 'Mountain'
WHEN 'T' THEN 'Touring'
WHEN 'S' THEN 'Other sale items'
ELSE 'Not for sale'
END,
Name
FROM Production.Product
ORDER BY ProductNumber;
GO
Just a note here that you may actually be better off having 3 separate SELECTS for reasons of optimization. If you have one single SELECT then the generated plan will have to project all columns col1, col2, col3, col7, col8 etc, although, depending on the value of the runtime #var, only some are needed. This may result in plans that do unnecessary clustered index lookups because the non-clustered index Doesn't cover all columns projected by the SELECT.
On the other hand 3 separate SELECTS, each projecting the needed columns only may benefit from non-clustered indexes that cover just your projected column in each case.
Of course this depends on the actual schema of your data model and the exact queries, but this is just a heads up so you don't bring the imperative thinking mind frame of procedural programming to the declarative world of SQL.
Try something like
SELECT
CASE var
WHEN xyz THEN col1
WHEN zyx THEN col2
ELSE col7
END AS col1,
...
In other words, use a conditional expression to select the value, then rename the column.
Alternately, you could build up some sort of dynamic SQL hack to share the query tail; I've done this with iBatis before.
Simple CASE expression:
CASE input_expression
WHEN when_expression THEN result_expression [ ...n ]
[ ELSE else_result_expression ]
END
Searched CASE expression:
CASE
WHEN Boolean_expression THEN result_expression [ ...n ]
[ ELSE else_result_expression ]
END
Reference: http://msdn.microsoft.com/en-us/library/ms181765.aspx
CASE is the answer, but you will need to have a separate case statement for each column you want returned. As long as the WHERE clause is the same, there won't be much benefit separating it out into multiple queries.
Example:
SELECT
CASE #var
WHEN 'xyz' THEN col1
WHEN 'zyx' THEN col2
ELSE col7
END,
CASE #var
WHEN 'xyz' THEN col2
WHEN 'zyx' THEN col3
ELSE col8
END
FROM Table
...
The most obvious solutions are already listed. Depending on where the query is sat (i.e. in application code) you can't always use IF statements and the inline CASE statements can get painful where lots of columns become conditional.
Assuming Col1 + Col3 + Col7 are the same type, and likewise Col2, Col4 + Col8 you can do this:
SELECT Col1, Col2 FROM tbl WHERE #Var LIKE 'xyz'
UNION ALL
SELECT Col3, Col4 FROM tbl WHERE #Var LIKE 'zyx'
UNION ALL
SELECT Col7, Col8 FROM tbl WHERE #Var NOT LIKE 'xyz' AND #Var NOT LIKE 'zyx'
As this is a single command there are several performance benefits with regard to plan caching. Also the Query Optimiser will quickly eliminate those statements where #Var doesn't match the appropriate value without touching the storage engine.
Related
I am developing a data validation framework where I have this requirement of checking that the table fields should have at least one non-null value i.e they shouldn't be completely empty having all values as null.
For a particular column, I can easily check using
select count(distinct column_name) from table_name;
If it's greater than 0 I can tell that the column is not empty. I already have a list of columns. So, I can execute this query in the loop for every column but this would mean a lot of requests and it is not the ideal way.
What is the better way of doing this? I am using Microsoft SQL Server.
I would not recommend using count(distinct) because it incurs overhead for removing duplicate values. You can just use count().
You can construct the query for counts using a query like this:
select count(col1) as col1_cnt, count(col2) as col2_cnt, . . .
from t;
If you have a list of columns you can do this as dynamic SQL. Something like this:
declare #sql nvarchar(max);
select #sql = concat('select ',
string_agg(concat('count(', quotename(s.value), ') as cnt_', s.value),
' from t'
)
from string_split(#list) s;
exec sp_executesql(#sql);
This might not quite work if your columns have special characters in them, but it illustrates the idea.
You should probably use exists since you aren't really needing a count of anything.
You don't indicate how you want to consume the results of multiple counts, however one thing you could do is use concat to return a list of the columns meeting your criteria:
The following sample table has 5 columns, 3 of which have a value on at least 1 row.
create table t (col1 int, col2 int, col3 int, col4 int, col5 int)
insert into t select null,null,null,null,null
insert into t select null,2,null,null,null
insert into t select null,null,null,null,5
insert into t select null,null,null,null,6
insert into t select null,4,null,null,null
insert into t select null,6,7,null,null
You can name the result of each case expression and concatenate, only the columns that have a non-null value are included as concat ignores nulls returned by the case expressions.
select Concat_ws(', ',
case when exists (select * from t where col1 is not null) then 'col1' end,
case when exists (select * from t where col2 is not null) then 'col2' end,
case when exists (select * from t where col3 is not null) then 'col3' end,
case when exists (select * from t where col4 is not null) then 'col4' end,
case when exists (select * from t where col5 is not null) then 'col5' end)
Result:
col2, col3, col5
I asked a similar question about a decade ago. The best way of doing this in my opinion would meet the following criteria.
Combine the requests for multiple columns together so they can all be calculated in a single scan.
If the scan encounters a not null value in every column under consideration allow it to exit early without reading the rest of the table/index as reading subsequent rows won't change the result.
This is quite a difficult combination to get in practice.
The following might give you the desired behaviour
SELECT DISTINCT TOP 2 ColumnWithoutNull
FROM YourTable
CROSS APPLY (VALUES(CASE WHEN b IS NOT NULL THEN 'b' END),
(CASE WHEN c IS NOT NULL THEN 'c' END)) V(ColumnWithoutNull)
WHERE ColumnWithoutNull IS NOT NULL
OPTION ( HASH GROUP, MAXDOP 1, FAST 1)
If it gives you a plan like this
Hash match usually reads all its build input first meaning that no shortcircuiting of the scan will happen. If the optimiser gives you an operator in "flow distinct" mode it won't do this however and the query execution can potentially stop as soon as TOP receives its first two rows signalling that a NOT NULL value has been found in both columns and query execution can stop.
But there is no hint to request the mode for hash aggregate so you are dependent on the whims of the optimiser as to whether you will get this in practice. The various hints I have added to the query above are an attempt to point it in that direction however.
I would like to write a query
Select col1, col2
from table
where col1 = 'blah' or 'blah2' or 'blah3'
and col2 = 'blah' or 'blah2' or 'blah3'
I am used to writing them like this for a SINGLE option
select
col1, col2
from
table
where
col1 = :col1 and col2 = :col2
Parameters.AddWithValue(":col1", 'blah')
Parameters.AddWithValue(":col2", 'blah')
Now I want to add several options with OR between them and obviously the above code wont work. The SQL is for SQLite. Can anyone suggest how I could do this? I may potential have more then 3 different values for each parameter. I have tried searching but the answer is elusive.
You still have to use complete expressions, i.e., you need to write col1 = or col2 = every time.
Alternative, use IN:
SELECT ... WHERE col1 IN (:c11, :c12, :c13) AND col2 IN (:c21, :c22, :c23);
I am trying to write a PL/SQL procedure which will have the SQL query to get the results. But the requirement is that the order by can be dynamic and is mainly for sorting the columns in the screen. I am passing 2 parameters to this procedure - in_sort_column and in_sort_order.
The requirement is such that on text columns the sorting is in ASC and for numbers it is DESC.
My query looks something like this without adding the in_sort_order -
SELECT col1, col2, col3 from tabl e1 where col1 > 1000
ORDER BY decode(in_sort_column,'col1', col1, 'col2', col2, 'col3', col3);
I am not able to figure out how to use the in_sort_order parameter in this case. Can someone who has done this before help out ?
Thanks
When doing a dynamic sort, I recommend using separate clauses:
order by (case when in_sort_column = 'col1' then col1 end),
(case when in_sort_column = 'col2' then col2 end),
(case when in_sort_column = 'col3' then col3 end)
This guarantees that you will not have an unexpected problem with type conversion, if the columns are of different types. Note that case return NULL without an else clause.
Since the requirement is based on data type, you could just negate the numeric columns in your decode; if col1 is numeric and the others are text then:
ORDER BY decode(in_sort_column, 'col1', -col1, 'col2', col2, 'col3', col3);
But this is going to attempt to convert the text columns to numbers. You can swap the decode or around to avoid that, but you then do an implicit conversion of your numeric column to a string, and your numbers will then be sorted alphabetically - so 2 comes after 10, for example.
So Gordon Linoff's use of case is better, and you can still negate the col1 value with that to make the numbers effectively sort descending.
In WHERE clause when using condition like this Table.Column = #Param OR #Param IS NULL It does not use INDEX on Column.
Is it true and if so then how to write this kind of query which also use INDEX
Query Example
SELECT Col1, Col2 ...
FROM Table
WHERE (Col1 = #col OR #col IS NULL)
AND (Col2 = #col2 OR #col2 IS NULL)
AND (Col3 = #col3 OR #col3 IS NULL)
Any help.
Unfortunately, the generation of execution plans does not behave as you expect.
For that single query, a single plan is created. In creating that plan the indexes to use are selected, and fixed. It doesn't matter what parameters you provide, the same plan, same indexes, etc, are always used.
The otpimiser has tried to find the best plan that can fit all eventuallities, but by the nature of this type of query, there isn't one. A characteristic born out by the plan you have not using an index at all.
The solution is to use dynamic SQL. This feels untidy, but if you use parameterised queries with sp_executesql, it can actually be quite stuctured, and very performant.
Here is a link to a very useful article on the subject: dynamic search
It's very in depth, but it is a very robust approach to this problem.
SELECT Col1, Col2 ...
FROM Table
WHERE EXISTS(
SELECT Col1, Col2, Col3
INTERSECT
SELECT #col, #col2, #col3)
Intuitively, this seems like it should perform very badly, but SQL Server's query optimiser knows how to give INTERSECT special treatment, and internally translates it to (pseudo-SQL)
SELECT Col1, Col2 ...
FROM Table
WHERE (Col1, Col2, Col3) IS (#col, #col2, #col3)
as you can see in the query plan. If you have indices on these columns, they can and do get used.
I originally picked this up from Paul White's Undocumented Query Plans: Equality Comparisons blog post, which may be an interesting further read.
Why don't try this:
SELECT Col1, Col2 ...
FROM Table
WHERE Col1 = IsNull(#col,Col1)
AND Col2 = IsNull(#col2,Col2)
AND Col3 = IsNull(#col3,Col3)
About your question:
Your query analyzer say it don't use the index on column1,2,3 ? You made a index for all 3 columns? Then it should use it regardless the other OR IS NULL
Try to have index on all where clause columns and try to use the more structured query as given below:
SELECT Col1, Col2 ...
FROM Table
WHERE Col1 = **COALESCE**(#col,Col1)
AND Col2 = **COALESCE**(#col2,Col2)
AND Col3 = **COALESCE**(#col3,Col3)
The COALESCE() function returns the first non-null argument so if STATUS is NULL it will return ''.
I wish to do something like this:
DECLARE #IgnoreNulls = 1;
SELECT Col1, Col2
FROM tblSimpleTable
IF #IgnoreNulls
BEGIN
WHERE Col2 IS NOT NULL
END
ORDER BY Col1 DESC;
The idea is to, in a very PHP/ASP.NET-ish kinda way, only filter NULLs if the user wishes to. Is this possible in T-SQL? Or do we need one large IF block like so:
IF #IgnoreNulls
BEGIN
SELECT Col1, Col2
FROM tblSimpleTable
WHERE Col2 IS NOT NULL
ORDER BY Col1 DESC;
END
ELSE
BEGIN
SELECT Col1, Col2
FROM tblSimpleTable
ORDER BY Col1 DESC;
END
You can do that this way:
SELECT Col1, Col2
FROM tblSimpleTable
WHERE ( #IgnoreNulls != 1 OR Col2 IS NOT NULL )
ORDER BY Col1 DESC
Dynamically changing searches based on the given parameters is a complicated subject and doing it one way over another, even with only a very slight difference, can have massive performance implications. The key is to use an index, ignore compact code, ignore worrying about repeating code, you must make a good query execution plan (use an index).
Read this and consider all the methods. Your best method will depend on your parameters, your data, your schema, and your actual usage:
Dynamic Search Conditions in T-SQL by by Erland Sommarskog
In general (unless the table is small) the best approach is to separate out the cases and do something like you have in your question.
IF (#IgnoreNulls = 1)
BEGIN
SELECT Col1, Col2
FROM tblSimpleTable
WHERE Col2 IS NOT NULL
ORDER BY Col1 DESC;
END
ELSE
BEGIN
SELECT Col1, Col2
FROM tblSimpleTable
ORDER BY Col1 DESC;
END
This is less likely to cause you problems with sub optimal query plans being cached.