Best way to write union query when dealing with NULL and Empty String values - sql

I have to write a query that performs a union between two tables with similar data. The results need to be distinct. The problem I have is that some fields that should be the same are not when it comes to empty values. Some are indicated as null, and some have empty string values. My question is, is there a better way to perform the following query? (without fixing the actual data to ensure proper defaults are set, etc) Will using the Case When be a big performance hit?
Select
When Column1 = '' Then NULL Else Column1 as [Column1],
When Column2 = '' Then NULL Else Column2 as [Column2]
From TableA
UNION ALL
Select
When Column1 = '' Then NULL Else Column1 as [Column1],
When Column2 = '' Then NULL Else Column2 as [Column2]
From TableB

I don't think it would make any difference in performance, but NULLIF is another way to write this and, IMHO, looks a little cleaner.
Select
NULLIF(Column1, '') as [Column1],
NULLIF(Column2, '') as [Column2]
From TableA
UNION
Select
NULLIF(Column1, '') as [Column1],
NULLIF(Column2, '') as [Column2]
From TableB

Use UNION to remove duplicates - it's slower than UNION ALL for this functionality:
SELECT CASE
WHEN LEN(LTRIM(RTRIM(column1))) = 0 THEN NULL
ELSE column1
END AS column1,
CASE
WHEN LEN(LTRIM(RTRIM(column2))) = 0 THEN NULL
ELSE column2
END AS column2
FROM TableA
UNION
SELECT CASE
WHEN LEN(LTRIM(RTRIM(column1))) = 0 THEN NULL
ELSE column1
END,
CASE
WHEN LEN(LTRIM(RTRIM(column2))) = 0 THEN NULL
ELSE column2
END
FROM TableB
I changed the logic to return NULL if the column value contains any number of spaces and no actual content.
CASE expressions are ANSI, and more customizable than NULLIF/etc syntax.

A Case should perform fine, but IsNull is more natural in this situation. And if you're searching for distinct rows, doing a union instead of a union all will accomplish that (thanks to Jeffrey L Whitledge for pointing this out):
select IsNull(col1, '')
, IsNull(col2, '')
from TableA
union
select IsNull(col1, '')
, IsNull(col2, '')
from TableB

You can keep your manipulation operations separate from the union if you do whatever manipulation you want (substitute NULL for the empty string) in a separate view, then union the views.
You shouldn't have to apply the same manipulation on both sets, though.
If that's the case, union them first, then apply the manipulation to the resulting, unioned set once.
Half as much manipulation code to support that way.

Related

Create custom column based off of other columns in SQL query (SQL Server)

I'm having a hard time finding correct syntax to do the following:
SELECT
ColumnA,
ColumnB,
ColumnC,
(if Column1 IS Null and Column2 IS NOT NULL) Then 'Pending' Else '' AS ColumnD
I've tried IF/ELSE, IIF(), but I can't seem to get these queries to work.
use case when expression
SELECT ColumnA,ColumnB,ColumnC,
case when Column1 IS Null and Column2 IS NOT NULL Then 'Pending' Else '' end AS ColumnD
from yourtable

Case when statement in SQL

I am using the following query. In this query I want to apply the where clause based on passed parameter. But the issue is that where clause is like 'value = if parameterVal = 'I' than NULL else NOT NULL'
I've build a query like this
SELECT * FROM MASTER
WHERE
Column1 IS (CASE WHEN :Filter = 'I' THEN 'NULL' ELSE 'NOT NULL' END)
but it's not working. Help me solve this.
UPDATE
Updating question to elaborate question more clearly.
I've one table MASTER. Now I am passing one parameter in query that is Filter (indicated by :Filter in query).
Now when the Filter parameter's value is 'I' than it should return the following result.
SELECT * FROM MASTER WHERE Column1 IS NULL
but if the passed argument is not equal to 'I' than,
SELECT * FROM MASTER WHERE Column1 IS NOT NULL
SELECT * FROM MASTER
WHERE (Filter = 'I' AND Column1 IS NULL)
OR
(Filter <> 'I' AND Column1 IS NOT NULL)
If you really insist on using a CASE the SELECT could be rewritten as:
SELECT *
FROM MASTER
WHERE CASE
WHEN COLUMN1 IS NULL AND FILTER = 'I' THEN 1
WHEN COLUMN1 IS NOT NULL AND FILTER <> 'I' THEN 1
ELSE 0
END = 1
SQLFiddle here
Frankly, though, I think that this is very difficult to interpret, and I suggest that #MAli's version is better.
Your case has assignment not equality check

Faster way of doing multiple checks on one dataset

Is there a better way to rewrite the following:
SELECT DISTINCT Column1, 'Testing #1'
FROM MyTable
WHERE Column2 IS NOT NULL && Column3='Value'
UNION ALL
SELECT DISTINCT Column1, 'Testing #2'
FROM MyTable
WHERE Column3 IS NULL && Column2='Test Value'
UNION ALL
SELECT DISTINCT Column1, 'Testing #3'
FROM MyTable
Where ....
In have about 35 union all statements that all query the same table. I was wondering if there's an easier/faster way to do things.
Yes, you can rewrite it with case statements like this
SELECT Column1,
CASE WHEN Column2 IS NOT NULL AND Column3='Value' THEN 'Testing #1'
WHEN Column3 IS NULL AND Column2='Test Value' THEN 'Testing #2'
ELSE 'Testing #3' END as customcol
FROM MyTable
EDIT : Ok, i am making this edit because according to your comment, there are two issues we need to address. (I am leaving the original answer as it is in case it might help somebody.)
1) Result set should be filtered and there should be no else part.
This is actually achievable with this solution since else is optional and data can be filtered with a where clause at the end.
2) Being able to select the same row multiple times with different Testing # values if it matches the criteria.
This however is not achievable with my previous solution. So i thought of a different one. Hope it fits into your case. Here it is
S1 - Create a new table with Testing # values(Testing #1, Testing #2, Testing #3 etc.). Let's say this table is named Testing.
S2 - JOIN your main table (MyTable) with Testing table which contains Testing # values. So now you have every possible combination of real-data and testing values.
S3 - Filter the results you don't want to appear with a where clause.
S4 - Filter the real-data <-> testing combinations with an addition to where clause.
End query should look something like this :
SELECT M.Column1, T.TestingValue
FROM MyTable M
INNER JOIN Testing T ON 1=1
WHERE
(
(M.Column2 IS NOT NULL AND M.Column3='Value' AND T.TestingValue='Testing #1') OR
(M.Column3 IS NULL AND M.Column2='Test Value' AND T.TestingValue='Testing #2') OR
<conditions for other testing values>
)
AND
<other conditions>
I think this should work and produce the results you want. But since i don't have the data i am not able to run any benchmarks vs the union-based solution. So i don't have any scientific evidence to claim this is faster but it is an option. You can test both and use the better one.
It might be a little late but hope this solves your problem.
You can do this in one statement, but you want a different column for each test:
select column1,
(case when column2 is not null and column3 = 'Value' then 1 else 0
end) as Test1
(case when column3 is null and column3 = 'Test Value' then 1 else 0
end) as Test2,
. . .
from t;
Because you only want cases where things fail, you can put this in a subquery and test for any failure:
select *
from (select column1,
(case when column2 is not null and column3 = 'Value' then 1 else 0
end) as Test1
(case when column3 is null and column3 = 'Test Value' then 1 else 0
end) as Test2,
. . .
from t
) t
where test1 + test2 + . . . > 0

Is it possible to return multiple columns using 1 case statement?

I have 4 case statements that are exactly the same CASE criteria, but they all have different THEN/ELSE statements.
Is it possible to do this all in one, or do I need to separate these all out and copy and paste the code multiple times?
,CASE WHEN lm.Id IN ('1','2','3') THEN lm.name ELSE lm.Desc END AS [Column1]
,CASE WHEN lm.Id IN ('1','2','3') THEN '3' ELSE '1' END AS [Column2]
,CASE WHEN lm.Id IN ('1','2','3') THEN 'True' ELSE 'False' END AS [Column3]
Is it possible to do this with less code?
I don't think this is possible in strict SQL. Some DB engines may support it as an extension. You could probably accomplish functionally the same thing through some other mechanism, though... possibly with a JOIN, or a UNION.
Suggest UNIONing your resultsets. It won't get you fewer lines of code, but perhaps more readable.
SELECT [name], '3', 'True'
From Mytable WHERE ID IN ('1','2','3')
UNION
SELECT [desc], '1', 'False'
From Mytable WHERE ID NOT IN ('1','2','3')
Why don't you try to update the table using where? In the select statement of your question you can declare Column1, Column2 and Column3 as NULL and with two update statements change the values.
With "only" three columns depending on same case statement the code below doesn't save much typing (probably execution time..?) but it comes handy when you have more than 3...
UPDATE MyTable
SET Column1 = lm.name,
Column2 = '3',
Column3 = 'True'
WHERE lm.Id IN ('1','2','3')
UPDATE MyTable
SET Column1 = lm.Desc,
Column2 = '1',
Column3 = 'False'
WHERE lm.Id NOT IN ('1','2','3')
For the example you give, I would not try to make any change. If your test ( WHEN ... THEN ) involved a lot more calculation, or if it was repeated a lot more often, you could consider setting up a subquery to evaluate it. But with only a small amount of repetition, why bother? The code you have is easy to read, and not expensive to execute.

SQL Server 2005/2008 Group By statement with parameters without using dynamic SQL?

Is there a way to write SQL Server Stored Procedure which to be strongly typed ( i.e. returning known result set of columns ) and having its group statement to be dynamic.
Something like:
SELECT SUM( Column0 ) FROM Table1
GROUP BY #MyVar
I tried the following also:
SELECT SUM( Column0 ) FROM Table1
GROUP BY CASE #MyVar WHEN 'Column1' THEN Column1 ELSE Column2
The second statement works only in scenarios where the db types of Column1 and Column2 are the same. If they are not the SQL engine throws an error saying something similar to: "Conversion failed when converting the nvarchar value 'SYSTEM' to data type [The Different TYPE]."
What can I do to achieve strong result set and yet have some dynamic part - i.e. the grouper in my case? This will be exposed to LINQ.
EDIT:
As seems you can do it, but you should NOT! Absolutely overkill. Testing showed figures of thousand times slower execution plans. And it will only get slower with bigger result sets.
You can group on a constant which might be useful
SELECT
SUM(Column0),
CASE #MyVar WHEN 'Column1' THEN Column1 ELSE '' END AS MyGrouping
FROM
Table1
GROUP BY
CASE #MyVar WHEN 'Column1' THEN Column1 ELSE '' END
Edit: For datatype mismatch and multiple values and this allows you to group on both columns...
SELECT
SUM(Column0),
CASE #MyVar WHEN 'Column1' THEN Column1 ELSE NULL END AS Column1,
CASE #MyVar WHEN 'Column2' THEN Column2 ELSE NULL END AS Column2
FROM
Table1
GROUP BY
CASE #MyVar WHEN 'Column1' THEN Column1 ELSE NULL END,
CASE #MyVar WHEN 'Column2' THEN Column2 ELSE NULL END
You are about to shoot yourself in the foot and are asking for a bigger bullet.
The only sensible approach to this is to separate the IF into a T-SQL flow control statement:
IF (0 = #MyVar)
SELECT SUM(Column0) FROM Table1 GROUP BY Column1;
ELSE IF (1 = #MyVar)
SELECT SUM(Column0) FROM Table1 GROUP BY Column2;
ESLE IF (2 = #myVar)
SELECT SUM(Column0) FROM Table1 GROUP BY Column3;
The last thing you want from the query optimizer is to generate a plan that has to GROUP BY a condition that is determined by a variable.