Eliminate division by zero using an SQL Server CASE statement - sql

I am trying to eliminate this division by zero error using CASE in my T-SQL SELECT statement. For some reason, I keep getting the error. Here is my logic.
SELECT
CASE
WHEN tb1.wpf = 0.000 THEN '0.000'
ELSE SUM(tb2.weight/tb1.wpf)
END AS Average
FROM Table1 tb1, table2 tb2
GROUP BY tb1.wpf
I did not include joins and all my logic to keep my question specific to this case. How can I get rid of this error?

The CASE is going to be applied to the aggregate, not individual bits. Try this:
SELECT SUM(Average) FROM (
SELECT
CASE
WHEN tb1.wpf = 0 THEN 0
ELSE tb2.weight / tb1.wpf
END AS Average
FROM Table1 tb1, table2 tb2
) a

Related

Why Using COALESCE or CASE keep returning null

I have the following SQL Query :
(SELECT ROUND(SUM(NBTOSUM)/1000000,1) FROM MyTable t2 WHERE t2.ELEMNAME IN ('A','B','C'))
Which works fine.
But Where there is no 'A','B','C' the result of the select is (null)
So to handle it, I did the following :
(SELECT COALESCE(ROUND(SUM(NBTOSUM)/1000000,1),0) FROM MyTable t2 WHERE t2.ELEMNAME IN ('A','B','C'))
And also try :
(SELECT
CASE
WHEN SUM(NBTOSUM)/1000000 IS NULL THEN 0
ELSE ROUND(SUM(NBTOSUM)/1000000,1)
END
FROM MyTable t2 WHERE t2.ELEMNAME IN ('A','B','C'))
But both keep returning null
What am I doing wrong ?
Move the WHERE restrictions to the CASE expression as well:
SELECT ROUND(SUM(CASE WHEN t2.ELEMNAME IN ('A','B','C')
THEN NBTOSUM ELSE 0 END) / 1000000, 1)
FROM MyTable t2;
Note that this trick solves the null problem and also avoids the need for an ugly COALESCE() call.
Your code should work as the SUM aggregation function will generate a single row of output regardless of whether the number of input rows is zero or non-zero. If there are no input rows or the values are all NULL then the output of the SUM will be NULL and then COALESCE would work.
Since you claim it does not then that suggests that there is something else going on in your query that you have not shared in the question.
You have braces around your statement suggesting that you are using it as part of a larger statement. If so, you can try moving the COALESCE to the outer query:
SELECT COALESCE(
(
SELECT ROUND(SUM(NBTOSUM)/1000000,1)
FROM MyTable
WHERE ELEMNAME IN ('A','B','C')
),
0
)
FROM your_outer_query;
That might fix the problem if you are somehow correlating to an outer query but your question makes no mention of that.
fiddle

Anyway to use IN operator in the SELECT statement? If not, why?

This may come off as a feature request more than anything, but it would be nice if SQL allowed use of the IN operator in a select statement such as the one below. I want to create new_variable intable1based on the ID variable in table2, hence the case statement.
select ID,
case when ID in (select ID
from table2)
then 1
else 0
end as new_variable
from table1
I understand that SQL will give me an error if I run this, but why is that the case? It doesn't seem obvious to me why SQL developers couldn't enable the IN operator to be used outside of the WHERE clause.
Side note: I'm currently using a left join to avoid this issue, so I am not hung up on this.
select ID,
case when ifnull(b.ID, 0) = 0 then 0
else 1
end as variable_name
from table1
left join(select ID from table2) as b
on a.ID = b.ID
SQL definitely supports this:
select ID,
(case when ID in (select ID from table2)
then 1 else 0
end) as new_variable
from table1
Note that there is a comma after id.
This is standard SQL. If your database doesn't support it, it is a feature request (and one that all or almost all databases support).

How to force AVG to include zeros

I am averaging values from a column in a sub query that includes zeros. The average seems to be ignoring zero values and giving me an inflated value.
I pulled the sub query alone and I see zeros in column F (UWDays), the one I am trying to average. I tried the same query but replaced avg(mm.UWdays) with avg(NULLIF(mm.days,0)) and again I got the same values as the original pull.
SELECT mm.month, mm.FlagA, mm.FlagB, mm.FlagC, avg(mm.UWdays) AS UWDays,
FROM (
select date_trunc('Month', dates_table.month) as month,
customer_table.customer_id,
Case WHEN (table1.attribute1 LIKE '%Yes%' AND table2.attribute2 LIKE '%Yes%' and table3.attribute3 NOT LIKE '%Yes%') THEN 1 ELSE 0 END AS FlagA,
Case WHEN (table1.attribute1 LIKE '%Yes%' AND table2.attribute2 LIKE '%Yes%' AND table3.attribute3 LIKE '%Yes%') THEN 1 ELSE 0 END AS FlagB,
Case WHEN (table1.attribute1 LIKE '%Yes%' AND table2.attribute2 LIKE '%No%' AND table3.attribute3 LIKE '%Yes%') THEN 1 ELSE 0 END AS FlagC,
CASE WHEN (min(table1.date) is null AND max(table1.date) is null) THEN 0 ELSE count(table1.date) end AS UWDays
FROM customer_table cross dates_table
left outer join table1 ON customer_table.customer_id= table1.customer_id
left outer join table2 on customer_table.customer_id= table2.customer_id
left outer join table3 on customer_table.customer_id= table3.customer_id
group by 1,2,3,4,5
order by 2,1) mm
GROUP BY 1,2,3,4
AVG() does not exclude zeroes. However, it does ignore NULL values, so perhaps that is what you mean -- particularly because your query has LEFT JOINs which would tend to generate NULL values.
You can treat NULL values as 0 using COALESCE():
avg(coalesce(mm.UWdays), 0)

Using self join/inner join to get the data by adding all the above rows data, in SQL Server

I am working with SQL server, the query details are as below:
Need to get the calculative column whose result will be the addition of all the above rows. And I can't use correlated queries, Lag and Lead as it is not supported.
I tried using self-join/inner join/left outer join, the only problem I am facing is due to group by clause of other columns the result is not coming as expected.
For example, data is like
Expected Output
But the output I am getting due to group by clause applied on Column4.
Output getting is like below
Is there some alternative of GROUP BY clause or other alternatives?
In SQL Server 2012+, you would simply use cumulative sum:
select d.*, sum(column3) over (order by column2)
from data d;
using subquery for earlier version of 2012
SELECT Column1,Column2,Column3,
( SELECT SUM(y.Column3)
FROM Data y
WHERE y.Column1 = x.Column1
AND y.Column2 <= x.Column2
) AS [Column5(Summation of Column3)]
FROM Data x
ORDER BY 1 ,2 ,3;
Query result
It is resolved!!
Earlier my query was like,
SELECT a.*,
CASE WHEN col4!= A THEN SUM col3 ELSE 0 END
FROM a
GROUP BY col4, *...
This groupby was causing issue as it was giving the expected output
The things I changes in the query is,
SELECT a.*,
SUM(CASE WHEN col4!= A THEN col3 ELSE 0 END)
FROM a
GROUP BY (CASE WHEN col4!= A THEN col3 ELSE 0 END)
Now it is giving result as expected.
Thank you everyone for the help!!

Joining two datasets with subqueries

I am attempting to join two large datasets using BigQuery. they have a common field, however the common field has a different name in each dataset.
I want to count number of rows and sum the results of my case logic for both table1 and table2.
I believe that I have errors resulting from subquery (subselect?) and syntax errors. I have tried to apply precedent from similar posts but I still seem to be missing something. Any assistance in getting this sorted is greatly appreciated.
SELECT
table1.field1,
table1.field2,
(
SELECT COUNT (*)
FROM table1) AS table1_total,
sum(case when table1.mutually_exclusive_metric1 = "Y" then 1 else 0 end) AS t1_pass_1,
sum(case when table1.mutually_exclusive_metric1 = "Y" AND table1.mutually_exclusive_metric2 IS null OR table1.mutually_exclusive_metric3 = 'Y' then 1 else 0 end) AS t1_pass_2,
sum(case when table1.mutually_exclusive_metric3 ="Y" AND table1.mutually_exclusive_metric2 ="Y" AND table1.mutually_exclusive_metric3 ="Y" then 1 else 0 end) AS t1_pass_3,
(
SELECT COUNT (*)
FROM table2) AS table2_total,
sum(case when table2.metric1 IS true then 1 else 0 end) AS t2_pass_1,
sum(case when table2.metric2 IS true then 1 else 0 end) AS t2_pass_2,
(
SELECT COUNT (*)
FROM dataset1.table1 JOIN EACH dataset2.table2 ON common_field_table1 = common_field_table2) AS overlap
FROM
dataset1.table1,
dataset2.table2
WHERE
XYZ
Thanks in advance!
Sho. Lets take this one step at a time:
1) Using * is not explicit, and being explicit is good. Additionally, stating explicit selects and * will duplicate selects with autorenames. table1.field will become table1_field. Unless you are just playing around, don't use *.
2) You never joined. A query with a join looks like this (note order of WHERE and GROUP statements, note naming of each):
SELECT
t1.field1 AS field1,
t2.field2 AS field2
FROM dataset1.table1 AS t1
JOIN dataset2.table2 AS t2
ON t1.field1 = t2.field1
WHERE t1.field1 = "some value"
GROUP BY field1, field2
Where t1.f1 = t2.f1 contain corresponding values. You wouldn't repeat those in the select.
3) Use whitespace to make your code easier to read. It helps everyone involved, including you.
4) Your subselects are pretty useless. A subselect is used instead of creating a new table. For example, you would use a subselect to group or filter out data from an existing table. For example:
SELECT
subselect.field1 AS ssf1,
subselect.max_f1 AS ss_max_f1
FROM (
SELECT
t1.field1 AS field1,
MAX(t1.field1) AS max_f1,
FROM dataset1.table1 AS t1
GROUP BY field1
) AS subselect
The subselect is practically a new table that you select from. Treat it logically like it happens first, and you take the results from that and use it in your main select.
5) This was a terrible question. It didn't even look like you tried to figure things out one step at a time.