Why does a CASE WHEN give me a different result to a WHERE with the same condition? - sql

I'm trying to use a case when and a pivot to filter some data, but get a result of a total count of 0, however, when I use the same condition in a where statement I get a result that's more what i expected.
SELECT * FROM(SELECT FYEAR,
CASE WHEN (DIAG_3_01 IN ('E10','E11','E12','E13','E14','O24') OR DIAG_4_01 IN ('E232','N251','P702')) AND (OPERTN_3_01 IN ('N26','P22','X07','X09','X10','X11') OR OPERTN_4_01 IN ('Q011','X215','X216','X273','X121')) THEN 'a'
ELSE 'Other' END AS 'Procs',
(FCE)
FROM database
) AS a
PIVOT(COUNT(FCE) FOR [Procs] IN ([a])) AS p;
So this results in a table with column name a and a row value of 0, whereas this code results in a total of about 4000:
SELECT COUNT(FCE)
FROM database
WHERE (DIAG_3_01 IN ('E10','E11','E12','E13','E14','O24') OR DIAG_4_01 IN ('E232','N251','P702')) AND (OPERTN_3_01 IN ('N26','P22','X07','X09','X10','X11') OR OPERTN_4_01 IN ('Q011','X215','X216','X273','X121'));
Unfortunately I cant share the database contents, but would appreciate any insight as to why this might be happening.

The CASE statement is like an If-statement, which only returns the very first value it founds that meets the criteria. You can also check multiple cases with WHEN and if no criteria are met, then the query will show whatever is in the ELSE statement.
The WHERE clause is just filtering the result, meaning it tell the query which records should not be ignored.

Related

Display 0 where condition doesnt match

I need to write a query where I need to display 0 where source system havent sent any data for today, if I write simple query with count(*) it provides me no rows. I am using case statement but not successful to display actual count.
select SRC_SYS,
case
when ( select count (*) from table1 where SRC_SYS ='A')='0' then '0'
else count(1) end as Count
This is working fine when there is no data, but when I have data its not displaying the actual count
when I have data its not displaying actual count
Er.. No, indeed it won't; it will display "1" because there is only one row; the dataset you're count(1)ing has only one row
Just do the count from the actual table; it'll work fine whether there are 0 rows or more because the select list only contains constants or aggregates:
select 'A' as SRC_SYS, COUNT(*)
from table1
where SRC_SYS ='A'
Or if your front end is using parameters:
select #pwhatever as SRC_SYS, COUNT(*)
from table1
where SRC_SYS = #pwhatever
See this fiddle and change the where 1=0 to where 1=1 to see the count change

Compare value in each row to average of the column (SQL)

I am trying to create a view where the score for each row is compared to the average for that column, so that I can easily identify records by their rough "grade". Simplified code:
select recordID,
case when table.ColumnA>avg(all table.ColumnA) then 'Hard' else 'Easy' end as Difficulty,
from table
group by recordID, ColumnA
I've tried various combinations of this and the case formula keeps defaulting to 'else', which on investigation seems to be that every calculated value is coming out as 0, as both the row value and average value are being deemed the same.
I have a feeling the answer has something to do with Rollup, either on this table or the source table, but the syntax required is beyond me.
Anyone?
You want a window function:
select recordID,
(case when table.ColumnA > avg(table.ColumnA) over ()
then 'Hard' else 'Easy'
end) as Difficulty
from table;

Specifying a column value in an aggregate function vs using a WHERE clause

I have a database people that looks like this:
I wanted to count the occurrences of state='CA'.
My first attempt was:
SELECT COUNT(state='CA')
FROM people
;
this returned 1 row with a value of 1000. So I thought that there were 1000 people from CA in the database.
This turns out to be incorrect. I know that they are 127, which I can verify with the query
SELECT COUNT(*)
FROM people
WHERE state='CA'
;
which returns 1 row with a value of 127.
I understand how the second query works. However, I do not understand what is wrong with the first one. What is it returning?
If you want to see what's going on, run the query:
select state='CA' from people;
You will see that you will get one result for each row in people, with the value 0 or 1 (or True/False). What you've selected is whether state='CA' for each row, and there will be just as many of those results as there are rows.
You can't constrain a COUNT statement within the statement, you have to do that via the WHERE clause as in your second example.
count is not a sum .. your first query is improper because don't return the number of the rows true .. but the total numbers of not null rows true or false
if you want a filter count you must use a where condition (as your second query) otherwise you must use an if or a a select case inside the sum() function eg:
Select sum(case
when state='CA' then 1 else 0
end) as my_result from People;
or if you want count .. use null and not 0min count
Select count(case
when state='CA' then 1 else null
end) as my_result from People;
Try this-:
Select count(case when state='CA' then 1 else null end) as xyz from People;
1st query will work if you use case when in side count,
like below query will returned count of CA
SELECT sum( case when state='CA' then 1 else 0 end)
FROM people
In first query it is assigning the value 'CA' to the column state for all 1000 rows instead of filtering the values. That is what SELECT does. SELECT does not filter the number of returning rows, it modifies the data.
Whereas in WHERE clause the rows are being filtered first then the SELECT clause runs the COUNT function.
There is a sequence for running the query. It starts from FROM then WHERE, GROUP BY, ORDER BY at the end SELECT will run.
To answer the actual question - why do you get 1000? I'm guessing that there are 1000 rows in your database, or at least 1000 where state is not null. Count will return the number of rows where the thing inside the () is not null and as one of your comments says, the part inside your () will return either true or false, neither of which is null, so will count them all. Your second example is of course the right way to do it.

How to Sum two columns meeting certain conditions

What I want to do is basically merge the two highlighted code, so that the end result is it using this SUM formula for only the items matching the LIKE criteria (under WHERE) - so that I am still able to pull GameDescriptions that do not include the LIKE criteria. Hope that makes sense... enter image description here
I think you just need to replace that part of the WHERE statement with a case statement in the SUM, like this:
SELECT
SUM(CASE WHEN GameDescription LIKE '5R25L%' THEN NetRevenue
ELSE 0 END) / COUNT(DISTINCT AccountingDate)
AS 'ES Created TheoWPU'
FROM Prime.dbo.PivotData

What is "Select -1", and how is it different from "Select 1"?

I have the following query that is part of a common table expression. I don't understand the function of the "Select -1" statement. It is obviously different than the "Select 1" that is used in "EXISTS" statements. Any ideas?
select days_old,
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
group by days_old
union all
select -1, -- Selecting the -1 here
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
where days_old between 1 and 7
It's just selecting the number "minus one" for each row returned, just like "select 1" will select the number "one" for each row returned.
There is nothing special about the "select 1" syntax uses in EXISTS statements by the way; it's just selecting some random value because EXISTS requires a record to be returned and a record needs data; the number 1 is sufficient.
Why you would do this, I have no idea.
When you have a union statement, each part of the union must contain the same columns. From what I read when I look at this, the first statement is giving you one line for each days old value and then some stats for each day old. The second part of the union is giving you a summary of all the records that are only a week or so less. Since days old column is not relevant here, they put in a fake value as a placeholder in order to do the union. OF course this is just a guess based on reading thousands of queries through the years. To be sure, I would need to actually run teh code.
Since you say this is a CTE, to really understand why this is is happening, you may need to look at the data it generates and how that data is used in the next query that uses the CTE. That might answer your question.
What you have asked is basically about a business rule unique to your company. The true answer should lie in any requirements documents for the original creation of the code. You should go look for them and read them. We can make guesses based on our own experience but only people in your company can answer the why question here.
If you can't find the documentation, then you need to talk (Yes directly talk, preferably in person) to the Stakeholders who use the data and find out what their needs were. Only do this after running the code and analyzing the results to better understand the meaning of the data returned.
Based on your query, all the records with days_old between 1 and 7 will be output as '-1', that is what select -1 does, nothing special here and there is no difference between select -1 and select 1 in exists, both will output the records as either 1 or -1, they are doing the same thing to check whether if there has any data.
Back to your query, I noticed that you have a union all and compare each four columns you select connected by union all, I am guessing your task is to get a final result with days_old not between 1 and 7 and combine the result with day_old, which is one because you take all between 1 and 7.
It is just a grouping logic there.
Your query returns aggregated
data (counts and rounds) grouped by days_old column plus one more group for data where days_old between 1 and 7.
So, -1 is just another additional group there, it cannot be 1 because days_old=1 is an another valid group.
result will be like this:
row1: days_old=1 count(*)=2 ...
row2: days_old=3 count(*)=5 ...
row3: days_old=9 count(*)=6 ...
row4: days_old=-1 count(*)=7