PostgreSQL - Handling empty query result - sql

I am quite new to SQL and I am currently working on some survey results with PostgreSQL. I need to calculate percentages of each option from 5-point scale for all survey questions. I have a table with respondentid, questionid, question response value. Demographic info needed for filtering datacut is retrieved from another table. Then query is passed to result table. All queries texts for specific datacuts are generated by VBA script.
It works OK in general, however there's one problematic case - when there are no respondents for specific cut and I receive empty table as query result. If respondent count is greater than 0 but lower than calculation threshold (5 respondents) I am getting table full of NULLs which is OK. For 0 respondents I get 0 rows as result and nothing is passed to result table and it causes some displacement in final table. I am able to track such cuts as I am also calculating respondent number for overall datacut and storing it in another table. But is there anything I can do at this point - generate somehow table full of NULLs which could be inserted into result table when needed?
Thanks in advance and sorry for clumsiness in code.
WITH ItemScores AS (
SELECT
rsp.questionid,
CASE WHEN SUM(CASE WHEN rsp.respvalue >= 0 THEN 1 ELSE 0 END) < 5 THEN
NULL
ELSE
ROUND(SUM(CASE WHEN rsp.respvalue = 5 THEN 1 ELSE 0 END)/CAST(SUM(CASE
WHEN rsp.respvalue >= 0 THEN 1 ELSE 0 END) AS DECIMAL),2)
END AS 5spercentage,
... and so on for frequencies of 1s,2s,3s and 4s
SUM(CASE WHEN rsp.respvalue >= 0 THEN 1 ELSE 0 END) AS QuestionTotalAnswers
FROM (
some filtering applied here [...]
) AS rsp
GROUP BY rsp.questionid
ORDER BY rsp.questionid;
INSERT INTO results_items SELECT * from ItemScores;

If you want to ensure that the questionid column won't be empty, then you must call a cte with its plain values and then left join with the table that actually you are using to make the aggregations, calcs etc. So it will generate for sure the first list and then join its values.
The example of its concept would be something like:
with calcs as (
select questionid, sum(respvalue) as sum_per_question
from rsp
group by questionid)
select distinct rsp.questionid, calcs.sum_per_question
from rsp
left join calcs on rsp.questionid = calcs.questionid

Related

Return a 0 if no rows are found in Microsoft SQL Server

I need your help with this query.
My table CSO_EMP_ORG_DPM_VIE has a column with different keys. Column name is EXT_KEY.
When I receive the same key number in EXT_KEY, I want the SQL code to count the duplicates using this query:
select EXT_KEY
from CSO_EMP_ORG_DPM_VIE
group by EXT_KEY
having count(*) > 1
This is working so far, but when it has no duplicate keys (numbers) in the column, I want it to generate it with 0 zero, and not nothing.
My expected result is; when two keys are the same I want to generate a 1. When no keys are the same, I want to generate an 0. Right now i got no result at all like in the screenshot.
How can I fix this SQL query accordingly?
Thank you in advance.
Use a CASE expression like this:
SELECT EXT_KEY,
CASE WHEN COUNT(*) > 1 THEN 1 ELSE 0 END flag
FROM CSO_EMP_ORG_DPM_VIE
GROUP by EXT_KEY
or if you want 1 result for the table:
SELECT CASE WHEN COUNT(EXT_KEY) > COUNT(DISTINCT EXT_KEY) THEN 1 ELSE 0 END flag
FROM CSO_EMP_ORG_DPM_VIE
It's not blindingly obvious as to what you are asking for. To that end, this query gives a 1/0 result based on having a count greater than 0 for each key...
SELECT
p.EXT_KEY,
EXT_KEY_RESULT = ISNULL((SELECT 1
FROM CSO_EMP_ORG_DPM_VIE c
WHERE c.EXT_KEY = p.EXT_KEY
HAVING COUNT(EXT_KEY) > 0), 0)
FROM
CSO_EMP_ORG_DPM_VIE p
Alternatively, if you are looking to count each of the keys, you could try...
SELECT EXT_KEY, COUNT(EXT_KEY)
FROM CSO_EMP_ORG_DPM_VIE
GROUP BY EXT_KEY
It's always good practice to specify a particular field in the COUNT aggregate, particularly the primary key, as it's faster to reference.
You really need to give us an expected result for your requirements and be very clear about your expectations.
SELECT CASE WHEN COUNT(EXT_KEY) > 0 THEN 1 ELSE 0 AS dupes
FROM CSO_EMP_ORG_DPM_VIE
PLEASE NOTE: Credit here to forpas for providing a smoother answer which I have borrowed.

Conditional COUNT within CASE statement

Table 1, is a list of clients, what membership they have, what service they used, and the date the service was used
Table 2, is just table 1 grouped by month and membership type, then a count of the service sessions
What I am trying to do is count membership sessions only by particular service types. This is what I have so far, it returns an error saying 'Service_Type' is not in an aggregate function or group by clause, when I put 'Service_Type' in a group by, the query has no errors but the SESSIONS column is all NULL.
SELECT
DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1)AS 'Draft_Date',
Membership,
CASE
WHEN Membership = 5 AND Service_Type = 'A' THEN COUNT(*)
WHEN Membership = 2 AND Service_Type IN ('J','C')
END AS'SESSIONS'
FROM Table1 t1
GROUP BY DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1),Membership
The case statement will include all memberships and service types but I think this is enough for my example. Any help would be greatly appreciated! I've been on this for days.
Table 1
Table 2
You were nearly there! I've made a few changes:
SELECT
DATEFROMPARTS(YEAR(t1.Date), MONTH(t1.Date),1) AS Draft_Date,
Membership,
COUNT(CASE WHEN t1.Membership = 5 AND t1.Service_Type = 'A' THEN 1 END) as m5stA,
COUNT(CASE WHEN t1.Membership = 2 AND t1.Service_Type IN ('J','C') THEN 1 END) as m2stJC
FROM Table1 t1
GROUP BY YEAR(t1.Date), MONTH(t1.Date), Membership
Changes:
Avoid using apostrophes to alias column names, use ascii standard " double quotes if you must
When doing a conditional count, put the count outside the CASE WHEN, and have the case when return something (any non null thing will be fine - i used 1, but it could also have been 'x' etc) when the condition is met. Don't put an ELSE - CASE WHEN will return null if there is no ELSE and the condition is not met, and nulls don't COUNT (you could also write ELSE NULL, though it's redundant)
Qualify all your column names, always - this helps keep the query working when more tables are added in future, or even if new columns with the same names are added to existing tables
You forgot a THEN in the second WHEN
You don't necessarily need to GROUP BY the output of DATEFROMPARTS. When a deterministic function is used (always produces the same output from the same inputs) the db is smart enough to know that grouping on the inputs is also fine
Your example data didn't contain any data that would make the COUNT count 1+ by the way, but I'm sure you will have other conditional counts that work out (it just made it harder to test)
use sum
SELECT DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1) AS Draft_Date , Membership,
sum(CASE WHEN Membership = 5 AND Service_Type = 'A' THEN 1 else 0 end),
sum(case WHEN Membership = 2 AND Service_Type IN ('J','C') then 1 else 0 end)
FROM Table1 t1 group by DATEFROMPARTS(YEAR(t1.Date),MONTH(t1.Date),1)

Return NULL instead of 0 when using COUNT(column) SQL Server

I have query which running fine and its doing two types of work, COUNT and SUM.
Something like
select
id,
Count (contracts) as countcontracts,
count(something1),
count(something1),
count(something1),
sum(cost) as sumCost
from
table
group by
id
My problem is: if there is no contract for a given ID, it will return 0 for COUNT and Null for SUM. I want to see null instead of 0
I was thinking about case when Count (contracts) = 0 then null else Count (contracts) end but I don't want to do it this way because I have more than 12 count positions in query and its prepossessing big amount of records so I think it may slow down query performance.
Is there any other ways to replace 0 with NULL?
Try this:
select NULLIF ( Count(something) , 0)
Here are three methods:
1. (case when count(contracts) > 0 then count(contracts) end) as countcontracts
2. sum(case when contracts is not null then 1 end) as countcontracts
3. nullif(count(contracts), 0)
All three of these require writing more complicated expressions. However, this really isn't that difficult. Just copy the line multiple times, and change the name of the variable on each one. Or, take the current query, put it into a spreadsheet and use spreadsheet functions to make the transformation. Then copy the function down. (Spreadsheets are really good code generators for repeated lines of code.)

Filter on a nested aggregate SUM function not working

I have these two tables (the names have been pluralized for the sake of the example):
Table Locations:
idlocation varchar(12)
name varchar(50)
Table Answers:
idlocation varchar(6)
question_number varchar(3)
answer_text1 varchar(300)
answer_text2 varchar(300)
This table can hold answers for multiple locations according a list of numbered questions that repeat on each of them.
What I am trying to do is to add up the values residing in the answer_text1 and answer_text2 columns, for each location available on the Locations table but for only an specific question and then output a value based on the result (1 or 0).
The query goes as follows using a nested table Answers to perform the SUM operation:
select
l.idlocation,
'RESULT' = (
case when (
select
sum(cast(isnull(c.answer_text1,0) as int)) +
sum(cast(isnull(c.answer_text2,0) as int))
from Answers c
where b.idlocation=c.idlocation and c.question_number='05'
) > 0 then
1
else
0
end
)
from Locations l, Answers b
where l.idlocation=b.idlocation and b.question_number='05'
In the table Answers I am saving sometimes a date string type of value for its field answer_text2 but on a different question number.
When I run the query I get the following error:
Conversion failed when converting the varchar value '27/12/2013' to data type int
I do have that value '27/12/2013' on the answer_text2 field but for a different question, so my filter gets ignored on the nested select statement after this: b.idlocation=c.idlocation, and it's adding up apparently more questions hence the error posted.
Update
According to Steve's suggested solution, I ended up implementing the filter to avoid char/varchar considerations into my SUM statement with a little variant:
Every possible not INT string value has a length greater than 2 ('00' to '99' for my question numbers) so I use this filter to determine when I am going to apply the cast.
'RESULT' =
case when (
select sum(
case when len(c.answer_text1) <= 2 then
cast(isnull(c.answer_text1,'0') as int)
else
0
end
) +
sum(
case when len(c.answer_text2) <= 2 then
cast(isnull(c.answer_text2,'0') as int)
else
0
end
)
from Answers c
where c.idlocation=b.idlocation and c.question_number='05'
) > 0
then
1
else
0
end
This is an unfortunate result of how the SQL Server query processor/optimizer works. In my opinion, the SQL standard prohibits the calculation of SELECT list expressions before the rows that will contribute to the result set have been identified, but the optimizer considers plans that violate this prohibition.
What you're observing is an error in the evaluation of a SELECT list item on a row that is not in the result set of your query. While this shouldn't happen, it does, and it's somewhat understandable, because to protect against it in every situation would exclude many efficient query plans from consideration. The vast majority of SELECT expressions will never raise an error, regardless of data.
What you can do is try to protect against this with an additional CASE expression. To protect against strings with the '/' character, for example:
... SUM(CASE WHEN c.answer_text1 IS NOT NULL and c.answer_text1 NOT LIKE '%/%' THEN CAST(c.answer_text1 as int) ELSE 0 END)...
If you're using SQL Server 2012, you have a better option: TRY_CONVERT:
...SUM(COALESCE(TRY_CONVERT(int,c.answer_text1),0)...
In your particular case, the overall database design is flawed, because numeric information should be stored in number-type columns. (This, of course, may not be your fault.) So redesign is an option, putting integer answers in integer-type columns and non-integer answer_text elsewhere. A compromise, if you can't redesign the tables, that I think will work, is to add a persisted computed column with value TRY_CONVERT(int,c.answer_text1) (or its best equivalent, based on what you know about the actual data in the table - perhaps the integer value of only columns containing no non-digit character and having length less than 9).
Your query appears correct enough, which means you have a Question 05 record with a datetime in either the answer_text1 or answer_text2 field.
Give this a shot to figure out which row has a date:
select *
from Answers
where question_number='05'
and (isdate(answer_text1) = 1 or isdate(answer_text2) = 1)
Furthermore, you could filter out any rows that have dates in them
where isdate(c.answer_text1) = 0
and isdate(c.answer_text2) = 0
and ...
Another option similar in nature to Steve's excellent answer is to filter your Answers table with a subquery like so:
select
l.idlocation,
'RESULT' = (
case when (
select
sum(cast(isnull(c.answer_text1,0) as int)) +
sum(cast(isnull(c.answer_text2,0) as int))
from (select answer_text1, answer_text2, idlocation from Answers where question_number ='05') c
where b.idlocation=c.idlocation
) > 0 then
1
else
0
end
)
from Locations l, Answers b
where l.idlocation=b.idlocation and b.question_number='05'
More generally, though, you could just have this query like this
select locations.idlocation, case when sum(case when is_numeric(answer_text1) then answer_text1 else 0 end) + sum(case when is_numeric(answer_text2) then answer_text2 else 0 end) > 0 then 1 else 0 end as RESULT from locations
inner join answers on answers.idlocation = locations.idlocation
where answers.question_number ='05'
group by locations.idlocation
Which would produce the same result.

SQL query to add or subtract values based on another field

I need to calculate the net total of a column-- sounds simple. The problem is that some of the values should be negative, as are marked in a separate column. For example, the table below would yield a result of (4+3-5+2-2 = 2). I've tried doing this with subqueries in the select clause, but it seems unnecessarily complex and difficult to expand when I start adding in analysis for other parts of my table. Any help is much appreciated!
Sign Value
Pos 4
Pos 3
Neg 5
Pos 2
Neg 2
Using a CASE statement should work in most versions of sql:
SELECT SUM( CASE
WHEN t.Sign = 'Pos' THEN t.Value
ELSE t.Value * -1
END
) AS Total
FROM YourTable AS t
Try this:
SELECT SUM(IF(sign = 'Pos', Value, Value * (-1))) as total FROM table
I am adding rows from a single field in a table based on values from another field in the same table using oracle 11g as database and sql developer as user interface.
This works:
SELECT COUNTRY_ID, SUM(
CASE
WHEN ACCOUNT IN 'PTBI' THEN AMOUNT
WHEN ACCOUNT IN 'MLS_ENT' THEN AMOUNT
WHEN ACCOUNT IN 'VAL_ALLOW' THEN AMOUNT
WHEN ACCOUNT IN 'RSC_DEV' THEN AMOUNT * -1
END) AS TI
FROM SAMP_TAX_F4
GROUP BY COUNTRY_ID;
select a= sum(Value) where Sign like 'pos'
select b = sum(Value) where Signe like 'neg'
select total = a-b
this is abit sql-agnostic, since you didnt say which db you are using, but it should be easy to adapat it to any db out there.