I have a table and I need calculate two aggregate functions with different conditions in one statement. How can I do this?
Pseudocode below:
SELECT count(CoumntA) *< 0*, count(CoumntA) * > 0*
FROM dbo.TableA
This is the same idea as tombom's answer, but with SQL Server syntax:
SELECT
SUM(CASE WHEN CoumntA < 0 THEN 1 ELSE 0 END) AS LessThanZero,
SUM(CASE WHEN CoumntA > 0 THEN 1 ELSE 0 END) AS GreaterThanZero
FROM TableA
As #tombom demonstrated, this can be done as a single query. But it doesn't mean that it should be.
SELECT
SUM(CASE WHEN CoumntA < 0 THEN 1 ELSE 0 END) AS less_than_zero,
SUM(CASE WHEN CoumntA > 0 THEN 1 ELSE 0 END) AS greater_than_zero
FROM
TableA
The time when this is not so good is...
- There is an index on CoumntA
- Most values (50% or more feels about right) are exactly zero
In that case, two queries will be faster. This is because each query can use the index to quickly home in on the section to be counted. In the end only counting the relevant records.
The example I gave, however, scans the whole table every time. Only once, but always the whole table. This is worth it when you're counting most of the records. In your case it looks liek you're counting most or all of them, and so this is probably a good way of doing it.
It is possible to do this in one select statement.
The way I've done it before is like this:
SELECT SUM(CASE WHEN ColumnA < 0 THEN 1 END) AS LessThanZero,
SUM(CASE WHEN ColumnA > 0 THEN 1 END) AS GreaterThanZero
FROM dbo.TableA
This is the correct MS SQL syntax and I believe this is a very efficient way of doing it.
Don't forget you are not covering the case when ColumnA = 0!
select '< 0' as filter, COUNT(0) as cnt from TableA where [condition 1]
union
select '> 0' as filter, COUNT(0) as cnt from TableA where [condition 2]
Be sure that condition 1 and condition 2 create a partition on the original set of records, otherwise same records could be counted in both groups.
For SQL Server, one way would be;
SELECT COUNT(CASE WHEN CoumntA<0 THEN 1 ELSE NULL END),
COUNT(CASE WHEN CoumntA>0 THEN 1 ELSE NULL END)
FROM dbo.TableA
Demo here.
SELECT
SUM(IF(CoumntA < 0, 1, 0)) AS lowerThanZero,
SUM(IF(CoumntA > 0, 1, 0)) AS greaterThanZero
FROM
TableA
Is it clear what's happening? Ask, if you have any more questions.
A shorter form would be
SELECT
SUM(CoumntA < 0) AS lowerThanZero,
SUM(CoumntA > 0) AS greaterThanZero
FROM
TableA
This is possible, since in MySQL a true condition is equal 1, a false condition is equal 0
EDIT: okay, okay, sorry, don't know why I thought it's about MySQL here.
See the other answers about correct syntax.
Related
I have a line of SQL which produces a count of purchases variable
count(distinct case when t.transaction_sub_type =1 then t.transaction_date end) as COUNTPUR,
I need to modify this so I can produce a 0/1 flag variable, which flags if a customer is a repeat purchaser. So, when a customer's purchases are greater than 1 then flag as 1 else flag as 0.
case when COUNTPUR>1 then 1 else 0 end as FLAG_REPEATPURCHASER
I need to combine these two case statements into one. I have been experimenting with different versions of the syntax, but I can't seem to nail it down. Below is one of the experiments which do not work.
max(case when (count(distinct case when t.transaction_sub_type =1 then t.transaction_date end))>1 then 1 else 0 end) as FLAG_REPEATPURCHASER,
Thanks in advance for assitance
You can use a case expression with conditional aggregation:
(case when count(distinct case when t.transaction_sub_type = 1 then t.transaction_date end) > 1
then 1 else 0
end) as FLAG_REPEATPURCHASER
I need to translate SAS code (PROC SQL) to (postgres) SQL, especially the calculated keyword in SAS that allow a variable defined in the query to be re-used directly in the same query for another variable computation:
SELECT
id,
sum( case
when (sales > 0) then 1
when (sales = 0) then 0
else -1
end) as pre_freq,
(case
when calculated pre_freq > 0 then calculated pre_freq
else 1
end) as freq
FROM my_table
GROUP BY id
This is not possible (AFAIK) in SQL, so I need to break down each step of the computation.
I was wondering what was the best option, knowing that, from my understanding, it is better to have more computations and fewer table scans, i.e. make as much as computation during a scan, instead of multiple table scans with a small computation steps.
In this particular exemple I could use:
SELECT
id
, greatest(1, sum( case
when (sales > 0) then 1
when (sales = 0) then 0
else -1
end) as freq
FROM
my_table
GROUP BY id
or:
SELECT
id
, (case when sum(case
when (sales > 0) then 1
when (sales < 0) then -1
else 0
end) > 0 then sum(case
when (sales > 0) then 1
when (sales < 0) then -1
else 0
end) else 1 end) as freq
FROM
my_table
GROUP BY id
... which is starting to be hard to read...
Is there anyway to define a variable for a snippet of SQL code that will be repeated?
More generally speaking that this illustration, was is the best (most efficient) approach?
calculated is a nice feature of proc sql. However, you cannot re-use aliases in databases in general (this is not a Postgres-specific limitation). A simple method is to use a subquery or CTE:
select id, pre_freq,
(case when pre_freq > 0 then pre_freq
else 1
end) as freq
from (select id,
sum(case when (sales > 0) then 1
when (sales = 0) then 0
else -1
end) as pre_freq,
from my_table t
group by id
) t;
However, the simplest solution is to use sign():
select id, sum(sign(sales)) as pre_freq,
greatest(sum(sign(sales)), 1) as freq
from my_table t
group by id;
Note: This is slightly different. It basically ignores NULL values. If you really need to treat NULL as -1, then use coalesce().
SELECT round(COUNT(dmd_1wk),2) AS NBR_ITEMS_1WK
FROM table;
Field dmd_1wk has so many zeros in it. How do I Count the non zero values?
It sounds like you just need to add a WHERE clause:
SELECT
round(COUNT(dmd_1wk),2) AS NBR_ITEMS_1WK
FROM table
WHERE dmd_1wk <> 0;
If you want the count of both non-zero and zero values, then you can use something like:
SELECT
round(COUNT(case when dmd_1wk <> 0 then dmd_1wk end),2) AS NBR_ITEMS_1WK_NonZero,
round(COUNT(case when dmd_1wk = 0 then dmd_1wk end),2) AS NBR_ITEMS_1WK_Zero
FROM table;
Method 1: Case Statement. This may be useful if you need to continue to process all rows (which a where clause would prevent).
SELECT count(case when dmd_1wk = 0 then 0 else 1 end) as NonZeroCount FROM MyTable
Method 2: Where Clause.
SELECT
count(1) as NonZeroCount
FROM
MyTable
WHERE
dmd_1wk <> 0
I'd like to offer another solution using NULLIF since COUNT won't count NULL values:
SELECT round(COUNT(NULLIF(dmd_1wk,0)),2) AS NBR_ITEMS_1WK
FROM table;
And here is the Fiddle.
Good luck.
Methinks bluefeets answer is probably what you are really looking for, as it sounds like you just want to count non-zeros; but this will get you a count of zero and non-zero items if that's not the case:
SELECT
ROUND(SUM(CASE NVL(dmd_1wk, 0) = 0 THEN 1 ELSE 0 END), 2) AS "Zeros",
ROUND(SUM(CASE NVL(dmd_1wk, 0) != 0 THEN 1 ELSE 0 END), 2) AS "NonZeros"
FROM table
Although there is no point in rounding a whole number, I've included your original ROUNDs as I'm guessing you're using it for formatting, but you might want to use:
TO_CHAR(SUM(...), '999.00')
as that's the intended function for formatting numbers.
You can filter them.
SELECT round(COUNT(dmd_1wk),2) AS NBR_ITEMS_1WK
FROM table
WHERE dmd_1wk <> 0;
I'm calculating the change in pain between day 1 and day 2.
There are two fields, Pain_Admit_Comfort and Pain_48_Hr_Comfort, the options in each is Yes/No.
I need to find everyone that had pain on Admit and is More Comfortable 2 days later.
This is the query. The first two statements return correct numbers. I can't figure out how to divide using the same statements as numerator and denominator.
select
(select COUNT (PAIN_48_HR_COMFORT_C)
FROM CASES WHERE PAIN_48_HR_COMFORT_C='Yes') as Forty_Eight_Hours,
(SELECT COUNT (PAIN_ADMIT_COMFORT_C)
FROM CASES WHERE PAIN_ADMIT_COMFORT_C='YES') as Admit_Uncomfort_Yes,
((select COUNT (PAIN_48_HR_COMFORT_C)
FROM CASES WHERE PAIN_48_HR_COMFORT_C='Yes')
/
(SELECT COUNT (PAIN_ADMIT_COMFORT_C)
FROM CASES WHERE PAIN_ADMIT_COMFORT_C='YES')) AS Percent_Changed
from CASES
Thanks
I don't spot any immediate problems with your statement but following statement should return the correct results and is perhaps a bit easier to read.
SELECT feh.Forty_Eight_Hours
, auy.Admit_Uncomfort_Yes
, Percent_Changed = CAST(feh.Forty_Eight_Hours AS FLOAT) / auy.Admit_Uncomfort_Yes
FROM (
SELECT Forty_Eight_Hours = COUNT(PAIN_48_HR_COMFORT_C)
FROM CASES
WHERE PAIN_48_HR_COMFORT_C = 'Yes'
) feh
CROSS APPLY (
SELECT Admit_Uncomfort_Yes = COUNT (PAIN_ADMIT_COMFORT_C)
FROM CASES
WHERE PAIN_ADMIT_COMFORT_C = 'Yes'
) auy
Your query, and the other answers, are very inefficient (multiple selects).
What you want is called a "pivot", and the most efficient way of coding it using just one select over the table (your query uses 4) is as follows:
select
sum(case when PAIN_48_HR_COMFORT_C = 'Yes' then 1 else 0 end) as Forty_Eight_Hours,
sum(case when PAIN_ADMIT_COMFORT_C = 'Yes' then 1 else 0 end) as Admit_Uncomfort_Yes
sum(case when PAIN_ADMIT_COMFORT_C = 'Yes' AND PAIN_48_HR_COMFORT_C = 'NO' then 1 else 0 end) as Improved_pain
FROM CASES
I'm not sure what the columns mean - you may need to change a 'YES' to 'NO' etc to get the "has"/"has not" pain correct.
I am trying to do aggregations in case statement. I found 2 ways to do it. Can anyone say what the difference between the 2 is?
(CASE WHEN Event = 5 THEN count(*) ELSE 0 END ) Follow_Count
GROUP BY Event;
SUM(CASE Event WHEN 5 THEN 1 ELSE 0 END) AS Follow_Count
Your case 1 will produce a row for each event in the table (from your group by). Your case 2 will just return 1 row.
Is there a reason that you wouldn't just write:
select count(*)
from my_table
where event = 5;
Better would be:
count(CASE Event WHEN 5 THEN 1 END) AS Follow_Count
Because
1) for count used own standart counter,
2) "else" not need (count don't count nulls)
Regards,
Sayan M.
There is no significant difference. You can decide for you which is better by comparing their execution plans.