Why does the parenthesis make a different in this sql query - sql

The objective:
Find the percentage of high elevation airports (elevation >= 2000) by
state from the airports table.
In the query, alias the percentage column as
percentage_high_elevation_airports.
Could someone explain why the following 2 SQL statements give different results:
Correct result:
SELECT state,
100.0 * sum(CASE WHEN elevation >= 2000 THEN 1 ELSE 0 END) / count(*) as percentage_high_elevation_airports
FROM airports
GROUP BY state;
sample result:
MS 0.0
MT 100.0
NC 11.1111111111111
ND 10.0
and wrong result:
select
state,
100.0 * (sum(case when elevation >= 2000 then 1 else 0 end)/count(*)) as percentage_high_elevation_airports
from airports
group by 1;
sample result:
MS 0.0
MT 100.0
NC 0.0
ND 0.0
Only difference is the additional placement of () around the sum.

I would write this as:
SELECT state,
AVG(CASE WHEN elevation >= 2000 THEN 100.0 ELSE 0 END) as percentage_high_elevation_airports
FROM airports
GROUP BY state;
The issue is integer arithmetic. Some databases do an integer division and return an integer. So, 1/2 is 0 rather than 0.5. Some databases also apply this to avg() (but even some that do integer division to numeric averages).
I should note that this is database-specific.

Your question is not about another/better solution to your query
but about the wrong results you get with the use of parentheses, right?
Because:
sum(case when elevation >= 2000 then 1 else 0 end)
results to an integer
and count(*) is by definition an integer.
The division between them is an integer division truncating any decimal digits.
So you get 0 instead of 0.5 or 0.05.
To avoid situations like this you can multiply by a real number like you do: 100.0 first and then divide.
Or you could do this:
sum(case when elevation >= 2000 then 1.0 else 0.0 end)
which results in a sum that is a floating point number.
In any case make sure that at least one of the operands of the division is a real number.

Try below - you need to change the placement of your parenthesis
select
state,
(100.0 * sum(case when elevation >= 2000 then 1 else 0 end))/count(*)) as percentage_high_elevation_airports
from airports
group by 1

Related

Calculate rate based on condition using postgresql

I want to find the rate of negative and zero profits from a column. I tried to do it using aggregate and subquery but it doesn't seem to work as both method return 0 values.
The code is as follows
SELECT
COUNT(CASE WHEN profit < 0 THEN 1
END) AS negative_profits,
COUNT(CASE WHEN profit < 0 THEN 1
END) / COUNT(profit),
COUNT(CASE WHEN profit = 0 THEN 1
END) AS zero_profits,
COUNT(CASE WHEN profit = 0 THEN 1
END) / COUNT(profit)
FROM sales;
SELECT (SELECT COUNT(*)
FROM sales
WHERE profit <= 0)/COUNT(profit) AS n_negative_profit
FROM sales;
Both query return 0 in values
enter image description here
Avoid integer division, which truncates (like Adrian pointed out).
Also, simplify with an aggregate FILTER expression:
SELECT count(*) FILTER (WHERE profit <= 0)::float8
/ count(profit) AS n_negative_profit
FROM sales;
If profit is defined NOT NULL, or to divide by the total count either way, optimize further:
SELECT count(*) FILTER (WHERE profit <= 0)::float8
/ count(*) AS n_negative_profit
FROM sales;
See:
Aggregate columns with additional (distinct) filters
Because you are doing integer division per docs Math operators/functions.
numeric_type / numeric_type → numeric_type
Division (for integral types, division truncates the result towards zero)
So:
select 2/5;
0
You need to make one of the numbers float or numeric:
select 2/5::numeric;
0.40000000000000000000
and to make it cleaner round:
select round(2/5::numeric, 2);
0.40

SQL using SUM in CASE in SUM

I had this query select
sum(CASE WHEN kpi.average >= temp.average THEN 1 ELSE 0 END) AS recordOrder,
which worked fine, but I had to change it to this
sum(CASE WHEN sum(kpi.averageUse) / sum(kpi.averageTotal) >= temp.average THEN 1 ELSE 0 END) AS recordOrder,
These queries have to get number of rows, where some value (average) is greater than average from TEMP table. But in the second query I have more accurate data (weighted average).
but I am getting error
1111 invalid use of group function
Any ideas how to write SUM in CASE in SUM?
Thanks!
This code is just non-sensical because you have nested sum functions:
sum(CASE WHEN sum(kpi.averageUse) / sum(kpi.averageTotal) >= temp.average THEN 1 ELSE 0 END) AS recordOrder,
Without seeing your larger query, it is not possible to know what you really intend. But I would speculate that you don't want the internal sum()s:
sum(CASE WHEN (skpi.averageUse / kpi.averageTotal) >= temp.average THEN 1 ELSE 0 END) AS recordOrder,

SQL Query SUM and Divide by Distinct Date Count

I assistance, I am looking for the sum of a data field and then want to divide it by the number of distinct dates in that field.
SUM(CASE WHEN dateResolved IS NOT NULL
THEN 1 ELSE 0
END) / DISTINCT(dateResolved) AvgPerDay
If there are 32 dates in dateResolved, with 5 distinct dates, I want it to return 6.4.
By default it does integer division you need :
SUM(CASE WHEN dateResolved IS NOT NULL
THEN 1 ELSE 0
END) * 1.0 / COUNT(DISTINCT dateResolved) AvgPerDay
However simply count would also work :
COUNT(dateResolved) * 1.0 / COUNT(DISTINCT dateResolved) AvgPerDay
COUNT(dateResolved) will ignore null values.
I would do this as:
SUM(CASE WHEN dateResolved IS NOT NULL
THEN 1.0 ELSE 0
END) / COUNT(DISTINCT dateResolved) as AvgPerDay
But this is more simply phrased as:
COUNT(dateResolved) * 1.0 / COUNT(DISTINCT dateResolved) as AvgPerDay

SQL, querying sum of positive results, absolute value

I have the following query which returns a total dollar amount.
select sum(cast(dollars as dec)) from financials
This includes positive and negative values.
I would like 2 separate things:
How can I just query the positive dollar amounts? ie. I have 3 records, 10 , -5 , 10. result would be 20.
I want an absolute value as a sum. ie. I have 3 records, 10, -5, 10. the result would be 25.
thanks.
FOR 1) Use conditional SUM()
SELECT SUM( CASE WHEN dollars > 0 then dollars ELSE 0 END) as positive_sum,
SUM( CASE WHEN dollars < 0 then dollars ELSE 0 END) as negative_sum
FROM financials
FOR 2) use ABS()
SELECT SUM( ABS( dollars ) )
FROM financials
Please try below queries. Thanks.
1) select sum(cast(dollars as dec))
from financials
where dollars > 0;
2) select sum(cast(abs(dollars) as dec))
from financials;
You have two queries.solutions are as follows
1.
select sum(dollars) from financials
2.
select sum((case when dollars>0 then dollars end))+sum((case when dollars<0 then -1*dollars end)) from financials

T-SQL Sum Case Confusion

I am currently doing a SUM CASE to workout a percentage, however the entire string returns zero's or ones (integers) but I don't know why. I have written the SQL in parts to break it out and ensure the underlying data is correct which it is, however when I add the last part on to do the percentage it fails. Am I missing something?
SELECT
SUPPLIERCODE,
(SUM(CASE WHEN ISNULL(DATESUBMITTED,0) - ISNULL(FAILDATE,0) <15 THEN 1 ELSE 0 END)) AS ACCEPTABLE,
COUNT(ID) AS TOTALSUBMITTED,
(SUM(CASE WHEN ISNULL(DATESUBMITTED,0) - ISNULL(FAILDATE,0) <15 THEN 1 ELSE 0 END)/COUNT(ID))
FROM SUPPLIERDATA
GROUP BY SUPPLIERCODE
For example here's some of the data returned:
SUPPLIERCODE ACCEPTABLE TOTALSUBMITTED Column1
HBFDE2 1018 1045 0
DTETY1 4 4 1
SWYTR2 579 736 0
VFTEQ3 2104 2438 0
I know I could leave the other columns and use an excel calculation but I'd rather not... Any help would be well received. Thanks
SELECT
SUPPLIERCODE,
(SUM(CASE WHEN ISNULL(DATESUBMITTED,0) - ISNULL(FAILDATE,0) <15 THEN 1 ELSE 0 END)) AS ACCEPTABLE,
COUNT(ID) AS TOTALSUBMITTED,
(SUM(CASE WHEN ISNULL(DATESUBMITTED,0) - ISNULL(FAILDATE,0) <15 THEN 1 ELSE 0 END)*1.0/COUNT(ID))
FROM SUPPLIERDATA
GROUP BY SUPPLIERCODE
You need convert your result to float. It can be easy done by multiply on 1.0
This is due to the fact that SQL Server is treating your values as INTs for the purpose of division.
Try the following and you will see the answer 0:
PRINT 1018 / 1045
In order to allow your operation to work correctly, you need to convert your values to FLOATs, like so:
PRINT CAST(1018 AS FLOAT) / 1045
This will produce the answer 0.974163 as expected.
A simple change to your statement to introduce a cast to FLOAT will sort your problem:
SELECT
SUPPLIERCODE,
(SUM(CASE WHEN ISNULL(DATESUBMITTED,0) - ISNULL(FAILDATE,0) <15 THEN 1 ELSE 0 END)) AS ACCEPTABLE,
COUNT(ID) AS TOTALSUBMITTED,
(CAST(SUM(CASE WHEN ISNULL(DATESUBMITTED,0) - ISNULL(FAILDATE,0) <15 THEN 1 ELSE 0 END) AS FLOAT) / COUNT(ID))
FROM SUPPLIERDATA
GROUP BY SUPPLIERCODE
All you have to do is avoid integer division by giving your database engine a hint.
In SQL Server, you would use:
SELECT
SUPPLIERCODE,
(SUM(CASE WHEN ISNULL(DATESUBMITTED, 0) - ISNULL(FAILDATE, 0) < 15 THEN 1 ELSE 0 END)) AS ACCEPTABLE,
COUNT(ID) AS TOTALSUBMITTED,
(SUM(CASE WHEN ISNULL(DATESUBMITTED, 0) - ISNULL(FAILDATE, 0) < 15 THEN 1 ELSE 0 END) / (COUNT(ID) * 1.0))
FROM
SUPPLIERDATA
GROUP BY
SUPPLIERCODE