SQL Query SUM and Divide by Distinct Date Count - sql

I assistance, I am looking for the sum of a data field and then want to divide it by the number of distinct dates in that field.
SUM(CASE WHEN dateResolved IS NOT NULL
THEN 1 ELSE 0
END) / DISTINCT(dateResolved) AvgPerDay
If there are 32 dates in dateResolved, with 5 distinct dates, I want it to return 6.4.

By default it does integer division you need :
SUM(CASE WHEN dateResolved IS NOT NULL
THEN 1 ELSE 0
END) * 1.0 / COUNT(DISTINCT dateResolved) AvgPerDay
However simply count would also work :
COUNT(dateResolved) * 1.0 / COUNT(DISTINCT dateResolved) AvgPerDay
COUNT(dateResolved) will ignore null values.

I would do this as:
SUM(CASE WHEN dateResolved IS NOT NULL
THEN 1.0 ELSE 0
END) / COUNT(DISTINCT dateResolved) as AvgPerDay
But this is more simply phrased as:
COUNT(dateResolved) * 1.0 / COUNT(DISTINCT dateResolved) as AvgPerDay

Related

SQL - Why is my percentage aggregation blank?

I'm trying to get percentage of issues that are marked as closed, for some reason it's coming out as 0 for all entries.
Any idea what I could be doing wrong?
SELECT
CASE
WHEN COUNT(IF(progress = 'CLOSED', id)) = 0 THEN 0
ELSE 1.0 * (COUNT(IF(progress = 'CLOSED', id)) / COUNT())
END as pct_closed,
assigned_date
FROM table
WHERE assigned_date >= YYYY-MM-DD
GROUP BY 2
Try this:
SELECT AVG(CASE WHEN progress = 'CLOSED' THEN 1.0 ELSE 0 END) as closed_ratio,
assigned_date
FROM table
WHERE assigned_date >= ? -- date format should be YYYY-MM-DD
GROUP BY assigned_date;

SQL - How to get the percentage of 2 sums

I am trying to get the CTR from uploads to clicks. How would I be able to get this percentage?
I am currently using the code below and I receive 0 as a result (view attached img).
SELECT geo_country, [created_at:date:aggregation] AS day,
SUM(case when name = 'adclick' then 1 else 0 end) as clicks,
SUM(case when name = 'camera_upload_image' then 1 else 0 end) as uploads,
CONVERT(DECIMAL(10,2), clicks / NULLIF(uploads, 0)) As CTR
FROM events
Results:
What you're running into is integer division.
If you take 2 integers
clicks = 3
uploads = 2
Then clicks / uploads equals 1 - not 1.5 as you expect:
http://sqlfiddle.com/#!18/0ed4a9/8/0
As you can see from that fiddle, the way around this is to ensure that your values are cast/converted to floating point numbers before doing the division.
Convert your results to decimal also:
SELECT geo_country, [created_at:date:aggregation] AS day,
SUM(case when name = 'adclick' then 1 else 0 end) as clicks,
SUM(case when name = 'camera_upload_image' then 1 else 0 end) as uploads,
CONVERT(DECIMAL(10,2), CONVERT(DECIMAL(10,2),clicks) / NULLIF(CONVERT(DECIMAL(10,2),uploads), 0)) As CTR
FROM events
Use a subquery or CTE:
SELECT c.*, clicks * 1.0 / NULLIF(uploads, 0) as ratio
FROM (SELECT geo_country, [created_at:date:aggregation] AS day,
SUM(case when name = 'adclick' then 1 else 0 end) as clicks,
SUM(case when name = 'camera_upload_image' then 1 else 0 end) as uploads
FROM events
GROUP BY geo_country, [created_at:date:aggregation]
) c;
The * 1.0 is to avoid integer division issues (i.e. 1/2=0 rather than 1.5. The NULLIF() is to avoid division by zero.

Why does the parenthesis make a different in this sql query

The objective:
Find the percentage of high elevation airports (elevation >= 2000) by
state from the airports table.
In the query, alias the percentage column as
percentage_high_elevation_airports.
Could someone explain why the following 2 SQL statements give different results:
Correct result:
SELECT state,
100.0 * sum(CASE WHEN elevation >= 2000 THEN 1 ELSE 0 END) / count(*) as percentage_high_elevation_airports
FROM airports
GROUP BY state;
sample result:
MS 0.0
MT 100.0
NC 11.1111111111111
ND 10.0
and wrong result:
select
state,
100.0 * (sum(case when elevation >= 2000 then 1 else 0 end)/count(*)) as percentage_high_elevation_airports
from airports
group by 1;
sample result:
MS 0.0
MT 100.0
NC 0.0
ND 0.0
Only difference is the additional placement of () around the sum.
I would write this as:
SELECT state,
AVG(CASE WHEN elevation >= 2000 THEN 100.0 ELSE 0 END) as percentage_high_elevation_airports
FROM airports
GROUP BY state;
The issue is integer arithmetic. Some databases do an integer division and return an integer. So, 1/2 is 0 rather than 0.5. Some databases also apply this to avg() (but even some that do integer division to numeric averages).
I should note that this is database-specific.
Your question is not about another/better solution to your query
but about the wrong results you get with the use of parentheses, right?
Because:
sum(case when elevation >= 2000 then 1 else 0 end)
results to an integer
and count(*) is by definition an integer.
The division between them is an integer division truncating any decimal digits.
So you get 0 instead of 0.5 or 0.05.
To avoid situations like this you can multiply by a real number like you do: 100.0 first and then divide.
Or you could do this:
sum(case when elevation >= 2000 then 1.0 else 0.0 end)
which results in a sum that is a floating point number.
In any case make sure that at least one of the operands of the division is a real number.
Try below - you need to change the placement of your parenthesis
select
state,
(100.0 * sum(case when elevation >= 2000 then 1 else 0 end))/count(*)) as percentage_high_elevation_airports
from airports
group by 1

How do I get percentage amount of categorical variables per day using SQL?

I've been stuck at this but my end goal is to get the % of negative, %positive, and % neutral for the overall data and group by dates (daily) as well as the categories. Thank you.
Just use window functions:
select mlsentimentzone,
(count(*) * 1.0 / sum(count(*)) over ()) as ratio
from t
group by mlsentimentzone;
Or, if you want this by date, use conditional aggregation:
select date,
avg(case when mlsentimentzone = 'negative' then 1.0 else 0.0 end) as negative,
avg(case when mlsentimentzone = 'neutral' then 1.0 else 0.0 end) as neutral,
avg(case when mlsentimentzone = 'positive' then 1.0 else 0.0 end) as positive
from t
group by date
order by date;

SQL percentage with rows same table with different where condition

I want to do a query like:
select
count(asterisk) where acción='a'/count(asterisk) where acción='b' * 100
from
same_table
grouped by day
but I don't want use subquery, is it possible with joins?
I`m not sure the syntax is correct, but you can use something like this:
SELECT day,
SUM(CASE WHEN "acción" = 'a' THEN 1 ELSE 0 END) AS SUM_A,
SUM(CASE WHEN "acción" = 'b' THEN 1 ELSE 0 END) AS SUM_B,
SUM(CASE WHEN "acción" = 'a' THEN 1 ELSE 0 END) AS SUM_A / SUM(CASE WHEN "acción" = 'b' THEN 1 ELSE 0 END) * 100 AS result
FROM your_table
GROUP BY day
The concept is to actually sum the the values that you need, instead of count.