bigQuery throws "division by zero: 0 / 0" error - sql

I am doing the percentage by the frequency of column value using bigquery. However, some of the value might be zero, so the query will return error for sure
(division by zero: 0 / 0)
How to apply kind of IFERROR(x/y,null) in this case? so the query will bounce null value as the result instead of error?
SELECT
User_ID,
ROUND(SUM(CASE WHEN Name LIKE '%MIKE%' THEN 1 ELSE 0 END) / COUNT(Name) * 100 ,1) AS Percentage_of_MIKE,
FROM
table
GROUP BY
User_ID
TRIED:
ROUND(SAFE_DIVIDE(SUM(CASE WHEN Name LIKE '%MIKE%' THEN 1 ELSE 0 END) / COUNT(Name) * 100 ,1)) AS Percentage_of_MIKE,

You can just use SAFE_DIVIDE function in such cases
Something like in below example
ROUND(SAFE_DIVIDE(SUM(CASE WHEN Name LIKE '%MIKE%' THEN 1 ELSE 0 END), COUNT(Name) * 100) ,1) AS Percentage_of_MIKE

I tend to use NULLIF() for this purpose, because I like using the division operator for division:
SELECT User_ID,
ROUND(COUNTIF(Name LIKE '%MIKE%') * 100 / NULLIF(COUNT(Name), 0), 1) AS Percentage_of_MIKE
FROM table
GROUP BY User_ID;

This error indicates that you have User_IDs whose all Names are NULL. So the denominator of your division is 0 (COUNT(Name) counts non-null values of Name), and you get the division by 0 error.
A simple way to avoid this is to use AVG():
ROUND(AVG(CASE
WHEN Name LIKE '%MIKE%' THEN 1.0
WHEN Name IS NOT NULL THEN 0
END) * 100, 1) AS Percentage_of_MIKE

Related

Select percentage of another column in postgresql

I would like to select, grouped by avfamily, the amount of records that have livingofftheland value equalling true and return it as the perc value.
Essentially column 3 divided by column 2 times 100.
select
avclassfamily,
count(distinct(malware_id)) as cc,
sum(case when livingofftheland = 'true' then 1 else 0 end),
(100.0 * (sum(case when livingofftheland = 'true' then 1 else 0 end) / (count(*)) ) ) as perc
from malwarehashesandstrings
group by avclassfamily having count(*) > 5000
order by perc desc;
Probably quite simple but my brains drawing a blank here.
select, grouped by avfamily, the amount of records that have livingofftheland value equalling true and return it as the perc value.
You could simply use avg() for this:
select
avclassfamily,
count(distinct(malware_id)) as cc,
avg(livingofftheland::int) * 100 as perc
from malwarehashesandstrings
group by avclassfamily
having count(*) > 5000
order by perc desc
livingofftheland::int turns the boolean value to 0 (false) or 1 (true). The average of this value gives you the ratio of records that satisfy the condition in the group, as a decimal number between 0 and 1, thatn you can then multiply by 100.
I would express this as:
select avclassfamily,
count(distinct malware_id) as cc,
count(*) filter (where livingofftheland = 'true'),
( count(*) filter (where livingofftheland = 'true') * 100.0 /
count(distinct malware_id)
) as perc
from malwarehashesandstrings
group by avclassfamily
having count(*) > 5000
order by perc desc;
Note that this replaces the conditional aggregation with filter, a SQL standard construct that Postgres supports. It also puts the 100.0 right next to the /, just to be sure Postgres doesn't decide to do integer division.

Find percentage of rows that meet a condition without rounding

I'm trying to find the percentage of rows that meet a specific condition. My query is close but the answer is always rounded to the nearest whole number.
For instance this query below is returning 6 but it should return 6.25. Meaning 6.25% of the rows meet that condition. How would I do this?
select sum(case when name like 'H%' then 1 else 0 end) * 100 / count(*)
from category
Just add a decimal point:
select sum(case when name like 'H%' then 1.0 else 0 end) * 100 / count(*)
from category;
Postgres does integer division.
You can also express this using avg():
select avg(case when name like 'H%' then 100.0 else 0 end)
from category;
The decimal point is not needed. Although Postgres does integer division of integers, it calculates average of integers using decimal points.
And this can be phrase more simply (assuming that name is not NULL):
select avg( (name like 'H%')::int ) * 100
from category;
You need a decimal value, otherwise, since both the numerator and denominator are integers, Postgres does integer division.
But here, I would just use avg() - for which you don't need an integer value:
select avg(case when name like 'H%' then 1 else 0 end) * 100
from category
You're performing integer division, and thus omitting everything right of the decimal point. If you use a floating-point literal 100.0, you should be OK:
select sum(case when name like 'H%' then 1 else 0 end) * 100.0 / count(*)
from category

sql: percentage of a type in a column

I'm trying to get percentage of missed calls for each user, so I used the following sql query:
select distinct a__contact
, count (case when a__type = 'missed'
then 1 else 0 end) / count(*) * 100
as "percentage of missed calls"
from table
group by 1
However, for each user I got 100 which do not seem to be correct output at all. Could someone help me to identify the error in my query? thank you so much!
Here is a simpler way to express the logic you want:
select a__contact,
avg(case when a__type = 'missed' then 100.0 else 0 end) as percentage_missed_calls
from table
group by 1;
Your version is failing because you are using count() in the numerator. You really intend sum(). count() counts the number of non-NULL values and both "1" and "0" are non-NULL.

Select NULL if a calculated column value is negative in SQL

I have the following piece of sql below. The second line (commented out) contains my addition of a check to see if the calculation returns a negative value in which case it should select NULL. This case is within a block of multiple other case statements. Since my approach means running the same calculation twice is there a better alternative or more efficient method to selecting NULL if the value of this calculated column is negative, rather than doing two similar calculations?
Thanks
CASE
WHEN M.ALPHA = 'B' OR T.CT IN (0.001, 0.002) THEN NULL
-- WHEN ((M.VAL / NULLIF (M.VAL2, 0)) / (NULLIF (T.VAL, 0) / T.VAL2)) < 0 THEN NULL
ELSE (M.VAL / NULLIF (M.VAL2, 0)) / (NULLIF (T.VAL, 0) / T.VAL2)
END As WORLD
You could move the calculation to a subquery. For example:
select case
when CalculatedColumn > 42 then 'Hot'
when CalculatedColumn < 42 then 'Cold'
else 'Answer'
end as Description
from (
select 2 * col1 + 3 as CalculatedColumn
from YourTable
) SubQuery
Sometimes it's clearer to define the subquery in a with clause:
; with SubQuery as
(
select 2 * col1 + 3 as CalculatedColumn
from YourTable
) SubQuery
select case
when CalculatedColumn > 42 then 'Hot'
when CalculatedColumn < 42 then 'Cold'
else 'Answer'
end as Description
from SubQuery

SQL: Having trouble with query that gets percentages using aggregate functions

I'm not an expert in SQL by any means, and am having a hard time getting the data I need from a query. I'm working with a single table, Journal_Entry, that has a number of columns. One column is Status_ID, which is a foreign key to a Status table with three values "Green", "Yellow", and "Red". Also, a journal entry is logged against a particular User (User_ID).
I'm trying to get the number of journal entries logged for each Status, as a percentage of the total number of journal entries logged by a particular user. So far I've got the following for a Status of 1, which is green (and I know this doesn't work):
SELECT CAST((SELECT COUNT(Journal_Entry_ID)
FROM Journal_Entry
WHERE Status_ID = 1 AND User_ID = 3 /
SELECT COUNT(Journal_Entry_ID)
FROM Journal_Entry AND User_ID = 3)) AS FLOAT * 100
I need to continue the query for the other two status ID's, 2 and 3, and ideally would like to end with the selection of three columns as percentages, one for each Status: "Green_Percent", "Yellow_Percent", and "Red_Percent".
This is probably the most disjointed question I've ever asked, so I apologize for any lack of clarity. I'll be happy to clarify as necessary. Also, I'm using SQL Server 2005.
Thanks very much.
Use:
SELECT je.statusid,
COUNT(*) AS num,
(COUNT(*) / (SELECT COUNT(*)+.0
FROM JOURNAL_ENTRY) ) * 100
FROM JOURNAL_ENTRY je
GROUP BY je.statusid
Then it's a matter of formatting the precision you want:
CAST(((COUNT(*) / (SELECT COUNT(*)+.0 FROM BCCAMPUS.dbo.COURSES_RFIP)) * 100)
AS DECIMAL(4,2))
...will give two decimal places. Cast the result to INT if you don't want any decimal places.
You could use a CTE to minimize the duplication:
WITH cte AS (
SELECT je.*
FROM JOURNAL_ENTRY je
WHERE je.user_id = 3)
SELECT c.statusid,
COUNT(*) AS num,
(COUNT(*) / (SELECT COUNT(*)+.0
FROM cte) ) * 100
FROM cte c
GROUP BY c.statusid
This should work:
SELECT
user_id,
(CAST(SUM(CASE WHEN status_id = 1 THEN 1 ELSE 0 END) AS DECIMAL(6, 4))/COUNT(*)) * 100 AS pct_green,
(CAST(SUM(CASE WHEN status_id = 2 THEN 1 ELSE 0 END) AS DECIMAL(6, 4))/COUNT(*)) * 100 AS pct_yellow,
(CAST(SUM(CASE WHEN status_id = 3 THEN 1 ELSE 0 END) AS DECIMAL(6, 4))/COUNT(*)) * 100 AS pct_red
FROM
Journal_Entry
WHERE
user_id = 1
GROUP BY
user_id
If you don't need the user_id returned then you could get rid of that and the GROUP BY clause as long as you're only ever returning data for one user (or you want the aggregates for all users in the WHERE clause). If you want it for each user then you can keep the GROUP BY and simply get rid of the WHERE clause.
DECLARE #JournalEntry TABLE
( StatusID INT
);
INSERT INTO #JournalEntry (StatusID) VALUES
(1), (1),(1),(1),(1),(1),(1)
,(2), (2),(2),(2),(2),(2),(2)
,(3), (3),(3),(3),(3),(3),(3);
SELECT
CAST(SUM(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) AS DECIMAL) / CAST(COUNT(*) AS DECIMAL) Green
,CAST(SUM(CASE WHEN StatusID = 2 THEN 1 ELSE 0 END) AS DECIMAL) / CAST(COUNT(*) AS DECIMAL) Yellow
,CAST(SUM(CASE WHEN StatusID = 3 THEN 1 ELSE 0 END) AS DECIMAL) / CAST(COUNT(*) AS DECIMAL) Blue
FROM #JournalEntry;