SQL Percentage of Occurrences - sql

I'm working on some SQL code as part of my University work. The data is factitious just to be clear. I'm trying to count the occurances of 1 & 0 in the SQL table Fact_Stream, this is stored in the Free_Stream column/attribute as a Boolean/bit value.
As calculations cant be made on bit values (at least in the way I'm trying) I've converted the value to an integer -- Just to be clear on that. The table contains information on a streaming companies streams, a 1 indicates the stream was free of charge, a 0 indicates the stream was paid for. My code:
SELECT Fact_Stream.Free_Stream, ((CAST(Free_Stream AS INT)) / COUNT(*) * 100) As 'Percentage of Streams'
FROM Fact_Stream
GROUP BY Free_Stream
The result/output is nearly where I want it to be, but it doesn't display the percentage correctly.
Output:
Using MS SQL Management Studio | MS SQL Server 2012 (I believe)

The percentage should be based on all rows, so you need to divide the count per 1/0 by a count of all rows. The easiest way to get this is utilizing a Windowed Aggregate Function:
SELECT Fact_Stream.Free_Stream,
100.0 * COUNT(*) -- count per bit
/ SUM(COUNT(*)) OVER () -- sum of those counts = count of all rows
As "Percentage of Streams"
FROM Fact_Stream
GROUP BY Free_Stream

You have INTs as a devisor and devidened(not sure I am correct with namings). So the result is also INT. Just cast one of those to decimal(notice how did I change to 100.0). Also you should debide count of elements in group to total count of rows in the table:
select Free_Stream,
(count(*) / (select count(*) from Free_Stream)) * 100.0 as 'Percentage of Streams'
from Fact_Stream
group by Free_Stream

Your equation is dividing the identifier (1 or 0) by the number of streams for each one, instead of dividing the count of free or paid by the total count. One way to do this is to get the total count first, then use it in your query:
declare #totalcount real;
select #totalcount = count(*) from Fact_Stream;
SELECT Fact_Stream.Free_Stream,
(Cast(Count(*) as real) / #totalcount)*100 AS 'Percentage of Streams'
FROM Fact_Stream
group by Fact_Stream.Free_Stream

Related

SQL: Performing window function and finding percent

I have a table that includes the rows Data, Gender, Age Group and Number of Fans. I need to show the split of page fans across age groups in %.
So far, I have been able to limit the data to the newest data (The most recent entry is 2018-10-06,) but have been unable to perform -- what I assume is needed -- a window function to group the genders (M, F, U) together and to then find the percent per age group. I greatly appreciate any help. Here is as far as I have gotten with success:
SELECT *
FROM fanspergenderage
WHERE fanspergenderage.date >= '2018-10-16'
GROUP BY fanspergenderage.gender, fanspergenderage.agegroup;
Here
I need to show the split of page fans across age groups in %.
I interpret this as the proportion of all fans in each age group. You seem to be asking for something like this:
SELECT f.agegroup,
COUNT(*) as num_fans,
COUNT(*) * 1.0 / SUM(COUNT(*)) OVER () as ratio
FROM fanspergenderage f
WHERE f.date >= '2018-10-16'
GROUP BY f.fanspergenderage;
The * 1.0 is because some databases do integer division.

Is there a way to find percentage of non-zero vs zero values in one column?

I'm supposed to find the percentage of people having received aid.
I'm assuming the best way to do this is find the number rows who received 0 aid, and the number of rows that have a greater than 0 value, create two variables for those and divide accordingly to find the percentage. It's been a while since I've worked with sql so this is challenging me.
select
rprawrd_aidy_code as year,
sum(rprawrd_accept_amt)
from
rprawrd
where
rprawrd_aidy_code = '1819'
group by
rprawrd_aidy_code
This only gives me a total of the amount of aid provided for the year in question. I need to figure out the total rows that received vs the total that didnt.
If the only output you need from your script is that ratio, there are a few ways to go about this one:
WITH cte (awrd) AS(
SELECT
CASE WHEN rprawrd_accept_amt > 0 THEN 1.0
ELSE 0.0
END awrd
FROM rprawrd
WHERE rprawrd_ady_code = '1819'
)
SELECT SUM(awrd)/COUNT(awrd)
FROM cte
This will get you the percentage of people who received an award, but if you need to know the amounts as well you'll have to approach it differently.

Min function in postgresql

I am trying to find a division with the lowest population density to do so i did the following:
SELECT P.edname, MIN((P.total_area*1000)/P.total2011) AS "Lowest population density"
FROM eds_census2011 P
GROUP BY P.edname
HAVING COUNT (*)> 1
total_area is multiplied by 1000 (so it is in square metres) and divide by total population.
I want only one record displaying the division (edname) and the population density wich is calculated (MIN((P.total_area*1000)/P.total2011)), instead I get all the records - not even sorted...
The problem is that I have to group it by edname, if I leave out the GROUP BY and HAVING lines I get an error. Any help is greatly appriciated!
Try
SELECT edname, (total_area*1000/total2011) density
FROM eds_census2011
WHERE (total_area*1000/total2011) = (SELECT MIN(total_area*1000/total2011) FROM eds_census2011)
SQLFiddle
A 'Return only one row' rule could be easily enforced by using LIMIT 1 if it's really necessary
Without subquery:
SELECT p.edname, min((p.total_area * 1000)/p.total2011) AS lowest_pop
FROM eds_census2011 p
GROUP BY p.edname
HAVING COUNT (*) > 1
ORDER BY 2
LIMIT 1;
This one returns only 1 row (if any qualify), even if multiple rows have equally low density.
If you just want the lowest density, period, this can be much simpler:
SELECT edname, (total_area * 1000)/total2011) AS lowest_pop
FROM eds_census2011
ORDER BY 2
LIMIT 1;

Total Count in Grouped TSQL Query

I have an performance heavy query, that filters out many unwanted records based on data in other tables etc.
I am averaging a column, and also returning the count for each average group. This is all working fine.
However, I would also like to include the percentage of the TOTAL count.
Is there any way of getting this total count without rerunning the whole query, or increasing the performance load significantly?
I would also prefer if I didn't need to completely restructure the sub query (e.g. by getting the total count outside of it), but can do if necessary.
SELECT
data.EquipmentId,
AVG(MeasureValue) AS AverageValue,
COUNT(data.*) AS BinCount
COUNT(data.*)/ ???TotalCount??? AS BinCountPercentage
FROM
(SELECT * FROM MultipleTablesWithJoins) data
GROUP BY data.EquipmentId
See Window functions.
SELECT
data.EquipmentId,
AVG(MeasureValue) AS AverageValue,
COUNT(*) AS BinCount,
COUNT(*)/ cast (cnt as float) AS BinCountPercentage
FROM
(SELECT *,
-- Here is total count of records
count(*) over() cnt
FROM MultipleTablesWithJoins) data
GROUP BY data.EquipmentId, cnt
EDIT: forgot to actually divide the numbers.
Another approach:
with data as
(
SELECT * FROM MultipleTablesWithJoins
)
,grand as
(
select count(*) as cnt from data
)
SELECT
data.EquipmentId,
AVG(MeasureValue) AS AverageValue,
COUNT(data.*) AS BinCount
COUNT(data.*)/ grand.cnt AS BinCountPercentage
FROM data cross join grand
GROUP BY data.EquipmentId

How to select count as a percentage over the total in Oracle using any Oracle function?

I have an SQL statement that counts over the total number of rows active packages whose end date is null. I am currently doing this using (x/y) * 100:
SELECT (SELECT COUNT(*)
FROM packages
WHERE end_dt IS NULL) / (SELECT COUNT(*)
FROM packages) * 100
FROM DUAL;
I wonder if there is a way to make use of any Oracle function to express this more easily?
There's no functionality I'm aware of, but you could simply the query to be:
SELECT SUM(CASE WHEN p.end_dt IS NULL THEN 1 ELSE 0 END) / COUNT(*) * 100
FROM PACKAGES p
So, basically the formula is
COUNT(NULL-valued "end_dt") / COUNT(*) * 100
Now, COUNT(NULL-valued "end_dt") is syntactically wrong, but it can be represented as COUNT(*) - COUNT(end_dt). So, the formula can be like this:
(COUNT(*) - COUNT(end_dt)) / COUNT(*) * 100
If we just simplify it a little, we'll get this:
SELECT (1 - COUNT(end_dt) * 1.0 / COUNT(*)) * 100 AS Percent
FROM packages
The * 1.0 bit converts the integer result of COUNT to a non-integer value so make the division non-integer too.
The above sentence and the corresponding part of the script turned out to be complete rubbish. Unlike some other database servers, Oracle does not perform integer division, even if both operands are integers. This doc page contains no hint of such behaviour of the division operator.
The original post is a little long in the tooth but this should work, using the function "ratio_to_report" that's been available since Oracle 8i:
SELECT
NVL2(END_DT, 'NOT NULL', 'NULL') END_DT,
RATIO_TO_REPORT(COUNT(*)) OVER () AS PCT_TOTAL
FROM
PACKAGES
GROUP BY
NVL2(END_DT, 'NOT NULL', 'NULL');