Count average with multiple conditions - sql

I'm trying to create a query which allows to categorize the average percentage for specific data per month.
Here's how my dataset presents itself:
Date
Name
Group
Percent
2022-01-21
name1
gr1
5.2
2022-01-22
name1
gr1
6.1
2022-01-26
name1
gr1
4.9
2022-02-01
name1
gr1
3.2
2022-02-03
name1
gr1
8.1
2022-01-22
name2
gr1
36.1
2022-01-25
name2
gr1
32.1
2022-02-10
name2
gr1
35.8
...
...
...
...
And here's what I want to obtain with my query (based on what I showed of the table):
Month
<=25%
25<_<=50%
50<_<=75%
75<_<=100%
01
1
1
0
0
02
1
1
0
0
...
...
...
...
...
The result needs to:
Be ordered by month
Have the average use for each name counted and categorized
So far I know how to get the average of the Percent value per Name:
SELECT Name,
AVG(Percent)
from `table`
where Group = 'gr1'
group by Name
and how to count iterations of Percent in the categories created for the query:
SELECT EXTRACT(MONTH FROM Date) as Month,
COUNT(CASE WHEN Percent <= 25 AND Group = 'gr1' THEN Name END) `_25`,
COUNT(CASE WHEN Percent > 25 AND Percent <= 50 AND Group = 'gr1' THEN Name END) `_50`,
COUNT(CASE WHEN Percent > 50 AND Percent <= 75 AND Group = 'gr1' THEN Name END) `_75`,
COUNT(CASE WHEN Percent > 75 AND Percent <= 100 AND Group = 'gr1' THEN Name END) `_100`,
FROM `table`
GROUP BY Month
ORDER BY Month
but this counts all iterations of every name where I want the average of those values.
I've been struggling to figure out how to combine the two queries or to create a new one that answers my need.
I'm working with the BigQuery service from Google Cloud

This query produces the needed result, based on your example. So basically this combines your 2 queries using subquery, where the subquery is responsible to calculate AVG grouped by Name, Month and Group, and the outer query is for COUNT and "categorization"
SELECT
Month,
COUNT(CASE
WHEN avg <= 25 THEN Name
END) AS _25,
COUNT(CASE
WHEN avg > 25
AND avg <= 50 THEN Name
END) AS _50,
COUNT(CASE
WHEN avg > 50
AND avg <= 75 THEN Name
END) AS _75,
COUNT(CASE
WHEN avg > 75
AND avg <= 100 THEN Name
END) AS _100
FROM
(
SELECT
EXTRACT(MONTH from Date) AS Month,
Name,
AVG(Percent) AS avg
FROM
table1
GROUP BY Month, Name, Group
HAVING Group = 'gr1'
) AS namegr
GROUP BY Month
This is the result:
Month
_25
_50
_75
_100
1
1
1
0
0
2
1
1
0
0
See also Fiddle (BUT on MySql) - http://sqlfiddle.com/#!9/16c5882/9

You can use this query to Group By Month and each Name
SELECT CONCAT(EXTRACT(MONTH FROM Date), ', ', Name) AS DateAndName,
CASE
WHEN AVG(Percent) <= 25 THEN '1'
ELSE '0'
END AS '<=25%',
CASE
WHEN AVG(Percent) > 25 AND AVG(Percent) <= 50 THEN '1'
ELSE '0'
END AS '25<_<=50%',
CASE
WHEN AVG(Percent) > 50 AND AVG(Percent) <= 75 THEN '1'
ELSE '0'
END AS '50<_<=75%',
CASE
WHEN AVG(Percent) > 75 AND AVG(Percent) <= 100 THEN '1'
ELSE '0'
END AS '75<_<=100%'
from DataTable /*change to your table name*/
group by EXTRACT(MONTH FROM Date), Name
order by DateAndName
It gives the following result:
DateAndName
<=25%
25<_<=50%
50<_<=75%
75<_<=100%
1, name1
1
0
0
0
1, name2
0
1
0
0
2, name1
1
0
0
0
2, name2
0
1
0
0

Related

Calculating Percentages in Postgres

I'm completely new to PostgreSQL. I have the following table called my_table:
a b c date
1 0 good 2019-05-02
0 1 good 2019-05-02
1 1 bad 2019-05-02
1 1 good 2019-05-02
1 0 bad 2019-05-01
0 1 good 2019-05-01
1 1 bad 2019-05-01
0 0 bad 2019-05-01
I want to calculate the percentage of 'good' from column c for each date. I know how to get the number of 'good':
SELECT COUNT(c), date FROM my_table WHERE c != 'bad' GROUP BY date;
That returns:
count date
3 2019-05-02
1 2019-05-01
My goal is to get this:
date perc_good
2019-05-02 25
2019-05-01 75
So I tried the following:
SELECT date,
(SELECT COUNT(c)
FROM my_table
WHERE c != 'bad'
GROUP BY date) / COUNT(c) * 100 as perc_good
FROM my_table
GROUP BY date;
And I get an error saying
more than one row returned by a subquery used as an expression.
I found this answer but not sure how to or if it applies to my case:
Calculating percentage in PostgreSql
How do I go about calculating the percentage for multiple rows?
avg() is convenient for this purpose:
select date,
avg( (c = 'good')::int ) * 100 as percent_good
from t
group by date
order by date;
How does this work? c = 'good' is a boolean expression. The ::int converts it to a number, with 1 for true and 0 for false. The average is then the average of a bunch of 1s and 0s -- and is the ratio of the true values.
For this case you need to use conditional AVG():
SELECT
date,
100 * avg(case when c = 'good' then 1 else 0 end) perc_good
FROM my_table
GROUP BY date;
See the demo.
You could use a conditional sum for get the good value and count for total
below an exaustive code sample
select date
, count(c) total
, sum(case when c='good' then 1 else 0 end) total_good
, sum(case when c='bad' then 1 else 0 end) total_bad
, (sum(case when c='good' then 1 else 0 end) / count(c))* 100 perc_good
, (sum(case when c='bad' then 1 else 0 end) / count(c))* 100 perc_bad
from my_table
group by date
and for your result
select date
, (sum(case when c='good' then 1 else 0 end) / count(c))* 100 perc_good
from my_table
group by date
or as suggested by a_horse_with_no_name using count(*) filter()
select date
, ((count(*) filter(where c='good'))/count(*))* 100 perc_good
from my_table
group by date

Group by datepart and find total count of individual values of each record

This is table structure;
ID Score Valid CreatedDate
1 A 1 2018-02-19 23:33:10.297
2 C 0 2018-02-19 23:32:40.700
3 B 1 2018-02-19 23:32:30.247
4 A 1 2018-02-19 23:31:37.153
5 B 0 2018-02-19 23:25:08.667
...
I need to find total number of each score and valid in each month
I mean final result should be like
Month A B C D E Valid(1) NotValid(0)
January 123 343 1021 98 12 1287 480
February 516 421 321 441 421 987 672
...
This is what I tried;
SELECT DATEPART(year, CreatedDate) as Ay,
(select count(*) from TableResults where Score='A') as 'A',
(select count(*) from TableResults where Score='B') as 'B',
...
FROM TableResults
group by DATEPART(MONTH, CreatedDate)
but couldn't figure how to calculate all occurrence of scores on each month.
Use conditional aggregation.
SELECT DATEPART(year, CreatedDate) as YR
, DATEPART(month, CreatedDate) MO
, sum(Case when score = 'A' then 1 else 0 end) as A
, sum(Case when score = 'B' then 1 else 0 end) as B
, sum(Case when score = 'C' then 1 else 0 end) as C
, sum(Case when score = 'D' then 1 else 0 end) as D
, sum(Case when score = 'E' then 1 else 0 end) as E
, sum(case when valid = 1 then 1 else 0 end) as Valid
, sum(case when valid = 0 then 1 else 0 end) as NotValid
FROM TableResults
GROUP BY DATEPART(MONTH, CreatedDate), DATEPART(year, CreatedDate)
I'm not a big fan of queries in the select; I find they tend to cause performance problems in the long run. Since we're aggregating here I just applied the conditional logic to all the columns.

count row with total

From tbl Department, I am trying to write a stored procedure to display output as shown below where I can find the count from each row based on following conditions:
Total = 100
Total >100 and <=200
Total >200 and <=300
Total >300
set #select = 'select count(*) as Orders, sum (tbl.Expenses) as Total from tbl group by tbl.Department'
So, how can I dynamically get the output for 4 conditions as shown above based on my #select statement.
I think you just want conditional aggregation:
select sum(case when total = 100 then 1 else 0 end) as condition1,
sum(case when total > 100 and total <= 200 then 1 else 0 end) as condition2,
sum(case when total > 200 and total <= 300 then 1 else 0 end) as condition3,
sum(case when total > 300 then 1 else 0 end) as condition4
from department d;
Hope that it would be helpful to you,
Select total,
Sum (Case when total = 100 then NumberOfOrders_Created end ) As "Condition1",
Sum (Case when total > 100 and total <= 200 then NumberOfOrders_Created end ) As "Condition2",
Sum (Case when total > 200 and total <= 300 then NumberOfOrders_Created end ) As "Condition3",
Sum (Case when total > 200 and total <= 300 then NumberOfOrders_Created end ) As "Condition4",
From Department
where
Orders_date between '2013-01-01 00:00:00' and '2013-12-31 00:00:00'
group by 2
order by 1 Desc

How to get count of the nonzero values while calculating average

I have a table mytable with the following structure and data (Oracle 11g)
Job_name job_execution_time(JET in seconds) Run_Date records_processed
A1 0 7/1/2013 0
A1 0 7/5/2013 0
A1 3 7/12/2013 5
A1 9 7/22/2013 14
A1 0 8/1/2013 0
A1 15 8/16/2013 20
A2 0 8/15/2013 0
A2 0 8/17/2013 0
A2 10 9/15/2013 25
A2 45 9/17/2013 70
I am trying to get the average(ignoring '0' values) of the (JET) column for each job for that specific month. Also I need to get a count of the non-zero values which I am using for my average calculation.
For example:
For job A1 for the month of July, the average of the JET column will be (9+3)/2 = 6 and the count of the nonzero values used for the calculation of this average would be 2.
I got the average value using the following code but have problem getting the count.
select job_name , to_char(Run_Date, 'Month') Mon ,
nvl(avg(nullif(job_execution_time,0)), 0) Average_secs
from mytable
group by job_name, to_char(Run_Date, 'Month')
How can I get the count of the nonzero values which are used for the calculation of every average? I tried the following for count but does not work.
count(nullif(job_execution_time, 0)) count_nonzeros
sum(CASE nvl(job_execution_time, 0) !=0 THEN 1 ELSE 0 END) AS "Count_NonZeros"
Thanks.
You can use SUM and CASE statement
select job_name , to_char(Run_Date, 'Month') Mon ,
nvl(avg(nullif(job_execution_time,0)), 0) Average_secs,
sum(case when job_execution_time !=0 then 1 else 0 end) counts
from table1 group by job_name, to_char(Run_Date, 'Month');

Fetch data in MS SQL 2008

I have three tables which are like:
table1
id,
created_Date
table2
id
district_ID
status_ID
table3
district_ID
district_Name
Now i need the records in following format
Srno District_name <10 days >10 and <20 days >20 days
1 xxx 12 15 20
2 yyy 8 0 2
count days as per current date
for example: if the created date is 10-08-2013 and current date is 13-08-2013 the date difference will be 3
So what should my query be? Any suggestions will be appreciated.
Thank you
table1
id created_Date
1 2013-07-12 13:32:10.957
2 2013-07-12 13:32:10.957
3 2013-08-01 10:00:10.957
4 2013-08-10 13:32:10.957
5 2013-08-10 14:32:10.957
table2
id district_ID status_id
1 1 3
2 2 3
3 2 7
4 3 4
5 4 3
table1
district_ID district_Name
1 xxx
2 yyy
3 zzz
4 aaa
5 bbb
I would have a look at using DATEDIFF and CASE.
DATEDIFF (Transact-SQL)
Returns the count (signed integer) of the specified datepart
boundaries crossed between the specified startdate and enddate.
Something like
SELECT District_name,
SUM(
CASE
WHEN DATEDIFF(day,created_Date, getdate()) < 10
THEN 1
ELSE 0
END
) [<10 days],
SUM(
CASE
WHEN DATEDIFF(day,created_Date, getdate()) >= 10 AND DATEDIFF(day,created_Date, getdate()) < 20
THEN 1
ELSE 0
END
) [>10 and <20 days],
SUM(
CASE
WHEN DATEDIFF(day,created_Date, getdate()) >= 20
THEN 1
ELSE 0
END
) [>20 days]
FROM Your_Tables_Here
GROUP BY District_name
;with cte as (
select t3.district_Name, datediff(day, t1.created_Date, getdate()) as diff
from table1 as t1 as t1
inner join table2 as t2 on t2.id = t1.id
inner join table3 as t3 on t3.district_id = t2.district_id
)
select
district_Name,
sum(case when diff < 10 then 1 else 0 end) as [<10 days],
sum(case when diff >= 10 and diff < 20 then 1 else 0 end) as [>=10 and < 20 days],
sum(case when diff >= 20 then 1 else 0 end) as [>= 20 days]
from cte
group by district_Name