Calculating Percentages in Postgres - sql

I'm completely new to PostgreSQL. I have the following table called my_table:
a b c date
1 0 good 2019-05-02
0 1 good 2019-05-02
1 1 bad 2019-05-02
1 1 good 2019-05-02
1 0 bad 2019-05-01
0 1 good 2019-05-01
1 1 bad 2019-05-01
0 0 bad 2019-05-01
I want to calculate the percentage of 'good' from column c for each date. I know how to get the number of 'good':
SELECT COUNT(c), date FROM my_table WHERE c != 'bad' GROUP BY date;
That returns:
count date
3 2019-05-02
1 2019-05-01
My goal is to get this:
date perc_good
2019-05-02 25
2019-05-01 75
So I tried the following:
SELECT date,
(SELECT COUNT(c)
FROM my_table
WHERE c != 'bad'
GROUP BY date) / COUNT(c) * 100 as perc_good
FROM my_table
GROUP BY date;
And I get an error saying
more than one row returned by a subquery used as an expression.
I found this answer but not sure how to or if it applies to my case:
Calculating percentage in PostgreSql
How do I go about calculating the percentage for multiple rows?

avg() is convenient for this purpose:
select date,
avg( (c = 'good')::int ) * 100 as percent_good
from t
group by date
order by date;
How does this work? c = 'good' is a boolean expression. The ::int converts it to a number, with 1 for true and 0 for false. The average is then the average of a bunch of 1s and 0s -- and is the ratio of the true values.

For this case you need to use conditional AVG():
SELECT
date,
100 * avg(case when c = 'good' then 1 else 0 end) perc_good
FROM my_table
GROUP BY date;
See the demo.

You could use a conditional sum for get the good value and count for total
below an exaustive code sample
select date
, count(c) total
, sum(case when c='good' then 1 else 0 end) total_good
, sum(case when c='bad' then 1 else 0 end) total_bad
, (sum(case when c='good' then 1 else 0 end) / count(c))* 100 perc_good
, (sum(case when c='bad' then 1 else 0 end) / count(c))* 100 perc_bad
from my_table
group by date
and for your result
select date
, (sum(case when c='good' then 1 else 0 end) / count(c))* 100 perc_good
from my_table
group by date
or as suggested by a_horse_with_no_name using count(*) filter()
select date
, ((count(*) filter(where c='good'))/count(*))* 100 perc_good
from my_table
group by date

Related

Count average with multiple conditions

I'm trying to create a query which allows to categorize the average percentage for specific data per month.
Here's how my dataset presents itself:
Date
Name
Group
Percent
2022-01-21
name1
gr1
5.2
2022-01-22
name1
gr1
6.1
2022-01-26
name1
gr1
4.9
2022-02-01
name1
gr1
3.2
2022-02-03
name1
gr1
8.1
2022-01-22
name2
gr1
36.1
2022-01-25
name2
gr1
32.1
2022-02-10
name2
gr1
35.8
...
...
...
...
And here's what I want to obtain with my query (based on what I showed of the table):
Month
<=25%
25<_<=50%
50<_<=75%
75<_<=100%
01
1
1
0
0
02
1
1
0
0
...
...
...
...
...
The result needs to:
Be ordered by month
Have the average use for each name counted and categorized
So far I know how to get the average of the Percent value per Name:
SELECT Name,
AVG(Percent)
from `table`
where Group = 'gr1'
group by Name
and how to count iterations of Percent in the categories created for the query:
SELECT EXTRACT(MONTH FROM Date) as Month,
COUNT(CASE WHEN Percent <= 25 AND Group = 'gr1' THEN Name END) `_25`,
COUNT(CASE WHEN Percent > 25 AND Percent <= 50 AND Group = 'gr1' THEN Name END) `_50`,
COUNT(CASE WHEN Percent > 50 AND Percent <= 75 AND Group = 'gr1' THEN Name END) `_75`,
COUNT(CASE WHEN Percent > 75 AND Percent <= 100 AND Group = 'gr1' THEN Name END) `_100`,
FROM `table`
GROUP BY Month
ORDER BY Month
but this counts all iterations of every name where I want the average of those values.
I've been struggling to figure out how to combine the two queries or to create a new one that answers my need.
I'm working with the BigQuery service from Google Cloud
This query produces the needed result, based on your example. So basically this combines your 2 queries using subquery, where the subquery is responsible to calculate AVG grouped by Name, Month and Group, and the outer query is for COUNT and "categorization"
SELECT
Month,
COUNT(CASE
WHEN avg <= 25 THEN Name
END) AS _25,
COUNT(CASE
WHEN avg > 25
AND avg <= 50 THEN Name
END) AS _50,
COUNT(CASE
WHEN avg > 50
AND avg <= 75 THEN Name
END) AS _75,
COUNT(CASE
WHEN avg > 75
AND avg <= 100 THEN Name
END) AS _100
FROM
(
SELECT
EXTRACT(MONTH from Date) AS Month,
Name,
AVG(Percent) AS avg
FROM
table1
GROUP BY Month, Name, Group
HAVING Group = 'gr1'
) AS namegr
GROUP BY Month
This is the result:
Month
_25
_50
_75
_100
1
1
1
0
0
2
1
1
0
0
See also Fiddle (BUT on MySql) - http://sqlfiddle.com/#!9/16c5882/9
You can use this query to Group By Month and each Name
SELECT CONCAT(EXTRACT(MONTH FROM Date), ', ', Name) AS DateAndName,
CASE
WHEN AVG(Percent) <= 25 THEN '1'
ELSE '0'
END AS '<=25%',
CASE
WHEN AVG(Percent) > 25 AND AVG(Percent) <= 50 THEN '1'
ELSE '0'
END AS '25<_<=50%',
CASE
WHEN AVG(Percent) > 50 AND AVG(Percent) <= 75 THEN '1'
ELSE '0'
END AS '50<_<=75%',
CASE
WHEN AVG(Percent) > 75 AND AVG(Percent) <= 100 THEN '1'
ELSE '0'
END AS '75<_<=100%'
from DataTable /*change to your table name*/
group by EXTRACT(MONTH FROM Date), Name
order by DateAndName
It gives the following result:
DateAndName
<=25%
25<_<=50%
50<_<=75%
75<_<=100%
1, name1
1
0
0
0
1, name2
0
1
0
0
2, name1
1
0
0
0
2, name2
0
1
0
0

SQL: calculate different columns with events and dates

I have this columns:
user_id (xxxx)
order_id (xxxx)
order_date (2020-07-01)
I would like to have per user_id the following calculated columns:
ordered at least 1 time or more, between 2020-07-01 to 2020-12-31 (6m)
ordered at least 3 times or more, between 2020-07-01 to 2020-12-31 (6m)
ordered at least 1 time or more, between 2020-07-01 to 2020-09-30 (3m)
ordered at least 1 time or more, between 2020-07-01 to 2020-08-31 (1m)
The result value could be e.g. "ordered" vs "not ordered" to populate the columns.
I'm using redshift
You can use the group by and conditional aggregation as follows:
select user_id,
case when
count(case when order_date between xxxx1 and yyyy1 then 1 end) > 1
and count(case when order_date between xxxx2 and yyyy2 then 1 end) > 3
and count(case when order_date between xxxx3 and yyyy3 then 1 end) > 1
and count(case when order_date between xxxx4 and yyyy4 then 1 end) > 1
then 'Yes' else 'No' end as res_
from your_table -- where ... -- use where condition to restrict the result if required
group by user_id
Replace dates with xxxxn and yyyyn

how to sum two column within single case statement

The query below returns 2 rows, but actually I need only one;
select Datename(month, m.CreatedDate) as [Ay], sum(case when h.Cinsiyet=1 then 1 else 0 end) as [Group1], sum(case when h.Cinsiyet=2 then 1 else 0 end) as [Group2] from Muayene.Muayene m with(nolock)
join Ortak.Hasta h with(nolock) on m.HastaTc = h.HastaTc
group by h.Cinsiyet, Datename(month, m.CreatedDate)
result:
MonthName Group1 Group2
April 4500 0
April 0 9000
Expected Result:
MonthName Group1 Group2
April 4500 9000
I know I can do it wrapping the query with another select statement and Group by month and Sum these results.. But its not efficient and looks dirty code.
How can I make a trick to get expected result without make another sum statement?
FIx the GROUP BY:
select Datename(month, m.CreatedDate) as [Ay],
sum(case when h.Cinsiyet = 1 then 1 else 0 end) as [Group1],
sum(case when h.Cinsiyet = 2 then 1 else 0 end) as [Group2]
from Muayene.Muayene m join
Ortak.Hasta h
on m.HastaTc = h.HastaTc
group by Datename(month, m.CreatedDate);

How to count rows matching a filter in aggregate operations

Sorry for the not punctual title, this is the best I succeeded to obtain.
I have a table like this:
date | type | qty
2018-03-21 03:30:00 | A | 3
2018-03-22 03:30:00 | A | 3
2018-03-22 04:57:00 | A | 1
2018-03-22 05:18:00 | B | 3
I do some aggregations on this table, e.g. sum of qty over day or over month.
In the same query I need to count how many rows are of type B, while retrieving the total qty on that day.
So,
select sum(qty), date_trunc('day', date) ... group by date_trunc('day', date);
Now, what I need to do next is to count how many rows are of type B. So the expected result is
day | Bcount | totqty
2018-03-21 | 0 | 3
2018-03-22 | 1 | 7
I thought to use partitions but I'm not sure how to use them in this specific case.
Edit: thank you all, guys, for your answers. This was soooooooo easy 🙄
Since 9.4 release we can replace the CASE WHEN clauses in these aggregate functions by the new FILTER clause, use below query:
select date_trunc('day', date) AS Day,
Count(TYPE) filter (where Type = 'B') AS BCount,
Sum(qty) AS TotalQty
FROM Table1 group by date_trunc('day', date);
For Demo Follow the link:
http://sqlfiddle.com/#!17/a5203/14
Until Postgres 9.4 release, if you wanted to count a few sets of records when executing an aggregate function, you had to use a CASE WHEN.
Like This:
SELECT date_trunc('day', date) AS Day,
SUM(CASE WHEN TYPE = 'B' THEN 1 ELSE 0 END) AS BCount,
Sum(qty) AS TotalQty
FROM Table1 group by date_trunc('day', date);
Use a case expression to do conditional aggregation:
select ...
sum(case when type = 'B' then 1 else 0 end) as Bcount
...
select date_trunc('day', date) ,sum(qty),
SUM (CASE WHEN type = 'B' THEN 1 ELSE 0 END) AS Bcount
FROM Table1
group by date_trunc('day', date);
Demo
http://sqlfiddle.com/#!17/a5203/13

How to show different dates data (from the same table) as columns in Oracle

I'm sorry if the title wasn't too clear, but the following explanation will be more accurate.
I have the following view:
DATE USER CONDITION
20140101 1 A
20140101 2 B
20140101 3 C
20140108 1 C
20140108 3 B
20140108 2 C
What I need to do is present how many users where in all conditions this week and 7 days before today.
Output should be like this:
Condition Today Last_Week (Today-7)
A 0 1
B 1 1
C 2 1
How can I do this in Oracle? I will need to do this for 4 weeks so itll be Today-7,14-21.
I've tried this with group by but I get the "week2" as rows. Then I've tried something like Select conditions, (select count(users) from MyView where DATE='Today') FROM MyView(looking at something thats actually working) but it doesnt work for me.
Achieved this with a little modification of the accepted answer:
select condition,
count(case when to_date(xdate) = to_date(sysdate) then 1 end) to_day,
count(case when to_date(xdate) = to_date(sysdate-7) then 1 end) last_7_days
from my_table
group by condition
select condition, count(case when to_date(xdate) = to_date(sysdate) then 1 end) to_day,
count(case when to_date(xdate) < to_date(sysdate) then 1 end) last_7_days
from my_table
where to_date(xdate) >= to_date(sysdate) - 7
group by condition
select condition
, sum
( case
when date between trunc(sysdate) - 7 and trunc(sysdate) - 1
then 1
else 0
end
)
last_week
, sum
( case
when date between trunc(sysdate) and trunc(sysdate + 1)
then 1
else 0
end
)
this_week
from table
group
by condition
By using the conditional count (as a sum) and grouping on condition you can filter out all desired dates. Note that using trunc will cause to use the begin of the day.