Aggregation within a certain time - sql

I need to create a list of the 2 rating hotels in the UK that have increased their rating by at least 3 points from the beginning.
Month | Hotel | Rating | Region |
---------------------------------------
01-Jan-19 | A | 1 | US |
01-Feb-19 | B | 2 | UK |
01-Mar-19 | C | 3 | EU |
01-Apr-19 | A | 1 | US |
01-May-19 | B | 4 | UK |
01-Jun-19 | C | 3 | EU |
01-Jul-19 | A | 1 | US |
01-Aug-19 | B | 5 | UK |
01-Sep-19 | C | 4 | EU |
Like this, the query must produce Hotel B only.

It sounds like you want the first and last entries. One method uses conditional aggregation. I am going to assume that month is really a date or number and not a string:
select t.hotel
from (select t.*,
row_number() over (partition by hotel order by month asc) as seqnum_asc,
row_number() over (partition by hotel order by month desc) as seqnum_desc
from t
) t
group by t.hotel
having max(rating) filter (where seqnum_asc = 1) >= max(rating) filter (where seqnum_desc = 1) + 3;

This also works
I have tried it
Select "Hotel"
From T
Where "Region" = 'UK'
Group by "Hotel"
Having
Min ("Rating") = 2
And
Max ("Rating") >= 5
The Link to test:
https://www.db-fiddle.com/f/6TVgrC5WRjqdyPwdGSvWGN/8

Related

How to add records for each user based on another existing row in BigQuery?

Posting here in case someone with more knowledge than may be able to help me with some direction.
I have a table like this:
| Row | date |user id | score |
-----------------------------------
| 1 | 20201120 | 1 | 26 |
-----------------------------------
| 2 | 20201121 | 1 | 14 |
-----------------------------------
| 3 | 20201125 | 1 | 0 |
-----------------------------------
| 4 | 20201114 | 2 | 32 |
-----------------------------------
| 5 | 20201116 | 2 | 0 |
-----------------------------------
| 6 | 20201120 | 2 | 23 |
-----------------------------------
However, from this, I need to have a record for each user for each day where if a day is missing for a user, then the last score recorded should be maintained then I would have something like this:
| Row | date |user id | score |
-----------------------------------
| 1 | 20201120 | 1 | 26 |
-----------------------------------
| 2 | 20201121 | 1 | 14 |
-----------------------------------
| 3 | 20201122 | 1 | 14 |
-----------------------------------
| 4 | 20201123 | 1 | 14 |
-----------------------------------
| 5 | 20201124 | 1 | 14 |
-----------------------------------
| 6 | 20201125 | 1 | 0 |
-----------------------------------
| 7 | 20201114 | 2 | 32 |
-----------------------------------
| 8 | 20201115 | 2 | 32 |
-----------------------------------
| 9 | 20201116 | 2 | 0 |
-----------------------------------
| 10 | 20201117 | 2 | 0 |
-----------------------------------
| 11 | 20201118 | 2 | 0 |
-----------------------------------
| 12 | 20201119 | 2 | 0 |
-----------------------------------
| 13 | 20201120 | 2 | 23 |
-----------------------------------
I'm trying to to this in BigQuery using StandardSQL. I have an idea of how to keep the same score across following empty dates, but I really don't know how to add new rows for missing dates for each user. Also, just to keep in mind, this example only has 2 users, but in my data I have more than 1500.
My end goal would be to show something like the average of the score per day. For background, because of our logic, if the score wasn't recorded in a specific day, this means that the user is still in the last score recorded which is why I need a score for every user every day.
I'd really appreciate any help I could get! I've been trying different options without success
Below is for BigQuery Standard SQL
#standardSQL
select date, user_id,
last_value(score ignore nulls) over(partition by user_id order by date) as score
from (
select user_id, format_date('%Y%m%d', day) date,
from (
select user_id, min(parse_date('%Y%m%d', date)) min_date, max(parse_date('%Y%m%d', date)) max_date
from `project.dataset.table`
group by user_id
) a, unnest(generate_date_array(min_date, max_date)) day
)
left join `project.dataset.table` b
using(date, user_id)
-- order by user_id, date
if applied to sample data from your question - output is
One option uses generate_date_array() to create the series of dates of each user, then brings the table with a left join.
select d.date, d.user_id,
last_value(t.score ignore nulls) over(partition by d.user_id order by d.date) as score
from (
select t.user_id, d.date
from mytable t
cross join unnest(generate_date_array(min(date), max(date), interval 1 day)) d(date)
group by t.user_id
) d
left join mytable t on t.user_id = d.user_id and t.date = d.date
I think the most efficient method is to use generate_date_array() but in a very particular way:
with t as (
select t.*,
date_add(lead(date) over (partition by user_id order by date), interval -1 day) as next_date
from t
)
select row_number() over (order by t.user_id, dte) as id,
t.user_id, dte, t.score
from t cross join join
unnest(generate_date_array(date,
coalesce(next_date, date)
interval 1 day
)
) dte;

Combine PARTITION BY and GROUP BY

I have a (mssql) table like this:
+----+----------+---------+--------+--------+
| id | username | date | scoreA | scoreB |
+----+----------+---------+--------+--------+
| 1 | jim | 01/2020 | 100 | 0 |
| 2 | max | 01/2020 | 0 | 200 |
| 3 | jim | 01/2020 | 0 | 150 |
| 4 | max | 02/2020 | 150 | 0 |
| 5 | jim | 02/2020 | 0 | 300 |
| 6 | lee | 02/2020 | 100 | 0 |
| 7 | max | 02/2020 | 0 | 200 |
+----+----------+---------+--------+--------+
What I need is to get the best "combined" score per date. (With "combined" score I mean the best scores per user and per date summarized)
The result should look like this:
+----------+---------+--------------------------------------------+
| username | date | combined_score (max(scoreA) + max(scoreB)) |
+----------+---------+--------------------------------------------+
| jim | 01/2020 | 250 |
| max | 02/2020 | 350 |
+----------+---------+--------------------------------------------+
I came this far:
I can group the scores by user like this:
SELECT
username, (max(scoreA) + max(scoreB)) AS combined_score,
FROM score_table
GROUP BY username
ORDER BY combined_score DESC
And I can get the best score per date with PARTITION BY like this:
SELECT *
FROM
(SELECT t.*, row_number() OVER (PARTITION BY date ORDER BY scoreA DESC) rn
FROM score_table t) as tmp
WHERE tmp.rn = 1
ORDER BY date
Is there a proper way to combine these statements and get the result I need? Thank you!
Btw. Don't care about possible ties!
You can combine window functions and aggregation functions like this:
SELECT s.*
FROM (SELECT username, date, (max(scoreA) + max(scoreB)) AS combined_score,
ROW_NUMBER() OVER (PARTITION BY date ORDER BY max(scoreA) + max(scoreB) DESC) as seqnum
FROM score_table
GROUP BY username, date
) s
ORDER BY combined_score DESC;
Note that date needs to be part of the aggregation.

eSQL multiple join but with conditions

I've 3 tables as under
MERCHANDISE
+-----------+-----------+---------------+
| MERCH_NUM | MERCH_DIV | MERCH_SUB_DIV |
+-----------+-----------+---------------+
| 1 | car | awd |
| 1 | car | awd |
| 2 | bike | 1kcc |
| 3 | cycle | hybrid |
| 3 | cycle | city |
| 4 | moped | fixie |
+-----------+-----------+---------------+
PRIORITY
+----------+-----------+---------+---------+------------+------------+---------------+
| CUST_NUM | SALES_NUM | DOC_NUM | BALANCE | PRIORITY_1 | PRIORITY_2 | PRIORITY_CODE |
+----------+-----------+---------+---------+------------+------------+---------------+
| 90 | 1000 | 10 | 23 | 1 | 6 | NO |
| 91 | 1001 | 20 | 32 | 3 | 7 | PRI |
| 92 | 1002 | 30 | 11 | 2 | 8 | LATE |
| 93 | 1003 | 40 | 22 | 5 | 9 | 1MON |
+----------+-----------+---------+---------+------------+------------+---------------+
ORDER
+----------+-----------+---------+---------+-----------+-----------+
| CUST_NUM | SALES_NUM | DOC_NUM | COUNTRY | MERCH_NUM | MERCH_DIV |
+----------+-----------+---------+---------+-----------+-----------+
| 90 | 1000 | 10 | INDIA | 1 | car |
| 91 | 1001 | 20 | CHINA | 2 | bike |
| 92 | 1002 | 30 | USA | 3 | cycle |
| 93 | 1003 | 40 | UK | 4 | moped |
+----------+-----------+---------+---------+-----------+-----------+
I want to join the left joined table from the last two tables with the first one such that the MERCH_SUB_DIV 'awd' appears only once for each unique combination of merch_num and merch_div
the code I came up with is as under, but I'm not sure how do I eliminate the duplicate row just for the awd
select
ROW#, MERCH.MERCH_NUMBER, ORDPRI.MERCH_NUMBER, ORDPRI.CUST_NUM,
BALANCE, SALES_NUM, ITEM_NUM, RANK, PRIORITY_1
from (
select
ROW_NUMBER() OVER(
PARTITION BY ORD.DOC_NUM, ORD.ITEM_NUM
ORDER BY ORD.DOC_NUM, ORD.ITEM_NUM ASC
) AS Row#,
ORD.CUST_NUM, PRI.CUST_NUM, ORD.MERCH_NUM, ORD.MERCH_DIV, PRI.BALANCE,
pri.DOC_NUM, pri.SALES_NUM, pri.PRIORITY_1, pri.PRIORITY_2
from ORDER as ORD
left join PRIORITY as PRI on ORD.DOC_NUM = PRI.DOC_NUM
and ORD.SALES_NUMBER = PRI.SALES_NUM
where country_name in ('USA', ‘INDIA’)
) as ORDPRI
left join MERCHANDISE as MERCH on ORDPRI.DIV = MERCH.DIV
and ORDPRI.MERCH_NUM = MERCH.MERCH_NUM
You have to use 'DISTINCT' keyword to get unique values, but if your 'Priority table' & 'Order table' contains different values for Same MERCH_NUM then the final result contains the repetation of the 'MERCH_NUM'.
SELECT DISTINCT M.MERCH_NUMBER, O.MERCH_NUMBER, O.CUST_NUM, BALANCE, SALES_NUM,ITEM_NUM,RANK,PRIORITY_1
FROM priority_table P
LEFT JOIN order_table O ON P.CUST_NUM = O.CUST_NUM AND P.SALES_NUM=O.SALES_NUM AND P.DOC_NUM = O.DOC_NUM
LEFT JOIN merchandise_table M ON M.MERCH_NUM = O.MERCH_NUM
A way around can be to add one new Row_Number() in the outermost query having Partition by MERCH_SUB_DIV + all the columns in the final list and then filter final results based on the New Row_Number() . Follows a pseudo code that might help:
select
-- All expected columns in final result except the newRow#
ROW#, MERCH_NUM, CUST_NUM,
BALANCE, SALES_NUM, PRIORITY_1
from (
select
ROW#,
-- the new row number includes all column you want to show in final result
row_number() over ( PARTITION BY MERCH.MERCH_SUB_DIV ,
MERCH.MERCH_NUM, ORDPRI.MERCH_NUM, ORDPRI.CUST_NUM,
BALANCE, SALES_NUM, PRIORITY_1
order by (select 1 )) as newRow# ,
MERCH.MERCH_NUM, ORDPRI.CUST_NUM,
BALANCE, SALES_NUM, PRIORITY_1
from (
-- main query goes here
select
ROW_NUMBER() OVER(
PARTITION BY ORD.DOC_NUM --, ORD.ITEM_NUM
ORDER BY ORD.DOC_NUM ASC --, ORD.ITEM_NUM
) AS Row#,
ORD.CUST_NUM, ORD.MERCH_NUM, ORD.MERCH_DIV as DIV, PRI.BALANCE,
pri.DOC_NUM, pri.SALES_NUM, pri.PRIORITY_1, pri.PRIORITY_2
from #ORDER as ORD
left join #PRIORITY as PRI on ORD.DOC_NUM = PRI.DOC_NUM
and ORD.SALES_NUMBER = PRI.SALES_NUM
where country_name in ('USA', 'INDIA')
) as ORDPRI
left join #MERCHANDISE as MERCH on ORDPRI.DIV = MERCH.DIV
and ORDPRI.MERCH_NUM = MERCH.MERCH_NUM
) as T
-- final filter to get distinct values
where newRow# = 1
Sample code here .. Hope this helps!!

select max value for each carID

I got three columns (carID, clientID, numClients). First one identifies a client, second one identifies a car and the third one shows how many times each client rented a car.
I need to get the maximum value of numClients for each carID.
I did this:
SELECT carID, clientID,
COUNT(*) AS numClients
FROM RENT R
JOIN DETAILS_OF_RENT D ON d.rentID = r.ID
GROUP BY carID, clientID
ORDER BY carID, clientID;
So the table I get is something like this:
+---------+----------+------------+
| carID | clientID | numClients |
+---------+----------+------------+
| 0765BBC | C02 | 1 |
| 0765BBC | C05 | 1 |
| 0765BBC | C07 | 1 |
| 0765BBC | C13 | 1 |
| 0765BBC | C14 | 1 |
| 1234XQP | C01 | 1 |
| 1234XPQ | C02 | 1 |
| 1234XPQ | C07 | 1 |
| 1234XPQ | C09 | 2 |
| 1234XPQ | C11 | 1 |
| 1523BBD | c07 | 1 |
| 1523BBD | c09 | 2 |
+---------+----------+------------+
My output should be 0765BBC and 1523BBD since they we're rented by the same client 2 times.
So, I have to get the carID's of the cars which were rented by the same client more times but I don't know how to select these rows from the above table
You appear to want something like this:
SELECT rd.*
FROM (SELECT rd.*, DENSE_RANK() OVER (ORDER BY client_cnt DESC) as seqnum
FROM (SELECT carID, clientID, COUNT(*) OVER (PARTITION BY clientId) as client_cnt
FROM RENT R JOIN
DETAILS_OF_RENT D
ON d.rentID = r.ID
) rd
) rd
WHERE seqnum = 1;
I don't think aggregation is needed. The innermost subquery adds a column which is the total number of cars for each client. The middle subquery adds a column which identifies the biggest values. The outer query then chooses the largest values.

Oracle SQL newbie - Add new column that gets occurrence and computations

This post is enhanced version of my previous post here.
Please Note: This is not duplicate post or thread.
I have 3 tables:
1. REQUIRED_AUDITS (Independent table)
2. SCORE_ENTRY (SCORE_ENTRY is One to Many relationship with ERROR table)
3. ERROR
Below are the dummy data and table structure:
REQUIRED_AUDITS TABLE
+-------+------+----------+---------------+-----------------+------------+----------------+---------+
| ID | VP | Director | Manager | Employee | Req_audits | Audit_eligible | Quarter |
+-------+------+----------+---------------+-----------------+------------+----------------+---------+
| 10001 | John | King | susan#com.com | jake#com.com | 2 | Y | FY18Q1 |
| 10002 | John | King | susan#com.com | beth#com.com | 4 | Y | FY18Q1 |
| 10003 | John | Maria | tony#com.com | david#com.com | 6 | N | FY18Q1 |
| 10004 | John | Maria | adam#com.com | william#com.com | 3 | Y | FY18Q1 |
| 10005 | John | Smith | alex#com.com | rose#com.com | 6 | Y | FY18Q1 |
+-------+------+----------+---------------+-----------------+------------+----------------+---------+
SCORE_ENTRY TABLE
+----------------+------+----------+---------------+-----------------+-------+---------+
| SCORE_ENTRY_ID | VP | Director | Manager | Employee | Score | Quarter |
+----------------+------+----------+---------------+-----------------+-------+---------+
| 1 | John | King | susan#com.com | jake#com.com | 100 | FY18Q1 |
| 2 | John | King | susan#com.com | jake#com.com | 90 | FY18Q1 |
| 3 | John | King | susan#com.com | beth#com.com | 98.45 | FY18Q1 |
| 4 | John | King | susan#com.com | beth#com.com | 95 | FY18Q1 |
| 5 | John | King | susan#com.com | beth#com.com | 100 | FY18Q1 |
| 6 | John | King | susan#com.com | beth#com.com | 100 | FY18Q1 |
| 7 | John | Maria | adam#com.com | william#com.com | 99 | FY18Q1 |
| 8 | John | Maria | adam#com.com | william#com.com | 98.1 | FY18Q1 |
| 9 | John | Smith | alex#com.com | rose#com.com | 96 | FY18Q1 |
| 10 | John | Smith | alex#com.com | rose#com.com | 100 | FY18Q1 |
+----------------+------+----------+---------------+-----------------+-------+---------+
ERROR TABLE
+----------+-----------------------------+----------------+
| ERROR_ID | ERROR | SCORE_ENTRY_ID |
+----------+-----------------------------+----------------+
| 10 | Words Missing | 2 |
| 11 | Incorrect document attached | 2 |
| 12 | No results | 3 |
| 13 | Value incorrect | 4 |
| 14 | Words Missing | 4 |
| 15 | No files attached | 4 |
| 16 | Document read error | 7 |
| 17 | Garbage text | 8 |
| 18 | No results | 8 |
| 19 | Value incorrect | 9 |
| 20 | No files attached | 9 |
+----------+-----------------------------+----------------+
I have query that give below output:
+----------+---------------+------------------+------------------+------------------+
| | | Director Summary | | |
+----------+---------------+------------------+------------------+------------------+
| Director | Manager | Audits Required | Audits Performed | Percent Complete |
| King | susan#com.com | 6 | 6 | 100% |
| Maria | adam#com.com | 3 | 2 | 67% |
| Smith | alex#com.com | 6 | 2 | 33% |
+----------+---------------+------------------+------------------+------------------+
Now I would like to add column where I want the number of scores that have an error associated with them divided by total count of scores:
It's not total count of errors divided by count of scores. Instead its count of each occurrence of error and divide by count of score. Please find below example:
Considering
Director:King
Manager:susan#com.com
From SCORE_ENTRY TABLE and ERROR table,
King has 6 entries in SCORE_ENTRY TABLE
6 entries in ERROR TABLE
Instead of 6 entries in ERROR TABLE, I would like to have occurrence of error ie., 3 errors.
Formula to calculate Quality:
Quality = 1 - (sum of error occurrence / total score)*100
For King:
Quality = 1 - (3/6)*100
Quality = 50
Please Note: It's not 1 - (6/6)*100
For Maria:
Quality = 1 - (2/2)*100
Quality = 0
Below is the new output I need with new column called Quality:
+----------+---------------+---------+------------------+------------------+------------------+
| | | | Director Summary | | |
+----------+---------------+---------+------------------+------------------+------------------+
| Director | Manager | Quality | Audits Required | Audits Performed | Percent Complete |
| King | susan#com.com | 50% | 6 | 6 | 100% |
| Maria | adam#com.com | 0% | 3 | 2 | 67% |
| Smith | alex#com.com | 50% | 6 | 2 | 33% |
+----------+---------------+---------+------------------+------------------+------------------+
Below is the query am having (Thanks to #Kaushik Nayak, #APC and others) and need to add new column to this query:
WITH aud(manager_email, director, quarter, total_audits_required)
AS (SELECT manager_email,
director,
quarter,
SUM (CASE
WHEN audit_eligible = 'Y' THEN required_audits
END)
FROM required_audits
GROUP BY manager_email,
director,
quarter), --Total_audits
scores(manager_email, director, quarter, audits_completed)
AS (SELECT manager_email,
director,
quarter,
Count (score)
FROM oq_score_entry s
GROUP BY manager_email,
director,
quarter) --Audits_Performed
SELECT a.director,
a.manager_email manager,
a.total_audits_required,
s.audits_completed,
Round(( ( s.audits_completed ) / ( a.total_audits_required ) * 100 ), 2)
percentage_complete,
a.quarter
FROM aud a
left outer join scores s
ON a.manager_email = s.manager_email
WHERE ( :P4_MANAGER_EMAIL = a.manager_email
OR :P4_MANAGER_EMAIL IS NULL )
AND ( :P4_DIRECTOR = a.director
OR :P4_DIRECTOR IS NULL )
AND ( :P4_QUARTER = a.quarter
OR :P4_QUARTER IS NULL )
ORDER BY a.total_audits_required DESC nulls last
Please let me know if its confusing or need more details. Am open for any suggestions and feedback.
Appreciate any help.
Thanks,
Richa
Update:
Well my first guess has been wrong, and I hope now I'm getting it right.
According to your and shawnt00's comments, you need to compute the count of score entries that have corresponding entries in ERROR table, and use it in quality calculation.
This count you get with the expression:
COUNT ((select max(1) from "ERROR" o where o.score_entry_id=s.score_entry_id)) AS error_occurences
max(1) returns 1 when there is an entry in "ERROR" and NULL otherwise. COUNT skips nulls.
I hope this is clear.
Quality is computed as
(1 - error_occurences/audits_completed)*100%
Below is the full script, where manager_email renamed to manager and oq_score_entry renamed to score_entry.
This is in accordance with your scheme. Also I removed unnecessary WITH column mapping, it just complicates things in this case.
WITH aud AS (SELECT manager, director, quarter, SUM (CASE
WHEN audit_eligible = 'Y' THEN req_audits
END) total_audits_required
FROM required_audits
GROUP BY manager, director, quarter), --Total_audits
scores AS (
SELECT manager, director, quarter,
Count (score) audits_completed,
COUNT ((select max(1) from "ERROR" o where o.score_entry_id=s.score_entry_id)
) error_occurences -- ** Added **
FROM score_entry s
GROUP BY manager, director, quarter
) --Audits_Performed
SELECT a.director,
a.manager manager,
a.total_audits_required,
s.audits_completed,
Round(( 1 - ( s.error_occurences ) / ( s.audits_completed )) * 100, 2), -- ** Added **
Round(( ( s.audits_completed ) / ( a.total_audits_required ) * 100 ), 2)
percentage_complete,
a.quarter
FROM aud a
left outer join scores s ON a.manager = s.manager
WHERE ( :P4_manager = a.manager
OR :P4_manager IS NULL )
AND ( :P4_DIRECTOR = a.director
OR :P4_DIRECTOR IS NULL )
AND ( :P4_QUARTER = a.quarter
OR :P4_QUARTER IS NULL )
ORDER BY a.total_audits_required DESC nulls last
About total_errors:
To add this column you can either use a technique similar to the one used before in scores:
scores AS (
SELECT manager, director, quarter,
count (score) audits_completed,
count ((select max(1) from "ERROR" o where o.score_entry_id=s.score_entry_id )
) error_occurences,
sum ( ( SELECT count(*) from "ERROR" o where o.score_entry_id=s.score_entry_id )
) total_errors -- summing error counts for matched score_entry_ids
FROM score_entry s
GROUP BY manager, director, quarter
)
Or you can rewrite the scores CTE joining score_entry and error, and that would require using DISTINCT on score_entry fields to avoid duplication of rows:
scores AS (
SELECT manager, director, quarter,
count(DISTINCT s.score_entry_id) audits_completed,
count(DISTINCT e.score_entry_id ) error_occurences, -- counting distinct score_entry_ids present in Error
count(e.score_entry_id) total_errors -- counting total rows in Error
FROM score_entry s
LEFT JOIN "ERROR" e ON s.score_entry_id=e.score_entry_id
GROUP BY manager, director, quarter
)
The latter approach is a bit less maintable, since it requires to be careful about unwanted duplication.
Yet another (and may be the most proper) way is to make a separate(third) CTE, but I don't think the query is complex enough to warrant this.
Original answer:
I might be wrong, but it seems to me that by "count of each occurrence of error" you are trying to describe COUNT(DISTINCT expr). That is to count unique occurences of error for each (manager_email, director, quarter).
If so, change the query a bit:
WITH aud(manager_email, director, quarter, total_audits_required)
AS (SELECT manager_email,
director,
quarter,
SUM (CASE
WHEN audit_eligible = 'Y' THEN required_audits
END)
FROM required_audits
GROUP BY manager_email,
director,
quarter), --Total_audits
scores(manager_email, director, quarter, audits_completed, distinct_errors)
AS (SELECT manager_email,
director,
quarter,
Count (score),
COUNT (DISTINCT o.error_id) -- ** Added **
FROM oq_score_entry s join error o on o.score_entry_id=s.score_entry_id
GROUP BY manager_email,
director,
quarter) --Audits_Performed
SELECT a.director,
a.manager_email manager,
a.total_audits_required,
s.audits_completed,
Round(( ( s.distinct_errors ) / ( s.audits_completed ) * 100 ), 2) quality, -- ** Added **
Round(( ( s.audits_completed ) / ( a.total_audits_required ) * 100 ), 2)
percentage_complete,
a.quarter
FROM aud a
left outer join scores s
ON a.manager_email = s.manager_email
WHERE ( :P4_MANAGER_EMAIL = a.manager_email
OR :P4_MANAGER_EMAIL IS NULL )
AND ( :P4_DIRECTOR = a.director
OR :P4_DIRECTOR IS NULL )
AND ( :P4_QUARTER = a.quarter
OR :P4_QUARTER IS NULL )
ORDER BY a.total_audits_required DESC nulls last
The join on your main query will need to include director and quarter once you have more data.
I suppose the easiest way to fix this is to follow the structure you've got and add another table expression joining it to the rest of your results in the same way as the original two.
select manager_email, director, quarter,
100.0 - 100.0 * count (distinct e.score_entry_id) / count (*) as quality
from score_entry se left outer join error e
on e.score_entry_id = se.score_entry_id
group by manager_email, director, quarter
What would have made most of your explanation unnecessary is to have simply said that you want the number of scores that have an error associated with them. It was difficult to draw that out from the information you provided.