Combining SQL grouped and ungrouped results with a cross join?

Combining SQL grouped and ungrouped results with a cross join? - sql

I have inherited two tables, where the data for one is in hours, and the data for the other is in days.
One table has planned resource use, the other holds actual hours spent
Internal_Resources
| PeopleName | NoOfDays | TaskNo |
|------------|----------|--------|
| Fred | 1 | 100 |
| Bob | 3 | 100 |
| Mary | 2 | 201 |
| Albert | 10 | 100 |
TimeSheetEntries
| UserName | PaidHours | TaskNumber |
|----------|-----------|------------|
| Fred | 7 | 100 |
| Fred | 14 | 100 |
| Fred | 7 | 100 |
| Bob | 7 | 100 |
| Bob | 21 | 100 |
| Mary | 7 | 201 |
| Mary | 14 | 100 |
What I need is a comparison of time planned vs time spent.
| name | PlannedDays | ActualDays |
|--------|-------------|------------|
| Albert | 10 | NULL |
| Bob | 3 | 4.00 |
| Fred | 1 | 4.00 |
| Mary | NULL | 2.00 |
I've cobbled together something that almost does the trick:
SELECT
UserName,
( SELECT
NoOfDays FROM Internal_Resources as r
WHERE r.PeopleName = e.UserName AND r.TaskNumber = ? ) AS PlannedDays,
SUM ( Round( PaidHours / 7 , 2 ) ) as ActualDays
FROM TimeSheetEntries e WHERE TaskNo = ?
GROUP BY UserName
Which for task 100 gives me back something like:
| UserName | PlannedDays | ActualDays |
|----------|-------------|------------|
| Bob | 3 | 4 |
| Fred | 1 | 4 |
| Mary | 0 | 2 |
but lazy Albert doesn't feature! I'd like:
| UserName | PlannedDays | ActualDays |
|----------|-------------|------------|
| Albert | 10 | 0 |
| Bob | 3 | 4 |
| Fred | 1 | 4 |
| Mary | 0 | 2 |
I've tried using variations on
SELECT * FROM ( SELECT ... ) AS plan
INNER JOIN ( [second-query] ) AS actual
ON plan.PeopleName = actual.UserName
What should I be doing? I suspect I need to squeeze a cross-join in there somewhere, but I'm getting nowhere...
( This going to be run inside a FileMaker ExecuteSQL() call, so I need pretty vanilla SQL... And no, I don't have control over the column or table names :-( )
EDIT:
To be clear, I need the result set to include both users who had planned days and haven't worked on a task, as well as those who have worked on a task without having planned days...
EDIT 2:
I can kind of get what I want manually, but can't see how to combine the statements below:
SELECT people.name, PlannedDays, ActualDays FROM
( SELECT PeopleName as name FROM Internal_Resources WHERE TaskNo = 100
UNION
SELECT DISTINCT UserName as name FROM TimeSheetEntries WHERE TaskNumber = 100
ORDER BY Name) AS people
gets me:
+--------+
| name |
+--------+
| Albert |
| Bob |
| Fred |
| Mary |
+--------+
and:
( SELECT PeopleName AS name, NoOfDays AS PlannedDays
FROM Internal_Resources WHERE TaskNo = 100 ) AS actual
gets me:
+--------+-------------+
| name | PlannedDays |
+--------+-------------+
| Fred | 1 |
| Bob | 3 |
| Albert | 10 |
+--------+-------------+
and finally,
( SELECT UserName AS name, SUM( Round( PaidHours / 7, 2 ) ) AS ActualDays
FROM TimeSheetEntries
WHERE TaskNumber = 100 GROUP BY UserName ) AS planned
gets me:
+------+------------+
| name | ActualDays |
+------+------------+
| Bob | 4.00 |
| Fred | 4.00 |
| Mary | 2.00 |
+------+------------+
Now all (All! ha!) I want is to combine these into this:
+--------+-------------+------------+
| name | PlannedDays | ActualDays |
+--------+-------------+------------+
| Albert | 10 | NULL |
| Bob | 3 | 4.00 |
| Fred | 1 | 4.00 |
| Mary | NULL | 2.00 |
+--------+-------------+------------+
EDIT 3:
I've tried combining it with something along the lines of:
SELECT people.name, PlannedDays, ActualDays
FROM ( SELECT PeopleName as name FROM Internal_Resources WHERE TaskNo = 100
UNION
SELECT DISTINCT UserName as name FROM TimeSheetEntries WHERE TaskNumber = 100
ORDER BY Name) AS people
LEFT JOIN ( SELECT PeopleName AS name, NoOfDays AS PlannedDays FROM Internal_Resources WHERE TaskNo = 100 ) AS actual,
ON people.name = actual.name
LEFT JOIN ( SELECT UserName AS name, SUM( Round( PaidHours / 7, 2 ) ) AS ActualDays FROM TimeSheetEntries WHERE TaskNumber = 100 GROUP BY UserName ) AS planned
ON people.name = planned.name;
but the syntax is clearly wonky.

Okay - this works:
SELECT people.name, COALESCE(PlannedDays, 0) as planned, COALESCE(ActualDays, 0) as actual
FROM ( SELECT PeopleName as name FROM Internal_Resources WHERE TaskNo = 100
UNION
SELECT DISTINCT UserName as name FROM TimeSheetEntries WHERE TaskNumber = 100
ORDER BY Name) AS people
LEFT JOIN ( SELECT PeopleName AS name, NoOfDays AS PlannedDays FROM Internal_Resources WHERE TaskNo = 100 ) AS ir
ON people.name = ir.name
LEFT JOIN ( SELECT UserName AS name, SUM( Round( PaidHours / 7, 2 ) ) AS ActualDays FROM TimeSheetEntries WHERE TaskNumber = 100 GROUP BY UserName ) AS ts
ON people.name = ts.name;
Giving:
+--------+---------+--------+
| name | planned | actual |
+--------+---------+--------+
| Albert | 10 | 0.00 |
| Bob | 3 | 4.00 |
| Fred | 1 | 4.00 |
| Mary | 0 | 2.00 |
+--------+---------+--------+
I thought there must be an easier way, and this looks simpler:
SELECT name, SUM(x) AS planned, SUM(y) AS actual
FROM (
SELECT PeopleName AS name, NoOfDays AS x, 0 AS y
FROM Internal_Resources WHERE TaskNo = 100
UNION
SELECT UserName AS name, 0 AS x, SUM( PaidHours / 7 ) AS y
FROM TimeSheetEntries WHERE TaskNumber = 100 GROUP BY UserName) AS source
GROUP BY name;
But frustratingly - both work in MySQL and both FAIL in FileMaker's cut-down SQL version - SELECTing from a derived table doesn't appear to be supported.
Finally - the trick to getting it to work in FileMaker SQL - subqueries are supported for IN and NOT IN... so a union of three queries - people who have planned days and have done some work, people who have done unplanned work, and people who haven't done planned work:
SELECT PeopleName as name, NoOfDays as planned, Sum( PaidHours / 7 ) as actual
FROM Internal_Resources
JOIN TimeSheetEntries
ON PeopleName = UserName
WHERE TaskNumber = 100 AND TaskNo = 100 GROUP BY PeopleName
UNION
SELECT UserName as name, 0 as planned, Sum( PaidHours / 7 ) as actual
FROM TimeSheetEntries
WHERE TaskNumber = 100
AND UserName NOT IN (
SELECT PeopleName FROM Internal_Resources WHERE TaskNo = 100
)
UNION
SELECT PeopleName as name, NoOfDays as planned, 0 as actual
FROM Internal_Resources WHERE TaskNo = 100
AND PeopleName NOT IN (
SELECT PeopleName as name
FROM Internal_Resources JOIN TimeSheetEntries
ON PeopleName = UserName
WHERE TaskNumber = 100 AND TaskNo = 100
GROUP BY PeopleName
)
ORDER BY name;
Hope this helps someone.

Doesn't filemaker support LEFT OUTER JOINs?
SELECT
PeopleName,
NoOfDays AS PlannedDays
ROUND(SUM(PaidHours) / 7, 2) AS ActualDays
FROM
Internal_Resources AS planned
-- left join should not discard Albert's record from Internal_Resources
LEFT JOIN TimeSheetEntries AS actual
ON planned.PeopleName = actual.UserName
AND planned.TaskNo = actual.TaskNumber
WHERE
planned.TaskNo = ?
GROUP BY PeopleName, NoOfDays

Invert the logic to read from Internal_resources in the outer query:
SELECT ir.UserName, NoOfDays as PlannedDays,
(SELECT SUM ( Round( PaidHours / 7 , 2 ))
FROM TimeSheetEntries e
WHERE e.TaskNo = ? AND ir.PeopleName = e.UserName
) as ActualDays
FROM Internal_Resources ir
WHERE ir.TaskNumber = ?
GROUP BY ir.UserName, NoOfDays;

Related

How to Merge Identical Rows with different column?

I have a table like this:
--------------------------------------------
| Job | Class | Employee | PayType | Hours |
| 212 A John 1 20 |
| 212 A John 2 10 |
| 911 C Rebekah 1 15 |
| 911 C Rebekah 2 10 |
--------------------------------------------
I want to convert this table so i can get following output
------------------------------------
| Job | Class | Employee | OT | ST |
| 212 | A | John | 20 | 10 |
| 911 | C | Rebekah | 15 | 10 |
------------------------------------
Here I've set 1 for OT and 2 for ST

You can conditional aggregation:
select
job,
class,
employee
sum(case when paytype = 1 then hours else 0 end) ot,
sum(case when paytype = 2 then hours else 0 end) st
from mytable
group by
jobs,
class,
employee

Using PIVOT TABLE:
select
Job,
Class,
Employee,
[1] as OT,
[2] as ST from
(
select * from test2
) as t
pivot
(
sum([Hours])
for paytype in([1],[2])
) as pvt;

Want to JOIN fourth table in query

I have four tables:
mls_category
points_matrix
mls_entry
bonus_points
My first table (mls_category) is like below:
*--------------------------------*
| cat_no | store_id | cat_value |
*--------------------------------*
| 10 | 101 | 1 |
| 11 | 101 | 4 |
*--------------------------------*
My second table (points_matrix) is like below:
*----------------------------------------------------*
| pm_no | store_id | value_per_point | maxpoint |
*----------------------------------------------------*
| 1 | 101 | 1 | 10 |
| 2 | 101 | 2 | 50 |
| 3 | 101 | 3 | 80 |
*----------------------------------------------------*
My third table (mls_entry) is like below:
*-------------------------------------------*
| user_id | category | distance | status |
*-------------------------------------------*
| 1 | 10 | 20 | approved |
| 1 | 10 | 30 | approved |
| 1 | 11 | 40 | approved |
*-------------------------------------------*
My fourth table (bonus_points) is like below:
*--------------------------------------------*
| user_id | store_id | bonus_points | type |
*--------------------------------------------*
| 1 | 101 | 200 | fixed |
| 2 | 102 | 300 | fixed |
| 1 | 103 | 4 | per |
*--------------------------------------------*
Now, I want to add bonus points value into the sum of total distance according to the store_id, user_id and type.
I am using the following code to get total distance:
SELECT MIN(b.value_per_point) * d.total_distance FROM points_matrix b
JOIN
(
SELECT store_id, sum(t1.totald/c.cat_value) as total_distance FROM mls_category c
JOIN
(
SELECT SUM(distance) totald, user_id, category FROM mls_entry
WHERE user_id= 1 AND status = 'approved' GROUP BY user_id, category
) t1 ON c.cat_no = t1.category
) d ON b.store_id = d.store_id AND b.maxpoint >= d.total_distance
The above code is correct to calculate value, now I want to JOIN my fourth table.
This gives me sum (60*3 = 180) as total value. Now, I want (60+200)*3 = 780 for user 1 and store id 101 and value is fixed.

i think your query will be like below
SELECT Max(b.value_per_point)*( max(d.total_distance)+max(bonus_points)) FROM mls_point_matrix b
JOIN
(
SELECT store_id, sum(t1.totald/c.cat_value) as total_distance FROM mls_category c
JOIN
(
SELECT SUM(distance) totald, user_id, category FROM mls_entry
WHERE user_id= 1 AND status = 'approved' GROUP BY user_id, category
) t1 ON c.cat_no = t1.category group by store_id
) d ON b.store_id = d.store_id inner join bonus_points bp on bp.store_id=d.store_id
DEMO fiddle

SQL - Rows that are repetitive with a particular condition

We have a table like this:
+----+-------+-----------------+----------------+-----------------+----------------+-----------------+
| ID | Name | RecievedService | FirstZoneTeeth | SecondZoneTeeth | ThirdZoneTeeth | FourthZoneTeeth |
+----+-------+-----------------+----------------+-----------------+----------------+-----------------+
| 1 | John | SomeService1 | 13 | | 4 | |
+----+-------+-----------------+----------------+-----------------+----------------+-----------------+
| 2 | John | SomeService1 | 34 | | | |
+----+-------+-----------------+----------------+-----------------+----------------+-----------------+
| 3 | Steve | SomeService3 | | | | 2 |
+----+-------+-----------------+----------------+-----------------+----------------+-----------------+
| 4 | Steve | SomeService4 | | | | 12 |
+----+-------+-----------------+----------------+-----------------+----------------+-----------------+
Every digit in zones is a tooth (dental science) and it means "John" has got "SomeService1" twice for tooth #3.
+----+------+-----------------+----------------+-----------------+----------------+-----------------+
| ID | Name | RecievedService | FirstZoneTeeth | SecondZoneTeeth | ThirdZoneTeeth | FourthZoneTeeth |
+----+------+-----------------+----------------+-----------------+----------------+-----------------+
| 1 | John | SomeService1 | 13 | | 4 | |
+----+------+-----------------+----------------+-----------------+----------------+-----------------+
| 2 | John | SomeService1 | 34 | | | |
+----+------+-----------------+----------------+-----------------+----------------+-----------------+
Note that Steve has received services twice for tooth #2 (4th Zone) but services are not one.
I'd write some code that gives me a table with duplicate rows (Checking the only patient and received service)(using "group by" clause") but I need to check zones too.
I've tried this:
select ROW_NUMBER() over(order by vv.ID_sick) as RowNum,
bb.Radif,
bb.VCount as 'Count',
vv.ID_sick 'ID_Sick',
vv.ID_service 'ID_Service',
sick.FNamesick + ' ' + sick.LNamesick as 'Sick',
serv.NameService as 'Service',
vv.Mab_Service as 'MabService',
vv.Mab_daryafti as 'MabDaryafti',
vv.datevisit as 'DateVisit',
vv.Zone1,
vv.Zone2,
vv.Zone3,
vv.Zone4,
vv.ID_dentist as 'ID_Dentist',
dent.FNamedentist + ' ' + dent.LNamedentist as 'Dentist',
vv.id_do as 'ID_Do',
do.FNamedentist + ' ' + do.LNamedentist as 'Do'
from visiting vv inner join (
select ROW_NUMBER() OVER(ORDER BY a.ID_sick ASC) AS Radif,
count(a.ID_sick) as VCount,
a.ID_sick,
a.ID_service
from visiting a
group by a.ID_sick, a.ID_service, a.Zone1, a.Zone2, a.Zone3, a.Zone4
having count(a.ID_sick)>1)bb
on vv.ID_sick = bb.ID_sick and vv.ID_service = bb.ID_service
left join InfoSick sick on vv.ID_sick = sick.IDsick
left join infoService serv on vv.ID_service = serv.IDService
left join Infodentist dent on vv.ID_dentist = dent.IDdentist
left join infodentist do on vv.id_do = do.IDdentist
order by bb.ID_sick, bb.ID_service,vv.datevisit
But this code only returns rows with all tooths repeated. What I want is even one tooth repeats ...
How can I implement it?
I need to check characters in zones.
**Zone's datatype is varchar

This is a bad datamodel for what you are trying to do. By storing the teeth as a varchar, you have kind of decided that you are not interested in single teeth, but only in the group of teeth. Now, however, you are trying to investigate on single teeth.
You'd want a datamodel like this:
service
+------------+--------+-----------------+
| service_id | Name | RecievedService |
+------------+--------+-----------------+
| 1 | John | SomeService1 |
+------------+--------+-----------------+
| 3 | Steve | SomeService3 |
+------------+--------+-----------------+
| 4 | Steve | SomeService4 |
+------------+-------+-----------------+
service_detail
+------------+------+-------+
| service_id | zone | tooth |
+------------+------+-------+
| 1 | 1 | 1 |
| 1 | 1 | 3 |
| 1 | 3 | 4 |
+------------+------+-------+
| 1 | 1 | 3 |
| 1 | 1 | 4 |
+------------+------+-------+
| 3 | 4 | 2 |
+------------+------+-------+
| 4 | 4 | 1 |
| 4 | 4 | 2 |
+------------+------+-------+
What you can do with the given datamodel is to create such table on-the-fly using a recursive query and string manipulation:
with unpivoted(service_id, name, zone, teeth) as
(
select recievedservice, name, 1, firstzoneteeth
from mytable where len(firstzoneteeth) > 0
union all
select recievedservice, name, 2, secondzoneteeth
from mytable where len(secondzoneteeth) > 0
union all
select recievedservice, name, 3, thirdzoneteeth
from mytable where len(thirdzoneteeth) > 0
union all
select recievedservice, name, 4, fourthzoneteeth
from mytable where len(fourthzoneteeth) > 0
)
, service_details(service_id, name, zone, tooth, teeth) as
(
select
service_id, name, zone, substring(teeth, 1, 1), substring(teeth, 2, 10000)
from unpivoted
union all
select
service_id, name, zone, substring(teeth, 1, 1), substring(teeth, 2, 10000)
from service_details
where len(teeth) > 0
)
, duplicates(service_id, name) as
(
select distinct service_id, name
from service_details
group by service_id, name, zone, tooth
having count(*) > 1
)
select m.*
from mytable m
join duplicates d on d.service_id = m.recievedservice and d.name = m.name;
A lot of work and a rather slow query due to a bad datamodel, but still feasable.
Rextester demo: http://rextester.com/JVWK49901

Oracle SQL newbie - Add new column that gets occurrence and computations

This post is enhanced version of my previous post here.
Please Note: This is not duplicate post or thread.
I have 3 tables:
1. REQUIRED_AUDITS (Independent table)
2. SCORE_ENTRY (SCORE_ENTRY is One to Many relationship with ERROR table)
3. ERROR
Below are the dummy data and table structure:
REQUIRED_AUDITS TABLE
+-------+------+----------+---------------+-----------------+------------+----------------+---------+
| ID | VP | Director | Manager | Employee | Req_audits | Audit_eligible | Quarter |
+-------+------+----------+---------------+-----------------+------------+----------------+---------+
| 10001 | John | King | susan#com.com | jake#com.com | 2 | Y | FY18Q1 |
| 10002 | John | King | susan#com.com | beth#com.com | 4 | Y | FY18Q1 |
| 10003 | John | Maria | tony#com.com | david#com.com | 6 | N | FY18Q1 |
| 10004 | John | Maria | adam#com.com | william#com.com | 3 | Y | FY18Q1 |
| 10005 | John | Smith | alex#com.com | rose#com.com | 6 | Y | FY18Q1 |
+-------+------+----------+---------------+-----------------+------------+----------------+---------+
SCORE_ENTRY TABLE
+----------------+------+----------+---------------+-----------------+-------+---------+
| SCORE_ENTRY_ID | VP | Director | Manager | Employee | Score | Quarter |
+----------------+------+----------+---------------+-----------------+-------+---------+
| 1 | John | King | susan#com.com | jake#com.com | 100 | FY18Q1 |
| 2 | John | King | susan#com.com | jake#com.com | 90 | FY18Q1 |
| 3 | John | King | susan#com.com | beth#com.com | 98.45 | FY18Q1 |
| 4 | John | King | susan#com.com | beth#com.com | 95 | FY18Q1 |
| 5 | John | King | susan#com.com | beth#com.com | 100 | FY18Q1 |
| 6 | John | King | susan#com.com | beth#com.com | 100 | FY18Q1 |
| 7 | John | Maria | adam#com.com | william#com.com | 99 | FY18Q1 |
| 8 | John | Maria | adam#com.com | william#com.com | 98.1 | FY18Q1 |
| 9 | John | Smith | alex#com.com | rose#com.com | 96 | FY18Q1 |
| 10 | John | Smith | alex#com.com | rose#com.com | 100 | FY18Q1 |
+----------------+------+----------+---------------+-----------------+-------+---------+
ERROR TABLE
+----------+-----------------------------+----------------+
| ERROR_ID | ERROR | SCORE_ENTRY_ID |
+----------+-----------------------------+----------------+
| 10 | Words Missing | 2 |
| 11 | Incorrect document attached | 2 |
| 12 | No results | 3 |
| 13 | Value incorrect | 4 |
| 14 | Words Missing | 4 |
| 15 | No files attached | 4 |
| 16 | Document read error | 7 |
| 17 | Garbage text | 8 |
| 18 | No results | 8 |
| 19 | Value incorrect | 9 |
| 20 | No files attached | 9 |
+----------+-----------------------------+----------------+
I have query that give below output:
+----------+---------------+------------------+------------------+------------------+
| | | Director Summary | | |
+----------+---------------+------------------+------------------+------------------+
| Director | Manager | Audits Required | Audits Performed | Percent Complete |
| King | susan#com.com | 6 | 6 | 100% |
| Maria | adam#com.com | 3 | 2 | 67% |
| Smith | alex#com.com | 6 | 2 | 33% |
+----------+---------------+------------------+------------------+------------------+
Now I would like to add column where I want the number of scores that have an error associated with them divided by total count of scores:
It's not total count of errors divided by count of scores. Instead its count of each occurrence of error and divide by count of score. Please find below example:
Considering
Director:King
Manager:susan#com.com
From SCORE_ENTRY TABLE and ERROR table,
King has 6 entries in SCORE_ENTRY TABLE
6 entries in ERROR TABLE
Instead of 6 entries in ERROR TABLE, I would like to have occurrence of error ie., 3 errors.
Formula to calculate Quality:
Quality = 1 - (sum of error occurrence / total score)*100
For King:
Quality = 1 - (3/6)*100
Quality = 50
Please Note: It's not 1 - (6/6)*100
For Maria:
Quality = 1 - (2/2)*100
Quality = 0
Below is the new output I need with new column called Quality:
+----------+---------------+---------+------------------+------------------+------------------+
| | | | Director Summary | | |
+----------+---------------+---------+------------------+------------------+------------------+
| Director | Manager | Quality | Audits Required | Audits Performed | Percent Complete |
| King | susan#com.com | 50% | 6 | 6 | 100% |
| Maria | adam#com.com | 0% | 3 | 2 | 67% |
| Smith | alex#com.com | 50% | 6 | 2 | 33% |
+----------+---------------+---------+------------------+------------------+------------------+
Below is the query am having (Thanks to #Kaushik Nayak, #APC and others) and need to add new column to this query:
WITH aud(manager_email, director, quarter, total_audits_required)
AS (SELECT manager_email,
director,
quarter,
SUM (CASE
WHEN audit_eligible = 'Y' THEN required_audits
END)
FROM required_audits
GROUP BY manager_email,
director,
quarter), --Total_audits
scores(manager_email, director, quarter, audits_completed)
AS (SELECT manager_email,
director,
quarter,
Count (score)
FROM oq_score_entry s
GROUP BY manager_email,
director,
quarter) --Audits_Performed
SELECT a.director,
a.manager_email manager,
a.total_audits_required,
s.audits_completed,
Round(( ( s.audits_completed ) / ( a.total_audits_required ) * 100 ), 2)
percentage_complete,
a.quarter
FROM aud a
left outer join scores s
ON a.manager_email = s.manager_email
WHERE ( :P4_MANAGER_EMAIL = a.manager_email
OR :P4_MANAGER_EMAIL IS NULL )
AND ( :P4_DIRECTOR = a.director
OR :P4_DIRECTOR IS NULL )
AND ( :P4_QUARTER = a.quarter
OR :P4_QUARTER IS NULL )
ORDER BY a.total_audits_required DESC nulls last
Please let me know if its confusing or need more details. Am open for any suggestions and feedback.
Appreciate any help.
Thanks,
Richa

Update:
Well my first guess has been wrong, and I hope now I'm getting it right.
According to your and shawnt00's comments, you need to compute the count of score entries that have corresponding entries in ERROR table, and use it in quality calculation.
This count you get with the expression:
COUNT ((select max(1) from "ERROR" o where o.score_entry_id=s.score_entry_id)) AS error_occurences
max(1) returns 1 when there is an entry in "ERROR" and NULL otherwise. COUNT skips nulls.
I hope this is clear.
Quality is computed as
(1 - error_occurences/audits_completed)*100%
Below is the full script, where manager_email renamed to manager and oq_score_entry renamed to score_entry.
This is in accordance with your scheme. Also I removed unnecessary WITH column mapping, it just complicates things in this case.
WITH aud AS (SELECT manager, director, quarter, SUM (CASE
WHEN audit_eligible = 'Y' THEN req_audits
END) total_audits_required
FROM required_audits
GROUP BY manager, director, quarter), --Total_audits
scores AS (
SELECT manager, director, quarter,
Count (score) audits_completed,
COUNT ((select max(1) from "ERROR" o where o.score_entry_id=s.score_entry_id)
) error_occurences -- ** Added **
FROM score_entry s
GROUP BY manager, director, quarter
) --Audits_Performed
SELECT a.director,
a.manager manager,
a.total_audits_required,
s.audits_completed,
Round(( 1 - ( s.error_occurences ) / ( s.audits_completed )) * 100, 2), -- ** Added **
Round(( ( s.audits_completed ) / ( a.total_audits_required ) * 100 ), 2)
percentage_complete,
a.quarter
FROM aud a
left outer join scores s ON a.manager = s.manager
WHERE ( :P4_manager = a.manager
OR :P4_manager IS NULL )
AND ( :P4_DIRECTOR = a.director
OR :P4_DIRECTOR IS NULL )
AND ( :P4_QUARTER = a.quarter
OR :P4_QUARTER IS NULL )
ORDER BY a.total_audits_required DESC nulls last
About total_errors:
To add this column you can either use a technique similar to the one used before in scores:
scores AS (
SELECT manager, director, quarter,
count (score) audits_completed,
count ((select max(1) from "ERROR" o where o.score_entry_id=s.score_entry_id )
) error_occurences,
sum ( ( SELECT count(*) from "ERROR" o where o.score_entry_id=s.score_entry_id )
) total_errors -- summing error counts for matched score_entry_ids
FROM score_entry s
GROUP BY manager, director, quarter
)
Or you can rewrite the scores CTE joining score_entry and error, and that would require using DISTINCT on score_entry fields to avoid duplication of rows:
scores AS (
SELECT manager, director, quarter,
count(DISTINCT s.score_entry_id) audits_completed,
count(DISTINCT e.score_entry_id ) error_occurences, -- counting distinct score_entry_ids present in Error
count(e.score_entry_id) total_errors -- counting total rows in Error
FROM score_entry s
LEFT JOIN "ERROR" e ON s.score_entry_id=e.score_entry_id
GROUP BY manager, director, quarter
)
The latter approach is a bit less maintable, since it requires to be careful about unwanted duplication.
Yet another (and may be the most proper) way is to make a separate(third) CTE, but I don't think the query is complex enough to warrant this.
Original answer:
I might be wrong, but it seems to me that by "count of each occurrence of error" you are trying to describe COUNT(DISTINCT expr). That is to count unique occurences of error for each (manager_email, director, quarter).
If so, change the query a bit:
WITH aud(manager_email, director, quarter, total_audits_required)
AS (SELECT manager_email,
director,
quarter,
SUM (CASE
WHEN audit_eligible = 'Y' THEN required_audits
END)
FROM required_audits
GROUP BY manager_email,
director,
quarter), --Total_audits
scores(manager_email, director, quarter, audits_completed, distinct_errors)
AS (SELECT manager_email,
director,
quarter,
Count (score),
COUNT (DISTINCT o.error_id) -- ** Added **
FROM oq_score_entry s join error o on o.score_entry_id=s.score_entry_id
GROUP BY manager_email,
director,
quarter) --Audits_Performed
SELECT a.director,
a.manager_email manager,
a.total_audits_required,
s.audits_completed,
Round(( ( s.distinct_errors ) / ( s.audits_completed ) * 100 ), 2) quality, -- ** Added **
Round(( ( s.audits_completed ) / ( a.total_audits_required ) * 100 ), 2)
percentage_complete,
a.quarter
FROM aud a
left outer join scores s
ON a.manager_email = s.manager_email
WHERE ( :P4_MANAGER_EMAIL = a.manager_email
OR :P4_MANAGER_EMAIL IS NULL )
AND ( :P4_DIRECTOR = a.director
OR :P4_DIRECTOR IS NULL )
AND ( :P4_QUARTER = a.quarter
OR :P4_QUARTER IS NULL )
ORDER BY a.total_audits_required DESC nulls last

The join on your main query will need to include director and quarter once you have more data.
I suppose the easiest way to fix this is to follow the structure you've got and add another table expression joining it to the rest of your results in the same way as the original two.
select manager_email, director, quarter,
100.0 - 100.0 * count (distinct e.score_entry_id) / count (*) as quality
from score_entry se left outer join error e
on e.score_entry_id = se.score_entry_id
group by manager_email, director, quarter
What would have made most of your explanation unnecessary is to have simply said that you want the number of scores that have an error associated with them. It was difficult to draw that out from the information you provided.

How would I make this join on this statistic?

Firstly, sorry about the question title. I'm not up with statistics parlance or this kind of join difficulty whatever that may be.
I have a query*, with it I essentially generate three things.. a random_sex, random_first, and random_last. I'm trying to join now with this method.
random_sex | random_first | random_last
------------+------------------+------------------
male | 47.7101715711225 | 24.3833348881337
male | 72.8463141907472 | 28.3560050522089
female | 72.8617294209544 | 33.3203859277759
male | 39.3406164890062 | 26.3352867371729
female | 28.6855500966031 | 65.8870893270099
female | 35.5960198949557 | 83.1188118207422
male | 11.5711074977927 | 10.544433838184
male | 15.6900786811765 | 18.7324617852545
male | 24.9860797089245 | 8.98265511383023
female | 80.4563122882508 | 35.594445341751
(10 rows)
Essentially the census data sits in a table like this...
name | freq | cumfreq | rank | name_type
------------+-------+---------+------+-----------
SMITH | 1.006 | 1.006 | 1 | LAST
JOHNSON | 0.81 | 1.816 | 2 | LAST
WILLIAMS | 0.699 | 2.515 | 3 | LAST
JONES | 0.621 | 3.136 | 4 | LAST
BROWN | 0.621 | 3.757 | 5 | LAST
DAVIS | 0.48 | 4.237 | 6 | LAST
MILLER | 0.424 | 4.66 | 7 | LAST
WILSON | 0.339 | 5 | 8 | LAST
MOORE | 0.312 | 5.312 | 9 | LAST
TAYLOR | 0.311 | 5.623 | 10 | LAST
ANDERSON | 0.311 | 5.934 | 11 | LAST
THOMAS | 0.311 | 6.245 | 12 | LAST
JACKSON | 0.31 | 6.554 | 13 | LAST
WHITE | 0.279 | 6.834 | 14 | LAST
HARRIS | 0.275 | 7.109 | 15 | LAST
MARTIN | 0.273 | 7.382 | 16 | LAST
THOMPSON | 0.269 | 7.651 | 17 | LAST
GARCIA | 0.254 | 7.905 | 18 | LAST
MARTINEZ | 0.234 | 8.14 | 19 | LAST
And, in this case..
random_sex | random_first | random_last
male | 47.7101715711225 | 24.3833348881337
I want it to be joined like this (procedurally):
=# select * from census.names where cumfreq > 47.7101715711225 AND name_type = 'MALE_FIRST' order by cumfreq asc limit 1;
name | freq | cumfreq | rank | name_type
--------+-------+---------+------+------------
SILVER | 0.009 | 47.717 | 1424 | MALE_FIRST
=# select * from census.names where cumfreq > 24.3833348881337 AND name_type = 'LAST' order by cumfreq asc limit 1;
name | freq | cumfreq | rank | name_type
--------+-------+---------+------+-----------
HARPER | 0.054 | 24.408 | 185 | LAST
So this gents name would be Silver Harper. I've never met one in my life, but they do exist.
I'd like to return "Silver" "Harper" in the above query rather than random numbers. How can I make it work like this?
FOOTNOTE
*: Just to keep it simple:
SELECT
CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
, RANDOM() * 90.020 AS random_first -- dataset is 90% of most popular
, RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1);

I actually don't know about statistics as well. but I think this is what you want
Lets name the table who returns the random columns Randoms
WITH RANDOMS AS
(
SELECT
CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS random_sex
, RANDOM() * 90.020 AS random_first
, RANDOM() * 90.483 AS random_last
FROM generate_series(1,10,1)
)
SELECT (
SELECT A.NAME
FROM census.names A
WHERE A.cumfreq > R.random_first
AND A.name_type = 'MALE_FIRST'
order by A.cumfreq asc limit 1
),
(
SELECT A.NAME
FROM census.names A
WHERE A.cumfreq > R.random_last
AND A.name_type = 'LAST'
order by A.cumfreq asc limit 1
) AS NAME
FROM RANDOMS R ;

Correlated sub-queries?
SELECT
*
FROM
yourRandomTable
INNER JOIN
census.names AS first_name
ON first_name.cumfreq = (SELECT MIN(cumfreq)
FROM census.names
WHERE cumfreq > yourRandomTable.random_first
AND type = yourRandomTable.random_sex + '_FIRST')
AND first_name.type = yourRandomTable.random_sex + '_FIRST'
INNER JOIN
census.names AS last_name
ON last_name.cumfreq = (SELECT MIN(cumfreq)
FROM census.names
WHERE cumfreq > yourRandomTable.random_last
AND type = 'LAST')
AND last_name.type = 'LAST'
You can vary this pattern quite a lot. Exactly how you choose to do it depends on how you have set up your indexes.

EXPLAIN ANALYZE SELECT
r.sex
, r.detail
, COALESCE(
(SELECT name FROM census.names AS mf WHERE r.sex = 'male' AND mf.name_type = 'MALE_FIRST' AND mf.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
, (SELECT name FROM census.names AS ff WHERE r.sex = 'female' AND ff.name_type = 'FEMALE_FIRST' AND ff.cumfreq > r.first ORDER BY cumfreq LIMIT 1)
) AS first
, (SELECT name FROM census.names AS l WHERE l.name_type = 'LAST' AND l.cumfreq > r.last ORDER BY cumfreq LIMIT 1) AS last
FROM (
SELECT
RANDOM() * 90.020 AS first
, RANDOM() * 90.483 AS last
, CASE WHEN RANDOM() > 0.5 THEN 'male' ELSE 'female' END AS sex
FROM generate_series(1,10,1)
) AS r;
This is actually what I ended up going with.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Combining SQL grouped and ungrouped results with a cross join? - sql

Related

How to Merge Identical Rows with different column?

Want to JOIN fourth table in query

SQL - Rows that are repetitive with a particular condition

Oracle SQL newbie - Add new column that gets occurrence and computations

How would I make this join on this statistic?

Categories

Resources