SQL Query (incorrect use of joins?) - sql

I'm trying to write an (Oracle) SQL query that, given an "agent_id", would give me a list of questions that agent has answered during an assessment, as well as an average score over all of the times that agent has answered those questions.
Note: I tried to design the query such that it would support multiple employees (so we can query at the store level), hence the "IN" condition in the where clause.
Here's what I have so far:
select question.question_id as questionId,
((sum(answer.answer) / count(answer.answer)) * 100) as avgScore
from SPMADMIN.SPM_QC_ASSESSMENT_ANSWER answer
join SPMADMIN.SPM_QC_QUESTION question
on answer.question_id = question.question_id
join SPMADMIN.SPM_QC_ASSESSMENT assessment
on answer.assessment_id = assessment.assessment_id
join SPMADMIN.SPM_QC_SUB_GROUP_TYPE sub_group
on question.sub_group_type_id = sub_group.sub_group_id
join SPMADMIN.SPM_QC_GROUP_TYPE theGroup
on sub_group.group_id = theGroup.group_id
where question.question_id in (select distinct question2.question_id
from SPMADMIN.SPM_QC_QUESTION question2
)
and question.bool_yn_active_flag = 'Y'
and assessment.agent_id in (?)
and answer.answer is not null
order by theGroup.page_order asc,
sub_group.group_order asc,
question.sub_group_order asc
Basically I would want to see:
|questionId|avgScore|
| 1 | 100 |
| 2 | 50 |
| 3 | 75 |
Such that every question that employee has ever answered is in the list of question indexes with their average score over all of the times they've answered it.
When I run it as is, I'm given a "ORA-00937: not a single-group group function" error. Any sort of combination of a "group by" clause I've added hasn't helped in the least.
When I run it removing the question.question_id as questionId, part of the select, it runs fine, but it shows their average score over all questions. I need it broken down by question.
Any help or pointers would be greatly appreciated.

When you have an aggregate function in the SELECT list (SUM and COUNT are aggregate functions), then any other columns in the SELECT list need to be in a GROUP BY clause. For example:
SELECT fi, COUNT(fo)
FROM fum
GROUP BY fi
The COUNT(fo) expression is an aggregate, the fi column is a non-aggregate. If you were to add another non-aggregate to the SELECT list, it would also need to be included in the GROUP BY. For example
SELECT TRUNC(fee), fi, COUNT(fo)
FROM fum
GROUP BY TRUNC(fee), fi
To be a little more precise, rather than say "columns in the SELECT list", we should actually say "all non-aggregate expressions in the SELECT list" will need to be included in the GROUP BY clause.

It's not your joins but your use of GROUP BY.
When you use a GROUP BY in SQL, the things you GROUP BY are the things which define the groups. Everything else you have in your SELECT have to be in aggregates which operate over the group.
You can also do aggregates over the entire set without a GROUP BY, but then every column will need to be within an aggregate function.

Related

Why can't I get GROUP BY to work in my LEFT JOIN in Access

I'm trying to populate a combobox from two tables in Access (2007-2016 file format).
I have two tables:
tblSurveyStatus
SurveyID
SurveyStatus
1
Y
2
N
3
N/A
tblWorkOrder
WONumber
SurveyedID
WO2101
1
WO2102
1
WO2103
2
WO2104
3
WO2105
2
WO2106
{Empty}
WO2107
{Empty}
Desired Result:
WONumber(this col will get hidden)
SurveyStatus
WO2101
Y
WO2103
N
WO2104
N/A
This query works in the datasource for the combobox without using GROUP BY:
SELECT SurveyedID, SurveyStatus
FROM [tblWorkOrders] a
LEFT JOIN (
SELECT SurveyID, SurveyStatus
FROM [tblSurveyStatus]
) b
ON a.SurveyedID = b.SurveyID
ORDER BY b.SurveyID
The problem with this query is that it returns duplicates (Y,Y,N,N,N/A).
So I introduced the GROUP BY like this:
SELECT SurveyedID, SurveyStatus
FROM [tblWorkOrders] a
LEFT JOIN (
SELECT SurveyID, SurveyStatus
FROM [tblSurveyStatus]
) b
ON a.SurveyedID = b.SurveyID
GROUP BY a.SurveyStatus ORDER BY b.SurveyID
This causes an error message "Your query does not include the specified expression 'SurveyedID' as part of an aggregate function." So, I put MIN(SurveyedID) and the error message moves to the next field so I keep putting MIN() in the SQL until finally it works but, I get an input box asking for the "SurveyStatus" then another one asking for the "SurveyID".
I have spent three solid days researching, reading threads on this website and many others without success. I am not a programmer but I kind of understand the basics. My programming basically comes from finding snippets of code and altering them for my use. Please Help!
You are using 2 columns : SurveyedID, SurveyStatus in your 'select' expression while in 'group by' you are using only SurveyStatus which is not valid. All non-aggregate columns in select should be used in 'group by' as well. That is why by adding MIN() to those columns solved the error (which converted those non-aggregate columns to aggregate ones).
Adding "SurveyedID" also to your group by clause can resolve your issue here.
Also if your sole motive is to avoid duplicates , just use 'DISTINCT' before the list of columns in select expression

Make a WHERE clause affect a single column in a query

As the questions stated I am looking to make a WHERE clause affect a single column and am having issues. My query is as follows
Select exp, COUNT(grade), COUNT(exp)
FROM table
WHERE grade = 100
GROUP BY exp;
essentailly I want a output that groups by exp and gives a full count of everyone with that exp but in the second column shows only how many of those people got perfect scores. The problem is the current WHERE affects the COUNT(exp). Beginner to SQL so sorry if this is simple and thanks for any help.
You want conditional aggregation, which in Postgres uses filter:
SELECT exp, COUNT(*),
COUNT(*) filter (where grade = 100)
FROM table
GROUP BY exp;

SQL Nested Query with distinct count

I have a dilemma, and I'm hoping someone will be able to help me out. I am attempting to work on some made up problems from an old text book of mine, this isn't a question from the book, but the data is, I just wanted to see if I could still work in SQL, so here goes. When this code is executed,
SELECT COUNT(code_description) "Number of Different Crimes", last, first,
code_description
FROM
(
SELECT criminal_id, last, first, crime_code, code_description
FROM criminals
JOIN crimes USING (criminal_id)
JOIN crime_charges USING (crime_id)
JOIN crime_codes USING (crime_code)
ORDER BY criminal_id
)
WHERE criminal_id = 1020
GROUP BY last, first, code_description;
I am provided with these results:
Number of Different Crimes LAST FIRST CODE_DESCRIPTION
1 Phelps Sam Agg Assault
1 Phelps Sam Drug Offense
Inevitably, I would like the number of different crimes to be 2 for each line since this criminal has two unique crimes charged to him. I would like it to be displayed something like:
Number of Different Crimes LAST FIRST CODE_DESCRIPTION
2 Phelps Sam Agg Assault
2 Phelps Sam Drug Offense
Not to push my luck but I would also like to get rid of the follow line also:
WHERE criminal_id = 1020
to something a little more elegant to represent any criminal with more than 1 crime type associated with them, for this case, Sam Phelps is the only one in this data set.
As #sgeddes said in a comment, you can use an analytic count, which doesn't need a subquery if you're specifying the criminal ID:
SELECT COUNT(code_description) OVER (PARTITION BY first, last) AS "Number of Different Crimes",
last, first, code_description
FROM criminals
JOIN crimes USING (criminal_id)
JOIN crime_charges USING (crime_id)
JOIN crime_codes USING (crime_code)
WHERE criminal_id = 1020;
If you want to look for anyone with multiple crimes then you do need a subquery so you can filter on the analytic result:
SELECT charge_count AS "Number of Different Crimes",
last, first, code_description
FROM (
SELECT COUNT(DISTINCT code_description) OVER (PARTITION BY first, last) AS charge_count,
criminal_id, last, first, code_description
FROM criminals
JOIN crimes USING (criminal_id)
JOIN crime_charges USING (crime_id)
JOIN crime_codes USING (crime_code)
)
WHERE charge_count > 1
ORDER BY criminal_id, code_description;
SQL Fiddle demo.
If the charges are across multiple crimes, but duplicated, then the distinct count still works, but you might want to make add a distinct to the overall result set - unless you want to show other crime-specific info - otherwise you get something like this.

Nested subquery in Access alias causing "enter parameter value"

I'm using Access (I normally use SQL Server) for a little job, and I'm getting "enter parameter value" for Night.NightId in the statement below that has a subquery within a subquery. I expect it would work if I wasn't nesting it two levels deep, but I can't think of a way around it (query ideas welcome).
The scenario is pretty simple, there's a Night table with a one-to-many relationship to a Score table - each night normally has 10 scores. Each score has a bit field IsDouble which is normally true for two of the scores.
I want to list all of the nights, with a number next to each representing how many of the top 2 scores were marked IsDouble (would be 0, 1 or 2).
Here's the SQL, I've tried lots of combinations of adding aliases to the column and the tables, but I've taken them out for simplicity below:
select Night.*
,
( select sum(IIF(IsDouble,1,0)) from
(SELECT top 2 * from Score where NightId=Night.NightId order by Score desc, IsDouble asc, ID)
) as TopTwoMarkedAsDoubles
from Night
This is a bit of speculation. However, some databases have issues with correlation conditions in multiply nested subqueries. MS Access might have this problem.
If so, you can solve this by using aggregation with a where clause that chooses the top two values:
select s.nightid,
sum(IIF(IsDouble, 1, 0)) as TopTwoMarkedAsDoubles
from Score as s
where s.id in (select top 2 s2.id
from score as s2
where s2.nightid = s.nightid
order by s2.score desc, s2.IsDouble asc, s2.id
)
group by s.nightid;
If this works, it is a simply matter to join Night back in to get the additional columns.
Your subquery can only see one level above it. so Night.NightId is totally unknown to it hence why you are being prompted to enter a value. You can use a Group By to get the value you want for each NightId then correlate that back to the original Night table.
Select *
From Night
left join (
Select N.NightId
, sum(IIF(S.IsDouble,1,0)) as [Number of Doubles]
from Night N
inner join Score S
on S.NightId = S.NightId
group by N.NightId) NightsWithScores
on Night.NightId = NightsWithScores.NightId
Because of the IIF(S.IsDouble,1,0) I don't see the point is using top.

Oracle Group by issue

I have the below query. The problem is the last column productdesc is returning two records and the query fails because of distinct. Now i need to add one more column in where clause of the select query so that it returns one record. The issue is that the column i need
to add should not be a part of group by clause.
SELECT product_billing_id,
billing_ele,
SUM(round(summary_net_amt_excl_gst/100)) gross,
(SELECT DISTINCT description
FROM RES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele) productdescr
FROM bil.bill_sum aa
WHERE file_id = 38613 --1=1
AND line_type = 'D'
AND (product_billing_id, billing_ele) IN (SELECT DISTINCT
product_billing_id,
billing_ele
FROM bil.bill_l2 )
AND trans_type_desc <> 'Change'
GROUP BY product_billing_id, billing_ele
I want to modify the select statement to the below way by adding a new filter to the where clause so that it returns one record .
(SELECT DISTINCT description
FROM RRES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele
AND (rate_structure_start_date <= TO_DATE(aa.p_effective_date,'yyyymmdd')
AND rate_structure_end_date > TO_DATE(aa.p_effective_date,'yyyymmdd'))
) productdescr
The aa.p_effective_date should not be a part of GROUP BY clause. How can I do it? Oracle is the Database.
So there are multiple RES.tariff records for a given product_billing_id/billing_ele, differentiated by the start/end dates
You want the description for the record that encompasses the 'p_effective_date' from bil.bill_sum. The kicker is that you can't (or don't want to) include that in the group by. That suggests you've got multiple rows in bil.bill_sum with different effective dates.
The issue is what do you want to happen if you are summarising up those multiple rows with different dates. Which of those dates do you want to use as the one to get the description.
If it doesn't matter, simply use MIN(aa.p_effective_date), or MAX.
Have you looked into the Oracle analytical functions. This is good link Analytical Functions by Example