Why can't I get GROUP BY to work in my LEFT JOIN in Access - sql

I'm trying to populate a combobox from two tables in Access (2007-2016 file format).
I have two tables:
tblSurveyStatus
SurveyID
SurveyStatus
1
Y
2
N
3
N/A
tblWorkOrder
WONumber
SurveyedID
WO2101
1
WO2102
1
WO2103
2
WO2104
3
WO2105
2
WO2106
{Empty}
WO2107
{Empty}
Desired Result:
WONumber(this col will get hidden)
SurveyStatus
WO2101
Y
WO2103
N
WO2104
N/A
This query works in the datasource for the combobox without using GROUP BY:
SELECT SurveyedID, SurveyStatus
FROM [tblWorkOrders] a
LEFT JOIN (
SELECT SurveyID, SurveyStatus
FROM [tblSurveyStatus]
) b
ON a.SurveyedID = b.SurveyID
ORDER BY b.SurveyID
The problem with this query is that it returns duplicates (Y,Y,N,N,N/A).
So I introduced the GROUP BY like this:
SELECT SurveyedID, SurveyStatus
FROM [tblWorkOrders] a
LEFT JOIN (
SELECT SurveyID, SurveyStatus
FROM [tblSurveyStatus]
) b
ON a.SurveyedID = b.SurveyID
GROUP BY a.SurveyStatus ORDER BY b.SurveyID
This causes an error message "Your query does not include the specified expression 'SurveyedID' as part of an aggregate function." So, I put MIN(SurveyedID) and the error message moves to the next field so I keep putting MIN() in the SQL until finally it works but, I get an input box asking for the "SurveyStatus" then another one asking for the "SurveyID".
I have spent three solid days researching, reading threads on this website and many others without success. I am not a programmer but I kind of understand the basics. My programming basically comes from finding snippets of code and altering them for my use. Please Help!

You are using 2 columns : SurveyedID, SurveyStatus in your 'select' expression while in 'group by' you are using only SurveyStatus which is not valid. All non-aggregate columns in select should be used in 'group by' as well. That is why by adding MIN() to those columns solved the error (which converted those non-aggregate columns to aggregate ones).
Adding "SurveyedID" also to your group by clause can resolve your issue here.
Also if your sole motive is to avoid duplicates , just use 'DISTINCT' before the list of columns in select expression

Related

How to select multiple values with the same ids and put them in one row, while maintaining the id to value connection?

I have a processknowledgeentry table that has the following data:
pke_id prc_id knw_id
1 1 2
2 1 4
3 2 4
The column knw_id references another table called knowledge, which also has its own id column. I want to be able to select all knw_id values with the same prc_id, and have them retain its nature as an id (so that it remains referenceable to the knowledge table).
Desired result:
prc_id knw_ids
1 [2, 4]
My code is shown below. (It also selects a Process Name from another table called process by inner joining the prc_ids. That part works correctly at least.)
SELECT * FROM (
SELECT
p.prc_name,
(SELECT knw_id
FROM processknowledgeentry
GROUP BY knw_id
HAVING COUNT(*) > 1)
FROM processknowledgeentry pke
INNER JOIN process p
ON pke.prc_id=p.prc_id
WHERE pke.prc_id = %s) as temp
I get the error: "CardinalityViolation: more than one row returned by a subquery used as an expression", and I understand why the error exists, so I want to know how to work around it. I'm also not sure if my logic is correct.
Would appreciate any assistance, thank you!
Seems you need a STRING_AGG() function instead of GROUP_CONCAT(), which some other DBMS has, containing a string type parameter as the first argument along with HAVING clause which filters multiple prc_id values such as
SELECT p.prc_id, STRING_AGG(knw_id::TEXT,',') AS knw_ids
FROM processknowledgeentry pke
JOIN process p
ON pke.prc_id = p.prc_id
-- WHERE pke.prc_id = %s
GROUP BY p.prc_id
HAVING COUNT(pke.prc_id) > 1
Indeed this case, a WHERE clause won't be needed.
Demo

Transposing and summing the top 5 results in Teradata SQL Assistant

I have a query that I converted from Access and is currently working correctly in Teradata SQL Assistant. The data pulled is just a standard table full of all of the data I need.
What I am wondering is: Can something be added to this query that will essentially sum up all of the Exposure values and then only show the top 5 Divisions by greatest to smallest sum (of those Top 5). Also, transposing the data so that my Topics are the left most column.
Here is the working code, details omitted.
SELECT
A.AS_OF_DT
, B.DIVISION
, B.CLASS
, Sum(A.BALANCE/1000000) AS "Bal in MMs"
, Sum(A.EXPOSURE/1000000) AS "Exp in MMs"
, Sum(CASE WHEN A.STATUS = 'NACC' THEN (B.BALANCE/1000000) ELSE 0 END) AS "NPL Bal as MMs"
FROM DB.TABLE1 A LEFT JOIN DB.TABLE2 B ON A.NAICS = B.NAICS_CD
WHERE A.AS_OF_DT= '2017-03-31'
GROUP BY
A.AS_OF_DT,
B.DIVISION,
B.CLASS
ORDER BY SUM (A.EXPOSURE/1000000) DESC
Essentially I want the columns to be the following:
DIVISION|DATE|
Below DIVISION would only be the Top 5 DIVISIONS summarized by EXPOSURE (under DATE)
I can try and clarify if needed. Just let me know.
Thanks!
End result is to have a datapaste I can throw into Excel without the manual work of transposing the data in Excel along with writing formulas to rummage through the 1000's of results of the base query to find summarize the individual Divisions and then picking the top 5 each month.
Thanks!
Shill
To get the 5 top for each division, you can use QUALIFY.
Add this to the end of you query:
QUALIFY ROW_NUMBER() over (PARTITION BY AS_OF_DATE,DIVISION order by (SUM (A.EXPOSURE/1000000))
For your other questions, SQL Assistant isn't much of a presentation tool, it won't do what you are asking for.
If your query already work,
try replacing:
SELECT
By:
SELECT top 10
(line 1)

Oracle SQL - Comparing AVG functions in WHERE

I'm trying to write a few Oracle SQL scripts for an assignment. I've managed to get all of it to work, except for one part. To summarize, I have to display data from 2 tables if the average of 1 column in table A is greater than the average of another column in table B. I realize you cannot include AVG functions in a WHERE clause or HAVING clause since it seems unable to properly access the data (from what I've read). When I exclude this clause, the script executes properly, so I'm confident there are no other errors.
I've tried writing it as follows but the error I get is ORA-00936: missing expression and it is just before the > sign. I thought this may be due to improper bracket placing but none of my attempts resolved this. Here is my attempt:
SELECT l.l_category, SUM(r.r_sold), AVG(l.l_cost)
FROM promos l
INNER JOIN sales r
ON r.promo_id = l.promo_id
GROUP BY l.l_category
HAVING (SELECT AVG(l.l_cost) OVER (PARTITION BY l.l_cost)) >
(SELECT AVG(r.r_sold) OVER (PARTITION BY r.r_sold));
I tried doing this without the OVER (PARTITION BY ...) as well as putting it into a WHERE clause but it didn't resolve the error. I'm pretty sure I need to put it into a SELECT statement somehow but I'm at a loss.
You do not need to use the OVER clause when applying the aggregate functions in the HAVING clause. Just use the aggregate functions on their own.
SELECT l.l_category, SUM(r.r_sold), AVG(l.l_cost)
FROM promos l
INNER JOIN sales r
ON r.promo_id = l.promo_id
GROUP BY l.l_category
HAVING HAVING AVG(l.l_cost) > AVG(r.r_sold)

SQL Query (incorrect use of joins?)

I'm trying to write an (Oracle) SQL query that, given an "agent_id", would give me a list of questions that agent has answered during an assessment, as well as an average score over all of the times that agent has answered those questions.
Note: I tried to design the query such that it would support multiple employees (so we can query at the store level), hence the "IN" condition in the where clause.
Here's what I have so far:
select question.question_id as questionId,
((sum(answer.answer) / count(answer.answer)) * 100) as avgScore
from SPMADMIN.SPM_QC_ASSESSMENT_ANSWER answer
join SPMADMIN.SPM_QC_QUESTION question
on answer.question_id = question.question_id
join SPMADMIN.SPM_QC_ASSESSMENT assessment
on answer.assessment_id = assessment.assessment_id
join SPMADMIN.SPM_QC_SUB_GROUP_TYPE sub_group
on question.sub_group_type_id = sub_group.sub_group_id
join SPMADMIN.SPM_QC_GROUP_TYPE theGroup
on sub_group.group_id = theGroup.group_id
where question.question_id in (select distinct question2.question_id
from SPMADMIN.SPM_QC_QUESTION question2
)
and question.bool_yn_active_flag = 'Y'
and assessment.agent_id in (?)
and answer.answer is not null
order by theGroup.page_order asc,
sub_group.group_order asc,
question.sub_group_order asc
Basically I would want to see:
|questionId|avgScore|
| 1 | 100 |
| 2 | 50 |
| 3 | 75 |
Such that every question that employee has ever answered is in the list of question indexes with their average score over all of the times they've answered it.
When I run it as is, I'm given a "ORA-00937: not a single-group group function" error. Any sort of combination of a "group by" clause I've added hasn't helped in the least.
When I run it removing the question.question_id as questionId, part of the select, it runs fine, but it shows their average score over all questions. I need it broken down by question.
Any help or pointers would be greatly appreciated.
When you have an aggregate function in the SELECT list (SUM and COUNT are aggregate functions), then any other columns in the SELECT list need to be in a GROUP BY clause. For example:
SELECT fi, COUNT(fo)
FROM fum
GROUP BY fi
The COUNT(fo) expression is an aggregate, the fi column is a non-aggregate. If you were to add another non-aggregate to the SELECT list, it would also need to be included in the GROUP BY. For example
SELECT TRUNC(fee), fi, COUNT(fo)
FROM fum
GROUP BY TRUNC(fee), fi
To be a little more precise, rather than say "columns in the SELECT list", we should actually say "all non-aggregate expressions in the SELECT list" will need to be included in the GROUP BY clause.
It's not your joins but your use of GROUP BY.
When you use a GROUP BY in SQL, the things you GROUP BY are the things which define the groups. Everything else you have in your SELECT have to be in aggregates which operate over the group.
You can also do aggregates over the entire set without a GROUP BY, but then every column will need to be within an aggregate function.

Group by SQL statement

So I got this statement, which works fine:
SELECT MAX(patient_history_date_bio) AS med_date, medication_name
FROM biological
WHERE patient_id = 12)
GROUP BY medication_name
But, I would like to have the corresponding medication_dose also. So I type this up
SELECT MAX(patient_history_date_bio) AS med_date, medication_name, medication_dose
FROM biological
WHERE (patient_id = 12)
GROUP BY medication_name
But, it gives me an error saying:
"coumn 'biological.medication_dose' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.".
So I try adding medication_dose to the GROUP BY clause, but then it gives me extra rows that I don't want.
I would like to get the latest row for each medication in my table. (The latest row is determined by the max function, getting the latest date).
How do I fix this problem?
Use:
SELECT b.medication_name,
b.patient_history_date_bio AS med_date,
b.medication_dose
FROM BIOLOGICAL b
JOIN (SELECT y.medication_name,
MAX(y.patient_history_date_bio) AS max_date
FROM BIOLOGICAL y
GROUP BY y.medication_name) x ON x.medication_name = b.medication_name
AND x.max_date = b.patient_history_date_bio
WHERE b.patient_id = ?
If you really have to, as one quick workaround, you can apply an aggregate function to your medication_dose such as MAX(medication_dose).
However note that this is normally an indication that you are either building the query incorrectly, or that you need to refactor/normalize your database schema. In your case, it looks like you are tackling the query incorrectly. The correct approach should the one suggested by OMG Poinies in another answer.
You may be interested in checking out the following interesting article which describes the reasons behind this error:
But WHY Must That Column Be Contained in an Aggregate Function or the GROUP BY clause?
You need to put max(medication_dose) in your select. Group by returns a result set that contains distinct values for fields in your group by clause, so apparently you have multiple records that have the same medication_name, but different doses, so you are getting two results.
By putting in max(medication_dose) it will return the maximum dose value for each medication_name. You can use any aggregate function on dose (max, min, avg, sum, etc.)