Calculated Field Using Group By Subquery - sql

I have one table which holds Lessons with theirs attributes such as DepId and FieldId. I have two tables also, about Lesson Departments and Lesson Fields. I need to calculate lesson's weekly times percentage's based on DepId and FieldId. My query as follows:
Select a.FieldName, b.DepName, sum(LessonWeeklyTime), ((sum(LessonWeeklyTime))*100)/(select LessonDep, sum(LessonWeeklyTime)
From Lessons
Group By LessonDep)
from Lessons l, Departments b, Fields a
Where l.LessonDep=b.DepId
And l.LessonField=a.FieldId
Group By b.DepName,a.FieldName
But I am getting error:
Msg 116, Level 16, State 1, Line 2
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
Appreciate for help.

Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax. It is the 21st Century, after all.
You should be doing this with window functions:
select f.FieldName, d.DepName, sum(LessonWeeklyTime),
(sum(LessonWeeklyTime) * 100.0 /
sum(sum(LessonWeeklyTime)) over (partition by d.depName)
from Lessons l join
Departments d
on l.LessonDep = d.DepId join
Fields f
on l.LessonField = f.FieldId
Group By d.DepName, f.FieldName

As the error message states, you may only have one column in your subquery. Try it that way:
Select a.FieldName, b.DepName, sum(LessonWeeklyTime), ((sum(LessonWeeklyTime))*100)/(select sum(LessonWeeklyTime)
From Lessons
Group By LessonDep)
from Lessons l, Departments b, Fields a
Where l.LessonDep=b.DepId
And l.LessonField=a.FieldId
Group By b.DepName,a.FieldName

Check this:
Select a.FieldName, b.DepName, sum(LessonWeeklyTime), ((sum(LessonWeeklyTime))*100)/ sum(c.LessonWeeklyTime)
from Lessons l, Departments b, Fields a ,(select LessonDep, sum(LessonWeeklyTime) LessonWeeklyTime
From Lessons
Group By LessonDep) c
Where l.LessonDep=b.DepId
And l.LessonField=a.FieldId
And l.LessonDep=c.LessonDep
Group By b.DepName,a.FieldName

This gets the time by necessary codes from 1 table and uses a window function for the percentage. It groups/calculates by the IDs rather than descriptions to improve performance and reduce chance of errors on poor labelling/nulls:
WITH CTE AS( SELECT LESSONDEP, LESSONFIELD, SUM(LESSONWEEKLYTIME) AS [TIME]
FROM [LESSONS]
GROUP BY LESSONDEP, LESSONFIELD)
SELECT F.FIELDNAME, D.DEPNAME, L.[TIME], (L.[TIME]*100.0) / (SUM(L.[TIME]) OVER (PARTITION BY L.LESSONDEP) AS [PERCENTAGE]
FROM CTE L
INNER JOIN DEPARTMENTS D ON L.LESSONDEP = D.DEPID
INNER JOIN FIELDS F ON L.LESSONFIELD = F.FIELDID

Related

SQL Not a GROUP BY expression

I'm still new to SQL.
I've got a query to count the number of students that attend a certain lecture and I've been trying to group the records by the lectureid so I don't have 10 records for the same lecture.
SELECT ATTENDANCESHEET.LECTUREID,TOPIC, (
SELECT COUNT(STUDENTID) AS ATTENDANCE
FROM ATTENDANCESHEET
WHERE ATTENDANCESHEET.STUDENTID = LECTURE.STUDENTID
)
FROM ATTENDANCESHEET,LECTURE
WHERE ATTENDANCESHEET.LECTUREID = LECTURE.LECTUREID
GROUP BY ATTENDANCESHEET.LECTUREID;
I'm getting the error "not a GROUP BY expression". Can someone help me, please?
The error is because you have a correlated query. The correlation clause (the where in the subquery) is using a column from the outer query that is not aggregated. In addition, you have a column topic that is not in the group by.
I believe the query you want is more simply written as:
select a.lectureid, count(*) as attendance
from attendancesheet a
group by a.lectureid;
I notice that you have topic in the select. That is also an issue. Perhaps you want:
select l.lectureid, l.topic, count(*) as attendance
from attendancesheet a join
lecture l
on a.lectureid = l.lectureid
group by l.lectureid;
Or, if you have studentid in lecture, perhaps:
select l.lectureid, l.topic, count(*) as attendance
from lecture l
group by l.lectureid;
EDIT:
The data structure doesn't make sense to me, but perhaps you need both keys for the join:
select l.lectureid, l.topic, count(*) as attendance
from attendancesheet a join
lecture l
on a.lectureid = l.lectureid and a.studentid = l.lectureid
group by l.lectureid;
to solve the issue of group by without knowing the expected result
SELECT ATTENDANCESHEET.LECTUREID,TOPIC, (
SELECT COUNT(STUDENTID) AS ATTENDANCE
FROM ATTENDANCESHEET
WHERE ATTENDANCESHEET.STUDENTID = LECTURE.STUDENTID
)
FROM ATTENDANCESHEET,LECTURE
WHERE ATTENDANCESHEET.LECTUREID = LECTURE.LECTUREID
GROUP BY ATTENDANCESHEET.LECTUREID,TOPIC,LECTURE.STUDENTID; -- added the topic and studentid from lecture table
but I think what he's trying to do is
SELECT ATTENDANCESHEET.LECTUREID,TOPIC, count(LECTURE.STUDENTID) cntstudent
FROM ATTENDANCESHEET,LECTURE
WHERE ATTENDANCESHEET.LECTUREID = LECTURE.LECTUREID
GROUP BY ATTENDANCESHEET.LECTUREID,TOPIC
Try adding TOPIC to the group by :)

SUM a column count from two tables

I have this simple unioned query in SQL Server 2014 where I am getting counts of rows from each table, and then trying to add a TOTAL row at the bottom that will SUM the counts from both tables. I believe the problem is the LEFT OUTER JOIN on the last union seems to be only summing the totals from the first table
SELECT A.TEST_CODE, B.DIVISION, COUNT(*)
FROM ALL_USERS B, SIGMA_TEST A
WHERE B.DOMID = A.DOMID
GROUP BY A.TEST_CODE, B.DIVISION
UNION
SELECT E.TEST_CODE, F.DIVISION, COUNT(*)
FROM BETA_TEST E, ALL_USERS F
WHERE E.DOMID = F.DOMID
GROUP BY E.TEST_CODE, F.DIVISION
UNION
SELECT 'TOTAL', '', COUNT(*)
FROM (SIGMA_TEST A LEFT OUTER JOIN BETA_TEST E ON A.DOMID
= E.DOMID )
Here is a sample of the results I am getting:
I would expect the TOTAL row to display a result of 6 (2+1+3=6)
I would like to avoid using a Common Table Expression (CTE) if possible. Thanks in advance!
Since you are counting users with matching DOMIDs in the first two statements, the final statement also needs to include the ALL_USERS table. The final statement should be:
SELECT 'TOTAL', '', COUNT(*)
FROM ALL_USERS G LEFT OUTER JOIN
SIGMA_TEST H ON G.DOMID = H.DOMID
LEFT OUTER JOIN BETA_TEST I ON I.DOMID = G.DOMID
WHERE (H.TEST_CODE IS NOT NULL OR I.TEST_CODE IS NOT NULL)
I would consider doing a UNION ALL first then COUNT:
SELECT COALESCE(TEST_CODE, 'TOTAL'),
DIVISION,
COUNT(*)
FROM (
SELECT A.TEST_CODE, B.DIVISION
FROM ALL_USERS B
INNER JOIN SIGMA_TEST A ON B.DOMID = A.DOMID
UNION ALL
SELECT E.TEST_CODE, F.DIVISION
FROM BETA_TEST E
INNER JOIN ALL_USERS F ON E.DOMID = F.DOMID ) AS T
GROUP BY GROUPING SETS ((TEST_CODE, DIVISION ), ())
Using GROUPING SETS you can easily get the total, so there is no need to add a third subquery.
Note: I assume you want just one count per (TEST_CODE, DIVISION). Otherwise you have to also group on the source table as well, as in #Gareth's answer.
I think you can achieve this with a single query. It seems your test tables have similar structures, so you can union them together and join to ALL_USERS, finally, you can use GROUPING SETS to get the total
SELECT ISNULL(T.TEST_CODE, 'TOTAL') AS TEST_CODE,
ISNULL(U.DIVISION, '') AS DIVISION,
COUNT(*)
FROM ALL_USERS AS U
INNER JOIN
( SELECT DOMID, TEST_CODE, 'SIGNMA' AS SOURCETABLE
FROM SIGMA_TEST
UNION ALL
SELECT DOMID, TEST_CODE, 'BETA' AS SOURCETABLE
FROM BETA_TEST
) AS T
ON T.DOMID = U.DOMID
GROUP BY GROUPING SETS ((T.TEST_CODE, U.DIVISION, T.SOURCETABLE), ());
As an aside, the implicit join syntax you are using was replaced over a quarter of a century ago in ANSI 92. It is not wrong, but there seems to be little reason to continue to use it, especially when you are mixing and matching with explicit outer joins and implicit inner joins. Anyone else that might read your SQL will certainly appreciate consistency.

How to use WITH clause and select clause

click here to view screenshot of table
Question: write a query to display the customer number, firstname, lastname for those client where total loan amount taken is maximum and at least taken from 2 bank branch.
I have tried the following query but I'm getting this error
Msg 8120, Level 16, State 1, Line 7
Column 'customer.fname' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Code:
with l as
(
select custid, sum(loan_amount) as tot
from loan
group by custid
having count(bid) >= 2
)
select
concat(c.fname, c.ltname) as name,
max(l.tot)
from
customer as c, l
where
l.custid = c.custid
You need to have a GROUP BY to select both aggregated and non-aggregated data, so you need to decide how you want the data grouped. You could do either
SELECT CONCAT(c.fname,c.ltname) as name, MAX(l.tot)
FROM customer AS c
INNER JOIN l ON l.custid=c.custid
GROUP BY c.fname,c.ltname
or
SELECT CONCAT(c.fname,c.ltname) as name, MAX(l.tot)
FROM customer AS c
INNER JOIN l ON l.custid=c.custid
GROUP BY concat(c.fname,c.ltname)
Please note the following:
I converted the "old" join syntax to the more acceptable INNER JOIN syntax
You probably want a space between the first and late name if you're displaying the results.

Using group by and having clause

Using the following schema:
Supplier (sid, name, status, city)
Part (pid, name, color, weight, city)
Project (jid, name, city)
Supplies (sid, pid, jid**, quantity)
Get supplier numbers and names for suppliers of parts supplied to at least two different projects.
Get supplier numbers and names for suppliers of the same part to at least two different projects.
These were my answers:
1.
SELECT s.sid, s.name
FROM Supplier s, Supplies su, Project pr
WHERE s.sid = su.sid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid) >= 2
2.
SELECT s.sid, s.name
FROM Suppliers s, Supplies su, Project pr, Part p
WHERE s.sid = su.sid AND su.pid = p.pid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid)>=2
Can anyone confirm if I wrote this correctly? I'm a little confused as to how the Group By and Having clause works
The semantics of Having
To better understand having, you need to see it from a theoretical point of view.
A group by is a query that takes a table and summarizes it into another table. You summarize the original table by grouping the original table into subsets (based upon the attributes that you specify in the group by). Each of these groups will yield one tuple.
The Having is simply equivalent to a WHERE clause after the group by has executed and before the select part of the query is computed.
Lets say your query is:
select a, b, count(*)
from Table
where c > 100
group by a, b
having count(*) > 10;
The evaluation of this query can be seen as the following steps:
Perform the WHERE, eliminating rows that do not satisfy it.
Group the table into subsets based upon the values of a and b (each tuple in each subset has the same values of a and b).
Eliminate subsets that do not satisfy the HAVING condition
Process each subset outputting the values as indicated in the SELECT part of the query. This creates one output tuple per subset left after step 3.
You can extend this to any complex query there Table can be any complex query that return a table (a cross product, a join, a UNION, etc).
In fact, having is syntactic sugar and does not extend the power of SQL. Any given query:
SELECT list
FROM table
GROUP BY attrList
HAVING condition;
can be rewritten as:
SELECT list from (
SELECT listatt
FROM table
GROUP BY attrList) as Name
WHERE condition;
The listatt is a list that includes the GROUP BY attributes and the expressions used in list and condition. It might be necessary to name some expressions in this list (with AS). For instance, the example query above can be rewritten as:
select a, b, count
from (select a, b, count(*) as count
from Table
where c > 100
group by a, b) as someName
where count > 10;
The solution you need
Your solution seems to be correct:
SELECT s.sid, s.name
FROM Supplier s, Supplies su, Project pr
WHERE s.sid = su.sid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid) >= 2
You join the three tables, then using sid as a grouping attribute (sname is functionally dependent on it, so it does not have an impact on the number of groups, but you must include it, otherwise it cannot be part of the select part of the statement). Then you are removing those that do not satisfy your condition: the satisfy pr.jid is >= 2, which is that you wanted originally.
Best solution to your problem
I personally prefer a simpler cleaner solution:
You need to only group by Supplies (sid, pid, jid**, quantity) to
find the sid of those that supply at least to two projects.
Then join it to the Suppliers table to get the supplier same.
SELECT sid, sname from
(SELECT sid from supplies
GROUP BY sid
HAVING count(DISTINCT jid) >= 2
) AS T1
NATURAL JOIN
Supliers;
It will also be faster to execute, because the join is only done when needed, not all the times.
--dmg
Because we can not use Where clause with aggregate functions like count(),min(), sum() etc. so having clause came into existence to overcome this problem in sql. see example for having clause go through this link
http://www.sqlfundamental.com/having-clause.php
First of all, you should use the JOIN syntax rather than FROM table1, table2, and you should always limit the grouping to as little fields as you need.
Altought I haven't tested, your first query seems fine to me, but could be re-written as:
SELECT s.sid, s.name
FROM
Supplier s
INNER JOIN (
SELECT su.sid
FROM Supplies su
GROUP BY su.sid
HAVING COUNT(DISTINCT su.jid) > 1
) g
ON g.sid = s.sid
Or simplified as:
SELECT sid, name
FROM Supplier s
WHERE (
SELECT COUNT(DISTINCT su.jid)
FROM Supplies su
WHERE su.sid = s.sid
) > 1
However, your second query seems wrong to me, because you should also GROUP BY pid.
SELECT s.sid, s.name
FROM
Supplier s
INNER JOIN (
SELECT su.sid
FROM Supplies su
GROUP BY su.sid, su.pid
HAVING COUNT(DISTINCT su.jid) > 1
) g
ON g.sid = s.sid
As you may have noticed in the query above, I used the INNER JOIN syntax to perform the filtering, however it can be also written as:
SELECT s.sid, s.name
FROM Supplier s
WHERE (
SELECT COUNT(DISTINCT su.jid)
FROM Supplies su
WHERE su.sid = s.sid
GROUP BY su.sid, su.pid
) > 1
What type of sql database are using (MSSQL, Oracle etc)?
I believe what you have written is correct.
You could also write the first query like this:
SELECT s.sid, s.name
FROM Supplier s
WHERE (SELECT COUNT(DISTINCT pr.jid)
FROM Supplies su, Projects pr
WHERE su.sid = s.sid
AND pr.jid = su.jid) >= 2
It's a little more readable, and less mind-bending than trying to do it with GROUP BY. Performance may differ though.
1.Get supplier numbers and names for suppliers of parts supplied to at least two different projects.
SELECT S.SID, S.NAME
FROM SUPPLIES SP
JOIN SUPPLIER S
ON SP.SID = S.SID
WHERE PID IN
(SELECT PID FROM SUPPPLIES GROUP BY PID, JID HAVING COUNT(*) >= 2)
I am not slear about your second question

SQL aggregate query error

I have 3 tables like this
player(id,name,age,teamid)
team(id,name,sponsor,totalplayer,totalchampion,boss,joindate)
playerdetail(id,playerid,position,number,allstar,joindate)
I want to select teaminfo include name,sponsor,totalplayer,totalchampion,boss,
the average age of the players, the number of the allstar players
I write the t-sql as below
SELECT T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE,
AVG(P.AGE) AS AverageAge,COUNT(D.ALLSTAR) As AllStarPlayer
FROM Team T,Player P,PlayerDetail D
WHERE T.ID=P.TID AND P.ID=D.PID
but it doesn't work, the error message is
'Column 'Team.Name' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.'
Who can help me?
Thx in advance!
Add
GROUP BY
T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE
In most RDBMS (except MySQL which will guess for you), a column must be either aggregated (COUNT, AVG) or in the GROUP BY
Also, you should use explicit JOINs.
This is clearer, less ambiguous and more difficult to bollix your code
SELECT
T.NAME, T.SPONSOR, T.TOTALPLAYER, T.TOTALCHAMPION, T.BOSS, T.JOINDATE,
AVG(P.AGE) AS AverageAge,
COUNT(D.ALLSTAR) As AllStarPlayer
FROM
Team T
JOIN
Player P ON T.ID=P.TID
JOIN
PlayerDetail D ON P.ID=D.PID
GROUP BY
T.NAME, T.SPONSOR, T.TOTALPLAYER, T.TOTALCHAMPION, T.BOSS, T.JOINDATE;
Given that you want this data per team, and team.ID uniquely identifies team, I suggest the following:
SELECT max(T.NAME) As TeamName,
max(T.SPONSOR) As Sponsor,
max(T.TOTALPLAYER) As TotalPlayers,
max(T.TOTALCHAMPION) As TotalChampions,
max(T.BOSS) As Boss,
max(T.JOINDATE) As JoinDate,
AVG(P.AGE) AS AverageAge,
COUNT(D.PID) As AllStarPlayer
FROM Team T
join Player P on T.ID=P.TID
left join PlayerDetail D on P.ID=D.PID and D.ALLSTAR = 'Y'
group by T.ID
Use:
SELECT T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE,
AVG(P.AGE) AS AverageAge,COUNT(D.ALLSTAR) As AllStarPlayer
FROM Team T
JOIN Player P ON T.ID = P.TEAMID
JOIN PlayerDetail D ON P.ID = D.PLAYERID
GROUP BY T.NAME,T.SPONSOR,T.TOTALPLAYER,T.TOTALCHAMPION,T.BOSS,T.JOINDATE