Query including TOP and AVG - sql

This is the last problem I have to deal with in my application and I hope someone will help because I'm clueless, I did my research and cannot find a proper solution.
I have an 'University Administration' application. I need to make a report with few tables included.
The problem is in SQL Query i have to finish. Query needs to MAKE LIST OF BEST 'n' STUDENTS, and the condition for student to be 'best' is grade AVERAGE.
I have 3 columns (students.stID & examines.grades). I need to get an average of my 'examines.grades' column, sort the table from highest (average grade) to the lowest, and I need to filter 'n' best 'averages'.
The user would enter the filter number and as I said, the app needs to show 'n' best averages.
Problem is in my SQL knowledge (not mySQL literaly but T-SQL). This is what I've donne with my SQL query, but the problem lies in the "SELECT TOP" because when I press my button, the app takes average only from TOP 'n' rows selected.
SELECT TOP(#topParam) student.ID, AVG(examines.grades)
FROM examines INNER JOIN
student ON examines.stID = student.stID
WHERE (examines.grades > 1)
For example:
StudentID Grade
1 2
2 5
1 5
2 2
2 4
2 2
EXIT:
StudentID Grade_Average
1 3.5
2 3.25

Being impatient, I think this is what you are looking for. You didn't specify which SQL Server version you're using although.
DECLARE #topParam INT = 3; -- Default
DECLARE #student TABLE (StudentID INT); -- Just for testing purpose
DECLARE #examines TABLE (StudentID INT, Grades INT);
INSERT INTO #student (StudentID) VALUES (1), (2);
INSERT INTO #examines (StudentID, Grades)
VALUES (1, 2), (2, 5), (1, 5), (2, 2), (2, 4), (2, 2);
SELECT DISTINCT TOP(#topParam) s.StudentID, AVG(CAST(e.grades AS FLOAT)) OVER (PARTITION BY s.StudentID) AS AvgGrade
FROM #examines AS e
INNER JOIN #student AS s
ON e.StudentID = s.StudentID
WHERE e.grades > 1
ORDER BY AvgGrade DESC;
If you'll provide some basic data, I'll adapt query for your needs.
Result:
StudentID AvgGrade
--------------------
1 3.500000
2 3.250000
Quick explain:
Query finds grades average in derived table and later queries it sorting by it. Another tip: You could use WITH TIES option in TOP clause to get more students if there would be multiple students who could fit for 3rd position.
If you'd like to make procedure as I suggested in comments, use this snippet:
CREATE PROCEDURE dbo.GetTopStudents
(
#topParam INT = 3
)
AS
BEGIN
BEGIN TRY
SELECT DISTINCT TOP(#topParam) s.StudentID, AVG(CAST(e.grades AS FLOAT)) OVER (PARTITION BY s.StudentID) AS AvgGrade
FROM examines AS e
INNER JOIN student AS s
ON e.StudentID = s.StudentID
WHERE e.grades > 1
ORDER BY AvgGrade DESC;
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER(), ERROR_MESSAGE();
END CATCH
END
And later call it like that. It's a good way to encapsulate your logic.
EXEC dbo.GetTopStudents #topParam = 3;

You should use the group by clause for counting average grades (in case examines.grades has an integer type, you should cast it to the floating-point type) for each student.ID and order by clause to limit your output to only top n with highest average grades:
select top(#topParam) student.ID
, avg(cast(examines.grades as float)) as avg_grade
from examines
join student on examines.stID = student.stID
where (examines.grades > 1)
group by student.ID
order by avg_grade desc

There's no need for A Windowed Aggregate which returns the duplicate rows and then you need DISTINCT to remove them again. It's a simple aggregation and your original query was already quite close:
SELECT TOP(#topParam) student.ID, AVG(CAST(grade AS FLOAT)) as AvgGrade
FROM examines INNER JOIN
student ON examines.stID = student.stID
WHERE (examines.grades > 1)
group by student.ID
order by AvgGrade DESC

Related

How to add this condition to this query?

I'm working on this query:
SELECT s.studentname,
Avg(cs.exam_season_one
+ cs.exam_season_two
+ cs.degree_season_one
+ cs.degree_season_two ) / 4 AS average
FROM courses_student cs
join students s
ON s.student_id = cs.student_id
join SECTION se
ON s.sectionid = se.sectionid
WHERE cs.courses_id = 1
AND ( se.classes_id = 2
OR se.classes_id = 5 )
AND s.studentname LIKE 'm%'
GROUP BY s.studentname
until this moment everything works perfectly but I need to add a last condition and I dont know how.
I need to get the sudents with the same average
I mean only students with count(average) > 1 (idk if this is right)
anyone knows how to solve this problem in this query?
PS: I use oracle.
Edit:
The create tables statements and sample data are here:
http://sqlfiddle.com/#!4/ebf636
trying to explain the problem more because maybe I did it the wrong way the first time!
First, the average is the average of 4 columns
The output I expect is to get the names of the students who belong to class 2 or class 5 (classes_id = 2, classes_id = 5), also their name should start with M
I want to check their average in a specific course (course_id = 1)
and the last condition I'm asking about is that I want to get the students who only have the same average in this course.
for example:
if we have 4 students and the averages in the course are (60,70,80,80) then I want to get only the last 2 student names because they have the same average. hope it's clear now!
You appear to be asking for the students where the average of the averages of their exam and degree seasons 1 and 2 are greater than 1 for certain sections.
Without sample data and expected output to validate against, its difficult to answer but, for each student, you want to GROUP BY the primary key that uniquely identifies the student (otherwise you may aggregate two students with the same name together) and you only need to check if the section exists (rather than JOINing the section as that would create duplicate rows if there are multiple matching sections and skew the averages).
Then
if we have 4 students and the averages in the course are (60,70,80,80) then I want to get only the last 2 student names because they have the same average
You want to count how many students have the same average and then filter out those with unique averages:
SELECT studentname,
average
FROM (
SELECT a.*,
COUNT(*) OVER (PARTITION BY average) AS num_with_average
FROM (
SELECT MAX(s.studentname) AS studentname,
( Avg(cs.exam_season_one)
+ Avg(cs.exam_season_two)
+ Avg(cs.degree_season_one)
+ Avg(cs.degree_season_two) ) / 4 AS average
FROM courses_student cs
join students s
ON s.student_id = cs.student_id
WHERE cs.courses_id = 1
AND s.studentname LIKE 'm%'
AND EXISTS(
SELECT 1
FROM section se
WHERE s.sectionid = se.sectionid
AND se.classes_id IN (2, 5)
)
GROUP BY
s.student_id
) a
)
WHERE num_with_average > 1;
db<>fiddle here

How to get MAX value out of the GROUPs COUNT

I've recently started to learn tsql beyond basic inserts and selects, I have test database that I train on, and there is one query that I can't really get to work.
There are 3 tables used in that query, in the picture there are simplified fields and relations
I have 2 following queries - first one is simply displaying students and number of marks from each subject. Second is doing almost what I want to achive - shows students and maxiumum amount of marks they got, so ex.
subject1 - (marks) 1, 5, 3, 4 count - 4
subject2 - (marks) 5, 4, 5 - count - 3
Query shows 4 and from what I checked it returns correct results, but I want one more thing - just to show the name of the subject from which there is maximum amount of marks so in the example case - subject1
--Query 1--
SELECT s.Surname, subj.SubjectName, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE m.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName
ORDER BY s.Surname
--Query 2--
SELECT query.Surname, MAX(Marks_count) as Maximum_marks_count FROM (SELECT s.Surname, subj.SubjectNumber, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE marks.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName) as query
GROUP BY query.Surname
ORDER BY query.Surname
--Query 3 - not working as supposed--
SELECT query.Surname, query.SubjectName, MAX(Marks_count) as Maximum_marks_count FROM (SELECT s.Surname, subj.SubjectNumber, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE marks.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName) as query
GROUP BY query.Surname, query.SubjectName
ORDER BY query.Surname
Part of the query 1 result
Part of the query 2 and unfortunately query 3 result
The problem is that when I add to the select statement subject name I got results as from query one - there is no more maximum amount of marks just students, subjects and amount of marks from each subject.
If someone could say what I'm missing, I will much appreciate :)
Here's a query that gets the highest mark per student, put it at the top of your sql file/batch and it will make another "table" you can join to your other tables to get the student name and the subject name:
WITH studentBest as
SELECT * FROM(
SELECT *, ROW_NUMBER() OVER(PARTITION BY studentid ORDER BY mark DESC) rown
FROM marks) a
WHERE rown = 1)
You use it like this (for example)
--the WITH bit goes above this line
SELECT *
FROM
studentBest sb
INNER JOIN
subject s
ON sb.subjectid = s.subjectnumber
Etc
That's also how you should be doing your joins
How does it work? Well.. it establishes an incrementing counter that restarts every time studentid changes (the partition clause) and the numberin goes in des ending mark order (the order by clause). An outer query selects only those rows with 1 in the row number, ie the top mark per student
Why can't I use group by?
You can, but you have to write a query that summarises the marks table into the top mark (max) per student and then you have to join that data back to the mark table to retrieve the subject and all in it's a lot more faff, often less efficient
What if there are two subjects with the same mark?
Use RANK instead of ROW_NUMBER if you want to see both
Edit in response to your comment:
An extension of the above method:
SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY su, st ORDER BY c DESC) rn FROM
(
SELECT studentid st, subjectid su, count(*) c
FROM marks
GROUP BY st, su
) a
) b
INNER JOIN student stu on b.st = stu.studentnumber
INNER JOIN subject sub on b.su = sub.subjectnumber
WHERE
b.rn = 1
We count the marks by student/subject, then rownumber them in descending order of count per student-subject pair, then choose only the first row and join in the other wanted data
Ok thanks to Caius Jard, some other Stack's question and a little bit of experiments I managed to write working query, so this is how I did it.
First I created view from query1 and added one more column to it - studentId.
Then I wrote query which almost satisfied me. That question helped me a lot with that task: Question
SELECT marks.Surname,
marks.SubjectName,
marks.Marks_count,
ROW_NUMBER() OVER(PARTITION BY marks.Surname ORDER BY marks.Surname) as RowNum
FROM MarksAmountPerStudentAndSubject marks
INNER JOIN (SELECT MarksAmountPerStudentAndSubject.Id,
MAX(MarksAmountPerStudentAndSubject.Marks_count) as MaxAmount
FROM MarksAmountPerStudentAndSubject
GROUP BY MarksAmountPerStudentAndSubject.Id) m
ON m.Id = marks.Id and marks.Marks_count = m.MaxAmount
It gives following results
That's what I wanted to achieve with one exception - if students have the same amount of marks from multiple subjects it displays all of them - thats fine but I decided to restrict this to the first result for each student - I couldn't just simply put TOP(1)
there so I used similar solution that Caius Jard showed - ROW_NUMBER and window function - it gave me a chance to choose records that has row number equals to 1.
I created another view from this query and I could simply write the final one
SELECT marks.Surname, marks.SubjectName, marks.Marks_count
FROM StudentsMaxMarksAmount marks
WHERE marks.RowNum = 1
ORDER BY marks.Surname
With result

writing a query using advanced group by

I have a single table database consists of the following fields:
ID, Seniority (years), outcome and some other less important fields.
Table row example:
ID:36 Seniority(years):1.79 outcome:9627
I need to write a query (sql server) in relatively simple code that returns the average outcome, grouped by the Seniority field, with leaps of five years (0-5 years, 6-10 etc...) with the condition that the average will be shown only if the group has more than 3 rows.
Result row example:
range:0-5 average:xxxx
Thank you very much
Use CASE statement to create different age groups. Try this
select case when Seniority between 0 and 5 then '0-5'
when Seniority between 6 and 10 then '6-10'
..
End,
Avg(outcome)
From yourtable
Group by case when Seniority between 0 and 5 then '0-5'
when Seniority between 6 and 10 then '6-10'
..
End
Having count(1)>=3
Since you have decimal places, If you want to count 5.4 to 0-5 group and 5.6 to 6-10 then use Round(Seniority,0) instead of Seniority in CASE statement
P.s.
0-5 contains 6 values while 6-10 contains 5.
select 'range:'
+ cast (isnull(nullif(floor((abs(seniority-1))/5)*5+1,1),0) as varchar)
+ '-'
+ cast ((floor((abs(seniority-1))/5)+1)*5 as varchar) as seniority_group
,avg(outcome)
from t
group by floor((abs(seniority-1))/5)
having count(*) >= 3
;
This would be something like:
select floor(seniority / 5), avg(outcome)
from t
group by floor(seniority / 5)
having count(*) >= 3;
Note: This breaks the seniority into equal sized groups which is 0-4, 5-9, and so on. This seems more reasonable than having unequal groups.
You can follow Gordon's answer(but you should to edit it a little), but I would do this with additional table with all possible intervals. You then can add appropriate index to boost it.
create table intervals
(
id int identity(1, 1),
start int,
end int
)
insert into intervals values
(0, 5),
(6, 10)
...
select i.id, avg(t.outcome) as outcome
from intervals i
join tablename t on t.seniority between i.start and i.end
group by i.id
having count(*) >=3
If creating new tables is not an option you can always use a CTE:
;with intervals as(
select * from
(values
(0, 5),
(6, 10)
--...
) t(start, [end])
)
select i.id, avg(t.outcome) as outcome
from intervals i
join tablename t on t.seniority between i.start and i.[end]
group by i.id
having count(*) >=3

Better way to detect the number of consecutive records for a lot of different unique user IDs

So I'm currently waiting for my query to run (which takes about 5 minutes)
I have developed a query that will look at a student's attendance record, identify a run of Attendance Codes and group them based on date and class time.
This query works great if i'm trying to find the highest number of absences for a particular student but becomes trickier when I try to create a table which shows the highest run of absences for all active students.
To achieve this I am using a cursor to assign the StudentNo (unique id) to a parameter and then run my original query, and place the results into a Temporary table called #results.
Here is my code:
DECLARE #StudentId INT
DECLARE #getStudentId CURSOR
DECLARE #Results TABLE(StudentNo INT,AttendanceCode VarChar(2),StartDate DateTime,EndDate DateTime,"# of Classes" INT)
SET #getStudentId = CURSOR FOR
SELECT StudentNo
FROM [dbo].[Students]
OPEN #getStudentId
FETCH NEXT
FROM #getStudentId INTO #StudentId
WHILE ##FETCH_STATUS = 0
BEGIN
WITH AttendanceCodeMaster AS
(SELECT
[dbo].[Students].StudentNo,
CAST(CONVERT(date,[dbo].[CourseOfferingSchedule].ClassDate,101) as DATETIME) + CAST(CONVERT(time,dbo.CourseOfferingSchedule.ClassStartTime,101) AS DATETIME) as ClassDate,
[dbo].[CourseOfferingAttendanceScheduled].AttendanceCode
FROM [dbo].[CourseOfferingAttendanceScheduled]
INNER JOIN [dbo].[Students] on [dbo].[CourseOfferingAttendanceScheduled].StudentNo = [dbo].[Students].StudentNo
INNER JOIN dbo.[CourseOfferingSchedule] on [dbo].[CourseOfferingAttendanceScheduled].ScheduleID = [dbo].[CourseOfferingSchedule].ScheduleID
INNER JOIN [dbo].[StudentStatus] on [dbo].[Students].StudentStatusID = [dbo].[StudentStatus].StudentStatusID
where
[dbo].[Students].StudentNo = #StudentId and StudentStatus = 'Active' and Complete = 'Y'
),
RunGroup AS
(SELECT StudentNo,ClassDate, AttendanceCode, (SELECT COUNT(*) From AttendanceCodeMaster as G WHERE G.AttendanceCode <> GR.AttendanceCode AND G.ClassDate <= GR.ClassDate) as RunGroup
FROM AttendanceCodeMaster as GR ),
AbsenceStreaks AS
(SELECT
StudentNo,
AttendanceCode,
MIN(ClassDate) as StartDate,
MAX(ClassDate)as EndDate,
COUNT(*) as '# of Classes'
FROM RunGroup
where AttendanceCode = 'A'
GROUP BY StudentNo,AttendanceCode, RunGroup),
LongestStreak AS
(SELECT TOP 1 * FROM AbsenceStreaks
Order BY '# of Classes' Desc)
INSERT INTO #Results SELECT * FROM LongestStreak
FETCH NEXT
FROM #getStudentId INTO #StudentId
END
CLOSE #getStudentId
DEALLOCATE #getStudentId
SELECT * from #Results
where "# of Classes" >= 30
order by StudentNo
You do not need a cursor to solve this problem. The following may be making some assumptions on field contents and names (because it is hard to follow your query logic), but it should give you the right approach.
The key is that a string of absences can be recognized by enumerating all class days for a student (in a class) and then enumerating all class days for a student by whether or not they attended. The difference between these values is constant, for a sequence of absences or presences.
select studentNo, grp, AttendanceCode, count(*) as numInRow,
min(ClassDate) as DateStart, max(ClassDate) as DateEnd
from (select acm.*
(row_number() over (partition by StudentNo order by ClassDate) -
row_number() over (partition by StudentNo, AttendanceCode order by ClassDate)
) as grp
from AttendanceCodeMaster acm
) acm
group by studentNo, grp, AttendanceCode;
(I'm not sure if there is a class code in there somewhere as well.)
This should give you the information you want to find sequences of absences or presences.
StudentNo,ClassDate, AttendanceCode, (SELECT COUNT(*) From AttendanceCodeMaster as G WHERE G.AttendanceCode <> GR.AttendanceCode AND G.ClassDate <= GR.ClassDate) as RunGroup
FROM AttendanceCodeMaster

SQL help: select the last 3 comments for EACH student?

I have two tables to store student data for a grade-school classroom:
Behavior_Log has the columns student_id, comments, date
Student_Roster has the columns student_id, firstname, lastname
The database is used to store daily comments about student behavior, and sometimes the teacher makes multiple comments about a student in a given day.
Now let's say the teacher wants to be able to pull up a list of the last 3 comments made for EACH student, like this:
Jessica 7/1/09 talking
Jessica 7/1/09 passing notes
Jessica 5/3/09 absent
Ciboney 7/2/09 great participation
Ciboney 4/30/09 absent
Ciboney 2/22/09 great participation
...and so on for the whole class
The single SQL query must return a set of comments for each student to eliminate the human-time-intensive need for the teacher to run separate queries for each student in the class.
I know that this sounds similar to
SQL Statement Help - Select latest Order for each Customer but I need to display the last 3 entries for each person, I can't figure out how to get from here to there.
Thanks for your suggestions!
A slightly modified solution from this article in my blog:
Analytic functions: SUM, AVG, ROW_NUMBER
SELECT student_id, date, comment
FROM (
SELECT student_id, date, comment, (#r := #r + 1) AS rn
FROM (
SELECT #_student_id:= -1
) vars,
(
SELECT *
FROM
behavior_log a
ORDER BY
student_id, date DESC
) ao
WHERE CASE WHEN #_student_id <> student_id THEN #r := 0 ELSE 0 END IS NOT NULL
AND (#_student_id := student_id) IS NOT NULL
) sc
JOIN Student_Roster sr
ON sr.student_id = sc.student_id
WHERE rn <= 3
A different approach would be to use the group_concat function and a single sub select and a limit on that subselect.
select (
select group_concat( concat( student, ', ', date,', ', comment ) separator '\n' )
from Behavior_Log
where student_id = s.student_id
group by student_id
limit 3 )
from Student_Roster s