Use Rank() over a pseudo Column Name - sql

I have a table with columns:
StudentName
Marks1
Marks2
from which I need to perform a query that will calculate the average of two marks and rank the rows from highest average to least.
I executed the following query:
SELECT
*,
(SELECT AVG(c) FROM (VALUES(Marks1),(Marks2)) T (c)) AS Average,
RANK() OVER (ORDER BY Average DESC) AS Position
from Marks;
But that gives an error:
Average is an Invalid Column Name.
How do I fix this? How do I give a query to perform Rank() over Average.

You can't reference a column by its alias in the SELECT; the only place you can reference its alias is in the ORDER BY clause.
What you can do, however, is move the subquery to the FROM, and then you can reference the column returned in your (outer) SELECT:
SELECT M.*,--List your columns here, don't use *
A.Average,
RANK() OVER (ORDER BY A.Average DESC) AS Position
FROM Marks M
CROSS APPLY(SELECT AVG(Mark) AS Average FROM (VALUES(Marks1),(Marks2)) V(Mark) ) A;

You should just use the average of the two marks inlined in the outer query:
SELECT *, RANK() OVER (ORDER BY (Marks1 + Marks2) / 2 DESC) AS Position
FROM Marks
ORDER BY (Marks1 + Marks2) / 2 DESC;

Related

How do I grab each student’s 3rd max assignment mark in each subject

I am trying to write an sql that will allow me select each student’s 3rd best assignment mark in each subject. I have tried with the query below but it isn't working for me. I will be grateful to get some answers. I am getting an error [Code: 0, SQL State: 21000] ERROR: more than one row returned by a subquery used as an expression.
This is the table structure Students , Courses(Id) , bridging table called StudentsCourses(ID, StudentID,CourseID) and then assignment table which has StudentsCourse(FK) and Grade
select max(Assignments.Grade)
from Assignments
where grade < (select max(Assignments.Grade)
from Assignments
where grade < (select max(Assignments.Grade)
from Assignments
group by Assignments.StudentCourseID))
You can use window functions:
select *
from (
select a.*, row_number() over(partition by student_id, subject_id order by grade desc)
from assignments a
) a
where rn = 3
Your question is a bit unclear about the structure of table assignments. This assumes that a student is identified by student_id and a subject by subject_id - you many need to ajust that to your actual column names.
Use row_number():
select a.*
from (select a.*,
row_number() over (partition by student_id, StudentCourseID order by grade desc) as seqnum
from assignments a
) a
where seqnum = 3;
Note: If all the assignments have the same value, this will return the highest value.
If you want the third highest distinct score, then use dense_rank() instead of row_number().

SQL Query to obtain the maximum value for each unique value in another column

ID Sum Name
a 10 Joe
a 8 Mary
b 21 Kate
b 110 Casey
b 67 Pierce
What would you recommend as the best way to
obtain for each ID the name that corresponds to the largest sum (grouping by ID).
What I tried so far:
select ID, SUM(Sum) s, Name
from Table1
group by ID, Name
Order by SUM(Sum) DESC;
this will arrange the records into groups that have the highest sum first. Then I have to somehow flag those records and keep only those. Any tips or pointers? Thanks a lot
In the end I'd like to obtain:
a 10 Joe
b 110 Casey
You want the row_number() function:
select id, [sum], name
from (select t.*]
row_number() over (partition by id order by [sum] desc) as seqnum
from table1
) t
where seqnum = 1;
Your question is more confusing than it needs to be because you have a column called sum. You should avoid using SQL reserved words for identifiers.
The row_number() function assigns a sequential number to a group of rows, starting with 1. The group is defined by the partition by clause. In this case, all rows with the same id are in the same group. The ordering of the numbers is determined by the order by clause, so the one with the largest value of sum gets the value of 1.
If you might have duplicate maximum values and you want all of them, use the related function rank() or dense_rank().
select *
from
(
select *
,rn = row_number() over (partition by Id order by sum desc)
from table
)x
where x.rn=1
demo

sql query finding most often level appear

I have a table Student in SQL Server with these columns:
[ID], [Age], [Level]
I want the query that returns each age value that appears in Students, and finds the level value that appears most often. For example, if there are more 'a' level students aged 18 than 'b' or 'c' it should print the pair (18, a).
I am new to SQL Server and I want a simple answer with nested query.
You can do this using window functions:
select t.*
from (select age, level, count(*) as cnt,
row_number() over (partition by age order by count(*) desc) as seqnum
from student s
group by age, level
) t
where seqnum = 1;
The inner query aggregates the data to count the number of levels for each age. The row_number() enumerates these for each age (the partition by with the largest first). The where clause then chooses the highest values.
In the case of ties, this returns just one of the values. If you want all of them, use rank() instead of row_number().
One more option with ROW_NUMBER ranking function in the ORDER BY clause. WITH TIES used when you want to return two or more rows that tie for last place in the limited results set.
SELECT TOP 1 WITH TIES age, level
FROM dbo.Student
GROUP BY age, level
ORDER BY ROW_NUMBER() OVER(PARTITION BY age ORDER BY COUNT(*) DESC)
Or the second version of the query using amount each pair of age and level, and max values of count pair age and level per age.
SELECT *
FROM (
SELECT age, level, COUNT(*) AS cnt,
MAX(COUNT(*)) OVER(PARTITION BY age) AS mCnt
FROM dbo.Student
GROUP BY age, level
)x
WHERE x.cnt = x.mCnt
Demo on SQLFiddle
Another option but will require later version of sql-server:
;WITH x AS
(
SELECT age,
level,
occurrences = COUNT(*)
FROM Student
GROUP BY age,
level
)
SELECT *
FROM x x
WHERE EXISTS (
SELECT *
FROM x y
WHERE x.occurrences > y.occurrences
)
I realise it doesn't quite answer the question as it only returns the age/level combinations where there are more than one level for the age.
Maybe someone can help to amend it so it includes the single level ages aswell in the result set: http://sqlfiddle.com/#!3/d597b/9
with combinations as (
select age, level, count(*) occurrences
from Student
group by age, level
)
select age, level
from combinations c
where occurrences = (select max(occurrences)
from combinations
where age = c.age)
This finds every age and level combination in the Students table and counts the number of occurrences of each level.
Then, for each age/level combination, find the one whose occurrences are the highest for that age/level combination. Return the age and level for that row.
This has the advantage of not being tied to SQL Server - it's vanilla SQL. However, a window function like Gordon pointed out may perform better on SQL Server.

Average of Sum minus Minimum

I have an SQL statement that grabs the grades of different activity types (Homework, Quiz, etc), and if there's a drop lowest for that type, it drops, else, it remains. The errors are below as well as the SQL Code.
SELECT Student.firstName, Student.lastName, 'Grades' =
CASE
WHEN Grades.activityType = 'Homework' THEN
CASE WHEN Policy.drop_hw = 1 THEN
(AVG(SUM(Grades.grade) - MIN(Grades.grade))) * (Policy.homework / 100)
ELSE
(AVG(Grades.grade) * (Policy.homework / 100))
END
END, Course.courseNum, Course.sectNum, Grades.activityType
FROM ...
Here are the errors I'm getting:
- Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
- Column 'Policy.drop_hw' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Look into analytical functions. (SO question, Oracle documentation).
Something like this:
AVG(Grades.grade) OVER (PARTITION BY Grades.student_id) AS avg_of_grades
and:
(AVG(SUM(Grades.grade) - MIN(Grades.grade))) OVER (PARTITION BY Grades.student_id) AS avg_grades_with_drop
Set the partitioning with whatever makes sense in your case; we can't tell since you omitted the FROM ... in your example.
You can then use those column aliases in any calculations inside your CASE statement.
If you only need to drop one lowest grade (in case of ties)
SELECT student_id, AVG(grade)
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY student_id ORDER BY grade) rn
FROM my_tables
)
WHERE NOT (drop_hw = 1 AND rn = 1)
GROUP BY
student_id
If you need to drop all lowest grades:
SELECT student_id, AVG(grade)
FROM (
SELECT *, MIN(grade) OVER (PARTITION BY student_id) mingrade
FROM my_tables
)
WHERE NOT (drop_hw = 1 AND grade = mingrade)
GROUP BY
student_id
The sum-operator gives one result (per group). The min-operator, too. So over what should the avg-operator aggregate?

How do I use ROW_NUMBER()?

I want to use the ROW_NUMBER() to get...
To get the max(ROW_NUMBER()) --> Or i guess this would also be the count of all rows
I tried doing:
SELECT max(ROW_NUMBER() OVER(ORDER BY UserId)) FROM Users
but it didn't seem to work...
To get ROW_NUMBER() using a given piece of information, ie. if I have a name and I want to know what row the name came from.
I assume it would be something similar to what I tried for #1
SELECT ROW_NUMBER() OVER(ORDER BY UserId) From Users WHERE UserName='Joe'
but this didn't work either...
Any Ideas?
For the first question, why not just use?
SELECT COUNT(*) FROM myTable
to get the count.
And for the second question, the primary key of the row is what should be used to identify a particular row. Don't try and use the row number for that.
If you returned Row_Number() in your main query,
SELECT ROW_NUMBER() OVER (Order by Id) AS RowNumber, Field1, Field2, Field3
FROM User
Then when you want to go 5 rows back then you can take the current row number and use the following query to determine the row with currentrow -5
SELECT us.Id
FROM (SELECT ROW_NUMBER() OVER (ORDER BY id) AS Row, Id
FROM User ) us
WHERE Row = CurrentRow - 5
Though I agree with others that you could use count() to get the total number of rows, here is how you can use the row_count():
To get the total no of rows:
with temp as (
select row_number() over (order by id) as rownum
from table_name
)
select max(rownum) from temp
To get the row numbers where name is Matt:
with temp as (
select name, row_number() over (order by id) as rownum
from table_name
)
select rownum from temp where name like 'Matt'
You can further use min(rownum) or max(rownum) to get the first or last row for Matt respectively.
These were very simple implementations of row_number(). You can use it for more complex grouping. Check out my response on Advanced grouping without using a sub query
If you need to return the table's total row count, you can use an alternative way to the SELECT COUNT(*) statement.
Because SELECT COUNT(*) makes a full table scan to return the row count, it can take very long time for a large table. You can use the sysindexes system table instead in this case. There is a ROWS column that contains the total row count for each table in your database. You can use the following select statement:
SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name') AND indid < 2
This will drastically reduce the time your query takes.
You can use this for get first record where has clause
SELECT TOP(1) * , ROW_NUMBER() OVER(ORDER BY UserId) AS rownum
FROM Users
WHERE UserName = 'Joe'
ORDER BY rownum ASC
ROW_NUMBER() returns a unique number for each row starting with 1. You can easily use this by simply writing:
ROW_NUMBER() OVER (ORDER BY 'Column_Name' DESC) as ROW_NUMBER
May not be related to the question here. But I found it could be useful when using ROW_NUMBER -
SELECT *,
ROW_NUMBER() OVER (ORDER BY (SELECT 100)) AS Any_ID
FROM #Any_Table
select
Ml.Hid,
ml.blockid,
row_number() over (partition by ml.blockid order by Ml.Hid desc) as rownumber,
H.HNAME
from MIT_LeadBechmarkHamletwise ML
join [MT.HAMLE] h on ML.Hid=h.HID
SELECT num, UserName FROM
(SELECT UserName, ROW_NUMBER() OVER(ORDER BY UserId) AS num
From Users) AS numbered
WHERE UserName='Joe'
You can use Row_Number for limit query result.
Example:
SELECT * FROM (
select row_number() OVER (order by createtime desc) AS ROWINDEX,*
from TABLENAME ) TB
WHERE TB.ROWINDEX between 0 and 10
--
With above query, I will get PAGE 1 of results from TABLENAME.
If you absolutely want to use ROW_NUMBER for this (instead of count(*)) you can always use:
SELECT TOP 1 ROW_NUMBER() OVER (ORDER BY Id)
FROM USERS
ORDER BY ROW_NUMBER() OVER (ORDER BY Id) DESC
Need to create virtual table by using WITH table AS, which is mention in given Query.
By using this virtual table, you can perform CRUD operation w.r.t row_number.
QUERY:
WITH table AS
-
(SELECT row_number() OVER(ORDER BY UserId) rn, * FROM Users)
-
SELECT * FROM table WHERE UserName='Joe'
-
You can use INSERT, UPDATE or DELETE in last sentence by in spite of SELECT.
SQL Row_Number() function is to sort and assign an order number to data rows in related record set. So it is used to number rows, for example to identify the top 10 rows which have the highest order amount or identify the order of each customer which is the highest amount, etc.
If you want to sort the dataset and number each row by seperating them into categories we use Row_Number() with Partition By clause. For example, sorting orders of each customer within itself where the dataset contains all orders, etc.
SELECT
SalesOrderNumber,
CustomerId,
SubTotal,
ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY SubTotal DESC) rn
FROM Sales.SalesOrderHeader
But as I understand you want to calculate the number of rows of grouped by a column. To visualize the requirement, if you want to see the count of all orders of the related customer as a seperate column besides order info, you can use COUNT() aggregation function with Partition By clause
For example,
SELECT
SalesOrderNumber,
CustomerId,
COUNT(*) OVER (PARTITION BY CustomerId) CustomerOrderCount
FROM Sales.SalesOrderHeader
This query:
SELECT ROW_NUMBER() OVER(ORDER BY UserId) From Users WHERE UserName='Joe'
will return all rows where the UserName is 'Joe' UNLESS you have no UserName='Joe'
They will be listed in order of UserID and the row_number field will start with 1 and increment however many rows contain UserName='Joe'
If it does not work for you then your WHERE command has an issue OR there is no UserID in the table. Check spelling for both fields UserID and UserName.