I have this query that finds the name of the teacher with the 4-th highest salary. I don't understand this part
SELECT COUNT (DISTINCT T2.salary)
FROM teacher as T2
WHERE T2.salary > T1.salary
) = 3
from
SELECT name
FROM teacher as T1
WHERE (
SELECT COUNT (DISTINCT T2.salary)
FROM teacher as T2
WHERE T2.salary > T1.salary
) = 3;
The way I understand count is that it gives a final result, not that we can interrupt its work by specifying a number.
This is the teacher table: https://imgur.com/a/tZVk1O8
(I couldn't upload it here due to a server error)
Focusing on the subquery:
SELECT COUNT(DISTINCT T2.salary)
FROM teacher AS T2
WHERE T2.salary > T1.salary
This will return the count of distinct teachers having a salary greater than the teacher, in each row of the teacher table. Asserting that this count be equal to 3 means that any matching teacher would have the 4th highest salary (since first 3 positions excluded).
Note that your logic should behave identically to DENSE_RANK. You could also have used:
WITH cte AS (
SELECT *, DENSE_RANK() OVER (ORDER BY salary DESC) rnk
FROM teacher
)
SELECT name
FROM cte
WHERE rnk = 4;
I wouldn't write it that way, but what it does is count, for each teacher, how many salaries are higher than his/her salary.
If 3 salaries are higher than a given teacher's salary then that teacher must be ranked 4th.
The performance of this query will be disastrous with large tables. You should use the rank window function instead.
Related
I'm trying to understand below query, how its working.
SELECT *
FROM Employee Emp1
WHERE (N-1) = (
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary
)
Lets say I have 5 distinct salaries and want to get 3rd largest salary. So Inner query will run first and then outer query ?
I'm getting confused how its being done in sql engine. Curious to know. Becasue if its 3rd largest then 3-1 = 2, so that 2 needs to be matched with inner count as well. How inner count is being operated.
Can anyone explain the how its working .. ?
The subquery is correlated subquery, so it conceptually executes once for each row in the outer query (database optimizations left apart).. What it does is count how many employees have a salary greater than the one on the row in the outer query: if there are 2 employee with a higher salary, then you know that the employee on the current row in the outer query has the third highest salary.
Another way to phrase this is to use row_number() for this:
select *
from (
select
e.*,
row_number() over(order by salary desc) rn
from employee e
) t
where rn = 3
Depending on how you want to handle duplicates, dense_rank() might also be an option.
SELECT * FROM (SELECT EMP.ID,RANK() OVER (ORDER BY SALARY DESC) AS NOS FROM EMPLOYEE) T WHERE T.NOS=3
Then from this select the one with any desired rank.
It is easier to understand when you run this query:
select e1.*,
(select count(distinct e2.salary)
from employee e2
where e2.salary > e1.salary) as n
from employee e1
This is my sample table:
create table employee(salary) as (
select * from table(sys.odcinumberlist(1500, 1200, 1400, 1500, 1100)));
so my output is:
SALARY N
---------- ----------
1500 0
1200 2
1400 1
1500 0
1100 3
As you can see that subquery counts, for each row, salaries which are greater than salary in current row. So for instance, for 1400 there is one DISTINCT greater salary (1500). 1500 appears twice in my table, but distinct makes that it is counted once. So 1400 is second in order.
Your query has this count moved to the where part and compared with required value. We have to substract one, because for highest salary there is no higher value, for second salary one row etc.
It's one of the methods used to find such values, newer Oracle versions introduced analytic functions (rank, row_number, dense_rank) which eliminates the need of using subqueries for such purposes. They are faster, more efficient. For your query dense_rank() would be useful.
I'm a beginner to oracle. In recent search I've seen WHERE N-1,3-2 ..so on.
How does it work in searching data?
This is my code attemtp so far:
SELECT name, salary
FROM #Employee e1
WHERE N-1 = (SELECT COUNT(DISTINCT salary) FROM #Employee e2
WHERE e2.salary > e1.salary)
It's a classic( and pretty old ) SQL query to get nth highest salary. I assume It is no longer used( I haven't seen ) in any production codes, but could be a favourite question among the interviewers.
The N you are referring to is not a column or some unknown entity but a placeholder which should translate to a valid integer or bind argument in the working query. It is a correlated subquery, a subquery that is evaluated once for each row processed by the outer query. The way it works is that it takes a count of distinct list of salary values from employees that have a salary greater than each one of employees coming from the outer query and restricts the result where that count is equal to N-1.Which means you get those rows with nth highest salaries.
A more commonly used way to do this would be to use analytic function dense_rank() ( or rank depending on your need ). Read the documentation for these functions in case you aren't aware of them.
SELECT first_name,
salary
FROM (
SELECT e.*,
dense_rank() OVER(
ORDER BY salary desc
) rn
FROM employees e
)
WHERE rn = 6; -- ( n = 6 )
In Oracle 12c and higher versions, even though the above query works, a handy option to use is the FETCH..FIRST syntax.
SELECT *
FROM employees
ORDER BY salary DESC OFFSET 6 ROWS FETCH FIRST 1 ROWS WITH TIES; --n=6
Tables:
Department (dept_id,dept_name)
Students(student_id,student_name,dept_id)
I am using Oracle. I have to print the name of that department that has the minimum no. of students. Since I am new to SQL, I am stuck on this problem. So far, I have done this:
select d.department_id,d.department_name,
from Department d
join Student s on s.department_id=d.department_id
where rownum between 1 and 3
group by d.department_id,d.department_name
order by count(s.student_id) asc;
The output is incorrect. It is coming as IT,SE,CSE whereas the output should be IT,CSE,SE! Is my query right? Or is there something missing in my query?
What am I doing wrong?
One of the possibilities:
select dept_id, dept_name
from (
select dept_id, dept_name,
rank() over (order by cnt nulls first) rn
from department
left join (select dept_id, count(1) cnt
from students
group by dept_id) using (dept_id) )
where rn = 1
Group data from table students at first, join table department, rank numbers, take first row(s).
left join are used is used to guarantee that we will check departments without students.
rank() is used in case that there are two or more departments with minimal number of students.
To find the department(s) with the minimum number of students, you'll have to count per department ID and then take the ID(s) with the minimum count.
As of Oracle 12c this is simply:
select department_id
from student
group by department_id
order by count(*)
fetch first row with ties
You then select the departments with an ID in the found set.
select * from department where id in (<above query>);
In older versions you could use RANK instead to rank the departments by count:
select department_id, rank() over (order by count(*)) as rnk
from student
group by department_id
The rows with rnk = 1 would be the department IDs with the lowest count. So you could select the departments with:
select * from department where (id, 1) in (<above query>);
In my Employee table, I wanted to find the 3rd highest salary. Someone provided me with the following query to do this:
SELECT *
FROM employee C1
WHERE 3 = (SELECT Count(DISTINCT( C2.salary ))
FROM employee C2
WHERE C2.salary >= C1.salary)
This query works, but I don't how it works. What kind of query is this?
As others have said, this type of query is called a correlated sub-query. It's a sub-query because there is a query within a query and it's correlated because the inner query references the outer query in its definition.
Consider the inner query:
SELECT Count(DISTINCT( C2.salary ))
FROM employee C2
WHERE C2.salary >= C1.salary
Conceptually, this inner query will be evaluated once for every row produced by the outer query before the WHERE clause is applied, basically once for every row in employee. It will produce a single value, the count of rows from employee where the salary is less than the salary of the outer row.
The outer query will only return records where the value produced by the inner query is exactly 3. Assuming unique salary values, there is only one row from the employee table where there will be exactly 3 records with a salary value greater than or equal to it (the one row) and that one row is necessarily the third-highest salary value.
It's clever, but unnecessarily weird and probably not as optimal as something more straightforward.
Maybe a better solution would have been
SELECT TOP 1 *
FROM (
SELECT TOP 3 * FROM employee ORDER BY Salary DESC
) t
ORDER BY Salary ASC
Easier to read and more efficient than a correlated sub-query.
You can also use Dense_Rank to rank the salaries greatest to least, then select the ones that are ranked 3rd. This will also prevent you from getting the wrong salary if the top 2 are identical like the other answers above mine are doing. This has a better looking execution plan than the Distinct count one also
SELECT *
FROM (
SELECT *,
DENSE_RANK() OVER (ORDER BY Salary DESC) salary_rank
FROM employee e
) t
WHERE salary_rank = 3
Could also rewrite this with a common table expression.
WITH top_three
AS
(
SELECT TOP 3 * FROM employee ORDER BY Salary DESC
)
SELECT TOP 1 *
FROM top_three
ORDER BY Salary ASC;
Or, if you need to look for other ranks in this you can use row_number().
WITH ranked
AS
(
SELECT rank = ROW_NUMBER()OVER(ORDER BY Salary DESC), *
FROM employee
ORDER BY Salary DESC
)
SELECT *
FROM ranked
WHERE rank = #whatever_rank_you_want;
How can I find the Nth highest salary in a table containing salaries in SQL Server?
You can use a Common Table Expression (CTE) to derive the answer.
Let's say you have the following salaries in the table Salaries:
EmployeeID Salary
--------------------
10101 50,000
90140 35,000
90151 72,000
18010 39,000
92389 80,000
We will use:
DECLARE #N int
SET #N = 3 -- Change the value here to pick a different salary rank
SELECT Salary
FROM (
SELECT row_number() OVER (ORDER BY Salary DESC) as SalaryRank, Salary
FROM Salaries
) as SalaryCTE
WHERE SalaryRank = #N
This will create a row number for each row after it has been sorted by the Salary in descending order, then retrieve the third row (which contains the third-highest record).
SQL Fiddle
For those of you who don't want a CTE (or are stuck in SQL 2000):
[Note: this performs noticably worse than the above example; running them side-by-side with an exceution plans shows a query cost of 36% for the CTE and 64% for the subquery]:
SELECT TOP 1 Salary
FROM
(
SELECT TOP N Salary
FROM Salaries
ORDER BY Salary DESC
) SalarySubquery
ORDER BY Salary ASC
where N is defined by you.
SalarySubquery is the alias I have given to the subquery, or the query that is in parentheses.
What the subquery does is it selects the top N salaries (we'll say 3 in this case), and orders them by the greatest salary.
If we want to see the third-highest salary, the subquery would return:
Salary
-----------
80,000
72,000
50,000
The outer query then selects the first salary from the subquery, except we're sorting it ascending this time, which sorts from smallest to largest, so 50,000 would be the first record sorted ascending.
As you can see, 50,000 is indeed the third-highest salary in the example.
You could use row_number to pick a specific row. For example, the 42nd highest salary:
select *
from (
select row_number() over (order by Salary desc) as rn
, *
from YourTable
) as Subquery
where rn = 42
Windowed functions like row_number can only appear in select or order by clauses. The workaround is placing the row_number in a subquery.
select MIN(salary) from (
select top 5 salary from employees order by salary desc) x
EmpID Name Salary
1 A 100
2 B 800
3 C 300
4 D 400
5 E 500
6 F 200
7 G 600
SELECT * FROM Employee E1
WHERE (N-1) = (
SELECT COUNT(DISTINCT(E2.Salary))
FROM Employee E2
WHERE E2.Salary > E1.Salary
)
Suppose you want to find 5th highest salary, which means there are total 4 employees who have salary greater than 5th highest employee. So for each row from the outer query check the total number of salaries which are greater than current salary. Outer query will work for 100 first and check for number of salaries greater than 100. It will be 6, do not match (5-1) = 6 where clause of outerquery. Then for 800, and check for number of salaries greater than 800, 4=0 false then work for 300 and finally there are totally 4 records in the table which are greater than 300. Therefore 4=4 will meet the where clause and will return
3 C 300.
try it...
use table_name
select MAX(salary)
from emp_salary
WHERE marks NOT IN (select MAX(marks)
from student_marks )
Simple way WITHOUT using any special feature specific to Oracle, MySQL etc.
Suppose in EMPLOYEE table Salaries can be repeated.
Use query to find out rank of each ID.
select *
from (
select tout.sal, id, (select count(*) +1 from (select distinct(sal) distsal from
EMPLOYEE ) where distsal >tout.sal) as rank from EMPLOYEE tout
) result
order by rank
First we find out distinct salaries. Then we find out count of distinct salaries greater than each row. This is nothing but the rank of that id. For highest salary, this count will be zero. So '+1' is done to start rank from 1.
Now we can get IDs at Nth rank by adding where clause to above query.
select *
from (
select tout.sal, id, (select count(*) +1 from (select distinct(sal) distsal from
EMPLOYEE ) where distsal >tout.sal) as rank from EMPLOYEE tout
) result
where rank = N;
The easiest method is to get 2nd higest salary from table in SQL:
sql> select max(sal) from emp where sal not in (select max(sal) from emp);
Dont forget to use the distinct keyword:-
SELECT TOP 1 Salary
FROM
(
SELECT Distinct TOP N Salary
FROM Salaries
ORDER BY Salary DESC
) SalarySubquery
ORDER BY Salary ASC
Solution 1: This SQL to find the Nth highest salary should work in SQL Server, MySQL, DB2, Oracle, Teradata, and almost any other RDBMS: (note: low performance because of subquery)
SELECT * /*This is the outer query part */
FROM Employee Emp1
WHERE (N-1) = ( /* Subquery starts here */
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary)
The most important thing to understand in the query above is that the subquery is evaluated each and every time a row is processed by the outer query. In other words, the inner query can not be processed independently of the outer query since the inner query uses the Emp1 value as well.
In order to find the Nth highest salary, we just find the salary that has exactly N-1 salaries greater than itself.
Solution 2: Find the nth highest salary using the TOP keyword in SQL Server
SELECT TOP 1 Salary
FROM (
SELECT DISTINCT TOP N Salary
FROM Employee
ORDER BY Salary DESC
) AS Emp
ORDER BY Salary
Solution 3: Find the nth highest salary in SQL Server without using TOP
SELECT Salary FROM Employee
ORDER BY Salary DESC OFFSET N-1 ROW(S)
FETCH FIRST ROW ONLY
Note that I haven’t personally tested the SQL above, and I believe that it will only work in SQL Server 2012 and up.
SELECT * FROM
(select distinct postalcode from Customers order by postalcode DESC)
limit 4,1;
4 here means leave first 4 and show the next 1.
Try this it works for me.
Very simple one query to find nth highest salary
SELECT DISTINCT(Sal) FROM emp ORDER BY Salary DESC LIMIT n,1