why "ANY" isn't working properly? - sql

I'm learning SQL using Oracle 10g. I need a query that returns the department with the most employees to use it in a update sentence. I already solved it, but I couldn't figure out why this query won't work:
select deptno
from (select deptno,
count(*) num
from emp
group by deptno)
where not num < any(select count(deptno)
from emp
group by deptno)
It puzzles me more since according to the documentation it should be equivalent and optimized into the following:
select deptno
from (select deptno,
count(*) num
from emp
group by deptno )
where not exists( select deptno,
count(*)
from emp
having count(*) > num
group by deptno)
That one works without errors. The following also work:
select deptno
from (select deptno,
count(*) num
from emp
group by deptno)
where num = (select max(alias)
from (select count(deptno) alias
from emp
group by deptno))
select deptno
from emp
group by deptno
having not count(deptno) < any( select count(deptno)
from emp
group by deptno)
Edit. Probably it'll help if I post the return values of the inner selects.
The first select returns:
Dept. Number Employees
30 6
20 5
10 3
The last one returns (3,5,6)
I checked them individually. It's also weird that if I put the values manually it works as expected and will return 30 as the department with most employees.
select deptno
from (select deptno,
count(*) num
from emp
group by deptno)
where not num < any(6,5,3)
I'm using Oracle 10g 10.2.0.1.0
Last edit, probably. Still don't know what's happening, but the behaviour is as if the last select is returning null somehow. So, even if I remove the ´not´, it still doesn't select anything.
If someone is interested I also found this useful:
TSQL - SOME | ANY why are they same with different names?
Read the first answer. It's probably better to avoid the use of any/some, all.

Here's a similar example which may clarify things (Standard SQL, can be easily transformed for Oracle):
WITH T
AS
(
SELECT *
FROM (
VALUES (0),
(1),
(2),
(NULL)
) AS T (c)
)
SELECT DISTINCT c
FROM T
WHERE 1 > ALL (SELECT c FROM T T2);
This returns the empty set, which is reasonable: given the presence of the null in the table, 1 > NULL is UNKNOWN, therefore it is not known whether the value 1 is greater than all values in the set.
However, adding the NOT operator:
WHERE NOT 1 > ALL (SELECT c FROM T T2);
returns all values in the set, including the null value. At first glance this seems wrong: given that 1 > 2 is FALSE we can say with certainty that the value 1 is not greater than all values in the set, regardless of the null.
However, in this case the NOT is simply flipping the earlier result i.e. the opposite of all no rows is all rows! ;)
Further consider the negated comparison using a column (rather than the literal value 1):
WHERE NOT c > ALL (SELECT c FROM T T2);
This time it returns all rows except for the null value.

Correction (update)
not num < any(select ...)
should be the same as your other queries. You can also try this variation:
num >= ALL(select ...)
but I can't understand why yours is giving wrong results. Perhaps because of the not precedence. Can you trythis instead?:
not ( num < ANY(select ...) )
Full queries:
select deptno
from (select deptno, count(*) num from emp group by deptno)
where num >= all(select count(deptno) from emp group by deptno)
and:
select deptno
from (select deptno, count(*) num from emp group by deptno)
where not ( num < any(select count(deptno) from emp group by deptno) )

Related

GROUP BY - inline view and unjoined tables

I have the following ORACLE query where I attempt to find the department with the highest average salary. I would like to use in-line view (i.e. retain the b dataset) for this implementation, but struggle to get the right part at the WHERE and GROUP BY components. I know the below GROUP BY and WHERE (which is non-existant) is wrong. But how do i correct them?
select a.deptno from emp a,
(select max(avg_sal) max_avg_sal from (select
avg(sal) avg_sal from emp group by deptno) ) b
group by a.deptno, b.max_avg_sal
having avg(a.sal) = b.max_avg_sal
Expected Result
deptno
10
Emp Structure
deptno staff sal
10 A 1000
10 B 1500
11 C 1100
12 D 1000
12 E 900
12 F 1000
Is this what you want?
select e.*
from (select e.*, avg(e.salary) over (partition by e.deptno) as avg_salary
from emp e
) e
order by avg_salary desc
fetch first 1 row only;
fetch first is available in Oracle 12c+. You can do similar things with an additional subquery in earlier versions.
You can use subquery
select deptno from tablename
group by deptno
having avg(sal)= (select max(asal) from (select avg(sal) as asal from tablename group by deptdno)A)
The straight-forward way is:
select deptno
from emp
group by deptno
order by avg(salary) desc
fetch first row with ties;
FETCH FIRST is available as of Oracle 12c.
In Oracle 11g we could use this instead:
select deptno
from
(
select deptno, avg(salary) as avg_salary, max(avg(salary)) over () as max_avg_salary
from emp
group by deptno
)
where avg_salary = max_avg_salary;
But you want an inline view, another word for a derived table (a subquery in the from clause). That looks way more clumsy. One example without FETCH FIRST and without window functions:
with d as
(
select deptno, avg(salary) as avg_salary
from emp
group by deptno
)
, dmax as
(
select max(avg_salary) as max_avg_salary
from d
)
select d.*
from d
join dmax on dmax.max_avg_salary = d.avg_salary;
I find this very obfuscated and don't recommend it at all. You can do the same without WITH clauses of course. Then it is even less readable.
I don't know why you'd want to write it this way, but if you really want only inline views and no windowing clauses, you can write it this way:
select b.deptno
from (SELECT deptno, avg(sal) avgsal from emp group by deptno ) b
cross join (SELECT max(avgsal) maxavgsal FROM (SELECT avg(sal) avgsal FROM emp group by deptno )) c
where b.avgsal = c.maxavgsal;
This the same thing, if you don't like CROSS JOIN for some reason:
select b.deptno
from (SELECT deptno, avg(sal) avgsal from emp group by deptno ) b
inner join ( SELECT max(avgsal) maxavgsal FROM
( SELECT avg(sal) avgsal FROM emp group by deptno ) ) c
on b.avgsal = c.maxavgsal;

Explain how to find the 3rd MAX salary in the emp table

Anyone will please explain me the runtime execution of the below query:-
select distinct sal
from emp e1
where 3 = (select count(distinct sal)
from emp e2
where e1.sal <= e2.sal);
select distinct sal
from emp e1
where 3 = (
select count(distinct sal)
from emp e2
where e1.sal <= e2.sal
)
It's a correlated query which means the subquery runs for each row of the outer query:
The subquery returns the count of distinct salaries that are greater than or equal to the given salary
for example there are following values in emp table:
10
20
30
40
Say the Outer query is at row with sal = 40. The count returned by the subquery will be 1.
for sal = 30, count = 2
for sal = 20, count = 3
for sal = 10, count = 4
so only row matching your criteria is row with sal = 20 which is what you wanted.
A better way can be using rank:
select distinct sal
from (
select t.*,
dense_rank() over (
order by salary desc
) as rnk
from your_table t
) t
where rnk = 3;
I think a shorter way is when you use the (rather new) function NTH_VALUE:
SELECT DISTINCT NTH_VALUE(salary, 3) OVER ()
FROM your_table;

Selecting based on condition having multiple options

I'm totally confused about using aggregate functions after where clause or anywhere after mentioning the table_name
EMP Table as posted on http://viditkothari.co.in/post/27045365558/sql-commands-1
Query Info:
Display all the emp who have sal equal to any of the emp of deptno 30
Suggested query:
select *
from employee_4521
where sal having (select sal
from employee_4521
where deptno = 30);
Returns following error:
ERROR at line 1:
ORA-00920: invalid relational operator
with an asterik marked under 'h' of having clause
There doesn't appear to be any reason to use an aggregate function here. Just use an IN or an EXISTS
select *
from employee_4521
where sal in (select sal
from employee_4521
where deptno=30);
or
select *
from employee_4521 a
where exists( select 1
from employee_4521 b
where b.deptno = 30
and a.sal = b.sal );

ORACLE sql query for getting top 3 salaries rownum greater than

I want to write a query to display employees getting top 3 salaries
SELECT *
FROM (SELECT salary, first_name
FROM employees
ORDER BY salary desc)
WHERE rownum <= 3;
But I dont understand how this rownum is calculated for the nested query
will this work or if it has problem ,request you to please make me understand:
SELECT *
FROM (SELECT salary, first_name
FROM employees
ORDER BY salary )
WHERE rownum >= 3;
I went through this link Oracle/SQL: Why does query "SELECT * FROM records WHERE rownum >= 5 AND rownum <= 10" - return zero rows ,but it again points to a link, which does not gives the answer
a_horse_with_no_name's answer is a good one,
but just to make you understand why you're 1st query works and your 2nd doesn't:
When you use the subquery, Oracle doesn't magically use the rownum of the subquery, it just gets the data ordered so it gives the rownum accordingly, the first row that matches criteria still gets rownum 1 and so on. This is why your 2nd query still returns no rows.
If you want to limit the starting row, you need to keep the subquery's rownum, ie:
SELECT *
FROM (SELECT * , rownum rn
FROM (SELECT salary, first_name
FROM employees
ORDER BY salary ) )sq
WHERE sq.rn >= 3;
But as a_horse_with_no_name said there are better options ...
EDIT: To make things clearer, look at this query:
with t as (
select 'a' aa, 4 sal from dual
union all
select 'b' aa, 1 sal from dual
union all
select 'c' aa, 5 sal from dual
union all
select 'd' aa, 3 sal from dual
union all
select 'e' aa, 2 sal from dual
order by aa
)
select sub.*, rownum main_rn
from (select t.*, rownum sub_rn from t order by sal) sub
where rownum < 4
note the difference between the sub rownum and the main rownum, see which one is used for criteria
The "rownum" of a query is assigned before an order by is applied to the result. So the rownumw 42 could wind up being the first row.
Generally speaking you need to use the rownum from the inner query to limit your overall output. This is very well explained in the manual:
http://docs.oracle.com/cd/E11882_01/server.112/e26088/pseudocolumns009.htm#i1006297
I prefer using row_number() instead, because you have much better control over the sorting and additionally it's a standard feature that works on most modern DBMS:
SELECT *
FROM (
SELECT salary,
first_name,
row_number() over (order by salary) as rn
FROM employees
)
WHERE rn <= 3
ORDER BY salary;
You should understand that the derived table in this case is only necessary to be able to apply a condition on the generated rn column. It's not there to avoid the "rownum problem" as the value of row_number() only depends on the order specifiy in the over(...) part (it is independent of any ordering applied to the query itself)
Note this would not return employees that have the same salary and would still fall under the top three. In that case using dense_rank() is probably more approriate.
if you want to select the people with the top 3 salaries.. perhaps you should consider using analytics.. something more like
SELECT *
FROM (
SELECT salary, first_name, dense_rank() over(order by salary desc) sal_rank
FROM employees
)
WHERE sal_rank <= 3
ie ALL people with the 3rd highest(ranked) salary amount(or more)
the advantage of this over using plain rownum is if you have multiple people with the same salary they will all be returned.
Easiest way to print 5th highest salary.
SELECT MIN(SALARY) FROM (SELECT SALARY FROM EMPLOYEES ORDER BY DESC) WHERE ROWNUM BETWEEN 1 AND 5
according to same if u want to print 3rd or 4th highest salary then just chage last value.(means instead of 5 use 3 or 4 you will get 3rd or 4th highest salary).
SELECT MIN(SALARY) FROM (SELECT SALARY FROM EMPLOYEES ORDER BY DESC) WHERE ROWNUM BETWEEN 1 AND 4
SELECT MIN(SALARY) FROM (SELECT SALARY FROM EMPLOYEES ORDER BY DESC) WHERE ROWNUM BETWEEN 1 AND 3
SELECT EMPNO,
SAL,
(SELECT SUM(E.SAL) FROM TEST E WHERE E.EMPNO <= T.EMPNO) R_SAL
FROM (SELECT EMPNO, SAL FROM TEST ORDER BY EMPNO) T
Easiest way to find the top 3 employees in oracle returning all fields details:
SELECT *
FROM (
SELECT * FROM emp
ORDER BY sal DESC)
WHERE rownum <= 3 ;
select *
from (
select emp.*,
row_number() over(order by sal desc)r
from emp
)
where r <= 3;
SELECT Max(Salary)
FROM Employee
WHERE Salary < (SELECT Max(salary) FROM employee WHERE Salary NOT IN (SELECT max(salary) FROM employee))
ORDER BY salary DESC;

SQL: aggregate function and group by

Consider the Oracle emp table. I'd like to get the employees with the top salary with department = 20 and job = clerk. Also assume that there is no "empno" column, and that the primary key involves a number of columns. You can do this with:
select * from scott.emp
where deptno = 20 and job = 'CLERK'
and sal = (select max(sal) from scott.emp
where deptno = 20 and job = 'CLERK')
This works, but I have to duplicate the test deptno = 20 and job = 'CLERK', which I would like to avoid. Is there a more elegant way to write this, maybe using a group by? BTW, if this matters, I am using Oracle.
The following is slightly over-engineered, but is a good SQL pattern for "top x" queries.
SELECT
*
FROM
scott.emp
WHERE
(deptno,job,sal) IN
(SELECT
deptno,
job,
max(sal)
FROM
scott.emp
WHERE
deptno = 20
and job = 'CLERK'
GROUP BY
deptno,
job
)
Also note that this will work in Oracle and Postgress (i think) but not MS SQL. For something similar in MS SQL see question SQL Query to get latest price
If I was certain of the targeted database I'd go with Mark Nold's solution, but if you ever want some dialect agnostic SQL*, try
SELECT *
FROM scott.emp e
WHERE e.deptno = 20
AND e.job = 'CLERK'
AND e.sal = (
SELECT MAX(e2.sal)
FROM scott.emp e2
WHERE e.deptno = e2.deptno
AND e.job = e2.job
)
*I believe this should work everywhere, but I don't have the environments to test it.
In Oracle I'd do it with an analytical function, so you'd only query the emp table once :
SELECT *
FROM (SELECT e.*, MAX (sal) OVER () AS max_sal
FROM scott.emp e
WHERE deptno = 20
AND job = 'CLERK')
WHERE sal = max_sal
It's simpler, easier to read and more efficient.
If you want to modify it to list list this information for all departments, then you'll need to use the "PARTITION BY" clause in OVER:
SELECT *
FROM (SELECT e.*, MAX (sal) OVER (PARTITION BY deptno) AS max_sal
FROM scott.emp e
WHERE job = 'CLERK')
WHERE sal = max_sal
ORDER BY deptno
That's great! I didn't know you could do a comparison of (x, y, z) with the result of a SELECT statement. This works great with Oracle.
As a side-note for other readers, the above query is missing a "=" after "(deptno,job,sal)". Maybe the Stack Overflow formatter ate it (?).
Again, thanks Mark.
In Oracle you can also use the EXISTS statement, which in some cases is faster.
For example...
SELECT name, number
FROM cust
WHERE cust IN
( SELECT cust_id FROM big_table )
AND entered > SYSDATE -1
would be slow.
but
SELECT name, number
FROM cust c
WHERE EXISTS
( SELECT cust_id FROM big_table WHERE cust_id=c.cust_id )
AND entered > SYSDATE -1
would be very fast with proper indexing. You can also use this with multiple parameters.
There are many solutions. You could also keep your original query layout by simply adding table aliases and joining on the column names, you would still only have DEPTNO = 20 and JOB = 'CLERK' in the query once.
SELECT
*
FROM
scott.emp emptbl
WHERE
emptbl.DEPTNO = 20
AND emptbl.JOB = 'CLERK'
AND emptbl.SAL =
(
select
max(salmax.SAL)
from
scott.emp salmax
where
salmax.DEPTNO = emptbl.DEPTNO
AND salmax.JOB = emptbl.JOB
)
It could also be noted that the key word "ALL" can be used for these types of queries which would allow you to remove the "MAX" function.
SELECT
*
FROM
scott.emp emptbl
WHERE
emptbl.DEPTNO = 20
AND emptbl.JOB = 'CLERK'
AND emptbl.SAL >= ALL
(
select
salmax.SAL
from
scott.emp salmax
where
salmax.DEPTNO = emptbl.DEPTNO
AND salmax.JOB = emptbl.JOB
)
I hope that helps and makes sense.