What select is executed first in a nested subquery? - sql

For such a subquery, what select is executed first?
SELECT name, salary, dept_id
FROM employee
WHERE salary >
( SELECT AVG(salary) FROM employee
WHERE dept_no =
( SELECT dept_no FROM employee
WHERE last_name =
( SELECT last_name FROM employee
WHERE salary > 50000))) ;
This: SELECT last_name FROM employee ?

SQL is a declarative language, not a procedural language. That is, the query does not specify the execution path, it specifies the logic for the result set. So, any of the queries could be "executed" first, depending on what the SQL optimizer decides to do.
That said, it is probably more important to understand the query logic than to understand how it is executed (at least at this stage). Your queries are all uncorrelated, so you can actually start with either the innermost or the outermost and work from there. Something like:
Get all employees whose salary
is greater than the average salary for the department
where employees with the same last name
have a salary greater than 50,000
Whether that is how the query is executed is immaterial. Something like that is what the query will return.

Related

Using ROUND, AVG and COUNT in the same SQL query

I need to write a query where I need to first count the people working in a department, then calculate the average people working in a department and finally round it to only one decimal place. I tried so many different variations.
That's what I got so far although it's not the first one I tried but I always get the same error message. (ORA-00979 - not a group by expression)
SELECT department_id,
ROUND(AVG(c.cnumber),1)
FROM employees c
WHERE c.cnumber =
(SELECT COUNT(c.employee_id)
FROM employees c)
GROUP BY department_id;
I really don't know what do to at this point and would appreciate any help.
Employees Table:
Try this (Oracle syntax) example from your description:
with department_count as (
SELECT department_id, COUNT(c.employee_id) as employee_count
FROM employees c
group by department_id
)
SELECT department_id,
ROUND(AVG(c.employee_count),1)
FROM department_count c
GROUP BY department_id;
But this query not make sense. Count is integer, and count return one number for one department in this case AVG return the same value as count.
Maybe you have calculate number of employee and averange of salary on department?

SQL query from Lynda.com

I have a question about two queries. Will these two queries give the same result? I am trying to find the average salary by department:
Select s1.department, avg(s1.salary)
From
(Select department, salary
From staff
Where salary > 100000) s1
Group by s1.department
vs
select department, avg(salary) as avg_salary
from staff
where salary > 100000
group by department
Yes, it gives the same amounts back.
the bottom query gets data from a sub select which gets its data from the table, whereas the top query gets it straight from the table itself.
There are no additional filters in there. So the result will be the same.
you can test it out however, don't take my word for it.

ORACLE SQL dealing with different tables

I will explain the problem I am stuck on. I have a table named empl02 which contains Lastname, salary, and position for all the employees. I am asked to display last,name,salary, position for all employees making more money than the highest paid member of a certain 'position', we will call this position server. I cannot just do something simple like...
SQL> select Lastname,salary,position FROM empl02
2 WHERE
3 SAL > 125000;
Rather, it must be dynamic. I feel the logic is pretty simple I'm just not sure how to translate it into SQL. I am thinking something along the lines of
"SELECT Lastname,salary,position from empl02 where salary > MAX(SALARY) of position(server)" what is a way to translate this task to SQL?
You need to retrieve the "reference" salary as a sub-query:
select lastname, salary, position
from empl02
where salary > (select max(salary)
from empl02
where position = 'manager');

Oracle 11g: Write a query that lists the highest earners for each department

This is a problem I've spent hours on now, and tried various different ways. It HAS to use Subqueries.
"Write a query that lists the highest earners for each department. Include the last_name, department_id, and the salary for each employee."
I've done a ton of subquery methods, and nothing works. I either get an error, or "No rows return". I'm assuming because one of the department_id is null, but even with NVL(department_id), I'm still having trouble. I tried splitting the table, and had no luck. Textbook's no help, my instructor is kind of useless, please... any help at all.
Here's a snapshot of the values, if that helps.
https://www.dropbox.com/s/bxtntlzqixdizzp/helpme.png?dl=0
You can rank the values within each department - then pull only the first place ranks in the outer query.
select a.last_name
,a.department_id
,a.salary
from (
select last_name
,department_id
,salary
,rank() over (partition by department_id order by salary desc) as rnk
from tablename
) a
where rnk=1
The partition groups all employees together who share the same department and should work regardless of the null value.
After grouping them - the order by tells that group to order on salary descending, and give a rank. You can run just the inner query to get an idea of what it does.

Can I write a condition on a single column name of a relation in HAVING clause

Considering the following relational schema
customers(id, name, age, address, salary)
I tried a query
SELECT SUM(salary), age FROM customers
GROUP BY age HAVING age > 23 ; ...(1)
I was surprised to see that it worked fine and that I could write a single column condition also in HAVING clause.
Even this is also working
SELECT SUM(salary), age FROM customers
GROUP BY age, salary HAVING age > 23 AND salary >2000; ...(2)
Otherwise, I should have written it like this : (using WHERE clause)
SELECT SUM(salary), age FROM customers
WHERE age > 23 GROUP BY age; ...(3)
And
SELECT SUM(salary), age FROM customers
WHERE age > 23 AND salary >2000 GROUP BY age, salary ; ..(4)
But when I tried with more combinations I found that
that column name must be present in GROUP BY clause also on which condition is applied in HAVING clause.
Am I correct or is it possible to write a single column condition in HAVING clause in any other way also ?
Why is it working because I had earlier studied that we can write only conditions on Aggregate functions in HAVING clause.
You're generally correct. Important thing is to understand grouping at all.
When using GROUP BY, server scans 'rows' and buckets them into some 'groups'. Then every 'group' works as a single new row. When operating these 'new lines' - in SELECT, HAVING or ORDER clauses - server needs to know 'attribute values' of them. These attribute values are aggregations of rows' attribute values or expressions with these aggregations.
When some attribute or expression used in the GROUP BY clause, it's aggregation values are quite deterministic, so server give us ability to simplify process. We can write something like
SELECT object_type, count(*)
FROM user_objects
GROUP BY object_type
HAVING MAX(object_type) like '%O%'
ORDER BY MIN(object_type)
It would work fine if we do this. But we can write simply
SELECT object_type, count(*)
FROM user_objects
GROUP BY object_type
HAVING object_type like '%O%'
ORDER BY object_type
which means exact the same. If column does not mentioned into the GROUP BY values - rule above became not true, so we cannot use it directly, without aggregation.
SELECT SUM(salary), age
FROM customers
GROUP BY age, salary
HAVING age > 23 AND salary >2000;
This gives you one record per age and salary, as you group by these. Later you remove some of the result lines. The sum of the salary is of course the salary itself.
If these are your records for instance:
salary age something
1000 30 100
1000 30 200
2000 30 300
2000 40 400
then you group like this:
salary age something
1000 30 100
200
2000 30 300
2000 40 400
For the group 1000/30 the sum(something) is 300 and avg(something) is 150. But the sum(salary) is 1000 and avg(salary) is 1000 and min(salary) is 1000, and so on, because it is just one salary value you are talking about.
The HAVING clause then removes lines from the result where age is over 23 and salary over 2000. You could have removed these records from evaluation by using a WHERE clause instead, thus saving the dbms some work. But you made the dbms collect all age and salary groups first, only to say which ones you dismiss afterwards.
I agree though that it would be better the DBMS raised an error telling you that sum(salary) makes no sense as it is just the one salary of the group.