How do I use subqueries efficiently as mentioned here - sql

select emp_name, salary
from emp12
where exists (select * from emp12 where salary > 3700)
I am running this query on SQL Server, I don't get any error but the problem is I have mentioned salary > 3700.
In the resulting output, I am get all the salary details including the ones which are below 3700 also, that's what I am not understanding.
How is that possible salary below 3700 is also showing up in the output?

Related

How to group things what are not in a specific range in SQL?

Doing revision for a SQL test I've got coming up and I'm having issues finding a query to meet one of the requirements of the questions. Heres the question:
The HR department needs to find the high-salary and low-salary employees. Modify your query from (7) to display the last name and salary for all employees whose salary is not in the range 5,000 through 12,000
Here's what I've got:
SELECT last_name, salary
FROM employees
WHERE salary BETWEEN 5000 AND 12000;
When I execute the query I get all of the employees who land between those 2 values, I need the employees who land outside of the range of the 2 values. Do I need to use '<' '>'?
Any help would be greatly appreciated :)
SELECT last_name, salary
FROM employees
WHERE salary NOT BETWEEN 5000 AND 12000

Misuse of aggregate function AVG() in SQL

I have an Employees table which looks like this:
employee_id employee_name employee_salary
1 Tom 35000
2 Sarah 50000
3 David 45000
4 Rosie 55000
5 Michael 45000
I need to return the employees salary that is higher than the average salary but the below command is having an error saying '1 misuse of aggregate function AVG()'.
SELECT employee_salary
FROM Employees
WHERE employee_salary > AVG(employee_salary);
The output that I'm expecting to get is:
employee_id employee_name employee_salary
2 Sarah 50000
4 Rosie 55000
Please advise, thank you!
I need to write the SQL query to return the number of employees for each department.
I assume you're looking for something like this:
SELECT department_id
,COUNT(employee_id) AS TotalEmployees
FROM Department
LEFT JOIN Employees
ON Employees.department_id = Department.department_id
GROUP BY department_id
Also, I need to return the employees salary that is higher than the average salary
The simplest way to return the salaries that are higher than average as a beginner sql programmer is probably something like this:
SELECT employee_salary
FROM Employees
WHERE employee_salary > (SELECT AVG(employee_salary)
FROM Employees)
As the others said, the other questions just require a bit of research. There are tonnes of resources out there to learn, but it takes time...
I need to write the SQL query to return the number of employees for each
department. However, my below command is not correct:
This is not what you ask for.
You get the join correct, but you ask for:
SELECT COUNT(Employees.employment_id)
The count how often different employment id's exist - which is 1 for an employee in one department, or X with X being the number of entries in the join. As the department_id entry is part of the employee table, this CAN NOT HAPPEN. TOTALLY not asking what you want.
I'm using the LEFT JOIN here because I am returning the result from the
Employees table is this right?
Depends - a normal join should work here. Left is only sensible if the other side can be empty - which I would assume is not possible (there are no rows with Employees.department_id being NULL).
You you want is a count (without anything in the brackets) and a group by department_id. And obviously the department id:
SELECT Department.department_id, count() FROM....
Furthermore, are there any tips to speed up SQL Server's performance?
Just pointing you to https://use-the-index-luke.com/ - indices are a cornerstone for any decent performance.
Ignoring your second question - one per question please.

Simple Subquery Statement, receiving an error

We are learning subqueries in Oracle SQL. I'm receiving an error "SQL command not properly ended" with an example from my textbook that should work.
I have attempted re-spacing the subquery, but keeping the exact code, this should work
SELECT last_name, salary
FROM employees
WHERE salary > 11000
(SELECT salary
FROM employees
WHERE last_name='Abel');
ERROR at line 4: ORA-00933: SQL command not properly ended
There needs to be something between the 11000 and the following subselect. As an example, it might be that the following was intended:
SELECT last_name, salary
FROM employees
WHERE salary > 11000 AND
salary IN (SELECT salary
FROM employees
WHERE last_name='Abel');
Is this what you want?
SELECT last_name, salary
FROM employees
WHERE salary > 11000 AND
last_name = 'Abel';
This would return employees named "Abel" whose salary exceeds 11,000.

SQL query from Lynda.com

I have a question about two queries. Will these two queries give the same result? I am trying to find the average salary by department:
Select s1.department, avg(s1.salary)
From
(Select department, salary
From staff
Where salary > 100000) s1
Group by s1.department
vs
select department, avg(salary) as avg_salary
from staff
where salary > 100000
group by department
Yes, it gives the same amounts back.
the bottom query gets data from a sub select which gets its data from the table, whereas the top query gets it straight from the table itself.
There are no additional filters in there. So the result will be the same.
you can test it out however, don't take my word for it.

Can I write a condition on a single column name of a relation in HAVING clause

Considering the following relational schema
customers(id, name, age, address, salary)
I tried a query
SELECT SUM(salary), age FROM customers
GROUP BY age HAVING age > 23 ; ...(1)
I was surprised to see that it worked fine and that I could write a single column condition also in HAVING clause.
Even this is also working
SELECT SUM(salary), age FROM customers
GROUP BY age, salary HAVING age > 23 AND salary >2000; ...(2)
Otherwise, I should have written it like this : (using WHERE clause)
SELECT SUM(salary), age FROM customers
WHERE age > 23 GROUP BY age; ...(3)
And
SELECT SUM(salary), age FROM customers
WHERE age > 23 AND salary >2000 GROUP BY age, salary ; ..(4)
But when I tried with more combinations I found that
that column name must be present in GROUP BY clause also on which condition is applied in HAVING clause.
Am I correct or is it possible to write a single column condition in HAVING clause in any other way also ?
Why is it working because I had earlier studied that we can write only conditions on Aggregate functions in HAVING clause.
You're generally correct. Important thing is to understand grouping at all.
When using GROUP BY, server scans 'rows' and buckets them into some 'groups'. Then every 'group' works as a single new row. When operating these 'new lines' - in SELECT, HAVING or ORDER clauses - server needs to know 'attribute values' of them. These attribute values are aggregations of rows' attribute values or expressions with these aggregations.
When some attribute or expression used in the GROUP BY clause, it's aggregation values are quite deterministic, so server give us ability to simplify process. We can write something like
SELECT object_type, count(*)
FROM user_objects
GROUP BY object_type
HAVING MAX(object_type) like '%O%'
ORDER BY MIN(object_type)
It would work fine if we do this. But we can write simply
SELECT object_type, count(*)
FROM user_objects
GROUP BY object_type
HAVING object_type like '%O%'
ORDER BY object_type
which means exact the same. If column does not mentioned into the GROUP BY values - rule above became not true, so we cannot use it directly, without aggregation.
SELECT SUM(salary), age
FROM customers
GROUP BY age, salary
HAVING age > 23 AND salary >2000;
This gives you one record per age and salary, as you group by these. Later you remove some of the result lines. The sum of the salary is of course the salary itself.
If these are your records for instance:
salary age something
1000 30 100
1000 30 200
2000 30 300
2000 40 400
then you group like this:
salary age something
1000 30 100
200
2000 30 300
2000 40 400
For the group 1000/30 the sum(something) is 300 and avg(something) is 150. But the sum(salary) is 1000 and avg(salary) is 1000 and min(salary) is 1000, and so on, because it is just one salary value you are talking about.
The HAVING clause then removes lines from the result where age is over 23 and salary over 2000. You could have removed these records from evaluation by using a WHERE clause instead, thus saving the dbms some work. But you made the dbms collect all age and salary groups first, only to say which ones you dismiss afterwards.
I agree though that it would be better the DBMS raised an error telling you that sum(salary) makes no sense as it is just the one salary of the group.