Unusual behavior of aggregate func + several joins - sql

Here is my RDB structure.
I try to count the number of departments and employees related to a single location.
select street_address, count(distinct(d.department_id)), count(emp.employee_id)
from locations loc
inner join departments d
on d.location_id = loc.location_id
inner join employees emp
on emp.department_id =d.department_id
group by street_address
Query execution result:
But without using distinct for counting d.department_id it produces wrong result.
Could somebody explain what happens during query execution and why distinct fixes this issue?

The reason you are getting wrong count with count(d.department_id) because there is multiple employees who related to same department_id and that is why you are getting same number of department and employees.
when you use count(distinct d.department_id), then distinct will only count each department_id once instead of counting every time it finds employee associated with department_id.

Related

is this a subquery or just an inner join

I am studying the subquery concept, below is one query which is extracted from wikipedia https://en.wikipedia.org/wiki/Correlated_subquery
SELECT employees.employee_number, employees.name
FROM employees INNER JOIN
(SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary > temp.department_average;
the sql is a rewritten version of an correlated subquery as below
SELECT
employee_number,
name,
(SELECT AVG(salary)
FROM employees
WHERE department = emp.department) AS department_average
FROM employees AS emp;
And now my question :
Is the sql from the rewritten version a subquery? I am so confused on it
INNER JOIN
(SELECT department, AVG(salary) AS department_average
FROM employees
GROUP BY department) AS temp ON employees.department = temp.department
WHERE employees.salary > temp.department_average;
Welcome to Stackoverflow. This is certainly confusing, so I'd make it a little bit simpler by using two different tables and no table aliases.
I'd say if it's in the FROM clause, it's called a join:
SELECT employee_id, department_name
FROM employees JOIN departments USING (department_id);
If it's in the WHERE clause, it's called a subquery:
SELECT employee_id
FROM employees
WHERE employee_id = (
SELECT manager_id
FROM departments
WHERE employees.employee_id = departments.manager_id);
If it's in the SELECT clause, it's called a scalar subquery (thanks, #Matthew McPeak):
SELECT employee_id,
(SELECT department_name
FROM departments
WHERE departments.department_id = employees.department_id)
FROM employees;
Not exactly. The equivalent would be a left join. The correlated version keeps all rows in the employees table, even when there is no match. The inner join requires that there be a match.
In general, the execution plans are not going to be exactly the same, because the SQL engine does not know before-hand if all rows match.
With the additional filtering condition, the two versions are equivalent. Note that the filter for the correlated version requires a subquery or CTE because the where clause does not recognize column aliases.

Count the number of employees for every country

I have this task:
Count the number of employees for every country. Show only those countries, when works more than 20 employees
employee_id is dedicated for Employees table
country belongs to different table - Countries table and we need country_name from this table
I have no idea how to solve this task. Below what I was able to create. I think we should use Inner Join.
SELECT a.employee_id
, b.country_name
, COUNT(a.employee_id) AS count
FROM employees a
INNER JOIN countries b ON a.employee_id = b.country_name
GROUP BY b.country_name
WHERE employee_id >20;
I think I need help from the beginning.
Thanks
Your join doesn't seem correct but as I don't know the table structure, I can't say what the right column is (I'm going to assume that it should be country_name. Even so, try this:
SELECT b.country_name
, COUNT(a.employee_id) AS count
FROM employees a
INNER JOIN countries b ON a.country_name = b.country_name
GROUP BY b.country_name
HAVING COUNT(employee_id) >20;
When grouping you need to use the HAVING statement to filter.

need to convert number to string with nvl and to_char in sql developer

I'm trying to write a query that will give me all the info from my departments table and join with employees table to get the names of the managers of all departments. I can get them except for one department with no manager, and for that I need to print out "no manager". I have tried using nvl and to_char in a WHERE clause but I don't think I am writing it correctly.
Here is the code I have written:
SELECT d.department_id,d.DEPARTMENT_NAME,d.LOCATION_ID,d.MANAGER_ID,
e.first_name||' '||e.last_name AS Manager
FROM departments d
JOIN employees e ON d.MANAGER_ID = e.employee_ID
WHERE NVL(TO_CHAR(d.MANAGER_ID),'No Manager');
When i run it without the WHERE clause, I get the correct output except for that one missing department.
What you need is a OUTER JOIN.
select d.department_id,
d.department_name,
d.location_id,
coalesce(d.manager_id, 'No Manager'),
coalesce(e.first_name||' '||e.last_name, 'No Manager') as manager
from departments d
left outer join
employees e
on d.manager_id = e.employee_id;
With a inner join, you will get only those departments which have a manager.
LEFT OUTER JOIN ensures the query returns all records from the table on the left, i.e departments.
COALESCE is similar to NVL and is standard across different DBMS, while NVL is Oracle specific.

sql query null data was not retrieved

Table DEPARTMEnT
TABLE EMPLOYEE
There is the Operations Department which has not any employee. So, i believed that the query would retrieved also the row(image 1):
Department_ID=10 , Department_Name =Operations, Employee=0
Why doesnt happen???
SELECT EMPLOYEE.Department_ID, DEPARTMENT.Department_Name, Count(*) AS Employees
FROM EMPLOYEE right JOIN DEPARTMENT ON DEPARTMENT.Department_ID = EMPLOYEE.Department_ID
GROUP BY DEPARTMENT.Department_Name,.EMPLOYEE.Department_ID
Since the principal data you care about for this query is coming from the DEPARTMENT table, you may want to consider rewriting your query to be:
SELECT DEPARTMENT.Department_ID, DEPARTMENT.Department_Name, Count(EMPLOYEE.Employee_ID) As Employees
FROM DEPARTMENT
LEFT JOIN EMPLOYEE ON EMPLOYEE.Department_ID = DEPARTMENT.Department_ID
GROUP BY DEPARTMENT.Department_ID, DEPARTMENT.Department_Name
The default join is an inner join, which only returns rows for which at least one row is found on both sides. Replace join with left join to retrieve departments without employees.
Example code:
SELECT e.Department_ID
, d.Department_Name
, count(e.Employee_ID) AS Employees
FROM Department d
LEFT JOIN
Employee e
ON d.Department_ID = e.Department_ID
GROUP BY
d.Department_ID
, d.Department_Name
This should do the trick. You could put in a RIGHT JOIN if you have the EMPLOYEE table first, but the reason this is not good is because soon your queries will start being a mix of LEFT and RIGHT joins, which becomes very hard to read, even for seasoned SQL professionals. By sticking with LEFT JOIN you keep the query maintainable and understandable. (In very rare circumstances RIGHT JOIN may simplify a query that has a complex order of precedence but I have only done it something like twice to avoid having to add parentheses around groups of joins).
SELECT
D.Department_ID,
D.Department_Name,
Employees = Count(*)
FROM
dbo.DEPARTMENT D
LEFT JOIN dbo.EMPLOYEE E
ON D.Department_ID = E.Department_ID
GROUP BY
D.Department_ID,
D.Department_Name
Also, I recommend that you use aliases for your tables instead of full table names. The query becomes much easier to scan and understand when there is consistent use of aliases. Spelling out the entire table name all too often obscures other parts of the query.

sql show current instance

here is the assignment that I had to face:
List the employees who have transferred between departments during their employment. You should show their current department, and the date when the transferred to the current department. This is a pretty tough query. My solution was to use a subquery to determine which employees have been in more than one department, and use that result in the base query.
The problem that I don't know how to display the employee who have been transfered and only once. The way to tell if the employee has been transfered is if in the EmployeeDepartmentHistory table, the employee id has been in more than one record (i.e. employeeID 1 is in both record 1 and record 2 because the person has been in two departments). How would I go about this? Here's what I have as of now:
SELECT EmployeeDepartmentHistory.EmployeeID,Person.Contact.FirstName, Person.Contact.LastName, Department.Name
From HumanResources.Department INNER JOIN
HumanResources.EmployeeDepartmentHistory ON
HumanResources.Department.DepartmentID =
HumanResources.EmployeeDepartmentHistory.DepartmentID INNER JOIN
HumanResources.Employee ON HumanResources.EmployeeDepartmentHistory.EmployeeID
= HumanResources.Employee.EmployeeID INNER JOIN
Person.Contact ON HumanResources.Employee.ContactID = Person.Contact.ContactID
WHERE EmployeeDepartmentHistory.EmployeeID=(SELECT COUNT(HumanResources.EmployeeDepartmentHistory.EmployeeID)
FROM HumanResources.EmployeeDepartmentHistory
WHERE EmployeeDepartmentHistory.EmployeeID = Employee.EmployeeID
Group by EmployeeDepartmentHistory.EmployeeID)
Not sure if I get the requirement correct. Would you want to try HAVING clause at the end?
That is:
HAVING COUNT(EmployeeDepartmentHistory.EmployeeID) = 2
(2 - assuming that the EmployeeDepartmentHistory will contain the current department as well)