INNER JOIN and Count POSTGRESQL - sql

I am learning postgresql and Inner join I have following table.
Employee
Id Name DepartmentId
1 John S. 1
2 Smith P. 1
3 Anil K. 2
Department
Department
Id Name
1 HR
2 Admin
I want to query to return the Department Name and numbers of employee in each department.
SELECT Department.name , COUNT(Employee.id) FROM Department INNER JOIN Employee ON Department.Id = Employee.DepartmentId Group BY Employee.department_id;
I dont know what I did wrong as I am new to database Query.

When involving all rows or major parts of the "many" table, it's typically faster to aggregate first and join later. Certainly the case here, since we are after counts for "each department", and there is no WHERE clause at all.
SELECT d.name, COALESCE(e.ct, 0) AS nr_employees
FROM department d
LEFT JOIN (
SELECT department_id AS id, count(*) AS ct
FROM employee
GROUP BY department_id
) e USING (id);
Also made it a LEFT [OUTER] JOIN, to keep departments without any employees in the result. And COALESCE to report 0 employees instead of NULL in that case.
Related, with more explanation:
Query with LEFT JOIN not returning rows for count of 0
Your original query would work too, after fixing the GROUP BY clause:
SELECT department.name, COUNT(employee.id)
FROM department
INNER JOIN employee ON department.id = employee.department_id
Group BY department.id; --!
That's assuming department.id is the PRIMARY KEY of the table, in which case it covers all columns of that table, including department.name. And you may want LEFT JOIN like above.
Aside: Consider legal, lower-case names exclusively in Postgres. See:
Are PostgreSQL column names case-sensitive?

Related

how to count different values from different tuples into the same sceme in sql

I have a table of hospitals details, department details linked to it, types of workers (the staff) in different tables, and their salary information.
I want to extract the following for each hospital: the average and the sum of the salaries, the number of nurses, the number of research doctors and the number of beds in all the departments of a specific hospital.
I built this view of all the workers salary information:
CREATE VIEW workers AS
SELECT hospcod, docsal as sal, 'treatdoc' as typework
FROM doc NATURAL JOIN treatdoc NATURAL JOIN dept
UNION
SELECT hospcod, nursal, 'nurse'
FROM nurse NATURAL JOIN dept
UNION
SELECT hospcod, docsal, 'rsrchdoc'
FROM doc NATURAL JOIN rsrchdoc NATURAL JOIN lab;
the departments and the labs have the hospital code column to correlate a worker information to a specific hospital.
so I have one sceme for all the staff with their rules workers(hospital_code, salary, type_of_worker)
here is the query I'm trying to build:
SELECT hospname, sum(workers.sal), avg(workers.sal), count(dept.numbed),
(SELECT count(typework) from workers where typework = 'nurse') nurse_num,
(SELECT count(typework) from workers where typework = 'rsrchdoc') rsrchdoc_num
FROM hosp NATURAL JOIN dept NATURAL JOIN workers
GROUP BY hospname;
I want to count for each hospital, the number of nurses and the number of research doctors
but it should be correlated somehow to the different hospitals (in the above it gives me the same number of nurses / rsrchdocs for each hospital) , there should be columns that is grouped by hospnames and should get all the tuples like the salary info (avg, sum), as I got properly, but the workers information should be grouped HAVING typework = 'nurse' for the nurse_num, and for the column rsrchdoc_numit should be HAVING typework = 'rsrchdoc_num'
does someone have an idea how can I combine thouse columns in one query?
thank you!
There is an error in your query, I will try to explain.
When you do:
(SELECT count(typework) from workers where typework = 'nurse') nurse_num,
You are getting a constant, that is not affected by the "group by" you are doing after.
What you have to do is a JOIN (like you did in the view) and link the nurse and the rsrchdoc to an specific hospital.
I will give an example is pseudo code
SELECT hosp_name, sum(nurse.salary) , avg(nurse.salary)
FROM hosp
JOIN nurse ON nurse.hosp_name = hosp.hosp_name
GROUP BY hosp.hosp_name
This query will give you 1 row for each nurse in each hospital (assuming that a nurse may work in more than one hospital).
Then you have to do the same thing also for doctors, in a different operation.
SELECT hosp_name, sum(doctors.salary) , avg(doctors.salary)
FROM hosp
JOIN doctors ON doctors.hosp_name = hosp.hosp_name
GROUP BY hosp.hosp_name
And finally you will have to join both ( you may perform the sum first to make it more readable.
SELECT hosp_name, sum_sal_doc, avg_sal_doc, sum_nur_doc, avg_nur_doc
FROM hosp
LEFT JOIN ( SELECT doctors.hosp_name, sum(doctors.salary) as sum_sal_doc, avg(doctors.salary) as avg_sal_doc
FROM doctors
GROUP BY doctors.hosp_name
) t1 ON t1.hosp_name = hosp.hosp_name
LEFT JOIN ( SELECT nurses.hosp_name, sum(nurses.salary) as sum_nur_doc, avg(nurses.salary) as avg_nur_doc
FROM nurses
GROUP BY nurses.hosp_name
) t2 ON t2.hosp_name = hosp.hosp_name
There must be 1 to many relationship between hosp --> dept and hosp --> workers so if you join these 3 tables then you will definitely find the duplicates for dept and workers so you must have to create sub-query for one of the dept or workers to fetch single grouped record group by hospital as follows:
SELECT h.hospname,
sum(w.sal) total_all_worker_sal,
avg(w.sal) avg_all_workers_sal,
d.numbed,
count(case when w.typework = 'nurse' then 1 end) nurse_num,
count(case when w.typework = 'rsrchdoc' then 1 end) rsrchdoc_num
FROM hosp h
JOIN (select hospital_code , sum(numbed) numbed
-- used SUM as numbed must be number of bed in department
-- COUNT will give you only number of department if you use count(d.numbed)
from dept
group by hospital_code) d ON h.hospital_code = d.hospital_code
JOIN workers w ON h.hospital_code = d.hospital_code
GROUP BY h.hospital_code , h.hospname, d.numbed;
-- used h.hospital_code to separate the records if two hospitals have same name

subquery and join not giving the same result

1
select *
from employees
where salary > (select max(salary) from employees where department_id=50)
2
select *
from employees e left join
employees d
on e.DEPARTMENT_ID =d.DEPARTMENT_ID
where d.salary > (select max(salary) from employees where department_id=50)
why the second query is giving multiple record
i want achieve the same result as of 1st query using join.....
Thanks in Advance......
Rocky, the first select is correct. Why do you want to do any join? Without further information the objective of the second select is not clear (nonsense).
I can't see the point about joining against the same table by DEPARTMENT_ID. Anyway, the problem about duplicates is because you are joining the same two tables by a key is not pk, basically you are multiplyng each employee for all the employees of the same department. This version eliminate duplicates but still has no improvement from the first one.
select *
from employees e left join
employees d
on e.employee_ID = d.employee_ID
where d.salary > (select max(salary) from employees where department_id=50)
You are probably looking for an anti join. This is a pattern mainly used in a young DBMS where IN and EXISTS clauses are slow compared to joins, because the developers focused on joins only.
You are looking for all employees whose salaries are greater than all salaries in department 50. With other words: WHERE NOT EXISTS a salary greater or equal in department 50.
Your query can hence be written as:
select *
from employees e
where not exists
(
select null
from employees e50
where e50.department_id = 50
and e50.salary >= e.salary
);
As an anti join (an outer join where you dismiss all matches):
select *
from employees e
left join employees e50 on e50.department_id = 50 and e50.salary >= e.salary
where e50.salary is null;

SQL Join left join or left outer join

I am having a question in SQL Joins. I have table employee with employeeid as primary key and some other columns for employee. And there is another table called employeeaddress where there can be multiple employeeid is a foreign key. One employee can have many employeeaddresses just to explain one to many relationship.
If I want to write a query which will fetch the following columns
employee.employeeid, employee.empname,
employeeaddress.employeeaddressid, employeeaddress.addr1,
employeeaddress.addr2
So there can be an employee with no employeeaddress. But anyway I wanted to fetch all the employees who may have zero or multiple addresses.
Do I need to apply left join or left outer join? I want the following result for a table that has 2 employees John and Michael where John has two employeeaddresses with employeeaddressid 21 and 22 and Michael has no employeeaddress
1, John, 21, addr1 for John, addr2 for John
1, John, 22, another addr1 for John, another addr2 for John
2, Michael, NULL , NULL , NULL
The above result is arranged in the following fashion
employee.employeeid, employee.empname, employeeaddress.employeeaddressid, employeeaddress.addr1, employeeaddress.addr2
Please help.
Based on your description it sounds like you're looking for a query as follows. If you also wanted the address details, you'll just have to add a left join to the outer query.
Also, as comments have eluded to, LEFT JOIN is shorthand for LEFT OUTER JOIN, they will produce the same results.
SELECT *
FROM employee
inner join
(
SELECT
employeeid,
count(*) as addresscount
FROM employee
left join employeeaddress ON employeeaddress.employeeaddressid = employee.employeeaddressid
group by employeeid
) counts on counts.employeeid = employee.employeeid
WHERE counts.addresscount = 0 -- Or 1, or 5 or > 1, etc.
LEFT JOIN should be all you need.
SQL Fiddle Example
SELECT e.employeeID ,
e.empName ,
ea.employeeAddressID ,
ea.addr1 ,
ea.addr2
FROM Employee e
LEFT JOIN EmployeeAddress ea ON ea.employeeID = e.employeeID

Representing 'not in' subquery as join

I am trying to convert the following query:
select *
from employees
where emp_id not in (select distinct emp_id from managers);
into a form where I represent the subquery as a join. I tried doing:
select *
from employees a, (select distinct emp_id from managers) b
where a.emp_id!=b.emp_id;
I also tried:
select *
from employees a, (select distinct emp_id from managers) b
where a.emp_id not in b.emp_id;
But it does not give the same result. I have tried the 'INNER JOIN' syntax as well, but to no avail. I have become frustrated with this seemingly simple problem. Any help would be appreciated.
Assume employee Data set of
Emp_ID
1
2
3
4
5
6
7
Assume Manger data set of
Emp_ID
1
2
3
4
5
8
9
select *
from employees
where emp_id not in (select distinct emp_id from managers);
The above isn't joining tables so no Cartesian product is generated... you just have 7 records you're looking at...
The above would result in 6 and 7 Why? only 6 and 7 from Employee Data isn't in the managers table. 8,9 in managers is ignored as you're only returning data from employee.
select *
from employees a, (select distinct emp_id from managers) b
where a.emp_id!=b.emp_id;
The above didnt' work because a Cartesian product is generated... All of Employee to all of Manager (assuming 7 records in each table 7*7=49)
so instead of just evaluating the employee data like you were in the first query. Now you also evaluate all managers to all employees
so Select * results in
1,1
1,2
1,3
1,4
1,5
1,8
1,9
2,1
2,2...
Less the where clause matches...
so 7*7-7 or 42. and while this may be the answer to the life universe and everything in it, it's not what you wanted.
I also tried:
select *
from employees a, (select distinct emp_id from managers) b
where a.emp_id not in b.emp_id;
Again a Cartesian... All of Employee to ALL OF Managers
So this is why a left join works
SELECT e.*
FROM employees e
LEFT OUTER JOIN managers m
on e.emp_id = m.emp_id
WHERE m.emp_id is null
This says join on ID first... so don't generate a Cartesian but actually join on a value to limit the results. but since it's a LEFT join return EVERYTHING from the LEFT table (employee) and only those that match from manager.
so in our example would be returned as e.emp_Di = m.Emp_ID
1,1
2,2
3,3
4,4
5,5
6,NULL
7,NULL
now the where clause so
6,Null
7,NULL are retained...
older ansii SQL standards for left joins would have been *= in the where clause...
select *
from employees a, managers b
where a.emp_id *= b.emp_id --I never remember if the * is the LEFT so it may be =*
and b.emp_ID is null;
But I find this notation harder to read as the join can get mixed in with the other limiting criteria...
Try this:
select e.*
from employees e
left join managers m on e.emp_id = m.emp_id
where m.emp_id is null
This will join the two tables. Then we discard all rows where we found a matching manager and are left with employees who aren't managers.
Your best bet would probably be a left join:
select
e.*
from employees e
left join managers m on e.emp_id = m.emp_id
where
m.emp_id is null;
The idea here is you're saying that you want to select everything from employees, including anything that matches in the manager table based on emp_id and then filtering out the rows that actually have something in the manager table.
Use Left Outer Join instead
select e.*
from employees e
left outer join managers m
on e.emp_id = m.emp_id
where m.emp_id is null
left outer join will preserve the rows from m table even if they do not have a match i e table based on the emp_id field. The we filter on where m.emp_id is null - give me all the rows from e where there's no matching record in m table.
A bit more on the subject can be found here:
Visual representation of joins
from employees a, (select distinct emp_id from managers) b implies cross join - all posible combinations between tables (and you needed left outer join instead)
The MINUS keyword should do the trick:
SELECT e.* FROM employees e
MINUS
Select m.* FROM managers m
Hope that helps...
select *
from employees
where Not (emp_id in (select distinct emp_id from managers));

Combining tables in SQL/QlikView

Is it possible to combine 2 tables with a join or similar construct so that all non matching field in one group. Some thing like this:
All employees with a department name gets their real department and all with no department ends up in group "Other".
Department:
SectionDesc ID
Dep1 500
Dep2 501
Employee:
Name ID
Anders 500
Erik 501
root 0
Output:
Anders Dep1
Erik Dep2
root Other
Best Regards Anders Olme
What you are looking for is an outer join:
SELECT e.name, d.name
FROM employee e
LEFT OUTER JOIN departments d ON e.deptid = d.deptid
This would give you a d.name of NULL for every employee without a department. You can change this to 'Other' with something like this:
CASE WHEN d.name IS NULL THEN 'Other' Else d.name END
(Other, simpler versions for different DBMSs exist, but this should work for most.)
QlikView is a bit tricky, as all joins in QlikView are inner joins by default. There is some discussion in the online help about the different joins, short version is that you can create a new table based on different joins in the script that reads in your data. So you could have something like this in your script:
Emps: SELECT * FROM EMPLOYEES;
Deps: SELECT * FROM DEPARTMENTS;
/* or however else you get your data into QlikView */
EmpDep:
SELECT Emps.name, Deps.name
FROM EMPS LEFT JOIN Deps
In order for this join to work the column names for the join have to be the same in both tables. (If necessary, you can construct new columns for the join when loading the base tables.)