Finding couples of occurrences - Postgresql

Finding couples of occurrences - Postgresql - sql

I need to find possible couplings of employees and department that they worked for. I need to include only the couplings of 2 different departments, so if an employee worked for more than 2 departments, I need to divide them into couplings of 2 departments to show the transfers. Also the period of the contract of the employee at the first department must be earlier than the contract at the second department. I need to list the values as "first_name", "last_name, "deptnr1", "dept1", "deptnr2", "dept2".
For example John Doe worked for department A(deptnr 9) from 01/01/2016 to 06/06/2016 and for department B(deptnr 3) from 10/06/2016 to 12/12/2017, the result should be like :
John, Doe, 9, A, 3, B
If he then returned to his job at department A, there should be another coupling like this to make his 2nd transfer visible:
John, Doe, 3, B, 9, A
so if he transfers around, we should have as many couplings of departments as possible, in this case 2 transfers, thus 2 couplings out of 3 departments(A => B => A so A,B/B,A).
I have 4 tables.
Person (PK email, first_name, last_name, FK postcode, FK place_name)
Employee(PK employeenr, FK email)
Department(PK departmentnr, name, FK postcode, FK place_name)
Contract(PK periode_begin, PK periode_end, FK departmentnr, FK employeenr)
I have tried this but I don't know how to make use of aliases to take values from let's say department.name and put them on other columns as name1 and name2. Also I couldn't figure out a way to make couplings out of let's say four transfers like (A=> B=> C=> D=> E TO A,B/B,C/C,D/D,E).
SELECT
first_name,
last_name,
d1.departmentnr AS deptnr1,
d1.name AS dept1,
d2.departmentnr AS deptnr2,
d2.name AS dept2,
FROM person
INNER JOIN employee ON employee.email=person.email
INNER JOIN contract ON contract.employeenr = employee.employeenr
INNER JOIN department d1 ON department.departmentnr = contract.departmentnr
where contract.employeenr in
(SELECT employeenr FROM contract
GROUP BY employeenr HAVING COUNT(employeenr)>1
AND COUNT(employeenr)>1)

Use the window function lead()
select
first_name,
last_name,
deptnr1,
dept1,
deptnr2,
dept2
from (
select
first_name,
last_name,
departmentnr as deptnr1,
name as dept1,
lead(departmentnr) over w as deptnr2,
lead(name) over w as dept2,
periode_begin
from person p
join employee e using(email)
join contract c using(employeenr)
join department d using(departmentnr)
window w as (partition by email order by periode_begin)
) s
where deptnr2 is not null
order by first_name, last_name, periode_begin
Read also about window functions in the documentation.

Related

sub-queries are running fast but joining them is taking forever

I have two tables (sample below) with some additional columns that I have not shown here. The only way to join the two tables is by using a combination of first name, last name, and address.
table A (~3000 rows):
First Name
Last Name
Address
Jane
Doe
123 Main St
Jack
Jones
100 Chestnut St
Tom
Locke
50 Market St
table B (~ 9M rows):
First Name
Last Name
Address
Jane
Doe
123 Main St
Jack
Jones
100 Chestnut St
Jeremy
Thomas
27 Spruce St
I have tried the following code -
select * from
(select first_name, last_name, address, concat(first_name, last_name, address) as con_A
from table_A) as A
join
(select first_name, last_name, address, concat(first_name, last_name, address) as con_B
from table_B) as B
on A.con_A=B.con_B
The above code is a generalization of what my code looks like. I have tried to only put the columns I need in the sub queries in my original code.
The two sub queries run within seconds when I run them individually but taking over an hour to execute when I join them.

I don't know that I'd use inline tables for this ... why not just a direct join?
select
A.first_name,
A.last_name,
A.address
from
table_A A
join table_B B on A.first_name = B.first_name AND
A.last_name = B.last_name AND
A.address = B.address
Now this is an inner join, so you'll only get exact matches for both. If you want to show records from one table whether they match or not, you'll need to use an outer join (left or right depending on the table you want to drive the results).

Instead of using the sub query in join you can directly use the join for better performance of the query.
select A.first_name, A.last_name, A.address from
(select first_name, last_name, address from table_A) as A
join
(select first_name, last_name, address from table_B) as B
on A.first_name=B.first_name, A.last_name=B.last_name, A.address=B.address
And one more thing don't use normal join use either left or right join based on your need. If you joining 3k records with 9M records all combination will form in the result. It makes very cost effective operation.

My SQL Join is only producting half the right aggregate output

SELECT E.DNO as DeptNum, COUNT(E.SSN) as EmployeeCount, COUNT(D.ESSN) as DependentCount
FROM Dependent D
RIGHT OUTER JOIN Employee E ON D.ESSN = E.SSN
GROUP BY E.DNO
The goal is to find the total number of employees and total number of dependents for every department. I am utilizing an Employee Table that features Employee, SSN, Department Number and a Dependent Table that has Dependent SSN, Birthdate, and Gender.
The output should be as follows
Dept Num Employee Count DependentCount
1 1 0
2 3 7
3 5 2
But, instead I am getting
Dept Num Employee Count DependentCount
1 1 0
2 9 7
3 4 2
One thing of note is the dependent's SSN is equivalent to the parent's SSN - that is the only way to define the relationship between the tables. Also, I know I need an outer join because we want to list ALL departments, despite the fact there are 0 mentions of it for Dept 1 in Dependent Table.
Can anyone tell me why this isn't working?

Try using distinct so things don't get double counted:
SELECT E.DNO as DeptNum, COUNT(DISTINCT E.SSN) as EmployeeCount,
COUNT(DISTINCT D.ESSN) as DependentCount
FROM Employee E LEFT JOIN
Dependent D
ON D.ESSN = E.SSN
GROUP BY E.DNO

JOIN - 2 tasks - sql developer

I have 2 tasks:
1. FIRST TASK
Show first_name, last_name (from employees), job_title, employee_id (from jobs) start_date, end_date (from job_history)
My idea:
SELECT s.employee_id
, first_name
, last_name
, job_title
, employee_id
, start_date
, end_date
FROM employees
INNER JOIN jobs hp
on s.employee_id = hp.employee_id
INNER JOIN job_history
on hp.jobs = h.jobs
I know it doesn't work. I'm receiving: "HP"."EMPLOYEE_ID": invalid identifier
What does it mean "on s.employee_id = hp.employee_id". Maybe I should write sthg else instead of this.
2. SECOND TASK
Show department_name (from departments), average and max salary for each department (those data are from employees) and how many employees are working in those departments (from employees). Choose only departments with more than 1 person. The result round to 2 decimal places.
I have the pieces, but i don't know to connect it
My idea:
SELECT department_name,average(salary),max(salary),count(employees_id)
FROM employees
INNER JOIN departments
on employees_id = departments_id
HAVING count(department) > 1
SELECT ROUND(average(salary),2) from employees

I modified your queries a bit by improving table aliasing. Hopefully, if the right columns are present in the tables as you say, it should work:
SELECT s.employee_id, s.first_name, s.last_name,
hp.job_title, hp.employee_id,
h.start_date, h.end_date
FROM employees s
INNER JOIN jobs hp
on s.employee_id = hp.employee_id
INNER JOIN job_history h
on hp.jobs = h.jobs;
When we say on s.employee_id = hp.employee_id it means that if, for example, there is an employee_id = 1234 present in both the tables employees and jobs, then SQL will bring all the columns from both the tables in the same line that corresponds to employee_id = 1234. You can now pick different columns in the SELECT clause as if they are in the same/single table(which was not the case before joining). This is the main logic behind SQL joins.
As to your 2nd task, try the below query. I made some modifications in aggregation by introducing COUNT(DISTINCT s.employees_id). If the same employees_id is present twice for some reason, you still want to count that as one person.
SELECT d.department_name, avg(s.salary), max(s.salary), count(distinct s.employees_id)
FROM employees s
INNER JOIN departments d
on e.employees_id = d.departments_id
GROUP BY d.department_name
HAVING COUNT(DISTINCT s.employees_id) > 1;
Let me know if there is still any issue. Hopefully, this works.

Select highest association when given record's level is random

There are three tables involved. DBMS is Oracle 10g.
Employees = individual employee records
Emp_id (PK)
Emp_name
various detail fields
Department = contain hierarchical org structure
Dept_code (PK)
Department Name
Parent_id (refers to dept_code in same table for parent dept)
depth_level (1=highest level, 2=sub-depts of 1, etc ... max = 6)
various details fields
Association = mapping employees to departments
Assoc_id (pk)
emp_id (fk)
dept_code (fk)
other fields that relate different association types
Where the association maps employees to departments at various depths, I want to run a query that counts all employees grouped at depth = 2. If an employee works in a dept at level 6, I would need to resolve level 5, then level 4, then level 3 to get to level 2, but if they work in a dept at level 3, I only need to resolve to level 2.
Trying to figure out the most efficient way. So far, I'm looking at running 5 separate queries, one for each depth with varying numbers of subqueries to resolve the depth levels and then combining with union. My second thought was to create a static reference table to map each department code to a level 2 label, but maintaining that table would be problematic.
Anybody got any better ideas?

Recursive CTE saved the day. I apologize if my question wasn't clear, here is my solution although I may have changed some of the field names from the original post. I plan to replace the static value for U.ID in the first part of the union query with a parameter that will any department code and retrieve its respective subordinate departments.
In this case dept code '5000002' is the IT department, the results display all employees in various levels of the IT department hierarchy.
select r.full_name, r.id, u.dept_name, u.dept_id, u.dept_level
from clarity.srm_resources r,
clarity.PRJ_OBS_ASSOCIATIONS a,
(with DIRECT_DEPT (Parent_ID, Dept_ID, Dept_Name, Dept_Level)
as
(
SELECT U.PARENT_ID, U.ID AS DEPT_ID, U.NAME AS DEPT_NAME, 0 AS Dept_Level
FROM clarity.prj_obs_units u
where u.type_id = '5000001'
AND U.ID = '5000002'
UNION ALL
SELECT U.PARENT_ID, U.ID AS DEPT_ID, U.NAME AS DEPT_NAME, Dept_Level +1
FROM clarity.prj_obs_units u
INNER JOIN DIRECT_DEPT D
ON U.PARENT_ID = D.DEPT_ID
where u.type_id = '5000001'
)
SELECT Parent_ID, Dept_ID, Dept_Name, Dept_Level
FROM DIRECT_DEPT) u
where a.record_id = r.id
and a.unit_id = u.dept_id
and a.table_name ='SRM_RESOURCES'
and r.is_active = '1'
;

Two Columns with different Query select at a time

I have 2 tables "Staffs" and "Staffjoins",
In Staffs:
two columns:
"sid"- teacher ID (primary Key)
"Sname"- Name of Teacher
In Staffjoins:
three columns
"sid"-teacher ID foreign key (refrences from Staffs table)
"cname" - college name,
"Salary" - teacher salary
My Question is:
I am entering 10 rows in staffs with unique sid and also enter sname with sid.
Then I enter 10 rows in staffjoins table where I entered:
3 rows with cname="College1",
2 rows with cname="College2",
2 rows with cname="College3",
3 rows with cname="College4".
In all rows have "salary" and different sid also then I want to take the name of teacher who earn the highest salary of each College with college name.

Salary is in the wrong table, and there should be a colleges table.
Best you can do is something like this.
Select c.cname, t.Sname
From StaffJoins c
Inner join (Select cname, Max(Salary) as Salary From StaffJoins) as biggestearners
on biggestearners.cname = c.name and biggestearners.salary = c.salary
inner join Staff t on c.sid = t.sid

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding couples of occurrences - Postgresql - sql

Related

sub-queries are running fast but joining them is taking forever

My SQL Join is only producting half the right aggregate output

JOIN - 2 tasks - sql developer

Select highest association when given record's level is random

Two Columns with different Query select at a time

Categories

Resources