Join two of the same tables to another table and output the info of the (same) table in the same row - sql

Sorry for the bad/long title but I don't know how else to put it.
What I want to do is join to 'A' tables and join it to the 'B' table where both 'A' have a foreign key in common and display info from both 'A' tables in the same row while preventing duplicates such as the example in the pic:
I know the query is just doing it's job, but is there a way to prevent 'duplicates' by comparing between the rows before output?
Here's what I tried, I know it may be bad performance-wise and there may be better ways but this is for a mini-project with a small DB, where performance shouldn't really matter:
SELECT w.emp_id AS emp1_id, w2.emp_id AS emp2_id,
e.fname || ' ' || e.lname AS emp1_name, e1.fname || ' ' || e1.lname AS emp2_name,
e.jobtitle AS emp1_jobtitle, e1.jobtitle AS emp2_jobtitle, e2.fname || ' ' || e2.lname AS cs_name
FROM work_on w
LEFT JOIN work_on w2
on w.emp_id != w2.emp_id and w.ticket_id = w2.ticket_id
LEFT JOIN employee e
on w.emp_id = e.emp_id
LEFT JOIN employee e1
on w2.emp_id = e1.emp_id
LEFT JOIN ticket t
on t.ticket_id = w.ticket_id
LEFT JOIN customer_problem p
on p.problem_id = t.problem_id
LEFT JOIN employee e2
on e2.emp_id = p.emp_id
WHERE e2.emp_id = 20 and p.submit_date >= '2018-04-08'
and p.submit_date <= '2018-04-11' and e1.emp_id != e.emp_id
ORDER BY w.emp_id;
My tables:
Employee: | Work_On: | Ticket: | Problem
----------+------------+--------------+------------
emp_id work_id ticket_id problem_id
fname emp_id problem_id emp_id
lname ticket_id
In this case I'm trying to combine two Employee on Work_On where they have the Ticket in common and another Employee which connects to the ticket via the Problem table.

Here is one option using least/greatest:
SELECT DISTINCT
LEAST(w.emp_id, w2.emp_id) AS emp1_id,
GREATEST(w.emp_id, w2.emp_id) AS emp2_id,
LEAST(e.fname || ' ' || e.lname, e1.fname || ' ' || e1.lname) AS emp1_name,
GREATEST(e.fname || ' ' || e.lname, e1.fname || ' ' || e1.lname) AS emp2_name,
LEAST(e.jobtitle, e1.jobtitle) AS emp1_jobtitle,
GREATEST(e.jobtitle, e1.jobtitle) AS emp2_jobtitle,
e2.fname || ' ' || e2.lname AS cs_name
FROM work_on w
LEFT JOIN work_on w2
ON w.emp_id != w2.emp_id AND w.ticket_id = w2.ticket_id
LEFT JOIN employee e
ON w.emp_id = e.emp_id
LEFT JOIN employee e1
ON w2.emp_id = e1.emp_id
LEFT JOIN ticket t
ON t.ticket_id = w.ticket_id
LEFT JOIN customer_problem p
ON p.problem_id = t.problem_id
LEFT JOIN employee e2
ON e2.emp_id = p.emp_id
WHERE
e2.emp_id = 20 AND
p.submit_date >= '2018-04-08' AND
p.submit_date <= '2018-04-11' AND
e1.emp_id != e.emp_id
ORDER BY w.emp_id;
To see why the least/greatest trick works, consider the following two records/columns:
emp1_id | emp2_id
2 | 15
15 | 2
It should be clear that while these records are distinct now, if we instead choose the least id followed by the greatest id, they appear identical:
LEAST(emp_id1, emp_id2) | GREATEST(emp_id1, emp_id2)
2 | 15
2 | 15
Then, using SELECT DISTINCT removes one of the two duplicate rows.

Related

How to access columns from correlated subquery

With below data i am trying to get employees of same location and department with type as manager whose salary is equals to sum of other employees sal along with emp ids as empids are unequal query not returning any result also grouping query should have more than 1 record
so result should not include Raghu record and also i want emp ids of all the records that are matched :
EMP_ID EMP_Name EMP_Loc EMP_Dept EMP_Sal Emp_type
1 Arjun Hyd Comp 1000 Manager
2 Ramesh Hyd Comp 500 Interim
3 Ragav Hyd Comp 300 Interim
4 Rajesh Hyd Comp 200 Interim
5 Raghu Hyd Comp 1000 Interim
select a.emp_dept , a.emp_loc ,a.emp_dept,b.emp_dept,a.emp_id,b.emp_id
from
(select sum(emp_sal) as sett,emp_loc,emp_dept,emp_id
from employee
where emp_type = 'Interim'
group by emp_loc,emp_dept,emp_id having count(emp_sal)>1
) a
inner join
(select emp_sal ,emp_loc,emp_dept,emp_id
from employee
where emp_type = 'Manager'
) b
on a.sett=b.emp_sal and a.emp_loc=b.emp_loc and a.emp_dept=b.emp_dept;
Sample data
solution:
SELECT a.EMP_ID, a.EMP_Name, a.EMP_Loc, a.EMP_Dept, a.EMP_Sal, a.Emp_type
FROM employee AS a INNER JOIN
(SELECT SUM(EMP_Sal) AS sal, EMP_Loc,EMP_Dept
FROM employee AS employee_1
WHERE (Emp_type = 'Interim')
GROUP BY EMP_Loc, EMP_Dept having Count(EMP_Sal)>1) AS b
ON a.EMP_Dept = b.EMP_Dept AND a.EMP_Loc = b.EMP_Loc AND a.EMP_Sal = b.sal
WHERE a.Emp_type= 'Manager'
Results:
Your problem is that you are including the emp_id in your first sub-select and grouping on it. This means that you get three rows, and the sum becomes pointless. There is never a case where the manager has 500, 200 or 300 salary, so the join doesn't work and there are therefore no records returned. If you do this instead it should work:
select a.emp_dept, a.emp_loc, b.emp_id
from
(select sum(emp_sal) as sett,emp_loc,emp_dept
from employee
where emp_type = 'Interim' and emp_sal > 1
group by emp_loc,emp_dept
) a
inner join
(select emp_sal ,emp_loc,emp_dept,emp_id
from employee
where emp_type = 'Manager'
) b
on a.sett=b.emp_sal and a.emp_loc=b.emp_loc and a.emp_dept=b.emp_dept;
Note that I also removed duplicate columns in the resultset. There is no point showing b.emp_dept as well as a.emp_dept, because the join condition guarantees that they will be the same.
EDIT
Following your revision to the question, I understand now what you are trying to do. This is really difficult to achieve, since what you need to do, is to take a cartesian sum of all the possible Interim salaries for a location and department to check if any possible combination of salary sums matches that of the Manager.
One possible way to do this would be the following:
WITH cte as
(SELECT a.emp_id as aempid, b.emp_id as bempid, 0 as cempid, 0 as dempid, a.emp_sal + b.emp_sal AS SumSal, a.emp_loc, a.emp_dept
FROM employee a
CROSS JOIN employee b WHERE a.emp_loc = b.emp_loc AND a.emp_dept = b.emp_dept
AND a.emp_type = b.emp_type AND a.emp_type = 'Interim' AND a.emp_id <> b.emp_id
AND b.emp_id > a.emp_id
UNION
SELECT a.emp_id, b.emp_id, c.emp_id, 0, a.emp_sal + b.emp_sal + c.emp_sal AS SumSal, a.emp_loc, a.emp_dept
FROM employee a
CROSS JOIN employee b
CROSS JOIN employee c
WHERE a.emp_loc = b.emp_loc AND a.emp_dept = b.emp_dept
AND a.emp_type = b.emp_type AND a.emp_type = 'Interim' AND a.emp_id <> b.emp_id
AND a.emp_loc = c.emp_loc AND a.emp_dept = c.emp_dept
AND a.emp_type = c.emp_type AND a.emp_id <> c.emp_id AND b.emp_id <> c.emp_id
AND b.emp_id > a.emp_id AND c.emp_id > b.emp_id
UNION
SELECT a.emp_id, b.emp_id, c.emp_id, d.emp_id, a.emp_sal + b.emp_sal + c.emp_sal + d.emp_sal AS SumSal, a.emp_loc, a.emp_dept
FROM employee a
CROSS JOIN employee b
CROSS JOIN employee c
CROSS JOIN employee d
WHERE a.emp_loc = b.emp_loc AND a.emp_dept = b.emp_dept
AND a.emp_type = b.emp_type AND a.emp_type = 'Interim' AND a.emp_id <> b.emp_id
AND a.emp_loc = c.emp_loc AND a.emp_dept = c.emp_dept
AND a.emp_type = c.emp_type AND a.emp_id <> c.emp_id AND b.emp_id <> c.emp_id
AND a.emp_loc = d.emp_loc AND a.emp_dept = d.emp_dept
AND a.emp_type = d.emp_type AND a.emp_id <> d.emp_id AND b.emp_id <> d.emp_id AND c.emp_id <> d.emp_id
AND b.emp_id > a.emp_id AND c.emp_id > b.emp_id AND d.emp_id > c.emp_id
)
SELECT m.emp_id as managerid, cte.aempid, cte.bempid, cte.cempid, cte.dempid, m.emp_sal, m.emp_loc, m.emp_dept
FROM employee m
INNER JOIN cte ON cte.emp_loc = m.emp_loc AND cte.emp_dept = m.emp_dept AND m.emp_sal = cte.sumsal
WHERE m.emp_type = 'Manager';
This is really ugly, but I could think of nothing better, and at least it returns the desired result. In practice of course, you would have to create it dynamically to match the maximum number of Interim employees in any one location/department.
A couple of points in case it is not clear. With this approach there is no need for a HAVING COUNT(*) > 1 since the requirement a.emp_id <> b.emp_id etc. guarantees that there are at least two different Interim employees. Also the requirement that b.emp_id > a.emp_id (and c and d) is there to ensure that each possible combination of a, b, c and d only occurs once (so in this case we get just 2, 3, 4 (in that order) plus 0 for d, not all possible combinations of 2, 3, and 4.
You are looking for all groups of employees that have a total salary that equals their manager's salary. In order to find all these groups you need a recursive query in SQL. In PostgreSQL you can use an array to collect the employees in that query.
Sample data:
emp_id
emp_type
salary
1
Manager
1000
2
Interim
500
3
Interim
500
4
Interim
300
5
Interim
200
Result:
manager_id
employee_ids
1
{2,3}
1
{2,4,5}
1
{3,4,5}
The query:
with recursive cte(manager_id, manager_salary, employee_ids, total, loc, dept) as
(
select emp_id, emp_sal, array[]::integer[], 0, emp_loc, emp_dept
from employee
where emp_type = 'Manager'
union
select cte.manager_id, cte.manager_salary, cte.employee_ids || e.emp_id,
cte.total + e.emp_sal, cte.loc, cte.dept
from cte
join employee e
on e.emp_loc = cte.loc
and e.emp_dept = cte.dept
and e.emp_type <> 'Manager'
and e.emp_id > all(cte.employee_ids)
and cte.total + e.emp_sal <= cte.manager_salary
)
select manager_id, manager_salary, employee_ids
from cte
where total = manager_salary
and array_length(employee_ids, 1) > 1
order by manager_id, employee_ids;
Demo: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=5311cdaebdc875440d9e226ace8dfefc
As a bonus:
At last the query with the employee IDs unnested, so you see the groups in multiple line with one line per employee:
with recursive cte(manager_id, manager_salary, employee_ids, total, loc, dept) as
(
select emp_id, emp_sal, array[]::integer[], 0, emp_loc, emp_dept
from employee
where emp_type = 'Manager'
union
select cte.manager_id, cte.manager_salary, cte.employee_ids || e.emp_id,
cte.total + e.emp_sal, cte.loc, cte.dept
from cte
join employee e
on e.emp_loc = cte.loc
and e.emp_dept = cte.dept
and e.emp_type <> 'Manager'
and e.emp_id > all(cte.employee_ids)
and cte.total + e.emp_sal <= cte.manager_salary
)
select cte.manager_id, cte.manager_salary,
dense_rank() over (partition by cte.manager_id order by cte.employee_ids) as grp,
e.*
from cte
cross join lateral unnest(employee_ids) link(employee_id)
join employee e on e.emp_id = link.employee_id
where cte.total = cte.manager_salary
and array_length(cte.employee_ids, 1) > 1
order by cte.manager_id, grp, e.emp_id;
Demo: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=7b6218f28cbf1e9d4c1a0c76c46b9601

How to unite several tables in a one so the names of the columns became the row names?

for instance I have
SELECT customer_id, first_name || ', ' || last_name || ', ' || email as "customer's info"
FROM customer
WHERE customer_id = 5
;
SELECT count(i.film_id) AS "num.of films rented" FROM payment p
JOIN rental r ON p.rental_id = r.rental_id
JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE r.rental_date >= ('2014-01-01'::date)
AND r.rental_date <= ('2017-05-03'::date)
AND p.customer_id = 5
;
I want in output
metric1 | metric2
----------------------------
customer's info | blalalalal
num.of films rented | blalalalal
I try smth like, but nothing
SELECT * FROM crosstab(
SELECT first_name || ', ' || last_name || ', ' || email
FROM customer WHERE customer_id = 5,
SELECT count(i.film_id) FROM payment p
JOIN rental r ON p.rental_id = r.rental_id
JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE r.rental_date >= ('2014-01-01'::date)
AND r.rental_date <= ('2017-05-03'::date))
AS ('fjfjf' TEXT, 'fjfjf' int );
Could you help me?
I dont know how to do it in postgress
Thanks a lot
I would UNION ALL the two queries together - but remember to CAST the count value as a string, as you need matching data types to UNION:
SELECT
'customer''s info' AS "name"
, first_name || ', ' || last_name || ', ' || email AS "value"
FROM customer c
UNION ALL
'num.of films rented' AS "name"
, COUNT(i.film_id)::VARCHAR(5) AS "value"
FROM payment p
JOIN rental r ON p.rental_id = r.rental_id
JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE r.rental_date >= ('2014-01-01'::date)
AND r.rental_date <= ('2017-05-03'::date)
WHERE customer_id = 5
;
It is unclear to me why inventory is in the second join.
SELECT 'customer''s info' as metric1,
first_name || ', ' || last_name || ', ' || email as metric2
FROM customer
WHERE customer_id = 5
UNION ALL
SELECT 'num.of films rented' as metric1, count(i.film_id)::text AS metric2
FROM payment p JOIN
rental r
ON p.rental_id = r.rental_id
WHERE r.rental_date >= '2014-01-01'::date AND
r.rental_date <= '2017-05-03'::date AND
p.customer_id = 5;
You could also combine this into a single query if you are just trying to get the results in a single result set:
SELECT (first_name || ', ' || last_name || ', ' || email) as customer_info,
count(i.film_id) as num_films
FROM payment p JOIN
rental r
ON p.rental_id = r.rental_id JOIN
customer c
ON c.customer_id = p.customer_id
WHERE r.rental_date >= '2014-01-01'::date AND
r.rental_date <= '2017-05-03'::date AND
c.customer_id = 5
GROUP BY c.customer_id;
(This puts the values in one row with two columns.) Using a subquery, the results can be easily unpivoted.

SQL Query Returns Duplicate Rows

SELECT E.SSN AS SocialSecurityNumber ,
E.Name + ' ' + E.LastName AS FullName ,
J.Job ,
R.Date ,
R.STime AS StartTime,
R.ETime AS EndTime,
CASE WHEN R.Date BETWEEN O.SDate AND O.EDate THEN O.OffID
ELSE 0
END AS OffReason
FROM Resume AS R
INNER JOIN Employees AS E ON E.ID = R.EmpID
INNER JOIN Jobs AS J ON J.ID = E.JobID
LEFT JOIN Offs AS O ON E.ID = O.EmpID
AND R.Date BETWEEN O.SDate AND O.EDate
WHERE E.JobLeft = 0
AND R.Date BETWEEN '2014-11-26 00:00:00'
AND '2014-11-26 23:59:59'
ORDER BY FullName
I wrote this SQL Query for my employee resume report program. I want to retrieve when the employees start to work and if they have a reason for not coming to work, return the reasonId or if they don't retrieve 0. In this query, the result returns duplicate rows when there is two or more offreasons for the same employee. Like vacation & national holiday. I searched for this but can't find the same situation. The removed rows are not important; one offreason is enough for me. So I tried the DISTINCT keyword but it doesn't solve the problem. So what can I do to solve this?
The simplest would be to select only one of the reasons, using MAX for instance:
SELECT E.SSN AS SocialSecurityNumber,
E.Name + ' ' + E.LastName AS FullName,
J.Job,
R.DATE,
R.STime AS Start,
R.ETime AS END,
MAX(CASE
WHEN R.DATE BETWEEN O.SDate AND O.EDate THEN O.OffID
ELSE 0
END) AS OffReason
FROM Resume AS R
INNER JOIN Employees AS E
ON E.ID = R.EmpID
INNER JOIN Jobs AS J
ON J.ID = E.JobID
LEFT JOIN Offs AS O
ON E.ID = O.EmpID
AND R.DATE BETWEEN O.SDate AND O.EDate
WHERE E.Left = 0
AND R.DATE BETWEEN '2014-11-26 00:00:00' AND '2014-11-26 23:59:59'
GROUP BY E.SSN,
E.Name + ' ' + E.LastName,
J.Job,
R.DATE,
R.STime,
R.ETime
ORDER BY FullName

HSQLDB v.2.3.0: USING predicate working as intended?

The following query when executed against snapshot 50 of HSQLDB 2.3.0 produces an error. The error message is "Error: duplicate column name in derived table: INST_ID"
SELECT c.lastname, to_date_string(c.dob), i.itag|| ': ' ||
m.mrn, to_date_string(en.dofs),
to_date_string(en.dols), pa.payertag,
y.dxname, d.icd9, d.icd10, d.icd10name,
to_datetime_string(s.dos), r.cpt, r.rxname, p.lastname || ', ' ||
p.firstname
FROM Encounters AS en
INNER JOIN Clients AS c USING ( cli_id )
INNER JOIN Client_MRNs AS m ON c.defmrn_id = m.mrn_id
INNER JOIN Institutions AS i USING ( inst_id )
INNER JOIN Payers AS pa USING ( payer_id )
INNER JOIN Encounter_DXs AS x USING ( enc_id )
INNER JOIN Diagnoses AS d USING ( dx_id )
INNER JOIN DXSynonyms AS y ON d.defsyn_id = y.syn_id
INNER JOIN Services AS s USING ( enc_id )
INNER JOIN RXCodes AS r USING ( rx_id )
INNER JOIN Providers AS p USING ( prov_id )
WHERE (s.dos >= 56453 AND s.dos < 56461)
ORDER BY c.lastname, en.dofs, s.dos;
However, when I execute the same query but replace all the USING predicates with ON ... = phrases it executes successfully:
SELECT c.lastname, to_date_string(c.dob), i.itag|| ': ' ||
m.mrn, to_date_string(en.dofs),
to_date_string(en.dols), pa.payertag,
y.dxname, d.icd9, d.icd10, d.icd10name,
to_datetime_string(s.dos), r.cpt, r.rxname, p.lastname || ', ' ||
p.firstname
FROM Encounters AS en
INNER JOIN Clients AS c ON c.cli_id = en.cli_id
INNER JOIN Client_MRNs AS m ON c.defmrn_id = m.mrn_id
INNER JOIN Institutions AS i ON i.inst_id = m.inst_id
INNER JOIN Payers AS pa ON pa.payer_id = en.payer_id
INNER JOIN Encounter_DXs AS x ON x.enc_id = en.enc_id
INNER JOIN Diagnoses AS d ON d.dx_id = x.dx_id
INNER JOIN DXSynonyms AS y ON d.defsyn_id = y.syn_id
INNER JOIN Services AS s ON s.enc_id = en.enc_id
INNER JOIN RXCodes AS r ON r.rx_id = s.rx_id
INNER JOIN Providers AS p ON p.prov_id = s.prov_id
WHERE (s.dos >= 56453 AND s.dos < 56461)
ORDER BY c.lastname, en.dofs, s.dos;
Is this working as intended? I like using USING because it results in less verbose, cleaner code. I won't include the DDL for the tables right now (but can), because the queries are big and involve many tables, but there are three tables that have INST_ID fields. One table has it as the primary key, and the other two have foreign keys to it. Really the only difference in the queries is "ON" vs "USING".
After the first join, there is one INST_ID from ENCOUNTERS. After the second join, there is an additional one from CLIENTS_MRNS. The third join fails because of this duplication.

Vertical Join in an SQL Statement

I've got the following SQL Statement:
select * from Leaves inner join LeaveDetails on Leaves.LeaveId= LeaveDetails.LeaveId
inner join Employee on Leaves.EmployeeCode = Employee.EmployeeCode
inner join LeaveType on Leaves.LeaveTypeId= LeaveType.LeaveTypeId
inner join LeaveStatus on Leaves.StatusId = LeaveStatus.StatusId
inner join Employee_organizationaldetails on Employee_organizationaldetails.EmployeeCode=Employee.EmployeeCode
where Leaves.LeaveId = 7295
Employee_organizationdetails contains another column called reporting officer which is a foreign key to the same Employee table. Now I need to get the name of the Employee.
How can I write the above query so that I can get the name of the reporting officer as another column without fetching executing the query
select (FirstName + ' ' + LastName) as name from Employee where EmployeeCode = ReportingTo
Here ReportingTo is the employee code. I need to join them vertically. Something similar to Union operator
You want to join back to another "copy" of the Employee table:
select *, (ro.FirstName + ' ' + LastName) as ReportingName
from Leaves inner join LeaveDetails on Leaves.LeaveId= LeaveDetails.LeaveId
inner join Employee on Leaves.EmployeeCode = Employee.EmployeeCode
inner join LeaveType on Leaves.LeaveTypeId= LeaveType.LeaveTypeId
inner join LeaveStatus on Leaves.StatusId = LeaveStatus.StatusId
inner join Employee_organizationaldetails on Employee_organizationaldetails.EmployeeCode=Employee.EmployeeCode left outer join
Employee ro
on ro.EmployeeCode = ReportingTo
where Leaves.LeaveId = 7295;
You probably don't want the * -- I assume it is just a shorthand for the question. It is better to list columns explicitly, especially because there are duplicate column names.
Possible this be helpful for you -
SELECT
*
, ReportingName = ro.FirstName + ' ' + LastName
FROM (
SELECT *
FROM dbo.Leaves l
WHERE l.LeaveId = 7295
) l
JOIN dbo.LeaveDetails ld ON l.LeaveId = ld.LeaveId
JOIN dbo.Employee e ON l.EmployeeCode = e.EmployeeCode
JOIN dbo.LeaveType lt ON l.LeaveTypeId = lt.LeaveTypeId
JOIN dbo.LeaveStatus ls ON l.StatusId = ls.StatusId
JOIN dbo.Employee_organizationaldetails e2 ON e2.EmployeeCode = e.EmployeeCode
LEFT JOIN dbo.Employee ro ON ro.EmployeeCode = ReportingTo