Query to combine data from 2 tables based on a condition for both the tables in SQL - sql

I have 2 tables. ProfileInfo (ProfileID - PK) and EmployeeRole (EmpID - PK, ProfileID - FK). Both are connected using a ProfileID and both tables have LastUpdatedTimestamp Column. I need to fetch data from both the tables combined, using from and to lastupdated timestamp.
Sometimes both the tables get updated at the same time and most times
only one get updated
. Here is what i have tried but it bring up data which is updated on both tables. Firstly, I tried join but it didn't work as much as i thought it would
select emp.emp_id as EmpId from EmployeeRole emp
FULL OUTER JOIN ProfileInfo pi on emp.profile_id = pi.profile_id
where emp.LST_UPDT_TS between '2017-09-18' and '2017-09-20' and
pi.LST_UPDT_TS between '2017-09-18' and '2017-09-20';
This brought emp ids that had changes on both the tables alone.
Table Details:
EmployeeRole Emp_ID PK, Profile_id FK, LST_UPDT_TS TIMESTAMP
ProfileInfo Profile_Id PK, Profile_name, LST_UPDT_TS TIMESTAMP
Example: If 2 records of ProfileInfo gets updated and 1 record of EmployeeRole gets updated. I need to get 3 emp_id considering both the records from ProfileInfo is not related to EmployeeRole record. If in case one of the record is related then I have to get 2 emp_id only.
I searched for similar answers for a short period but nothing worked. Please help.

This is just an example, your conditions may vary
SELECT
-- common data
COALESCE(emp.profile_id, pi.profile_id) as profile_id
,COALESCE(emp.LST_UPDT_TS, pi.LST_UPDT_TS) as LST_UPDT_TS
-- emp role
,emp.emp_id as EmpId
-- profile
, pi.Profile_name
FROM (SELECT *
FROM EmployeeRole
WHERE LST_UPDT_TS between '2017-09-18' and '2017-09-20') emp
FULL OUTER JOIN (
SELECT *
FROM ProfileInfo
WHERE LST_UPDT_TS between '2017-09-18' and '2017-09-20') pi
-- rows matching predicate
ON emp.profile_id = pi.profile_id
AND emp.LST_UPDT_TS = pi.LST_UPDT_TS

This worked with some little tweeks from the initial query I changed.
select emp.emp_id as EmpId from EmployeeRole emp
JOIN ProfileInfo pi on emp.profile_id = pi.profile_id and
((emp.LST_UPDT_TS between '2017-09-18' and '2017-09-20') or
(pi.LST_UPDT_TS between '2017-09-18' and '2017-09-20'));
Thanks a lot #Serg and #Phil and #wildplasser

Related

How to tune query to fetch result faster | Oracle 19c |

I have a table which as huge records in table
My tables : employee and customer
Now the issue here is I have 2 billion records in employee table and 1 billion records in customer table
Employee columns
empid
empname
empage
empdcourse
Customer columns
custid
custdesc
custmessage
My query :
select emp_id from employee where empid not in ( select custid from customer);
Error : It throws me table space issue. Not allowed to increase table space
Is their any way I can tune my query or run in batch by batch so I get output
Any solution is much appreciated !!!
Need it on high priority
NOT EXISTS may be more efficient and less memory consuming in such case.
(The query suggests Customer and Employee share the same PK, does it mean you have an "super" table Person ?)
Try this:
with tmp as
(select /*+full(c)*/
custid
from customer c)
select /*+full(e)*/
e.emp_id
from employee e, tmp t
where e.empid = t.custid(+)
and t.custid is null;
The hint full will prevent the tablespace issue.
The OUTER JOIN is faster than the NOT IN.
You can improve it by adding the hint parallel, starting with a degree=2 or 4 like this:
with tmp as
(select /*+full(c) parallel(c,2)*/
custid
from customer c)
select /*+full(e) parallel(e,2)*/
e.emp_id
from employee e, tmp t
where e.empid = t.custid(+)
and t.custid is null;
You can add indexes for columns, for example, if they aren’t primary keys:
CREATE INDEX empid_index
ON employee(empid);
Also, you can update the query:
select e.empid from employee e where not exists (select 1 from customer c where c.custid = e.empid);

issue with hierarchical queries in db2

I have the following tables
Lead
id varchar
employee_id varchar
Employee
id varchar
lead_id varchar
There will be a group of employees assigned to a lead. The Lead table holds the employee id of the lead.
The employee table will have lead_id which will be the id key of the leader.
The table will also contain employees which are not assigned to any lead
I need a query which will display the hierarchical result which will list the leaders and the employees under the leader
leader1 (employee )
employee1
employee 2
Leader 2(employee)
employee 3
employee 4
Any idea how this kind of hierarchical result can be obtained by a db2 query?
Click on the this link to view the table structure
The answer is a join of the two tables like
SELECT l.employee_id as leader_employee_id, e.id as employee_id
FROM LEAD l
INNER JOIN EMPLOYEE e
ON e.lead_id = l.employee_id

How to copy records from inter-linked tables to another in a different database?

I have 3 tables that are inter-linked between each other. The design of the tables are as below.
First (PK:FirstID, vchar:Name, int:Year)
Second (PK:SecondID, FK:FirstID, int:Day, int:Month)
Third (PK:ThirdID, FK:SecondID, int:Speed, vchar:Remark)
I'm trying to copy records from 3 inter-linked tables from Database A to Database B. So my Transact-SQL looks something like this:
INSERT INTO First
(Name, Year)
SELECT Name, Year
FROM DB_A.dbo.First
WHERE Year >= 1992
INSERT INTO Second
(FirstID, Day, Month)
SELECT FirstID, Day, Month
FROM DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID
WHERE Month > 6
INSERT INTO Third
(SecondID, Speed, Remark)
SELECT SecondID, Speed, Remark
FROM DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID
WHERE Remark <> NULL
These statements works all well and fine until the starting position of First.FirstID in Database A and B becomes not the same due to the three tables in Database B being empty. Hence, the constraint on foreign_key error is produced.
Possible Solutions
Reuse old First.FirstID One of the solution I have figured out is to use reuse the old First.FirstID from Database A. This can be done by setting SET IDENTITY_INSERT TableName ON just before the insert into TableName and including the TableName.TableNameID into the insert statement. However, I'm advised against doing this by my colleagues.
Overwrite Second.FirstID with new First.FirstID and subsequently, Third.SecondID with the new Second.SecondID I'm trying to apply this solution using OUTPUT and TABLE variable by outputting all First.FirstID into a temporary table variable and associate them with table Second similar to this answer However, I'm stuck on how to associate and replace the Second.FirstIDs with the correct IDs in the temporary table. An answer on how to do this would also be accepted as the answer for this question.
Using solution No. 1 and Update the primary and foreign keys using UPDATE CASCADE. I just got this idea but I have a feeling it will be very tedious. More research needs to be done but if there's an answer that shows how to implement this successfully, then I'll accept that answer.
So how do I copy records from 3 inter-linked tables to another 3 similar tables but different primary keys? Are there any better solutions than the ones proposed above?
You can use OUTPUT Clause.
CREATE TABLE #First (NewId INT PRIMARY KEY, OldId INT)
INSERT INTO First
(
Name,
Year,
OldId -- Added new column
)
OUTPUT Inserted.FirstID, Inserted.OldId INTO #First
SELECT
Name,
Year,
FirstID -- Old Id to OldId Column
FROM
DB_A.dbo.First
WHERE
Year >= 1992
Second Table
CREATE TABLE #Second (NewId INT PRIMARY KEY, OldId INT)
INSERT INTO Second
(
FirstID,
Day,
Month,
OldId -- Added new column
)
OUTPUT Inserted.SecondID, Inserted.OldId INTO #Second
SELECT
OF.NewId, --FirstID
Day,
Month,
SecondID
FROM
DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID INNER JOIN
#First OF ON F.FirstId = OF.OldId -- Old ids here
WHERE
Month > 6
Last one
INSERT INTO Third
(
SecondID,
Speed,
Remark
)
SELECT
OS.NewId, -- SecondID
Speed,
Remark
FROM
DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID INNER JOIN
#Second OS ON S.SecondID = OS.OldId
WHERE Remark <> NULL
First Solution
Using MERGE and OUTPUT together
OUTPUT combined with MERGE function has the ability to retrieve the old primary keys before inserting into the table.
Second Solution
NOTE: This solution only works if you are sure that you have another column that has its values unique in the table besides the table's primary key.
You may use this column as a link between the table in the source database and its sister table in the target database. The code below is an example taking into account that First.Name has unique values when month > 6.
-- no changes to insert code in First table
INSERT INTO First
(Name, Year)
SELECT Name, Year
FROM DB_A.dbo.First
WHERE Year >= 1992
INSERT INTO Second
(FirstID, Day, Month)
SELECT CurrentF.FirstID, Day, Month -- 2. Use the FirstID that has been input in First table
FROM DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID INNER JOIN
First CurrentF ON CurrentF.Name = F.Name -- 1. Join Name as a link
WHERE Month > 6
INSERT INTO Third
(SecondID, Speed, Remark)
SELECT CurrentS.SecondID, Speed, Remark --5. Get the proper SecondID
FROM DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID INNER JOIN
First CurrentF ON CurrentF.Name = F.Name INNER JOIN -- 3. Join using Name as Link
Second CurrentS ON CurrentS.FirstID= CurrentF.FirstID -- 4. Link Second and First table to get the proper SecondID.
WHERE Remark <> NULL

Update multiple row values to same row and different columns

I was trying to update table columns from another table.
In person table, there can be multiple contact persons with same inst_id.
I have a firm table, which will have latest 2 contact details from person table.
I am expecting the firm tables as below:
If there is only one contact person, update person1 and email1. If there are 2, update both. If there is 3, discard the 3rd one.
Can someone help me on this?
This should work:
;with cte (rn, id, inst_id, person_name, email) as (
select row_number() over (partition by inst_id order by id) rn, *
from person
)
update f
set
person1 = cte1.person_name,
email1 = cte1.email,
person2 = cte2.person_name,
email2 = cte2.email
from firm f
left join cte cte1 on f.inst_id = cte1.inst_id and cte1.rn = 1
left join cte cte2 on f.inst_id = cte2.inst_id and cte2.rn = 2
The common table expression (cte) used as a source for the update numbers rows in the person table, partitioned by inst_id, and then the update joins the cte twice (for top 1 and top 2).
Sample SQL Fiddle
I think you don't have to bother yourself with this update, if you rethink your database structure. One great advantage of relational databases is, that you don't need to store the same data several times in several tables, but have one single table for one kind of data (like the person's table in your case) and then reference it (by relationships or foreign keys for example).
So what does this mean for your example? I suggest, to create a institution's table where you insert two attributes like contactperson1 and contactperson2: but dont't insert all the contact details (like email and name), just the primary key of the person and make it a foreign key.
So you got a table 'Person', that should look something like this:
ID INSTITUTION_ID NAME EMAIL
1 100 abc abc#inst.com
2 101 efg efg#xym.com
3 101 ijk ijk#fg.com
4 101 rtw rtw#rtw.com
...
And a table "Institution" like:
ID CONTACTPERSON1 CONTACTPERSON2
100 1 NULL
101 2 3
...
If you now want to change the email adress, just update the person's table. You don't need to update the firm's table.
And how do you get your desired "table" with the two contact persons' details? Just make a query:
SELECT i.id, p1.name, p1.email, p2.name, p2.email
FROM institution i LEFT OUTER JOIN person p1 ON (i.contactperson1 = p1.id)
LEFT OUTER JOIN person p2 ON (i.contactperson2 = p2.id)
If you need this query often and access it like a "table" just store it as a view.

SQL Query: how to select records, but if a parent record exists select the most recent child of it

I have a complicated query I need to figure out but I'm not well versed enough in writing queries and sub queries.
The problem: I need to retrieve unique Patient records but if a record has a non null master_patient_id, I need to subquery or join on that master_patient table and query for the most recent (created_at desc limit 1) child patient of that master_patient.
The reason for this is that our system will create a new patient record for the same patient if they were readmitted to the same facility. Upon creating the 2nd record for a given patient we also create a master_patient record to associate the 2 patient records with it so that the system can know they are the same patient.
Now, I need to show a list of non duplicate patients. So I need to have a query that will get patients from the patient record, but query the master_patient table and only retrieve the latest patient associated to its master_patient.
Patient Table has: id, name, master_patient_id
and the patient belongs_to master_patient but isn't required to be present.
Master Patient table just has an id and has_many patients.
Desired results: should be unique patient records, but the only way to find out if patients are unique among themselves is to query the master_patient table to see if any patients belong_to it and then just retrieve the latest patient (child of master_patient).
I can't base my query off master_patient because those don't exist for patients that only have 1 record. Should I use some type of join or subquery?
Update: Thanks to #τεκ I was able to tweak his query to work in Postgres:
Update 2: 1 more tiny tweak to the query to make it shorter and correct a null id being returned:
SELECT MAX(patients.id) as id, *
FROM "patients"
JOIN (
SELECT MAX(created_at) AS created_at,
patient_master_id,
COALESCE(patient_master_id, id) pm_id
FROM patients
GROUP BY patient_master_id,
COALESCE(patient_master_id, id)
) s
ON (s.pm_id = patients.id or s.patient_master_id = patients.patient_master_id)
AND s.created_at = patients.created_at
GROUP BY patients.id, s.created_at, s.patient_master_id, s.pm_id
select max(id) as id from patient p
join (select
max(created_at) as created_at,
master_patient_id,
case when master_patient_id is null then id else null end as id
from patient
group by master_patient_id, case when master_patient_id is null then id else null end
) s on (s.id = p.id or s.master_patient_id = p.master_patient_id) and s.created_at = p.created_at
There's probably a simpler, postgres-specific way to do it, but I don't know postgres. In T-SQL it's cross apply.