How to merge tables without duplicates and maintain foreign references? - sql

I'm creating a data warehouse for a healthcare company. They have separate databases for different hospitals which contain tables on patients,their insurance,etc and PK is unique only within one hospital DB. When merged, I'm supposed to create a Master Patient Table, a Master Insurance Company table,etc that combines duplicate data into one record(eg. by comparing name and SSN fields for patients).
Any suggestions on how to do this merge, and create correct FK references in the new tables? A record in the Patient table needs to have a correct reference to an insurance company in the Insurance table. Any help or general pointers is appreciated!

Load the data from the first hospital (H1) into the warehouse. Then, move in the patient data from the second hospital (H2):
insert into P1.Patients( pid, ... )
select P2.pid, P2.this, P2.that, ...
from H2.Patients P2
left join H1.Patients P1
on P1.ssn = P2.ssn
where P1.pid is null;
Now you have added the P2 patients that were not already in the P1 patient table but retained the P1 patient ids of those already there. (You may have to handle collisions.) Then join the H2 insurance table with the H2 patient table to get the SSN joining with H1 patient table to get the H1.PatientID (pid).
insert into H1.Insurance( pid, ...)
select P1.pid, I2.this, I2.that,... -- To get H1's pid for H2's patients...
from H2.Insurance I2 -- Join the 2nd hospital's insurance table
join H2.Patients P2 -- ...with its patient table
on P2.pid = I2.pid -- ...based on its existing patient ID value.
join H1.Patients P1 -- Now you can join with first hospitals patient table
on P1.ssn = P2.ssn -- ...using SSN from 2nd hospital's patient table
where anything_else;
Repeat for other tables, using P1.pid to replace all uses of P2.pid.

The other answer just seems horribly inefficient. Instead of doing so many joins I would have a SELECT statement FROM H1.P P1 and H2.P P2 where P1.SSN==P2.SSN and insert this into H1. Then have a select for Insurance FK,Guarantor FK and whatever else you have on this result and left join it to this table. Do something similar for Insurance tables. Then select records where Insurance ID matches Insurance FK in patient and update them with new surrogate keys..
Although I would like someone with more experience taking on this question as well..

Related

How to link tables correctly in SQL to add roles to staff?

Currently I have a staff table with columns:
Staff_Id, first_name, Surname.
My second table is:
Id, management_role.
When I link the tables each staff member gets added to every management role. So for example a person in first table called Jim is added three times as manager, supervisor, intern and this happens for every staff.
Some things to consider that are your ID columns are primary keys for their respective tables. If not are every value in the column is unique? Also are ids not
From your description you might be using a cross join here. The thing you need is inner join so it joins the matching id's together.
So you can do
SELECT *
FROM staff_table as st
INNER JOIN management_table as mt
ON st.Staff_Id = mt.ID

SQL Statement querying same table 2 times

Ok, I've got brain cells melting at an alarming rate on this SQL statement. Not my database, but I've been tasked with extracting data. So here's what I'm dealing with...
It is medical data. We have a database where ALL of the people are listed in one table- patients, as well as doctors. Each person has a unique PersonID. Let's just start with the Person table:
Person:
PersonID, PersonType, LastName, FirstName
I have another table that is hospital admissions.
Admissions:
AdmissionID, PersonID and PrimaryMD
where the Primary MD is the same as the Person ID for a doctor.
I need to extract each Admission, with the last name, and then the first name of the patient, but then I need to go back, based on the PrimaryMD identifier and use that value to pull the last name and first name of the doctor so that my results look like:
Admission | PatientLastName | PatientFirstName | DoctorLastName | DoctorFirstName
Ultimately, I'll need to pull address information for both the patient, and the doctor which is all stored in an address table with the same PersonID as in the person table, and then pull the doctor's address using the primarymd against the person table. But I can't figure how to write two queries in the same statement against these similar columns. I tried using aliases, and some left and inner joins and even a union, but I can't seem to get things right.
Any assistance would be hugely appreciated.
Try this:
SELECT
a.AdmissionID,
pat.LastName AS PatientLastName,
pat.FirstName AS PatientFirstName,
doc.LastName AS DoctorLastName,
doc.FirstName AS DoctorFirstName
FROM Admissions a
INNER JOIN Person pat
ON a.PersonID = pat.PersonID
INNER JOIN Person doc
ON a.PrimaryMD = doc.PersonID
For getting addresses use the same steps:
SELECT
a.AdmissionID,
pat.LastName AS PatientLastName,
pat.FirstName AS PatientFirstName,
doc.LastName AS DoctorLastName,
doc.FirstName AS DoctorFirstName
FROM Admissions a
INNER JOIN Person pat
ON a.PersonID = pat.PersonID
INNER JOIN Person doc
ON a.PrimaryMD = doc.PersonID
INNER JOIN Addresses addPat
ON pat.PersonID = addPat.PersonID
INNER JOIN Addresses addDoc
ON doc.PersonID = addDoc.PersonID

Returning two pieces of data from a tables based on a two different joins

I'm sure there is an answer out there for this, I am just having a hard time explaining what exactly it is I am looking for making it really hard to research:
I basically have 2 tables:
Table A:
PrimaryKeyID
SalesmanID
ManagerID
Table B:
List of all employers with ID being the primary key (auto incremented from 1)
I need to get both the Salesman and Manager Names from Table B, from a specific row in table A. Consider Table A like a transaction log.
Just a way to do it
SELECT A.ID, SalesMan.NAME, Manager.NAME
FROM TableA A
LEFT JOIN TableB SalesMan ON SalesMan.Id= A.SalesmanID
LEFT JOIN TableB Manager ON Manager.Id= A.ManagerID
WHERE (A.Your condition here)
AND (SalesMan.SalesmanID IS NOT NULL OR Manager.ManagerID IS NOT NULL)

Inner joins with 2 foreign keys to one primary key

I have a table called
branch (branchid, branchname)
and another table called transfer
transfer(tranferid, sourcebranch, destinationbranch)
both sourcebranch and destinationbranch are Fk to the branchid of of branch table.
I need to show a query that looks like this
Tranferid Source Destination
4 uk us
but all I can get is something like this
Tranferid Source Destinationid
4 uk 3
query sample
select tranferid, branch.branchname, transfer.destinationbranch
from transfer
inner join branch on branch.branchid == transfer.sourcebranch
How do I get the destination branch to show. CTE on my mind
You need to join table branch on table transfer twice so you can get the value for each column.
SELECT a.*,
b.branchName AS sourceBranchName,
c.branchName AS destinationBranchName
FROM transfer a
INNER JOIN branch b
ON a.sourcebranch = b.branchID
INNER JOIN branch c
ON a.destinationbranch = c.branchID
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins

SQL Join 2 tables

I have two tables one named Person, which contains columns ID and Name and the second one, named Relation, which contains two columns, each of which contains an ID of a Person. It's about a relation between customer and serviceman. I'd like to Join these two tables so that I'll have names of people in every relation. Is it possible to write this query with some kind of joining?
EDIT::
I must do something wrong, but it's not working. I had tried a lot of forms of so looking queries, but I had been only getting one column or some errors. It's actually the school task, I have it already done (with different JOIN query). Firstly I had been trying to do this, but I'd failed: It seems to be very common situation, so I don't know why it's too complicated for me..
Here are my tables:
CREATE TABLE Oprava..(Repair) (
KodPodvozku INTEGER PRIMARY KEY REFERENCES Automobil(KodPodvozku),
IDzakaznika..(IDcustomer) INTEGER REFERENCES Osoba(ID),
IDzamestnance..(IDemployee) INTEGER REFERENCES Osoba(ID)
);
CREATE TABLE Osoba..(Person) (
ID INTEGER CONSTRAINT primaryKeyOsoba PRIMARY KEY ,
Jmeno..(Name) VARCHAR(256) NOT NULL,
OP INTEGER UNIQUE NOT NULL
);
It's in Czech, but the words in brackets after ".." are english equivalents.
PS: I am using Oracle SQL.
Assuming your tables are:
persons: (id, name)
relations: (customer_id, serviceman_id)
Using standard SQL:
SELECT p1.name AS customer_name,
p2.name AS serviceman_name
FROM persons p1
JOIN relations ON p1.id=relations.customer_id
JOIN persons p2 ON relations.serviceman_d=p2.id;
Further explanation
The join creates the following table:
p1.id|p1.name|relations.customer_id|relations.serviceman_id|p2.id|p2.name
Where p1.id=relations.customer_id, and p2.id=relations.serviceman_id. The SELECT clause chooses only the names from the JOIN.
Note that if all the ids from relations are also in persons, the result size would be exactly the size of the relations table. You might want to add a foreign key to verify that.
SELECT *
FROM Relation
INNER JOIN Person P1
ON P1.ID = Relation.FirstPersonID
INNER JOIN Person P2
ON P2.ID = Relation.SecondPersonID
SELECT p1.name AS customer, p2.name AS serciveman
FROM person p1, person p2, relation
WHERE p1.id=relation.customerid AND p2.id=relation.servicemanid
Person(ID, Name)
Relation(ID)
You don't mention the other columns that relation contains but this is what you need:
Select name
from Person as p
join Relation as r
on p.ID = r.ID
This is an INNER JOIN as are most of the other answers here. Please don't use this until you understand that if either record doesn't have a relationship in the other table it will be missing from the dataset (i.e. you can lose data)
Its very important to understand the different types of join so I would use this as an opportunity.