SQL DB table Relationship - sql

I am trying to create a small DB based on flat files that I import from a non Database System. The Import is working, the DB is good but I added a new table that contains data from another system. I am trying to create a relationship between the tables but because one table has duplicate rows (flat file is the source) I am not able to set that relationship.
Example: Table 1 lists all procedures done for a patient by a physician.. the patient can have many of the same procedures on the same day by the same physician (hence the duplicate rows) ... Table 2 has a list of Physicians and their ID #s ... I want to set up a relationship between the two tables based on the physician's name but I am getting errors because of the non unique data.
Anyone has a tip?
Thanks

the patient can have many of the same procedures on the same day by the same physician (hence the duplicate rows)
Normally, you should be able to set a foreign key relationship from Table1 to Table2 even in the presence of duplicate rows. This kind of error usually means you're trying to set the foreign key in the wrong table.
-- Your "Table2"
create table physicians (
physician_id integer primary key,
physician_name varchar(35) not null -- names are not unique
);
insert into physicians values
(1, 'Doctor Who'), (2, 'Dr. Watson');
create table patients (
patient_id integer primary key,
patient_name varchar(35) not null -- names are not unique
);
insert into patients values
(100, 'Melville, Herman'), (101, 'Poe, Edgar Allen');
-- Your "Table1"
-- Allows multiple physicians per date.
create table patient_procedures(
patient_id integer not null references patients (patient_id),
physician_id integer not null references physicians(physician_id),
procedure_date date not null default current_date,
procedure_name varchar(15) not null,
primary key (patient_id, physician_id, procedure_date, procedure_name)
);
insert into patient_procedures values
(100, 1, '2012-01-02', 'CBC'),
(100, 1, '2012-01-02', 'Thyroid panel');

I'm not sure from your description where the duplicate-data problem is. You have:
Table 1: procedures. Could be lots of rows for same physician
Table 2: physicians. Should be 1 row per physician (but there may be duplicates)
The relationship that would make sense would be 1[table 2 physician row] -> many[table 1 procedures rows]. i.e. table 2 would be the primary key table in the relationship: each table 2 row relating to between 0 and "many" table 1 rows. If you try to create this kind of relationship, then multiple duplicate table 1 rows are not a problem.
If, however, you have multiple rows per physician in table 2, then you won't be able to create this kind of relationship, because table 2 rows are not unique and thus can't act as the primary key element in the relationship. The problem then is one of data-cleansing: figuring out which rows in table 2 are duplicates, updating the table 1 rows to point to just ONE physician out of the duplicates, and then deleting the duplicate rows from table 2.
You mention physician ID#s and physician names. Physician name would be a bad choice for a unique key; if a user tries to add a new physician called "John Smith" when there already is another physician of that name, either
You've set up a unique index on PhysicianName, their change gets rejected, and you have an irate user; or
You haven't, and all the existing physician's (let's call him John A. Smith) procedures will be associated with the other physician (let's call him John B. Smith) and vice versa.
The relationship should be set up using the physician ID. If Table 1 (Procedures) includes a physician ID column, you're in luck. If it only includes physician Name, then you may have a data-cleansing problem, if there are already duplicate physician Names in Table 2.

Related

Created a join table with a Primary Key column, and two Foreign Key columns, not pulling in data

I created a new table called Expirations that has it's own unique id, with indentity incrementing, as it's primary key. I also want this table to pull in data from two other tables, so I created a column for the InsuranceId and the LicenseId, making both columns foreign keys to connect them to the data (aka ID) from their respective tables (Insurance and License).
Any idea why my data is not automatically filling in for my Expirations Table? I believe it was created as "many to one" for all of these columns. Not sure if that is correct either, as I want the Expiration table to list all insurance id's and licence id's.
Anyone know what I am doing wrong here?
Foreign keys don't mean that a table gets filled automatically.
Let's say you have a person table and a company table and a company_person table to show which person works in which company. Now you insert three companies and four persons. That doesn't mean that the company_person table gets filled with 3 x 4 = 12 records (i.e. all persons work in all companies). That would make no sense.
Foreign keys merely guarantee that you cannot insert invalid data (i.e. a person that doesn't exist or a company that doesn't exist) into the table in question. You must insert the records yourself.
Basically your expirations table is like a fact table and you want to load data from the dim table which is Insurancetable ( InsuranceId as primarykey) and Licencetable ( Licenceid as PK).
But if you do not have any combination of InsuranceId and licenseId how do you know which insuranceid belongs to which LicenseId.
If your expirations table is empty then you need to first do cross join between the insurancetable and licensetable which is cartesian result but you do not want to do that as it does not make sense in real world.
Hope my explanation helps.

Storing multiple values in a single column

I have two columns in my table: [Employee_id, Employee_phno].
Employee_id: primary key, data type = int
Employee_phno: allows null, data type = int
Now, how can I insert two phone numbers in the same employee_id?
For example:
employee_id employee_phno()
1 xxxxxxxxx
yyyyyyyyyy
For me, if you want multiple data for column Employee_phno better make another table for Employee_phno. In your second table, set a foreign key as relation for your first table.
Example:
1st table
Employee_id
1
2
3
2nd table
Employee_id Employee_phno
1 1234
2 1512
2 4523
Here you can see the employee with id = 2 has multiple Employee_phno
It is never possible to insert data in such way in single table.
If Employee_id is primary key then you can have only 1 record for an employee.Since you have only one field for Employee_phno,it's not possible to store 2 phone numbers for the same employee.
For doing so, you will have to do any one of the following:
1.Add another column in the data as Employee_Alternate_phno and if all the employees won't have 2 numbers you can make this column allow NULLs.
2.Create another mapping table say EmployeeNumbers where you will have EmployeeId as Foreign key and then the numbers field. Anytime if you want the 2 Employee_phno you can do a join on the mapping table and retrieve the values.

Proper way to make a relation between multiple rows of single table

I've got following situation: I want to connect multiple records from one table with some kind of relation. Record could have no connection to other, or could have multiple of them (1 or more). There is no hierarchy in this relation.
For example:
CREATE TABLE x
(
x_id SERIAL NOT NULL PRIMARY KEY,
data VARCHAR(10) NOT NULL
);
I've thought of two ideas:
Make a new column in this table, which will contain some relationId. It won't reference anything. When new record is inserted, I will generate new relationId and put it there. If I would want to connect other record with this one, I will simply put the same relationId.
Example:
CREATE TABLE x
(
x_id NUMBER(19, 0) NOT NULL PRIMARY KEY,
data VARCHAR(10) NOT NULL,
relation_id NUMBER(19, 0) NOT NULL
);
insert into x values (nextval, 'blah', 1);
insert into x values (nextval, 'blah2', 1);
It will connect these two rows.
pros:
very easy
easy queries to get all records connected to particular record
no overhead
cons:
hibernate entity will contain only relationId, no collection of
related records (or maybe it's possible somehow?)
Make a separate join table, and connect rows with many-to-many relation. Join table would contain two column with ids, so one entry would connect two rows.
Example:
CREATE TABLE x
(
x_id SERIAL NOT NULL PRIMARY KEY,
data VARCHAR(10) NOT NULL
);
CREATE TABLE bridge_x
(
x_id1 NUMBER(19, 0) NOT NULL REFERENCES x (x_id),
x_id2 NUMBER(19, 0) NOT NULL REFERENCES x (x_id),
PRIMARY KEY(x_id1, x_id2)
);
insert into x values (1, 'blah');
insert into x values (2, 'blah2');
insert into bridge_x values (1, 2);
insert into bridge_x values (2, 1);
pros:
normalized relation
easy hibernate entity mapping, with collection containing related
records
cons:
overhead (with multiple connected rows, every pair must be inserted)
What is the best way to do this? Is there any other way than these two?
The best way in my experience is to use normalization as you've said in your second option. What you are looking for here is to create a foreign key.
So if you use the example you've given in example 2 and then apply the following SQL statement, you will create a relational database that can have 0 to many relations.
ALTER TABLE `bridgex` ADD CONSTRAINT `fk_1` FOREIGN KEY (`xID`) REFERENCES `x`(`xID`) ON DELETE NO ACTION ON UPDATE NO ACTION;

How to reference foreign key from more than one column (Inconsistent values)

I Have table three tables:
The first one is emps:
create table emps (id number primary key , name nvarchar2(20));
The second one is cars:
create table cars (id number primary key , car_name varchar2(20));
The third one is accounts:
create table accounts (acc_id number primary key, woner_table nvarchar2(20) ,
woner_id number references emps(id) references cars(id));
Now I Have these values for selected tables:
Emps:
ID Name
-------------------
1 Ali
2 Ahmed
Cars:
ID Name
------------------------
107 Camery 2016
108 Ford 2012
I Want to
Insert values in accounts table so its data should be like this:
Accounts:
Acc_no Woner_Table Woner_ID
------------------------------------------
11013 EMPS 1
12010 CARS 107
I tried to perform this SQL statement:
Insert into accounts (acc_id , woner_table , woner_id) values (11013,'EMPS',1);
BUT I get this error:
ERROR at line 1:
ORA-02291: integrity constraint (HR.SYS_C0016548) violated - parent key not found.
This error occurs because the value of woner_id column doesn't exist in cars table.
My work require link tables in this way.
How Can I Solve This Problem Please ?!..
Mean: How can I reference tables in previous way and Insert values without this problem ?..
One-of relationships are tricky in SQL. With your data structure here is one possibility:
create table accounts (
acc_id number primary key,
emp_id number references emps(id),
car_id number references car(id),
id as (coalesce(emp_id, car_id)),
woner_table as (case when emp_id is not null then 'Emps'
when car_id is not null then 'Cars'
end),
constraint chk_accounts_car_emp check (emp_id is null or car_id is null)
);
You can fetch the id in a select. However, for the insert, you need to be explicit:
Insert into accounts (acc_id , emp_id)
values (11013, 1);
Note: Earlier versions of Oracle do not support virtual columns, but you can do almost the same thing using a view.
Your approach should be changed such that your Account table contains two foreign key fields - one for each foreign table. Like this:
create table accounts (acc_id number primary key,
empsId number references emps(id),
carsId number references cars(id));
The easiest, most straightforward method to do this is as STLDeveloper says, add additional FK columns, one for each table. This also bring along with it the benefit of the database being able to enforce Referential Integrity.
BUT, if you choose not to do, then the next option is to use one FK column for the the FK values and a second column to indicate what table the value refers to. This keeps the number of columns small = 2 max, regardless of number of tables with FKs. But, this significantly increases the programming burden for the application logic and/or PL/SQL, SQL. And, of course, you completely lose Database enforcement of RI.

How do I check constraints between two tables when inserting into a third table that references the other two tables?

Consider this example schema:
Customer ( int CustomerId pk, .... )
Employee ( int EmployeeId pk,
int CustomerId references Customer.CustomerId, .... )
WorkItem ( int WorkItemId pk,
int CustomerId references Customer.CustomerId,
null int EmployeeId references Employee.EmployeeId, .... )
Basically, three tables:
A customer table with a primary key and some additional columns
A employee table with a primary key, a foreign key constraint reference to the customer tables primary key, representing an employee of the customer.
A work item table, which stores work done for the customer, and also info about the specific employee who the work was performed for.
My question is. How do I, on a database level, test if an employee is actually associated with a customer, when adding new work items.
If for example Scott (employee) works at Microsoft (customer), and Jeff (employee) works at StackOverflow (customer), how do I prevent somebody from adding a work item into the database, with customer = Microsoft, and employee = Jeff, which do not make sense?
Can I do it with check constraints or foreign keys or do I need a trigger to test for it manually?
Should mention that I use SQL Server 2008.
UPDATE: I should add that WorkItem.EmployeeId can be null.
Thanks, Egil.
Wouldn't a foreign key on a composite column (CustomerId, EmployeeId) work?
ALTER TABLE WorkItem
ADD CONSTRAINT FK_Customer_Employee FOREIGN KEY (CustomerId, EmployeeId)
REFERENCES Employee (CustomerId, EmployeeId);
You might be able to do this by creating a view "WITH SCHEMABINDING" that spans those tables and enforces the collective constraints of the individual tables.
Why do you want employeeId to be null int WorkItem? Maybe you should add another table to avoid that particular oddity. From what I can see the easiest thing to do is to add a unique constraint on employeeid in workItem, and maybe even unique on customerId if that is what you want.
A more general way to add constraints spanning many tables is to define a view that should always be empty, and add the constraint that it is empty.
What are you trying to model here?
You're a contracting agency or the like, and you have a bunch of contractors who are (for some period of time) assigned to a customer.
You're actually storing information about other company's employees (maybe you're providing outsources payroll services, for example).
In case (1), it looks like you have a problem with the Employee table. In particular, when Scott's contract with MS is up and he gets contracted to someone else, you can't keep the historical data, because you need to change the CustomerId. Which also invalidates all the WorkItems. Instead, you should have a fourth table, e.g., CustomerEmployee to store that. Then WorkItem should reference that table.
In case (2), your primary key on Employee should really be CustomerId, EmployeeId. Two customers could have the same employee ID number. Then Kieron's foreign key will work.
I recently pass to a similar situation, consider the schema:
Table company (id_cia PK) Table product_group (id_cia FK to company, id_group PK) Table products (id_group FK to product_group, id_product PK, id_used_by_the_client null)
Rule: The database must allow only one id_used_by_the_client for each product of a company but this filed can be null. Example:
Insert into company (1) = allowed
Insert into company (2) = allowed
Insert into product_group (1, 1) = allowed
Insert into product_group (1,2) = allowed
Insert into product_group (2,3) = allowed
Insert into products values (1, 1, null) = allowed
Insert into products values (1, 2, null) = allowed
Insert into products values (1, 3, 1) = allowed
Insert into products values (1, 4, 1) = not allowed, in the group 1 that belongs to company 1 already exists an id_used_by_the_client = 1.
Insert into products values (2, 4, 1) = not allowed, in the group 2 that belongs to company 1 already exists an id_used_by_the_client = 1.
Insert into products values (3, 4, 1) = allowed, in the group 3 that belongs to company 2 there is no id_used_by_the_client = 1.
I decided to use a trigger to control this integrity.
Either:
make the EmployeeID column the Primary Key of Employee (and possibly an auto-id) and store the EmployeeID in the WorkItem record as a foreign key, instead of storing the Employee and Customer IDs in WorkItem. You can retrieve a WorkItem's Customer details by joining to the Customer table via the Employee table.
Or:
make the WorkItem's EmployeeID and CustomerID columns a composite foreign key to Employee.
I favour the first approach, personally.