Creating PostgreSQL tables + relationships - PROBLEMS with relationships - ONE TO ONE - sql

So I am supposed to create this schema + relationships exactly the way this ERD depicts it. Here I only show the tables that I am having problems with:
So I am trying to make it one to one but for some reason, no matter what I change, I get one to many on whatever table has the foreign key.
This is my sql for these two tables.
CREATE TABLE lab4.factory(
factory_id INTEGER UNIQUE,
address VARCHAR(100) NOT NULL,
PRIMARY KEY ( factory_id )
);
CREATE TABLE lab4.employee(
employee_id INTEGER UNIQUE,
employee_name VARCHAR(100) NOT NULL,
factory_id INTEGER REFERENCES lab4.factory(factory_id),
PRIMARY KEY ( employee_id )
);
Here I get the same thing. I am not getting the one to one relationship but one to many. Invoiceline is a weak entity.
And here is my code for the second image.
CREATE TABLE lab4.product(
product_id INTEGER PRIMARY KEY,
product_name INTEGER NOT NULL
);
CREATE TABLE lab4.invoiceLine(
line_number INTEGER NOT NULL,
quantity INTEGER NOT NULL,
curr_price INTEGER NOT NULL,
inv_no INTEGER REFERENCES invoice,
product_id INTEGER REFERENCES lab4.product(product_id),
PRIMARY KEY ( inv_no, line_number )
);
I would appreciate any help. Thanks.

One-to-one isn't well represented as a first-class relationship type in standard SQL. Much like many-to-many, which is achieved using a connector table and two one-to-many relationships, there's no true "one to one" in SQL.
There are a couple of options:
Create an ordinary foreign key constraint ("one to many" style) and then add a UNIQUE constraint on the referring FK column. This means that no more than one of the referred-to values may appear in the referring column, making it one-to-one optional. This is a fairly simple and quite forgiving approach that works well.
Use a normal FK relationship that could model 1:m, and let your app ensure it's only ever 1:1 in practice. I do not recommend this, there's only a small write performance downside to adding the FK unique index and it helps ensure data validity, find app bugs, and avoid confusing someone else who needs to modify the schema later.
Create reciprocal foreign keys - possible only if your database supports deferrable foreign key constraints. This is a bit more complex to code, but allows you to implement one-to-one mandatory relationships. Each entity has a foreign key reference to the others' PK in a unique column. One or both of the constraints must be DEFERRABLE and either INITIALLY DEFERRED or used with a SET CONSTRAINTS call, since you must defer one of the constraint checks to set up the circular dependency. This is a fairly advanced technique that is not necessary for the vast majority of applications.
Use pre-commit triggers if your database supports them, so you can verify that when entity A is inserted exactly one entity B is also inserted and vice versa, with corresponding checks for updates and deletes. This can be slow and is usually unnecessary, plus many database systems don't support pre-commit triggers.

Related

How to force two parent records to have same grandparent record in SQL

Here is a picture of the schema for a contrived example that demonstrates the problem I am facing.
My question is: In SQL, how do I make sure that the parent records to Door (House and Door Blueprint) each both have the same House Blueprint parent? In other words, how do I make sure that every Door record only has one House Blueprint grandparent?
I am already using best practices for foreign keys to designate the one to many relationships. I need the Door table, because the door instance could be painted any color based on the house. I need the Door Blueprint table, because I want to track who designed the door. Also, in this contrived example, the Door Blueprint can only have a single parent House Blueprint (I know this isn't realistic, so just ignore the possibility that Door Blueprints could be used in multiple House Blueprints).
The problem I am running into is that I sometimes get Door records with a House record attached to one House Blueprint and a Door Blueprint attached to a different House Blueprint record. This should never happen. And I could probably prevent this in my record insertion logic, but that is not at the SQL level.
It makes me uneasy that I have two different paths back to House Blueprint from Door, but I don't see any other way of doing it.
I'm not really looking for a bunch of code snippets, because I can figure out the syntax myself. Rather, I'm looking for a high-level approach to solving the problem in SQL. Also, I am using SQLite3, but I imagine this problem can be solved in any RDBMS.
Thanks in advance for any help with this!
In a relational database, you'd use assertions. In a SQL database, use overlapping foreign key references to unique constraints. (Not foreign key references to candidate keys.)
create table house_blueprints (
house_blueprint_id integer primary key
);
create table houses (
house_id integer primary key,
house_blueprint_id integer not null,
foreign key (house_blueprint_id)
references house_blueprints (house_blueprint_id),
unique (house_id, house_blueprint_id)
);
create table door_blueprints (
door_blueprint_id integer primary key,
house_blueprint_id integer not null,
foreign key (house_blueprint_id)
references house_blueprints (house_blueprint_id),
unique (door_blueprint_id, house_blueprint_id)
);
create table doors (
door_id integer primary key,
house_id integer not null,
house_blueprint_id integer not null,
foreign key (house_id, house_blueprint_id)
references houses (house_id, house_blueprint_id),
door_blueprint_id integer not null,
foreign key (door_blueprint_id, house_blueprint_id)
references door_blueprints (door_blueprint_id, house_blueprint_id)
);
The unique constraints aren't candidate keys, because they're not minimal. But they're necessary to provide targets for the overlapping foreign keys.
The table "doors" has one column for house_blueprint_id, and two overlapping foreign key constraints that use it. No way those foreign key constraints can have different values for house_blueprint_id.

Primary key in "many-to-many" table

I have a table in a SQL database that provides a "many-to-many" connection.
The table contains id's of both tables and some fields with additional information about the connection.
CREATE TABLE SomeTable (
f_id1 INTEGER NOT NULL,
f_id2 INTEGER NOT NULL,
additional_info text NOT NULL,
ts timestamp NULL DEFAULT now()
);
The table is expected to contain 10 000 - 100 000 entries.
How is it better to design a primary key? Should I create an additional 'id' field, or to create a complex primary key from both id's?
DBMS is PostgreSQL
This is a "hard" question in the sense that there are pretty good arguments on both sides. I have a bias toward putting in auto-incremented ids in all tables that I use. Over time, I have found that this simply helps with the development process and I don't have to think about whether they are necessary.
A big reason for this is so foreign key references to the table can use only one column.
In a many-to-many junction table (aka "association table"), this probably isn't necessary:
It is unlikely that you will add a table with a foreign key relationship to a junction table.
You are going to want a unique index on the columns anyway.
They will probably be declared not null anyway.
Some databases actually store data based on the primary key. So, when you do an insert, then data must be moved on pages to accommodate the new values. Postgres is not one of those databases. It treats the primary key index just like any other index. In other words, you are not incurring "extra" work by declaring one more more columns as a primary key.
My conclusion is that having the composite primary key is fine, even though I would probably have an auto-incremented primary key with separate constraints. The composite primary key will occupy less space so probably be more efficient than an auto-incremented id. However, if there is any chance that this table would be used for a foreign key relationship, then add in another id field.
A surrogate key wont protect you from adding multiple instances of (f_id1, f_id2) so you should definitely have a unique constraint or primary key for that. What would the purpose of a surrogate key be in your scenario?
Yes that's actually what people commonly do, that key is called surrogate key.. I'm not exactly sure with PostgreSQL, but in MySQL by using surrogate key you can delete/edit the records from the user interface.. Besides, this allows the database to query the single key column faster than it could multiple columns.. Hope it helps..

REFERENCES keyword in SQLite3

I was hoping someone could explain to me the purpose of the SQL keyword REFERENCES
CREATE TABLE wizards(
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
age INTEGER
, color TEXT);
CREATE TABLE powers(
id INTEGER PRIMARY KEY AUTOINCREMENT,
name STRING,
damage INTEGER,
wizard_id INTEGER REFERENCES wizards(id)
);
I've spent a lot of time trying to look this up and I initially thought that it would constrain the type of data you can enter into the powers table (based on whether the wizard_id ) However, I am still able to insert data into both columns without any constraint that I have noticed.
So, is the keyword REFERENCES just for increasing querying speed? What is its true purpose?
Thanks
It creates a Foreign Key to the other table. This can have performance benefits, but foreign keys are mostly about data integrity. It means that (in your case) the wizard_id filed of powers must have a value that exists in the id field of the wizards table. In other words, powers must refer to a valid wizard. Many databases also use this information to propagate deletions or other changes, so the tables stay in sync.
Just noticed this. A reason that you're able to bypass the key constraint may be that foreign keys aren't enabled. See Enabling foreign keys in the SQLite3 documentation.
From what I've gathered, there are two main benefits of using REFERENCES, and an important distinction to be made between its use with and without FOREIGN KEY.
It gives the DBMS room to optimize
Without using REFERENCES, SQLite would not know that attribute id and attribute wizard_id are functionally equivalent. The more known constraints you can define for the Database Management System (SQLite in this case), the more freedom it has to optimize the way it handles your data under the hood.
It can enforce or encourage good practice
Reference declaration can also be useful for enforcement and warning provision. For example, say you have two tables, A and B, and you assume that A.name is functionally equivalent to B.name, so you attempt a join: SELECT * FROM A, B WHERE A.name = B.name. If REFERENCE was never used to indicate functional equivalency between these two attributes, the DBMS could warn you when you make the join, which would be helpful in the case that these attributes only happen to have the same name but are not actually meant to represent the same thing.
REFERENCES does not always create a "foreign key"
Contrary to what has already been suggested, references and foreign keys are not the same thing. A reference declares functional equivalency between attributes. A foreign key refers to the primary key of another table.
EDIT: #IanMcLaird has corrected me: the use of REFERENCES does always create a foreign key of some kind, although this conflicts with the popular definition of foreign key as "a set of attributes in a table that refers to the primary key of another table" (Wikipedia).
Using REFERENCES without FOREIGN KEY may create a "column-level foreign key" which operates contrary to the popular definition of "foreign key."
There is a difference between the following statements.
driver_id INT REFERENCES Drivers
driver_id INT REFERENCES Drivers(id)
driver_id INT,
FOREIGN KEY(driver_id) REFERENCES Drivers(id)
The first statement assumes that you would like to reference the primary key of Drivers since no attribute is specified. The third statement requires that id be the primary key of Drivers. Both assume you want to make a foreign key by the popular definition provided above; both create a table-level foreign key.
The second statement is tricky. If specifying an attribute which is the primary key of Drivers, the DBMS may opt to create a table-level foreign key. But the specified attribute does not have to be the primary key of Drivers, and if it isn't, the DBMS will create a column-level foreign key. This is somewhat unintuitive for those who are first approaching databases and learn the less-flexible, popular definition of "foreign key."
Some people may use these three statements as if they are the same, and they may be functionally identical in many general use cases, but they are not the same.
All that said, this is just my understanding. I am not an expert in this subject and would greatly appreciate additions, corrections, and affirmations.

Converting an ER diagram to relational model

I know how to convert an entity set, relationship, etc. into the relational model but what i wonder is that what should we do when an entire diagram is given? How do we convert it? Do we create a separate table for each relationship, and for each entity set? For example, if we are given the following ER diagram:
My solution to this is like the following:
//this part includes the purchaser relationship and policies entity set
CREATE TABLE Policies (
policyid INTEGER,
cost REAL,
ssn CHAR(11) NOT NULL,
PRIMARY KEY (policyid).
FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE)
//this part includes the dependents weak entity set and beneficiary relationship
CREATE TABLE Dependents (
pname CHAR(20),
age INTEGER,
policyid INTEGER,
PRIMARY KEY (pname, policyid).
FOREIGN KEY (policyid) REFERENCES Policies,
ON DELETE CASCADE)
//This part includes Employees entity set
CREATE TABLE Employees(
ssn Char(11),
name char (20),
lot INTEGER,
PRIMARY KEY (ssn) )
My questions are:
1)Is my conversion true?
2)What are the steps for converting a complete diagram into relational model.
Here are the steps that i follow, is it true?
-I first look whether there are any weak entities or key constraints. If there
are one of them, then i create a single table for this entity set and the related
relationship. (Dependents with beneficiary, and policies with purchaser in my case)
-I create a separate table for the entity sets, which do not have any participation
or key constraints. (Employees in my case)
-If there are relationships with no constraints, I create separate table for them.
-So, in conclusion, every relationship and entity set in the diagram are included
in a table.
If my steps are not true or there is something i am missing, please can you write the steps for conversion? Also, what do we do if there is only participation constraint for a relationship, but no key constraint? Do we again create a single table for the related entity set and relationship?
I appreciate any help, i am new to databases and trying to learn this conversion.
Thank you
Hi #bigO I think it is safe to say that your conversion is true and the steps that you have followed are correct. However from an implementation point of view, there may be room for improvement. What you have implemented is more of a logical model than a physical model
It is common practice to add a Surrogate Instance Identifier to a physical table, this is a general requirement for most persistence engines, and as pointed out by #Pieter Geerkens, aids database efficiency. The value of the instance id for example EmployeeId (INT) would be automatically generated by the database on insert. This would also help with the issue that #Pieter Geerkens has pointed out with the SSN. Add the Id as the first column of all your tables, I follow a convention of tablenameId. Make your current primary keys into secondary keys ( the natural key).
Adding the Ids then makes it necessary to implement a DependentPolicy intersection table
DependentPolicyId, (PK)
PolicyId,
DependentId
You may then need to consider as to what is natural key of the Dependent table.
I notice that you have age as an attribute, you should consider whether this the age at the time the policy is created or the actual age of the dependent, I which case you should be using date of birth.
Other ornamentations you could consider are creation and modified dates.
I also generally favor using the singular for a table ie Employee not Employees.
Welcome to the world of data modeling and design.

database design pattern: many to many relationship across tables?

I have the following tables:
Section and Content
And I want to relate them.
My current approach is the following table:
In which I would store
Section to Section
Section to Content
Content to Section
Content to Content
Now, while I clearly can do that by adding a pair of fields that indicate whether the source is a section or a content, and whether the target is a section or a content, I'd like to know if there's a cleaner way to do this. and if possible using just one table for the relationship, which would be the cleanest in my opinion. I'd also like the table to be somehow related to the Section and Content tables so I can avoid manually adding constraints, or triggers that delete the relationships when a Section or Content is deleted...
Thanks as usual for the input! <3
Here's how I would design it:
CREATE TABLE Pairables (
PairableID INT IDENTITY PRIMARY KEY,
...other columns common to both Section and Content...
);
CREATE TABLE Sections (
SectionID INT PRIMARY KEY,
...other columns specific to sections...
FOREIGN KEY (SectionID) REFERENCES Pairables(PairableID)
);
CREATE TABLE Contents (
ContentID INT PRIMARY KEY,
...other columns specific to contents...
FOREIGN KEY (ContentID) REFERENCES Pairables(PairableID)
);
CREATE TABLE Pairs (
PairID INT NOT NULL,
PairableId INT NOT NULL,
IsSource BIT NOT NULL,
PRIMARY KEY (PairID, PairableID),
FOREIGN KEY (PairableID) REFERENCES Pairables(PairableID)
);
You would insert two rows in Pairs for each pair.
Now it's easy to search for either type of pairable entity, you can search for either source or target in the same column, and you still only need one many-to-many intersection table.
Yes, there is a much cleaner way to do this:
one table tracks the relations from Section to Section and enforces them as foreign key constraints
one table tracks the relations from Section to Content and enforces them as foreign key constraints
one table tracks the relations from Content to Section and enforces them as foreign key constraints
one table tracks the relations from Content to Content and enforces them as foreign key constraints
This is much cleaner than a single table with overloaded IDs that cannot be enforced by foreign key constraints. The fact that the data modeling, nor the domain modeling patterns, never mention a pattern like the one you describe should be your first alarm bell. The second alarm should be that the engine cannot enforce the constraints you envision and you have to dwell into triggers.
Having four distinct relationships modeled in one table brings no elegance to the model, it only adds obfuscation. Relational model is not C++: it has no inheritance, it has no polymorphism, it has no overloading. Trying to enforce a OO mind set into data modeling has led many a fine developers into a mud of unmaintainable trigger mesh of on-disk table-like bits vaguely resembling 'data'.