I am new to PostgreSql and am working on an example database for learning purposes. I have a bakery database and a table for recipes and a table for ingredients. I am trying to understand the schema to connect the two tables such that the recipes table has a list of ingredients that references the ingredients table, but I am not sure if I need a 3rd table or if I can get away with just the two tables.
CREATE TABLE ingredients
(
ing_id SERIAL PRIMARY KEY,
name varchar(255) NOT NULL,
quantity integer NOT NULL
);
CREATE TABLE recipes
(
rec_id SERIAL PRIMARY KEY,
name varchar(120) NOT NULL,
list_of_ingredients text NOT NULL,
);
EDIT:
So let's say I have this in the ingredients table:
(1, flour, 40)
(2, eggs, 12)
(3, sugar, 23)
And this in the recipes table:
(1, cake, "3 flour, 4 eggs, 2 sugar")
I'm a bit confused on how to link the two tables.
The list_of_ingredients will need to reference the ingredients table as a foreign key. I understand that the whole point of NoSQL DBs is to allow for lists, so I'm not sure if I am approaching this totally wrong.
I will also write a Make_Recipe function that will take in a recipe and make sure there is enough ingredients, then will go ahead and decrease the ingredient quantity if it passes the above condition.
I have read through these posts, but they don't quite fit the bill:
Database design for storing food recipes
Database Schema for Recipe/Ingredient/Measurement/Amount
Thanks for your time! Any help is much appreciated.
The relationship between recipes and ingredients is what is known as a many-to-many relationship - a recipe might contain any number of ingredients, while an ingredient might be used by any number of recipes.
In a relational database (and PostgreSql is a relational database), the way to create a many-to-many relationship is by introducing a bridge table.
In the case of recipes and ingredients, you will have three tables.
One table for ingredients, specifying the name of the ingredient (and possibly other ingredient related data, if you can think of such data).
Another table for recipes, specifying the name of the recipe, a description, the text explanation ect', and the size of the dish.
Then you have the bridge table, ingredientToRecipe, that will contain a one-to-many foreign key to the recipe table, a one-to-many foreign key to the ingredient table, and the quantity needed for that specific ingredient in that specific recipe.
Remember the size of the dish in the recipe table? That would be needed to calculate up or down scaling of the quantity of ingredients when scaling up or down the size of the dish.
So, a DDL for these tables might look something like this:
CREATE TABLE ingredients
(
ing_id SERIAL PRIMARY KEY,
name varchar(255) NOT NULL
);
CREATE TABLE recipes
(
rec_id SERIAL PRIMARY KEY,
name varchar(120) NOT NULL,
description text NOT NULL,
DishSize integer NOT NULL
);
CREATE TABLE ingredientsToRecipes
(
rec_Id integer REFERENCES recipes (rec_id),
ing_id integer REFERENCES ingredients (ing_id),
quantity integer NOT NULL,
quantity_unit varchar(100) NOT NULL,
PRIMARY KEY(rec_Id, ing_id)
);
Related
I have a quick question with respect to many to many relationships in sql.
So theoretically i understand that if 2 entities in an ER model have a M:N relationship between them, we have to split that into 2 1:N relationships with the inclusion of an intersection/lookup table which has a composite primary key from both the parent tables. But, my question here is , in addition to the composite primary key, can there be any other extra column added to the composite table which are not in any of the 2 parent tables ? (apart from intersectionTableId, table1ID, table2ID) a 4rth column which is entirely new and not in any of the 2 parent tables ? Please let me know.
In a word - yes. It's a common practice to denote properties of the relationship between the two entities.
E.g., consider you have a database storing the details of people and the sports teams they like:
CREATE TABLE person (
id INT PRIMARY KEY,
first_name VARCHAR(10),
last_name VARCHAR(10)
);
CREATE TABLE team (
id INT PRIMARY KEY,
name VARCHAR(10)
);
A person may like more than one team, which is your classic M:N relationship table. But, you could also add some details to this entity, such as when did a person start liking a team:
CREATE TABLE fandom (
person_id INT NOT NULL REFERENCES person(id),
team_id INT NOT NULL REFERENCES team(id),
fandom_started DATE,
PRIMARY KEY (person_id, team_id)
);
Yes, you can do that by modeling the "relationship" table yourself explicitly (just like your other entities).
Here are some posts about exactly that question.
Create code first, many to many, with additional fields in association table
Entity Framework CodeFirst many to many relationship with additional information
For serveral times I've been returning to understanding of database relational theory, and still I don't have success. I'll try once more.
Let's say I have two tables:
animals:
CREATE TABLE animals (id INTEGER PRIMARY KEY, name TEXT);
and food:
CREATE TABLE food (id INTEGER PRIMARY KEY, food TEXT);
What I need is to make this two tables connected. For example, I want to select 'pig' from animals table and recive all the things the pig can eat from food table.
I just don't get how to relate them. I belive I can add a foreign key to food table, which would link to the primary key of animal table, but there is an issue I can't figure out:
What if I make entries to the database from, for example, a web form, where I enter animal name and product which it eats. The animal name goes to the first table and automatically recieves an id. It just autoincrements. So, in order to make it a relation for the second table I must to select the new ID from the first table! So we got THREE sql requests:
1) INSERT INTO animals (name) VALUES ('pig);
2) SELECT id FROM animals WHERE name='pig'; (we store it in a variable, does not really matters for now)
3) INSERT INTO food (product, animal_id) VALUES ('something', 'id of a pig');
I just feel that it is wrong.
Or my mind is just not capable of understanding such complex abstractions.
Please advice.
You need a junction table, that related Animals and Food. This would look like:
CREATE TABLE AnimalFoods (
id INTEGER PRIMARY KEY,
AnimalId int references Animal(id),
FoodId int references Food(id)
);
You can then answer your questions using various joins among these tables.
That's how you implement such a many-to-many relationship:
How to implement a many-to-many relationship in PostgreSQL?
And you can accomplish the task you describe with a single query using a data-modifying CTE:
WITH ins AS (
INSERT INTO animals (name) VALUES ('pig')
RETURNING animal_id -- return generated ID immediately
)
INSERT INTO animal_food (food_id, animal_id) -- m:m link table
SELECT food_id, animal_id -- food_id passed as 2nd param
FROM ins;
Assuming we operate with a known-existing food (like it was select from a drop-down menu. Else you need one more step to look up the food or possibly INSERT a row there, too:
Is SELECT or INSERT in a function prone to race conditions?
... still a single query.
The linked answer provides some insight in the more tricky matter of race conditions with concurrent transactions.
Can an animal eat multiple foods? If not, then you can have animal be the primary key on food the food table. If animal can have multiple foods, then you can have an auto increment ID (just like the one in the animal table) in the food table. As jWeaver pointed out, have a parentID in the food table as the foreign key referencing the animal table.
Create an animal table with parentId and then in your food table use that parentId column to refer.
Example:
Animal(ParentId integer, NAME TEXT);
and
Food(FoodId integer, ParentId integer, Name Text);
That's it. Make sure, you are using same ParentId in food table to refer food for specific animal.
Hope, this make sense to you.
If different animals eat the same kind of food, you should define a table with foreign keys the id of the two related tables:
CREATE TABLE animals (id INTEGER PRIMARY KEY, name TEXT);
CREATE TABLE food (id INTEGER PRIMARY KEY, food TEXT);
CREATE TABLE eat (animal_id INTEGER, food_id INTEGER,
FOREIGN KEY (animal_id) REFERENCES animals(id),
FOREIGN KEY (food_id) REFERENCES food(id));
Assuming an animal would eat multiple food and multiple animals could eat same food. You need to have a many-to-many association with animals and food.
Animal(id integer, name Text)
Food(id integer, name TEXT)
AnimalFood(animalId integer,foodId integer)
CREATE TABLE animals (id int(11) not null auto_increment primary key, name text);
CREATE TABLE foods (id int(11) not null auto_increment primary key, name text);
CREATE TABLE animal_foods (animal_id int(11) not null, food_id (11) not null);
I have implemented the following ways of storing relational topology:
1.A general junction relation table:
Table: Relation
Columns: id parent_type parent_id parent_prop child_type child_id child_prop
On which joins are not generally capable of being executed against by most sql engines.
2.Relation specific junction tables
Table: Class2Student
Columns: id parent_id parent_prop child_id child_prop
On which joins are capable of being executed against.
3.Storing lists/string maps of related objects in a text field on both bidirectional objects.
Class: Class
Class properties: id name students
Table columns: id name students_keys
Rows: 1 "history" [{type:Basic_student,id:1},{type:Advanced_student,id:3}]
To enable joins by the sql engines, it would be possible to write a custom module which would be made even easier if the contents of students_keys was simply [1,3], ie that a relation was to the explicit Student type.
The questions are the following in the context of:
I fail to see what the point of a junction table is. For example, I fail to see that any problems the following arguments for a junction table claim to relieve, actually exist:
Inability to logically correctly save a bidirectional relations (eg
there is no data orphaning in bidirectional relations or any
relations with a keys field, because one recursively saves and one can enforce
other operations (delete,update) quite easily)
Inability to join effectively
I am not soliciting opinions on your personal opinions on best practices or any cult-like statements on normalization.
The explicit question(s) are the following:
What are the instances where one would want to query a junction table that is not provided by querying a owning object's keys field?
What are logical implementation problems in the context of computation provided by the sql engine where the junction table is preferable?
The only implementation difference with regards to a junction table vs a keys fields is the following:
When searching for a query of the following nature you would need to match against the keys field with either a custom indexing implementation or some other reasonable implementation:
class_dao.search({students:advanced_student_3,name:"history"});
search for Classes that have a particular student and name "history"
As opposed to searching the indexed columns of the junction table and then selecting the approriate Classes.
I have been unable to identify answers why a junction table is logically preferable for quite literally any reason. I am not claiming this is the case or do I have a religious preference one way or another as evidenced by the fact that I implemented multiple ways of achieving this. My problem is I do not know what they are.
The way I see it, you have have several entities
CREATE TABLE StudentType
(
Id Int PRIMARY KEY,
Name NVarChar(50)
);
INSERT StudentType VALUES
(
(1, 'Basic'),
(2, 'Advanced'),
(3, 'SomeOtherCategory')
);
CREATE TABLE Student
(
Id Int PRIMARY KEY,
Name NVarChar(200),
OtherAttributeCommonToAllStudents Int,
Type Int,
CONSTRAINT FK_Student_StudentType
FOREIGN KEY (Type) REFERENCES StudentType(Id)
)
CREATE TABLE StudentAdvanced
(
Id Int PRIMARY KEY,
AdvancedOnlyAttribute Int,
CONSTRIANT FK_StudentAdvanced_Student
FOREIGN KEY (Id) REFERENCES Student(Id)
)
CREATE TABLE StudentSomeOtherCategory
(
Id Int PRIMARY KEY,
SomeOtherCategoryOnlyAttribute Int,
CONSTRIANT FK_StudentSomeOtherCategory_Student
FOREIGN KEY (Id) REFERENCES Student(Id)
)
Any attributes that are common to all students have columns on the Student table.
Types of student that have extra attributes are added to the StudentType table.
Each extra student type gets a Student<TypeName> table to store its specific attributes. These tables have an optional one-to-one relationship with Student.
I think that your "straw-man" junction table is a partial implementation of an EAV anti-pattern, the only time this is sensible, is when you can't know what attributes you need to model, i.e. your data will be entirely unstructured. When this is a real requirment, relational databases start to look less desirable. On those occasions consider a NOSQL/Document database alternative.
A junction table would be useful in the following scenario.
Say we add a Class entity to the model.
CREATE TABLE Class
(
Id Int PRIMARY KEY,
...
)
Its concievable that we would like to store the many-to-many realtionship between students and classes.
CREATE TABLE Registration
(
Id Int PRIMARY KEY,
StudentId Int,
ClassId Int,
CONSTRAINT FK_Registration_Student
FOREIGN KEY (StudentId) REFERENCES Student(Id),
CONSTRAINT FK_Registration_Class
FOREIGN KEY (ClassId) REFERENCES Class(Id)
)
This entity would be the right place to store attributes that relate specifically to a student's registration to a class, perhaps a completion flag for instance. Other data would naturally relate to this junction, pehaps a class specific attendance record or a grade history.
If you don't relate Class and Student in this way, how would you select both, all the students in a class, and all the classes a student reads. Performance wise, this is easily optimised by indices on key columns.
When a many-to-many realtionships exists without any attributes I agree that logically, the junction table needn't exist. However, in a relational database, junction tables are still a useful physical implmentaion, perhaps like this,
CREATE TABLE StudentClass
(
StudentId Int,
ClassId Int,
CONSTRAINT PK_StudentClass PRIMARY KEY (ClassId, StudentId),
CONSTRAINT FK_Registration_Student
FOREIGN KEY (StudentId) REFERENCES Student(Id),
CONSTRAINT FK_Registration_Class
FOREIGN KEY (ClassId) REFERENCES Class(Id)
)
this allows simple queries like
// students in a class?
SELECT StudentId
FROM StudentClass
WHERE ClassId = #classId
// classes read by a student?
SELECT ClassId
FROM StudentClass
WHERE StudentId = #studentId
additionaly, this enables a simple way to manage the relationship, partially or completely from either aspect, that will be familar to relational database developers and sargeable by query optimisers.
I'm reading a book on EF4 and I came across this problem situation:
So I was wondering how to create this database so I can follow along with the example in the book.
How would I create these tables, using simple TSQL commands? Forget about creating the database, imagine it already exists.
You've been given the code. I want to share some information on why you might want to have two tables in a relationship like that.
First when two tables have the same Primary Key and have a foreign key relationship, that means they have a one-to-one relationship. So why not just put them in the same table? There are several reasons why you might split some information out to a separate table.
First the information is conceptually separate. If the information contained in the second table relates to a separate specific concern, it makes it easier to work with it the data is in a separate table. For instance in your example they have separated out images even though they only intend to have one record per SKU. This gives you the flexibility to easily change the table later to a one-many relationship if you decide you need multiple images. It also means that when you query just for images you don't have to actually hit the other (perhaps significantly larger) table.
Which bring us to reason two to do this. You currently have a one-one relationship but you know that a future release is already scheduled to turn that to a one-many relationship. In this case it's easier to design into a separate table, so that you won't break all your code when you move to that structure. If I were planning to do this I would go ahead and create a surrogate key as the PK and create a unique index on the FK. This way when you go to the one-many relationship, all you have to do is drop the unique index and replace it with a regular index.
Another reason to separate out a one-one relationship is if the table is getting too wide. Sometimes you just have too much information about an entity to easily fit it in the maximum size a record can have. In this case, you tend to take the least used fields (or those that conceptually fit together) and move them to a separate table.
Another reason to separate them out is that although you have a one-one relationship, you may not need a record of what is in the child table for most records in the parent table. So rather than having a lot of null values in the parent table, you split it out.
The code shown by the others assumes a character-based PK. If you want a relationship of this sort when you have an auto-generating Int or GUID, you need to do the autogeneration only on the parent table. Then you store that value in the child table rather than generating a new one on that table.
When it says the tables share the same primary key, it just means that there is a field with the same name in each table, both set as Primary Keys.
Create Tables
CREATE TABLE [Product (Chapter 2)](
SKU varchar(50) NOT NULL,
Description varchar(50) NULL,
Price numeric(18, 2) NULL,
CONSTRAINT [PK_Product (Chapter 2)] PRIMARY KEY CLUSTERED
(
SKU ASC
)
)
CREATE TABLE [ProductWebInfo (Chapter 2)](
SKU varchar(50) NOT NULL,
ImageURL varchar(50) NULL,
CONSTRAINT [PK_ProductWebInfo (Chapter 2)] PRIMARY KEY CLUSTERED
(
SKU ASC
)
)
Create Relationships
ALTER TABLE [ProductWebInfo (Chapter 2)]
ADD CONSTRAINT fk_SKU
FOREIGN KEY(SKU)
REFERENCES [Product (Chapter 2)] (SKU)
It may look a bit simpler if the table names are just single words (and not key words, either), for example, if the table names were just Product and ProductWebInfo, without the (Chapter 2) appended:
ALTER TABLE ProductWebInfo
ADD CONSTRAINT fk_SKU
FOREIGN KEY(SKU)
REFERENCES Product(SKU)
This simply an example that I threw together using the table designer in SSMS, but should give you an idea (note the foreign key constraint at the end):
CREATE TABLE dbo.Product
(
SKU int NOT NULL IDENTITY (1, 1),
Description varchar(50) NOT NULL,
Price numeric(18, 2) NOT NULL
) ON [PRIMARY]
ALTER TABLE dbo.Product ADD CONSTRAINT
PK_Product PRIMARY KEY CLUSTERED
(
SKU
)
CREATE TABLE dbo.ProductWebInfo
(
SKU int NOT NULL,
ImageUrl varchar(50) NULL
) ON [PRIMARY]
ALTER TABLE dbo.ProductWebInfo ADD CONSTRAINT
FK_ProductWebInfo_Product FOREIGN KEY
(
SKU
) REFERENCES dbo.Product
(
SKU
) ON UPDATE NO ACTION
ON DELETE NO ACTION
See how to create a foreign key constraint. http://msdn.microsoft.com/en-us/library/ms175464.aspx This also has links to creating tables. You'll need to create the database as well.
To answer your question:
ALTER TABLE ProductWebInfo
ADD CONSTRAINT fk_SKU
FOREIGN KEY (SKU)
REFERENCES Product(SKU)
Take these tables for example.
Item
id
description
category
Category
id
description
An item can belong to many categories and a category obviously can be attached to many items.
How would the database be created in this situation? I'm not sure. Someone said create a third table, but do I need to do that? Do I literally do a
create table bla bla
for the third table?
Yes, you need to create a third table with mappings of ids, something with columns like:
item_id (Foreign Key)
category_id (Foreign Key)
edit: you can treat item_id and category_id as a primary key, they uniquely identify the record alone. In some applications I've found it useful to include an additional numeric identifier for the record itself, and you might optionally include one if you're so inclined
Think of this table as a listing of all the mappings between Items and Categories. It's concise, and it's easy to query against.
edit: removed (unnecessary) primary key.
Yes, you cannot form a third-normal-form many-to-many relationship between two tables with just those two tables. You can form a one-to-many (in one of the two directions) but in order to get a true many-to-many, you need something like:
Item
id primary key
description
Category
id primary key
description
ItemCategory
itemid foreign key references Item(id)
categoryid foreign key references Category(id)
You do not need a category in the Item table unless you have some privileged category for an item which doesn't seem to be the case here. I'm also not a big fan of introducing unnecessary primary keys when there is already a "real" unique key on the joining table. The fact that the item and category IDs are already unique means that the entire record for the ItemCategory table will be unique as well.
Simply monitor the performance of the ItemCategory table using your standard tools. You may require an index on one or more of:
itemid
categoryid
(itemid,categoryid)
(categoryid,itemid)
depending on the queries you use to join the data (and one of the composite indexes would be the primary key).
The actual syntax for the entire job would be along the lines of:
create table Item (
id integer not null primary key,
description varchar(50)
);
create table Category (
id integer not null primary key,
description varchar(50)
);
create table ItemCategory (
itemid integer references Item(id),
categoryid integer references Category(id),
primary key (itemid,categoryid)
);
There's other sorts of things you should consider, such as making your ID columns into identity/autoincrement columns, but that's not directly relevant to the question at hand.
Yes, you need a "join table". In a one-to-many relationship, objects on the "many" side can have an FK reference to objects on the "one" side, and this is sufficient to determine the entire relationship, since each of the "many" objects can only have a single "one" object.
In a many-to-many relationship, this is no longer sufficient because you can't stuff multiple FK references in a single field. (Well, you could, but then you would lose atomicity of data and all of the nice things that come with a relational database).
This is where a join table comes in - for every relationship between an Item and a Category, the relation is represented in the join table as a pair: Item.id x Category.id.