How to properly index a table two other tables have a one-to-many relationship to?

How to properly index a table two other tables have a one-to-many relationship to? - sql

Imagine I have three tables, called "customers", "companies" and "phone_numbers". Both customers and companies can have multiple phone numbers. What would be the best way to index phone_numbers? Have both customer_id and company_id and keep one of them null? What if there are more than two tables with a one-to-many relationship with phone_numbers?

Your business rules might only state one-to-many, but in reality people & companies can be a many-to-many relationship. One person can have many phone numbers (home, cell, etc), and a phone number can relate to many people (myself, my significant other, etc). Likewise, a company number and my business number can be the same - you just use an extension number to reach me directly.
Indexing the foreign keys would be a good idea, but beware of premature optimization. Depending on setup, I'd consider a unique constraint on the phone number column but I would not have the phone number column itself as a primary key.

I would go with identity columns in the customer and company tables, then in the phone number table do as you said and keep one null and the other populated. I do something similar to this and it works out fine as long as you validate data so that it doesn't go in with both values being null. For a more elegant solution you could have two columns: one that is an id, and another that is a type identifier. Say 1 for customers and 2 for companies, that way you don't have to worry about null data or a lot of extra columns.

I'd add two columns to the phone_numbers table. The first would be an index that tells you what table to associate with (say, 1 = customers and 2 = companies). The second would be the foreign key to the appropriate table.
This way you can add as many phone number sources as you want.
If a particular person or company has more than one phone number, there would be multiple rows in the phone_numbers table.

The closest thing I have to a pattern is the following -- any two entities with a many-to-many relationship require an associative entity (a cross-reference table) between them, like so (surrogate keys assumed):
CREATE TABLE CUSTOMER_XREF_PHONE
( CUSTOMER_ID NUMBER NOT NULL,
PHONE_NUMBER_ID NUMBER NOT NULL,
CONSTRAINT CUSTOMER_XREF_PHONE_PK
PRIMARY KEY (CUSTOMER_ID, PHONE_NUMBER_ID),
CONSTRAINT CUSTOMER_XREF_PHONE_UK
UNIQUE (PHONE_NUMBER_ID, CUSTOMER_ID),
CONSTRAINT CUSTOMER_XREF_PHONE_FK01
FOREIGN KEY (CUSTOMER_ID)
REFERENCES CUSTOMER (CUSTOMER_ID) ON DELETE CASCADE,
CONSTRAINT CUSTOMER_XREF_PHONE_FK02
FOREIGN_KEY (PHONE_NUMBER_ID)
REFERENCES PHONE_NUMBERS (PHONE_NUMBER_ID) ON DELETE CASCADE
);
Such an implementation pattern can:
Be fully protected by database-level referential integrity constraints
Support bi-directional access (sometimes you need to see who else has that phone number)
Be self-cleaning if your database supports ON DELETE CASCADE
Be extended through the use of a "relationship type" attribute to map multiple independent relationships between the entities,
such as:
customer has a home telephone number
customer has a daytime telephone number
customer has a fax telephone number
customer has a mobile telephone number

Related

Postgresql: Primary key for table with one column

Sometimes, there are certain tables in an application with only one column in each of them. Data of records within the respective columns are unique. Examples are: a table for country names, a table for product names (up to 60 characters long, say), a table for company codes (3 characters long and determined by the user), a table for address types (say, billing, delivery), etc.
For tables like these, as the records are unique and not null, the only column can be used as the primary key, technically speaking.
So my question is, is it good enough to use that column as the primary key for the table? Or, is it still desirable to add another column (country_id, product_id, company_id, addresstype_id) as the primary key for the table? Why?
Thanks in advance for any advice.

there is always a debate between using surrogate keys and composite keys as primary key. using composite primary keys always introduces some complexity to your database design so to your application.
think that you have another table which is needed to have direct relationship between your resulting table (billing table). For the composite key scenario you need to have 4 columns in your related table in order to connect with the billing table. On the other hand, if you use surrogate keys, you will have one identity column (simplicity) and you can create unique constraint on (country_id, product_id, company_id, addresstype_id)
but it is hard to say this approach is better then the other one because they both have Pros and Cons.
You can check This for more information

One Primary Key Value in many tables

This may seem like a simple question, but I am stumped:
I have created a database about cars (in Oracle SQL developer). I have amongst other tables a table called: Manufacturer and a table called Parentcompany.
Since some manufacturers are owned by bigger corporations, I will also show them in my database.
The parentcompany table is the "parent table" and the Manufacturer table the "child table".
for both I have created columns, each having their own Primary Key.
For some reason, when I inserted the values for my columns, I was able to use the same value for the primary key of Manufacturer and Parentcompany
The column: ManufacturerID is primary Key of Manufacturer. The value for this is: 'MBE'
The column: ParentcompanyID is primary key of Parentcompany. The value for this is 'MBE'
Both have the same value. Do I have a problem with the thinking logic?
Or do I just not understand how primary keys work?
Does a primary key only need to be unique in a table, and not the database?
I would appreciate it if someone shed light on the situation.

A primary key is unique for each table.
Have a look at this tutorial: SQL - Primary key
A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique
values. A primary key column cannot have NULL values.
A table can have only one primary key, which may consist of single or
multiple fields. When multiple fields are used as a primary key, they
are called a composite key.
If a table has a primary key defined on any field(s), then you cannot
have two records having the same value of that field(s).

Primary key is table-unique. You can use same value of PI for every separate table in DB. Actually that often happens as PI often incremental number representing ID of a row: 1,2,3,4...
For your case more common implementation would be to have hierarchical table called Company, which would have fields: company_name and parent_company_name. In case company has a parent, in field parent_company_name it would have some value from field company_name.

There are several reasons why the same value in two different PKs might work out with no problems. In your case, it seems to flow naturally from the semantics of the data.
A row in the Manufacturers table and a row in the ParentCompany table both appear to refer to the same thing, namely a company. In that case, giving a company the same id in both tables is not only possible, but actually useful. It represents a 1 to 1 correspondence between manufacturers and parent companies without adding extra columns to serve as FKs.

Thanks for the quick answers!
I think I know what to do now. I will create a general company table, in which all companies will be stored. Then I will create, as I go along specific company tables like Manufacturer and parent company that reference a certain company in the company table.
To clarify, the only column I would put into the sub-company tables is a column with a foreign key referencing a column of the company table, yes?
For the primary key, I was just confused, because I hear so much about the key needing to be unique, and can't have the same value as another. So then this condition only goes for tables, not the whole database. Thanks for the clarification!

Is unique foreign keys across multiple tables via normalization and without null columns possible?

This is a relational database design question, not specific to any RDBMS. A simplified case:
I have two tables Cars and Trucks. They have both have a column, say RegistrationNumber, that must be unique across the two tables.
This could probably be enforced with some insert/update triggers, but I'm looking for a more "clean" solution if possible.
One way to achieve this could be to add a third table, Vehicles, that holds the RegistrationNumber column, then add two additional columns to Vehicles, CarID and TruckID. But then for each row in Vehicles, one of the columns CarID or TruckID would always be NULL, because a RegistrationNumber applies to either a Car or a Truck leaving the other column with a NULL value.
Is there anyway to enforce a unique RegistrationNumber value across multiple tables, without introducing NULL columns or relying on triggers?

This is a bit complicated. Having the third table, Vehicles is definitely part of the solution. The second part is guaranteeing that a vehicle is either a car or a truck, but not both.
One method is the "list-all-the-possibilities" method. This is what you propose with two columns. In addition, this should have a constraint to verify that only one of the ids is filled in. A similar approach is to have the CarId and TruckId actually be the VehicleId. This reduces the number of different ids floating around.
Another approach uses composite keys. The idea is:
create table Vehicles (
Vehicle int primary key,
Registration varchar(255),
Type varchar(255),
constraint chk_type check (type in ('car', 'truck')),
constraint unq_type_Vehicle unique (type, vehicle), -- this is redundant, but necessary
. . .
);
create table car (
VehicleId int,
Type varchar(255), -- always 'car' in this table
constraint fk_car_vehicle foreign key (VehicleId) references Vehicles(VehicleId),
constraint fk_car_vehicle_type foreign key (Type, VehicleId) references Vehicles(Type, VehicleId)
);

See the following tags: class-table-inheritance shared-primary-key
You have already outlined class table inheritance in your question. The tag will just add a few details, and show you some other questions whose answers may help.
Shared primary key is a handy way of enforcing the one-to-one nature of IS-A relationships such as the relationship between a vehicle and a truck. It also allows a foreign key in some other table to reference a vehicle and also a truck or a car, as the case may be.

You can add the third table Vehicles containing a single column RegistratioNumber on which you apply the unique constraint, then on the existing tables - Cars and Trucks - use the RegistrationNumber as a foreign key on the Vehicles table. In this way you don't need an extra id, avoid the null problem and enforce the uniqueness of the registration number.
Update - this solution doesn't prevent a car and a truck to share the same registration number. To enforce this constraint you need to add either triggers or logic beyond plain SQL. Otherwise you may want to take a look at Gordon's solution that involves Composite Foreign Keys.

I can't find any primary key in some of my relations

Alright so I read from somewhere
Every table should have a primary key
But some of my tables don't seem to behave!
I'd also like to know whether the relations as I'm using are fine or I need to dissolve them further, I'm open to suggestions.
The relations are
Dealers(DealerId(PK),DealerName)
Order(DealerId(FK),OrderDate,TotalBill)
Sales(DealerId(FK),ItemType,OrderDate,Quantity,Price)
P.S. I can't make a table named Items(ItemCode,Type,Price) Because the price is variable for different dealers. And all the constraints i.e not null + check that I needed are dealt with already just didn't mention.
1. Are the relations dissolved well?
2. Should I care about setting primary keys in the tables that don't have it already?
Helpful responses appreciated.

In your case, you should add an auto increment integer field to Order and Sales and set that to be the primary key.
In Relational Database Theory, you can sometimes identify a sub-set of the fields to use as a primary key, as long as those columns are non-null and unique. However, (1) the order table cannot have a primary key from DealerID and OrderDate because a dealer could make two orders on the same date. Maybe even for the same amount, which would mean that no sub-set of fields is unique, and (2) even when familiar data can uniquely identify the data, an auto-increment integer can be a good key.
I also think that you want a foreign key from Sales to Order. You are probably using DealerId and OrderDate for joins, but this will not work correctly if a dealer makes two orders on the same date.
Finally, take advice like
Every table should have a primary key
with a grain of salt. Linking tables used for many-to-many relationships can work perfectly fine without a primary key, although a primary key can be an improvement, since it will make deleting records easier, and if you don't have a primary key on a linking table, I would still recommend a unique index on all the fields, in which case that can be the primary key.

Do you really need separate Sales Table ?
Dealers(DealerId(PK),DealerName)
Order(OrderId(PK), DealerId(FK),OrderDate, ItemType, Quantity,Price)
Also,
TotalBill (can be calculated) = Quantity * Price

About the question 1 you should answer this question:
A sale can be made without an order?
If yes, your DealerId(FK) in Sales is alright, assuming that a sale will only exist if a dealer made it.
If no, you should put an OrderId(FK) in Sales, instead of DealerId(FK). If a sale belongs to an order, this order belongs do a dealer, so you already have the relation from dealers to sales.
About the question 2, you should have primary keys on your tables, because this is the way you have to select, update and delete some specific item on your database. Remembering that a primary key is not always an auto increment column.
And about the Items table, if the price is variable to different dealers, so you have an M to N relationship between Dealers and Items, which means you could have an intermediate table like this example:
DealerItemPrices(DealerId(FK), ItemId(FK), Price)
And these two Foreign Keys should be Unique Composite Keys, in this way a Dealer Y can't have two distinct prices to the same item.
Hope it helps!

How should this database sub type relationship be modelled?

I am revising a legacy multi-tenant application where the shopping cart function stores multiple vendors and multiple clients in the same database. Some clients of one vendor may be clients of a different vendor. Some vendors might actually be clients of another vendor.
I currently have a table for the super-type 'party' with primary key party_ID, a table for the subtype 'company' with primary key company_ID (references party_ID) and a table for the role of 'vendor' with primary key vendor_ID (references company_ID). I also have a junction table, 'client' with a composite primary key of vendor_ID and party_ID.
My question is how should the 'order' table reference the vendor and client tables? My first thought is that the table should have a composite primary key of vendor_ID, client_ID and order_ID (order_ID could be auto-increment across the table or sequential per vendor_ID + client_ID)
but this seemed a bit fishy as there were three attributes making up the key...
Does anyone have any insight into this topic? Most 'shopping carts' only deal with a single vendor, so the order table simply lists client_ID as a foreign key.
Thanks!

My question is how should the order table reference the vendor and
client tables? My first thought is that the table should have a
composite primary key of 'vendor_ID', 'client_ID' and 'order_ID' but
this seemed a bit fishy as there were three keys...
Composite primary key doesn't mean three keys. It means one key consisting of three columns.
But that's not the real issue.
An order is an accounting record; it must not change over time. Storing the ID numbers is risky unless you've built temporal tables, and I doubt you've done that. If a vendor changes its name today, its name no longer matches the name on earlier orders. You must not let that happen with accounting records.
Unless you mean something unusual by "order", I'd expect Order_id to be its primary key. There might be other constraints; there might even be other key constraints to prevent duplicate orders that differ only by Order_id. But I'd still expect Order_id to be the primary key of a table of orders.
If vendors and clients are subtypes, I'd expect any (high risk) id numbers you store to reference the id numbers in the subtype tables. In your case, you seem to have an additional table that identifies the clients of vendors; it contains the columns {vendor_id, client_id}. The foreign key references for that table should be obvious.
Your table of orders should have one foreign key reference to that table, not one foreign key to vendors and another foreign key to clients. So in the table of orders, foreign key (vendor_id, client_id) references vendor_clients (vendor_id, client_id). The table of vendor clients will need either a primary key constraint or a unique constraint on {vendor_id, client_id}.
But you shouldn't do that for accounting unless you're using temporal tables. Instead, you should probably store both the id numbers and the text.

I would start with something like this. I do admit that I still do not quite understand difference between company, vendor, and client in your question. As Catcall mentioned, in this model you are not allowed to delete Parties (People, Organizations); accounting records should be frozen -- usually by capturing current customer/supplier info in order table.

For your primary key, you'll want just order_id.
Really, the composite (and unique) key I would use would be [vendor_id, client_id, occurredAt] (where occurredAt is a timestamp) - assuming orders could only be placed once a millisecond. However, this is something of a wide key, and some systems don't appreciate those. You'll still need these columns, and probably indexed, however.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas