Design Pattern required for database schema with two classes both composing a third class - sql

Consider a system which has classes for both letters and people; both of these classes compose an address. When designing a database for the system it seems sensible to have a separate schema for the address but this leads to an issue; I don't know how to have a foreign key in the address table cleanly identify what it belongs to because the foreign key could identify either a letter or a person. Further more I expect that further classes will be added which will also compose an address.
I would be grateful for design patterns addressing this kind of design point.
Best Regards

I don't know how to have a foreign key
in the address table cleanly identify
what it belongs to because the foreign
key could identify either a letter or
a person.
It sounds like you have got that the wrong way around. There would be no foreign key in the address table; rather, the letters table would have a foreign key referencing the address table and the persons table would also have a foreign key referencing the address table.
In SQL, the REFERENTIAL_CONSTRAINTS view in the Information Schema catalog will tell you which tables are referencing the address table.
In our shop we regularly debate whether an address should be modelled as an entity in its own right. The problem with treating an address as an entity is that there is no reliable key beyond the attributes themselves (postal code, house name or number, etc) and many variations can identify the same address (I can write my home address twenty different ways). Then you need to consider post office boxes, care of addresses, Santa Claus, etc. And the Big Question: do you allow an address to be amended? If someone moves house, do they keep the same address entity with amended attributes (that way all linked entities get the address change) or do you lock-down addresses' attributes and force the creation of a new address entity then relink all the referencing addresses to the new one (and do you delete the now-orphaned address entity and if yes then why did you bother to replace it...?) But your application still needs to allow an address to be amended in case the post office changes it in real life, but how to you prevent this ability from being misused i.e. using it for aforementioned illegal house moves? You can use a webservice to scrub your address data so that is is correct but an external system way have less clean data and therefore you can't get your address entities to match anymore...?
I like to keep it simple: an address is a single attribute being the plaintext you must put on an item of mail for it to be delivered by the post office to the addressable entity in question; it is not an entity in its own right because it lacks an identifier. For practical reasons (e.g. being able to print it on an address label), this single attibute is usually split into subatomic elements (address_line_1, address_line_2, ... postal_code, whatever).
Because most SQL products lack support for domains, I have no problem duplicating the column names, data types, constraints, etc between for each table that models an addressable entity.

Surely the foreign keys should be from the Letter and People tables referencing the primary key on the Address table?
So, the Letter table contains an AddressId column referencing the Id on the Address table, as does the Person table, and any future classes which compose an address.
If Letters and Persons compose multiple addresses, then intermediate link tables will be required.

Related

SQL Join to either table, Best way or alternative design

I am designing a database for a system and I came up with the following three tables
My problem is that an Address can belong to either a Person or a Company (or other things in the future) So how do I model this?
I discarded putting the address information in both tables (Person
and Company) because of it would be repeated
I thought of adding two columns (PersonId and CompanyId) to the
Address table and keep one of them null, but then I will need to add
one column for every future relation like this that appears (for
example an asset can have an address where its located at)
The last option that occur to me was to create two columns, one
called Type and other Id, so a pair of values would represent a
single record in the target table, for example: Type=Person,Id=5 and
Type=Company,Id=9 this way I can Join the right table using the type
and it will only be two columns no matter how many tables relate to
this table. But I cannot have constraints which reduce data integrity
I don't know if I am designing this properly. I think this should be a common issue (I've faced it at least three times during this small design in objects like Contact information, etc...) But I could not find many information or examples that would resemble mine.
Thanks for any guidance that you can give me
There are several basic approaches you could take, depending on how much you want to future proof your system.
In general, Has-One relationships are modeled by a foreign key on the owning entity, pointing to the primary key on the owned entity. So you would have an AddressId on both Company and Person,which would be a foreign key to Address.Id. The complexity in your case is how to handle the fact that a person can have multiple addresses. If you are 100% sure that there will only ever be a home and work address, you could put two foreign key columns on Person, but this becomes a big problem if there's a third, fourth, fifth etc. address. The other option is to create a join table, PersonAddress, with three columns a PersonId an AddressId and a AddressType, to indicate whether its a home work or whatever address.

Can multiple relationship occur between two entities?

I have following two entities, where a person address is kept in a separate table because a person can have multiple addresses.
The mailing address however is one of the multiple addresses stored in address table, which is to be referred in person table.
Can following relationship exist?
The answer to your direct question is "Yes". However, you cannot declare the model only using create table because:
Declaring the foreign key address.person_id requires that the person table already exist.
Declaring the foreign key person.mailing_address requires that the address table already exist.
Hence, to implement the model, you need to use alter table to add one or both of the constraints after both tables are created.
Is this the model you want? One feature of an address is that multiple people can have the same address. Your model does not allow that. To handle this, you would typically have three tables:
Person
Address
PersonAddress
The third table has one row for each person/address pair. It can also have other information such as:
Type ("mailing" versus other types)
Effective and end dates.
Perhaps other information.
If you want to guarantee uniqueness of the "mailing" address in such a model, many databases support filtered unique indexes, to ensure there are no duplicate mailing addresses.
I'm not sure why you have person_id as an fk on address but that doesn't look correct. There are lots of correct ways to model this and the best one for you will depend on you particular circumstances - but a couple of options are:
If you know all the types of addresses there can be then add multiple address fk fields to the person e.g. billing address, shipping address, etc. This makes querying quick and simple but is inflexible: adding a new address type in the future is relatively complex to implement
Add an intersection table with fks for person and address and an address type field. This has a slight overhead when it comes to querying but has the advantage if being very flexible: adding a new address type is trivial

Is my relation a one-to-many or a many-to-many?

I have a model called Addresss which as the name sounds, is a list of addresses.
These addresses can belong to a Client and a client can have many of these addresses.
To link these addresses to the client, I will simply have a table called ClientAddress with 3 columns: id, client_id and address_id.
Is this an example of a one to many or a many-to-many relationship? I currently have it setup as a ManyToMany relationship in Phalcon however I'm not sure if it should actually be One to Many.
It's a one-to-many relation. One client (can) have multiple addresses. One address belongs to only one client.
Regarding your clientAddress table, I'd get rid off it as you can store the client id on the adress table.
If, as your tags suggest you're using phalcon and decide do go with phalcon's orm you should have a look at the documentation : Working with Models
Its all depends how you are thinking about your Entities
Lets start from Client
We would like to store information about Client. Now if our Client has only one specific Location then we have two options. We can directly store the Address information in same table as-
clients table
id, f_name, l_name, address, current_city, home_city, etc....
In this case there is nothing about Relation
If you are interested then you can split this table and store Location information on other table which you may name as addresses. Then the Relation between Client and Address will be one-to-one relation.
Now, if our Client has different office on different Location then it is mandatory to store Location information on different table. Then the Relation will be one-to-many as our Client has different `Location'.
Now, if we have many Client on same Location (same building may be) and our Client has many office in many Location then it will be as many-to-many relation. As Client hasMany Address and Address hasMany Client we new a pivot or intermediate table to hold our Relation information.
It depends on quite how complex your model is likely to become.
Suppose your "clients" are branches of a national company. Eech "client" may then have multiple adrresses - delivery address and billing address for instance. And equally, since the accounts-payable function may be centralised to either a regional or national branch (which may itself have, or not have a "delivery address)" the billing address is likely to be shared amongst many "clients."
So, in this scenario, many-to-many.

How to represent this type of FK relationship in SQL Server 2005?

Assume that I have two parent tables: Company and Contact. Assume that both Company and Contact records need to have 0 or more addresses recorded against them, where the format for addresses for both record types is identical. This implies that there is a need for a third table Address where addresses should be stored.
My question is, how can the relationship between Company and Contact and Address be expressed using foreign key constraints, so that things like CASCADE DELETE will work? (I'm guessing this means that I cannot have an OwnerTable field in address).
I'm using SQL Server 2005.
Stated:
- Each Company can have many Addresses
- Each Contact can have many Addresses
This also seems to imply that each address can be re-used multiple times:
- Each Address can have many Companies / Contacts
If this is the case then you actually need a link table, allowing many:many relationships. And n your case I'd have a link table for Company (CompanyID, AddressID) and a link table for Contact (ContactID, AddressID).
You could configure things such that deleting an address deletes all corresponding records in the link tables. But to Delete and Address if all mapped Companies and Contacts are deleted, that would require a trigger.
If an address is actually only used once, and you expressly want the deletion of a Company/Contact to delete the associated addresses...
1. Again, use a trigger
2. Have a ContactAddress table and a CompanyAddressTable
I'm not aware of any tricks that will allow one table to foreign key to two different tables, and allow both primary tables to cascade delete to the single foreign table. This is expressly prohibited in SQL Server to prevent cascaded deletes across circular references.
Should an auto cascade work if the address is shared? You could use a INSTEAD OF DELETE trigger to implement this special cascade correctly.

SQL Database Design Best Practice (Addresses)

Of course I realize that there's no one "right way" to design a SQL database, but I wanted to get some opinions on what is better or worse in my particular scenario.
Currently, I'm designing an order entry module (Windows .NET 4.0 application with SQL Server 2008) and I'm torn between two design decisions when it comes to data that can be applied in more than one spot. In this question I'll refer specifically to Addresses.
Addresses can be used by a variety of objects (orders, customers, employees, shipments, etc..) and they almost always contain the same data (Address1/2/3, City, State, Postal Code, Country, etc). I was originally going to include each of these fields as a column in each of the related tables (e.g. Orders will contain Address1/2/3, City, State, etc.. and Customers will also contain this same column layout). But a part of me wants to apply DRY/Normalization principles to this scenario, i.e. have a table called "Addresses" which is referenced via Foreign Key in the appropriate table.
CREATE TABLE DB.dbo.Addresses
(
Id INT
NOT NULL
IDENTITY(1, 1)
PRIMARY KEY
CHECK (Id > 0),
Address1 VARCHAR(120)
NOT NULL,
Address2 VARCHAR(120),
Address3 VARCHAR(120),
City VARCHAR(100)
NOT NULL,
State CHAR(2)
NOT NULL,
Country CHAR(2)
NOT NULL,
PostalCode VARCHAR(16)
NOT NULL
)
CREATE TABLE DB.dbo.Orders
(
Id INT
NOT NULL
IDENTITY(1000, 1)
PRIMARY KEY
CHECK (Id > 1000),
Address INT
CONSTRAINT fk_Orders_Address
FOREIGN KEY REFERENCES Addresses(Id)
CHECK (Address > 0)
NOT NULL,
-- other columns....
)
CREATE TABLE DB.dbo.Customers
(
Id INT
NOT NULL
IDENTITY(1000, 1)
PRIMARY KEY
CHECK (Id > 1000),
Address INT
CONSTRAINT fk_Customers_Address
FOREIGN KEY REFERENCES Addresses(Id)
CHECK (Address > 0)
NOT NULL,
-- other columns....
)
From a design standpoint I like this approach because it creates a standard address format that is easily changeable, i.e. if I ever needed to add Address4 I would just add it in one place rather than to every table. However, I can see the number of JOINs required to build queries might get a little insane.
I guess I'm just wondering if any enterprise-level SQL architects out there have ever used this approach successfully, or if the number of JOINs that this creates would create a performance issue?
You're on the right track by breaking address out into its own table. I'd add a couple of additional suggestions.
Consider taking the Address FK columns out of the Customers/Orders tables and creating junction tables instead. In other words, treat Customers/Addresses and Orders/Addresses as many-to-many relationships in your design now so you can easily support multiple addresses in the future. Yes, this means introducing more tables and joins, but the flexibility you gain is well worth the effort.
Consider creating lookup tables for city, state and country entities. The city/state/country columns of the address table then consist of FKs pointing to these lookup tables. This allows you to guarantee consistent spellings across all addresses and gives you a place to store additional metadata (e.g., city population) if needed in the future.
I just have some cautions. For each of these, there's more than one way to fix the problem.
First, normalization doesn't mean "replace text with an id number".
Second, you don't have a key. I know, you have a column declared "PRIMARY KEY", but that's not enough.
insert into Addresses
(Address1, Address2, Address3, City, State, Country, PostalCode)
values
('President Obama', '1600 Pennsylvania Avenue NW', NULL, 'Washington', 'DC', 'US', '20500'),
('President Obama', '1600 Pennsylvania Avenue NW', NULL, 'Washington', 'DC', 'US', '20500'),
('President Obama', '1600 Pennsylvania Avenue NW', NULL, 'Washington', 'DC', 'US', '20500'),
('President Obama', '1600 Pennsylvania Avenue NW', NULL, 'Washington', 'DC', 'US', '20500');
select * from Addresses;
1;President Obama;1600 Pennsylvania Avenue NW;;Washington;DC;US;20500
2;President Obama;1600 Pennsylvania Avenue NW;;Washington;DC;US;20500
3;President Obama;1600 Pennsylvania Avenue NW;;Washington;DC;US;20500
4;President Obama;1600 Pennsylvania Avenue NW;;Washington;DC;US;20500
In the absence of any other constraints, your "primary key" identifies a row; it doesn't identify an address. Identifying a row is usually not good enough.
Third, "Address1", "Address2", and "Address3" aren't attributes of addresses. They're attributes of mailing labels. (Lines on a mailing label.) That distinction might not be important to you. It's really important to me.
Fourth, addresses have a lifetime. Between birth and death, they sometimes change. They change when streets get re-routed, buildings get divided, buildings get undivided, and sometimes (I'm pretty sure) when a city employee has a pint too many. Natural disasters can eliminate whole communities. Sometimes buildings get renumbered. In our database, which is tiny compared to most, about 1% per year change like that.
When an address dies, you have to do two things.
Make sure nobody uses that address to mail, ship, or whatever.
Make sure its death doesn't affect historical data.
When an address itself changes, you have to do two things.
Some data must reflect that change. Make sure it does.
Some data must not reflect that change. Make sure it doesn't.
Fifth, DRY doesn't apply to foreign keys. Their whole purpose is to be repeated. The only question is how wide a key? An id number is narrow, but requires a join. (10 id numbers might require 10 joins.) An address is wide, but requires no joins. (I'm talking here about a proper address, not a mailing label.)
That's all I can think of off the top of my head.
I think there is a problem you are not aware of and that is that some of this data is time sensitive. You do not want your records to show you shipped an order to 35 State St, Chicago Il, when you actually sent it to 10 King Street, Martinsburg WV but the customer moved two years after the order was shipped. So yes, build an address table to get the address at that moment in time as long as any change to the address for someone like a customer results in a new addressid not in changing the current address which would break the history on an order.
You would want the addresses to be in a separate table only if they were entities in their own right. Entities have identity (meaning it matters if two objects pointed to the same address or to different ones), and they have their own lifecycle apart from other entities. If this was the case with your domain, I think it would be totally apparent and you wouldn't have a need to ask this question.
Cade's answer explains the mutability of addresses, something like a shipping address is part of an order and shouldn't be able to change out from under the order it belongs to. This shows that the shipping address doesn't have its own lifecycle. Handling it as if it was a separate entity can only lead to more opportunities for error.
"Normalization" specifically refers to removing redundancies from data so you don't have the same item represented in different places. Here the only redundancy is in the DDL, it's not in the data, so "normalization" is not relevant here. (JPA has the concept of embedded classes that can address the redundancy).
TLDR: Use a separate table if the address is truly an Entity, with its own distinct identity and its own lifecycle. Otherwise don't.
What you have to answer for yourself is the question whether the same address in everyday language is actually the same address in your database. If somebody "changes his address" (colloquially), he really links himself to another address. The address per se only changes when a street is renamed, a zip-code reform takes place or a nuke hits. And those are rare events (hopefully for the most part). There goes your main profit: change in one place for multiple rows (of multiple tables).
If you should actually change an address for that in your model - in the sense of an UPDATE on table address - that may or may not work for other rows that link to it. Also, in my experience, even the exact same address has to look different for different purposes. Understand the semantic differences and you will arrive at the right model that represents your real world best.
I have a number of databases where I use a common table of streets (which uses a table of cities (which uses a table of countries, ...)). In combination with a street number think of it as geocodes (lat/lon), not "street names". Addresses are not shared among different tables (or rows). Changes to street names and zip codes cascade, other changes don't.
You would normally normalise the data as far as possible, so use the table 'Addresses'.
You can use views to de-normalise the data afterwards which use indexes and should give a method to access data with easy references, whilst leaving the underlying structure normalised fully.
The number of joins shouldn't be a major issue, index based joins aren't too much of an overhead.
It's fine to have a split out addresses table.
However, you have to avoid the temptation of allowing multiple rows to refer to the same address without an appropriate system for managing options for the user to decide whether and how changing an address splits out a row for the new address change, i.e. You have the same address for billing and ship-to. Then a user says their address is changing. To start with, old orders might (should?) need their ship-to addresses retained, so you can't change it in-place. But the user might also need to say this address I'm changing is only going to change the ship-to.
One should maintain some master tables for City, State and Country. This way one can avoid the different spellings for these entities which might end up with mapping same city with some different state/country.
One can simply map the CityId in the address table as foreign key as shown below, instead of having all the three fields separately (City, State and Country) as plain text in address table itself.
Address: {
CityId
// With other fields
}
City: {
CityId
StateId
// Other fields
}
State: {
StateId
CountryId
// Other fields
}
Country: {
CountryId
// Other fields
}
If one maintains all the three ids (CityId, StateId and CountryId) in address table, at the end you have to make joins against those tables. Hence my suggestion would be to have only CityId and then retrieve rest of the required information though joins with above table structure.
I prefer to use an XREF table that contains a FK reference to the person/business table, a FK reference to the address table and, generally, a FK reference to a role table (HOME, OFFICE, etc) to delineate the actual type of address. I also include an ACTIVE flag to allow me to choose to ignore old address while preserving the ability to maintain an address history.
This approach allows me to maintain multiple addresses of varying types for each primary entity