Records linked to any table? - sql

Hi Im struggling a bit with this and could use some ideas...
Say my database has the following tables ;
Customers
Supplers
SalesInvoices
PurchaseInvoices
Currencies
etc etc
I would like to be able to add a "Notes" record to ANY type of record
The Notes table would like this
NoteID Int (PK)
NoteFK Int
NoteFKType Varchar(3)
NoteText varchar(100)
NoteDate Datetime
Where NoteFK is the PK of a customer or supplier etc and NoteFKType says what type of record the note is against
Now i realise that I cannot add a FK which references multiple tables without NoteFK needing to be present in all tables.
So how would you design the above ?
The note FK needs to be in any of the above tables
Cheers,
Daniel

You have to accept the limitation that you cannot teach the database about this foreign key constraint. So you will have to do without the integrity checking (and cascading deletes).
Your design is fine.
It is easily extensible to extra tables, you can have multiple notes per entity, and the target tables do not even need to be aware of the notes feature.
An advantage that this design has over using a separate notes table per entity table is that you can easily run queries across all notes, for example "most recent notes", or "all notes created by a given user".
As for the argument of that table growing too big, splitting it into say five table will shrink the table to about a fifth of its size, but this will not make any difference for index-based access. Databases are built to handle big tables (as long as they are properly indexed).

I think your design is ok, if you can accept the fact, that the db system will not check whether a note is referencing an existing entity in other table or not. It's the only design I can think of that doesn't require duplication and is scalable to more tables.
The way you designed it, when you add another entity type that you'd like to have notes for, you won't have to change your model. Also, you don't have to include any additional columns in your existing model, or additional tables.
To ensure data integrity, you can create set of triggers or some software solution that will clean notes table once in a while.

I would think twice before doing what you suggest. It might seem simple and elegant in the short term, but if you are truly interested in data integrity and performance, then having separate notes tables for each parent table is the way to go. Over the years, I've approached this problem using the solutions found in the other answers (triggers, GUIDs, etc.). I've come to the conclusion that the added complexity and loss of performance isn't worth it. By having separate note tables for each parent table, with an appropriate foreign key constraints, lookups and joins will be simple and fast. When combining the related items into one table, join syntax becomes ugly and your notes table will grow to be huge and slow.

I agree with Michael McLosky, to a degree.
The question in my mind is: What is the technical cost of having multiple notes tables?
In my mind, it Is preferable to consolidate the same functionality into a single table. It aso makes reporting and other further development simpler. Not to mention keeping the list of tables smaller and easier to manage.
It's a balancing act, you need to try to predetermine both the benefits And the costs of doing something like this. My -personal- preference is database referential integrity. Application management of integrity should, in my opinion, be limitted ot business logic. The database should ensure the data is always consistent and valid...
To actually answer your question...
The option I would use is a check constraint using a User Defined Function to check the values. This works in M$ SQL Server...
CREATE TABLE Test_Table_1 (id INT IDENTITY(1,1), val INT)
GO
CREATE TABLE Test_Table_2 (id INT IDENTITY(1,1), val INT)
GO
CREATE TABLE Test_Table_3 (fk_id INT, table_name VARCHAR(64))
GO
CREATE FUNCTION id_exists (#id INT, #table_name VARCHAR(64))
RETURNS INT
AS
BEGIN
IF (#table_name = 'Test_Table_1')
IF EXISTS(SELECT * FROM Test_Table_1 WHERE id = #id)
RETURN 1
ELSE
IF (#table_name = 'Test_Table_2')
IF EXISTS(SELECT * FROM Test_Table_2 WHERE id = #id)
RETURN 1
RETURN 0
END
GO
ALTER TABLE Test_Table_3 WITH CHECK ADD CONSTRAINT
CK_Test_Table_3 CHECK ((dbo.id_exists(fk_id,table_name)=(1)))
GO
ALTER TABLE [dbo].[Test_Table_3] CHECK CONSTRAINT [CK_Test_Table_3]
GO
INSERT INTO Test_Table_1 SELECT 1
GO
INSERT INTO Test_Table_1 SELECT 2
GO
INSERT INTO Test_Table_1 SELECT 3
GO
INSERT INTO Test_Table_2 SELECT 1
GO
INSERT INTO Test_Table_2 SELECT 2
GO
INSERT INTO Test_Table_3 SELECT 3, 'Test_Table_1'
GO
INSERT INTO Test_Table_3 SELECT 3, 'Test_Table_2'
GO
In that example, the final insert statement would fail.

You can get the FK referential integrity, at the costing of having one column in the notes table for each other table.
create table Notes (
id int PRIMARY KEY,
note varchar (whatever),
customer_id int NULL REFERENCES Customer (id),
product_id int NULL REFERENCES Product (id)
)
Then you'll need a constraint to make sure that you have only one of the columns set.
Or maybe not, maybe you might want a note to be able to be associated with both a customer and a product. Up to you.
This design would require adding a new column to Notes if you want to add another referencing table.

You could add a GUID field to the Customers, Suppliers, etc. tables. Then in the Notes table, change the foreign key to reference that GUID.
This does not help for data integrity. But it makes M-to-N relationships easily possible to any number of tables and it saves you from having to define a NoteFKType column in the Notes table.

You can easily implement "multi"-foreign key with triggers. Triggers will give you very flexible mechanism and you can do any integrity checks you wish.

Why dont you do it the other way around and have a foreign key in other tables (Customer, Supplier etc etc) to NotesID. This way you have one to one mapping.

Related

SQL - How to limit data entry of one attribute depending on another attribute?

Below is the DDL for the table I want to create. However, I want the attribute 'Expertise_breed' to be derived from 'Expertise_animal'. For example, if 'Dog' is entered into 'Expertise_animal' I don't want to be able to enter in a breed of cat. How would I go about achieving this?
I'm working with SQL Server Management Studio 2012
CREATE TABLE tExpertise
(
Expertise_ID int NOT NULL PRIMARY KEY, --E.G Data '001'
Expertise_type varchar(8) NOT NULL, --E.G Data 'Domestic'
Expertise_animal varchar(30) NOT NULL, --E.G Data 'Dog'
Expertise_breed varchar(30) NOT NULL --E.G Data 'Poodle'
)
This is a relation data situation, you should use relational tables.
I would have three
AnimalClassification - (domestic,wild,other)
AnimalSpecies (dog,cat,goat)
AnimalBreed (Poodle, Beagle)
Animal species would have a foreign key to animal classification i.e.
Dog - domestic
Animal breed would have a foreign key to animal species i.e.
Beagle - dog
You can create a trigger on insert and/or update and compare those two columns for each row. You can refer to the inserted entries via 'inserted' alias.
If you know what mappings are allowed (eg dog x poodle) you could store it in some table and join to it in the insert to filter out the wrong ones.
Theoretically, what you want can be achieved using table level constraints, a generic way of doing this being the following (not tested):
CREATE FUNCTION dbo.validateExpertise(
#expertise_type varchar(8),
#Expertise_animal varchar(30),
#Expertise_breed varchar(30)
)
RETURNS BIT
AS
BEGIN
IF (#Expertise_animal == 'dog' AND #Expertise_breed != 'dog')
RETURN 0;
-- other validations can come here
RETURN 1;
END
GO
-- add a table level constraint
-- WITH NOCHECK can be used to not check existing data
ALTER TABLE detailTable ADD CONSTRAINT chkExpertise
CHECK (dbo.validateExpertise(expertise_type, Expertise_animal, Expertise_breed) = 1)
While this may help you, it is not recommended to put such complex validation at database level. Complex validations are meant to be (at least) implemented in the business layer of your application, which is typically within the Logic tier (usually ASP.NET MVC, WCF service, Web service etc.) (some validations are also put in the presentation layer to avoid round-trips time delays.
Database is meant primarily for data persistence and fetch. Of course, simple constraints such as FKs, unique constraints, column level constraints etc. are welcomed, as they act as a good safety net.
Also, keep in mind that constraints like the one mentioned above will trigger for every INSERT or UPDATE in the table and might seriously degrade the performance for queries involving a large number of records.

SQL database design pattern for user favorites?

Asked this on the database site but it seems to be really slow moving. So I'm new to SQL and databases in general, the only thing I have worked on with an SQL database used one to many relationships. I want to know the easiest way to go about implementing a "favorites" mechanism for users in my DB-similar to what loads of sites like Youtube, etc, offer. Users are of course unique, so one user can have many favorites, but one item can also be favorited by many users. Is this considered a many to many relationship? What is the typical design pattern for doing this? Many to many relationships look like a headache(I'm using SQLAlchemy so my tables are interacted with like objects) but this seems to be a fairly common feature on sites so I was wondering what is the most straightforward and easy way to go about it. Thanks
Yes, this is a classic many-to-many relationship. Usually, the way to deal with it is to create a link table, so in say, T-SQL you'd have...
create table user
(
user_id int identity primary key,
-- other user columns
)
create table item
(
item_id int identity primary key,
-- other item columns
)
create table userfavoriteitem
(
user_id int foreign key references user(user_id),
item_id int foreign key references item(item_id),
-- other information about favoriting you want to capture
)
To see who favorited what, all you need to do is run a query on the userfavoriteitem table which would now be a data mine of all sorts of useful stats about what items are popular and who liked them.
select ufi.item_id,
from userfavoriteitem ufi
where ufi.user_id = [id]
Or you can even get the most popular items on your site using the query below, though if you have a lot of users this will get slow and the results should be saved in a special table updated on by a schedules job on the backend every so often...
select top 10 ufi.item_id, count(ufi.item_id),
from userfavoriteitem ufi
where ufi.item_id = [id]
GROUP BY ufi.item_id
I've never seen any explicitly-for-database design patterns (except a couple of trivial misuses of the phrase 'design pattern' when it became fashionable some years ago).
M:M relationships are OK: use a link table (aka association table etc etc). Your example of a User and Favourite sounds like M:M indeed.
create table LinkTable
(
Id int IDENTITY(1, 1), -- PK of this table
IdOfTable1 int, -- PK of table 1
IdOfTable2 int -- PK of table 2
)
...and create a UNIQUE index on (IdOfTable1, IdOfTable2). Or do away with the Id column and make the PF on (IdOfTable1, IdOfTable2) instead.

SQL Server foreign key to multiple tables

I have the following database schema:
members_company1(id, name, ...);
members_company2(id, name, ...);
profiles(memberid, membertypeid, ...);
membertypes(id, name, ...)
[
{ id : 1, name : 'company1', ... },
{ id : 2, name : 'company2', ... }
];
So each profile belongs to a certain member either from company1 or company2 depending on membertypeid value
members_company1 ————————— members_company2
———————————————— ————————————————
id ——————————> memberid <——————————— id
name membertypeid name
/|\
|
|
profiles |
—————————— |
memberid ————————+
membertypeid
I am wondering if it's possible to create a foreign key in profiles table for referential integrity based on memberid and membertypeid pair to reference either members_company1 or members_company2 table records?
A foreign key can only reference one table, as stated in the documentation (emphasis mine):
A foreign key (FK) is a column or combination of columns that is used
to establish and enforce a link between the data in two tables.
But if you want to start cleaning things up you could create a members table as #KevinCrowell suggested, populate it from the two members_company tables and replace them with views. You can use INSTEAD OF triggers on the views to 'redirect' updates to the new table. This is still some work, but it would be one way to fix your data model without breaking existing applications (if it's feasible in your situation, of course)
Operating under the fact that you can't change the table structure:
Option 1
How important is referential integrity to you? Are you only doing inner joins between these tables? If you don't have to worry too much about it, then don't worry about it.
Option 2
Ok, you probably have to do something about this. Maybe you do have inner joins only, but you have to deal with data in profiles that doesn't relate to anything in the members tables. Could you create a job that runs once per day or week to clean it out?
Option 3
Yeah, that one may not work either. You could create a trigger on the profiles table that checks the reference to the members tables. This is far from ideal, but it does guarantee instantaneous checks.
My Opinion
I would go with option 2. You're obviously dealing with a less-than-ideal schema. Why make this worse than it has to be. Let the bad data sit for a week; clean the table every weekend.
No. A foreign key can reference one and only one primary key and there is no way to spread primary keys across tables. The kind of logic you hope to achieve will require use of a trigger or restructuring your database so that all members are based off a core record in a single table.
Come on you can create a table but you cannot modify members_company1 nor members_company2?
Your idea of a create a members table will require more actions when new records are inserted into members_company tables.
So you can create triggers on members_company1 and members_company2 - that is not modify?
What are the constraints to what you can do?
If you just need compatibility on selects to members_company1 and members_company2 then create a real members table and create views for members_company1 and members_company2.
A basic select does not know it is a view or a table on the other end.
CREATE VIEW dbo.members_company1
AS
SELECT id, name
FROM members
where companyID = 1
You could possible even handle insert, updates, and deletes with instead-of
INSTEAD OF INSERT Triggers
A foreign key cannot reference two tables. Assuming you don't want to correct your design by merging members_company1 and members_company2 tables, the best approach would be to:
Add two columns called member_company1_id and member_company2_id to your profiles table and create two foreign keys to the two tables and allow nulls. Then you could add a constraint to ensure 1 of the columns is null and the other is not, at all times.

MS SQL share identity seed amongst tables

In MS SQL is it possible to share an identity seed across tables? For example I may have 2 tables:
Table: PeopleA
id
name
Table: PeopleB
id
name
I'd like for PeopleA.id and PeopleB.id to always have unique values between themselves. I.e. I want them to share the same Identity seed.
Note: I do not want to hear about table partitioning please, only about if it's possible to share a seed across tables.
Original answer
No you can't and if you want to do this, your design is almost
certainly flawed.
When I wrote this in 2010 that was true. However, at this point in time SQL Server now has Sequences that can do what the OP wants to do. While this may not help the OP (who surely has long since solved his problem), it may help some one else looking to do the same thing. I do still think that wanting to do this is usually a sign of a design flaw but it is possible out of the box now.
No, but I guess you could create an IDENTITY(1, 2) on the one table and an IDENTITY(2, 2) on the other. It's not a very robust design though.
Could you instead refer to your entities as 'A1', 'A2', ... if they come from TableA and 'B1', 'B2', etc... if they come from TableB? Then it's impossible to get duplicates. Obviously you don't actually need to store the A and the B in the database as it is implied.
Not sure what your design is, but sometimes it is useful to use an inheritance-type model, where you have a base table and then sub-tables with substantially different attributes, e.g.:
Person
------
PersonID <-- PK, autoincrement
FirstName
LastName
Address1
...
Employee
--------
PersonID <-- PK (not autoincrement), FK to Person
JobRoleID
StartDate
Photo
...
Associate
---------
PersonID <-- PK (not autoincrement), FK to Person
AssociateBranchID
EngagementTypeID
...
In this case you would insert the base values to Person, and then use the resulting PersonID to insert into either Employee or Associate table.
If you really need this, create a third table PeopleMaster, where the identity(1,1) exists, make the two other tables just have int FKs to this identity value. Insert into the PeopleMaster and then into PeopleA or PeopleB.
I would really consider this a bad design though. Create one table with a PeopleType flag ("A" or "B") and include all common columns, and create child tables if necessary (for any different columns between the PeopleA and PeopleB)
No.
But I have worked on projects where a similar concept was used. In my case what we did was have a table called [MasterIdentity] which had one column [Id] (an identity seed). No other table in the database had any columns with an identity seed and when Identities were required a function/stored proc was called to insert a value into the [MasterIdentity] table and return the seed.
No, there is nothing built into SQL Server to do this.
Obviously there are workarounds such as both using an FK relationship to a table which does have a single IDENTITY and having some fancy constraints or triggers.

Merging databases how to handle duplicate PK's

We have three databases that are physically separated by region, one in LA, SF and NY. All the databases share the same schema but contain data specific to their region. We're looking to merge these databases into one and mirror it. We need to preserve the data for each region but merge them into one db. This presents quite a few issues for us, for example we will certainly have duplicate Primary Keys, and Foreign Keys will be potentially invalid.
I'm hoping to find someone who has had experience with a task like this who could provide some tips, strategies and words of experience on how we can accomplish the merge.
For example, one idea was to create composite keys and then change our code and sprocs to find the data via the composite key (region/original pk). But this requires us to change all of our code and sprocs.
Another idea was to just import the data and let it generate new PK's and then update all the FK references to the new PK. This way we potentially don't have to change any code.
Any experience is welcome!
I have no first-hand experience with this, but it seems to me like you ought to be able to uniquely map PK -> New PK for each server. For instance, generate new PKs such that data from LA server has PK % 3 == 2, SF has PK % 3 == 1, and NY has PK % 3 == 0. And since, as I understood your question anyway, each server only stores FK relationships to its own data, you can update the FKs in identical fashion.
NewLA = OldLA*3-1
NewSF = OldLA*3-2
NewNY = OldLA*3
You can then merge those and have no duplicate PKs. This is essentially, as you already said, just generating new PKs, but structuring it this way allows you to trivially update your FKs (assuming, as I did, that the data on each server is isolated). Good luck.
BEST: add a column for RegionCode, and include it on your PKs, but you don't want to do all the leg work.
HACK: if your IDs are INTs, a quick fix would be to add a fixed value based on region to each key on import. INTs can be as large as: 2,147,483,647
local server data:
LA IDs: 1,2,3,4,5,6
SF IDs: 1,2,3,4,5
NY IDs: 1,2,3,4,5,6,7,9
add 100000000 to LA's IDs
add 200000000 to SF's IDs
add 300000000 to NY's IDs
combined server data:
LA IDs: 100000001,100000002,100000003,100000004,100000005,100000006
SF IDs: 200000001,200000002,200000003,200000004,200000005
NY IDs: 300000001,300000002,300000003,300000004,300000005,300000006,300000007,300000009
I have done this and I say change your keys (pick a method) rather than changing your code. Invariably you will either miss a stored procedure or introduce a bug. With data changes, it is pretty easy to write tests to look for orphaned records or to verify that things were matched up correctly. With code changes, especially code that is working correctly, it is too easy to miss something.
One thing you could do is set up the tables with regional data to use GUID's. That way, the primary keys in each region are unique, and you can mix and match data (import data from one region to another). For the tables which have shared data (like type tables), you can keep the primary keys the way they are (since they should be the same everywhere).
Here is some information about GUID's:
http://www.sqlteam.com/article/uniqueidentifier-vs-identity
Maybe SQL Server Management Studio lets you convert columns to use GUID's easily. I hope so!
Best of luck.
what i have done in a situation like this is this:
create a new db with the same schema
but only tables. no pk fk, checks
etc.
transfer data from DB1 to this
source db
for each table in target database
find the top number for the PK
for each table in the source
database update their pk, fk etc
starting with the (top number + 1)
from the target db
for each table in target database
set identity insert to on
import data from source db to target
db
for each table in target database
set identity insert to off
clear source db
repeat for DB2
As Jon mentioned, I would use GUIDs to solve the merge task. And I see two different solutions that required GUIDs:
1) Permanently change your database schema to use GUIDs instead of INTEGER (IDENTITY) as primary key.
This is a good solution in general, but if you have a lot of non SQL code that is somehow bound to the way your identifiers work, it could require quite some code changes. Probably since you merge databases, you may anyways need to update your application so that it is working with one region data only based on the user logged in etc.
2) Temporarily add GUIDs for migration purposes only, and after the data is migrated, drop them:
This one is kind-of more tricky, but once you write this migration script, you can (re-)run it multiple times to merge databases again in case you screw it the first time. Here is an example:
Table: PERSON (ID INT PRIMARY KEY, Name VARCHAR(100) NOT NULL)
Table: ADDRESS (ID INT PRIMARY KEY, City VARCHAR(100) NOT NULL, PERSON_ID INT)
Your alter scripts are (note that for all PK we automatically generate the GUID):
ALTER TABLE PERSON ADD UID UNIQUEIDENTIFIER NOT NULL DEFAULT (NEWID())
ALTER TABLE ADDRESS ADD UID UNIQUEIDENTIFIER NOT NULL DEFAULT (NEWID())
ALTER TABLE ADDRESS ADD PERSON_UID UNIQUEIDENTIFIER NULL
Then you update the FKs to be consistent with INTEGER ones:
--// set ADDRESS.PERSON_UID
UPDATE ADDRESS
SET ADDRESS.PERSON_UID = PERSON.UID
FROM ADDRESS
INNER JOIN PERSON
ON ADDRESS.PERSON_ID = PERSON.ID
You do this for all PKs (automatically generate GUID) and FKs (update as shown above).
Now you create your target database. In this target database you also add the UID columns for all the PKs and FKs. Also disable all FK constraints.
Now you insert from each of your source databases to the target one (note: we do not insert PKs and integer FKs):
INSERT INTO TARGET_DB.dbo.PERSON (UID, NAME)
SELECT UID, NAME FROM SOURCE_DB1.dbo.PERSON
INSERT INTO TARGET_DB.dbo.ADDRESS (UID, CITY, PERSON_UID)
SELECT UID, CITY, PERSON_UID FROM SOURCE_DB1.dbo.ADDRESS
Once you inserted data from all the databases, you run the code opposite to the original to make integer FKs consistent with GUIDs on the target database:
--// set ADDRESS.PERSON_ID
UPDATE ADDRESS
SET ADDRESS.PERSON_ID = PERSON.ID
FROM ADDRESS
INNER JOIN PERSON
ON ADDRESS.PERSON_UID = PERSON.UID
Now you may drop all the UID columns:
ALTER TABLE PERSON DROP COLUMN UID
ALTER TABLE ADDRESS DROP COLUMN UID
ALTER TABLE ADDRESS DROP COLUMN PERSON_UID
So at the end you should get a rather long migration script, that should do the job for you. The point is - IT IS DOABLE
NOTE: all written here is not tested.