I designing a SQL db system(with Postgre) and I have a question about what is the common practice to create a relationship / reference that can persist even when referenced objects are deleted.
For example, there is a UserORM, and ActivityORM, and UserActivityRelation. ActivityORM holds user.id as foreign key to tell who created the activity, and relation table is about which users should know about the activity.
Now, if I want to remove the actor from db, I still want ActivityORM and the relation table to persist so that other users can still know about the activities. I want to know what is the most common / best practice to design such system. Simple answers might be not assigning them as foreign keys, or create an inactive state, but I wonder if there is any better ways. Thank you.
As you mentioned, removing the problematical foreign key constraints would solve the immediate problem of deleting either users or activities on either side of the bridge table. But then this would allow broken relationships to end up in your database.
I propose using soft deletion here. Add an active bit column to both the UserORM and ActivityORM tables. Then, when you need to delete a user or activity, just mark that record as inactive. This would guarantee that the key relationships do not get broken during the deletion. And this approach would also let you view relationships which used to exist, prior to any deletions.
Related
This might sound complicated, so I'll give an example.
Say, I have two tables Instructor and Class.
Instructor has a required field called PreferredClassID which has a foreign key against Class.
Class has a required field called CurrentInstructorID which is a foreign key against Instructor
Is it possible to insert a row to either of these tables?
Cause if I insert a row to Instructor, I won't be able to as I'll need to supply a PreferredClassID, but I can't create a Class row either because it needs a CurrentInstructorID.
If I can't do this, how would I solve this problem? Would I just need to make one of those fields non-required (even if business requirements specifies it really should be required?)
If you find yourself here, reevaluate your data relation model.
In this case, you could simply have a lookup table called PreferredCourse with courseId and instructorId.
This will enforce that both the course and instructor exist before adding the row to the PreferredCourse lookup. Maintaining business model requirements without bending the rules of database model requirements.
While it may seem excessive to have another table, it will prevent a whole lot of maintenance overhead in both your database procedures and jobs, and your application code. Circular references create nothing but headaches and are easily solved with small lookup tables and JOINs.
The Impaler gave an example of how to accomplish this with your current data structure. Please note, that you have to 1: make a key nullable in at least one of the tables, and then 2: Perform INSERTs in a specified order. Or, 3: disable the constraints, 4: perform INSERTS, 5: reenable constraints, 6: roll back transaction if constraints are now broken.
There is a whole lot that can go wrong, simply fix the relation model now before things get out of hand.
As long as one of those foreign keys allows a null value, you're good. So you:
Insert the row that accepts the null value first (say Instructor), with a null value on the FK. Get the ID of the inserted row.
Insert in the other table (say Class). In the FK you use the ID you got from step #1. Once inserted, you get the ID of this new row.
Update the FK on the first row (Instructor) with the ID you got from step #2.
Commit.
Alternatively, if both FKs are NOT NULL then you have a bit of a problem. The options I see for this last case are:
Use deferrable FK integrity check. Some databases do allow you to insert without checking integrity until the COMMIT happens. This is really tricky, and enabling this is looking for trouble.
Disable the FK for a short period of time. Some databases allow you to enable/disable constraints. You are not deleting them, just temporarily disabling them. If you do this, don't forget to enable them back.
Drop the constraint temporarily, while you do the insert, and the add it again. This is really a work around of last resort. Adding/Dropping constraint are DML statements and usually cannot participate in a transaction. Do this at your own peril.
Something to consider (as per user7396598's answer) is looking at how normal forms apply to your data as it fits within your relational model.
In this case, it might be worth looking at the following:
With your Instructor table, is the PreferredClassID a necessary component? Does an instructor -need- to have a preferred class, or is it okay to say "Hey, I'm creating an entry for a new instructor, I don't know their preferred class."
(if they're new, they might not have a preferred class that your school offers)
This is a case where you definitely want to have a foreign key, but it should be okay to say 'I don't necessarily know the value I want to put there.'
In a similar vein, does a Class need to have an instructor when it's created? Is it possible to create a Class that an instructor has not been assigned to yet?
Again, both of these points are really a case of 'I don't know what I want to put here, but when I do, it should be a specific instance that exists in another table.'
I'm trying to design my database with very basic tables and I am confused on the CORRECT way to do it.
I've attached a picture of the main info, and I'm not quite sure how to link them. Meaning what should be a foreign key, or should some of these tables include of LIST<> of the other tables.
UPDATE TO TABLES
As per your requirements, You are right about the associative table
Client can have multiple accounts And Accounts can have multiple clients
Then, Many (Client) to Many (Account)
So, Create an associate table to break the many to many relationship first. Then join it that way
Account can have only one Manager
Which means One(Manager) to Many(Accounts)
So, add an attribute called ManagerID in Accounts
Account can have many traedetail
Which means One(Accounts) to Many(TradeDetails)
So, add an attribute called AccountID in TradeDetails
Depends on whether you are looking to have a normalized database or some other type of design paradigm. I recommend doing some reading on the concepts of database normalization and referential integrity.
What I would do is make tables that have a 1 to 1 relationship such as account/manager into a single table (unless you can think of a really good reason not to). Add Clientid as a foreign key to Account. Add AccountID as a foreign key to TradeDetail. You are basically setting up everything as 1 to many relationships where the table that has 1 record for the id has the field as a primary key and the table that has many has it as a foreign key.
I have an Sql-Server database (2008R2, if that is important) with lots of tables.
Some table contains foreign key references to other tables, and some of the foreign key constraints have cascade on delete, and some are restricted delete.
To be user friendly, in my application I want to figure out all other references that needs to be deleted before I can delete the one I'm currently deleting and present it to the user.
Some generic problems described below:
I try to delete from Customer where Id = 1 and Order has a foreign key to Customer (and that foreign key restricts user from deleting customer before orders are deleted), I would want to get a result of all Orders that restricts me from deleting Customer.
If Contracts have references to Customer as well, and that foreign key is cascade, but Contract is referenced by Alarm with restricted delete, I want to get which Contract was responsible for not being able to delete Customer, and also the Alarms which was responsible for not being able to delete Customer.
I want this behavior to be recursive, so that I get all connections, direct or indirect which hindered me from deleting Customer in the first place. I also want to be able to get this information wherever I start (No matter if I wanted to delete a Customer, Order, Alarm etc)
It feels like someone aught to have had similar problems before me, but I can mostly find information to get table->table foreign key restrictions, not in relation to a specific entity in the database (i.e. customer with Id = 1)
Is there any simple way of doing this?
basically I look at this as part of database/application architecture and you need to know as the developer or DBA how you tables are connected.
You need to use the customer ID and query against any table that uses customerID as a foreign key and that would give you a result.same with the contracts. You can do this with some ERD tools, but I tend to make my own classes to do this for my databases, this way I have total control of what I want to do. For instance in the delete method for a customer, a company may just want to disable or set active to false for the customer. or truly delete everything.
There is a way to cascade deletion in the database, but I am not sure you want to do that, since you are asking for a result to be returned.
I'd like to get some advice on database design. Specifically, consider the following (hypothetical) scenario:
Employees - table holding all employee details
Users - table holding employees that have username and password to access software
UserLog - table to track when users login and logout and calculate
time on software
In this scenario, if an employee leaves the company I also want to make sure I delete them from the Users table so that they can no longer access the software. I can achieve this using ON DELETE CASCADE as part of the FK relationship between EmployeeID in Employees and Users.
However, I don't want to delete their details from the UserLog as I am interested in collating data on how long people spend on the software and the fact that they no longer work at the company does not mean their user behaviour is no longer relevant.
What I am left with is a table UserLog that has no relationships with any other tables in my database. Is this a sensible idea?
Having looked through books etc / googled online I haven't come across any DB schemas with tables that have no relationships with others and so my gut instinct here is saying that my approach is not robust...
I'd appreciate some guidance please.
My personal preference in this case would be to "soft delete" an employee by adding a "DeletedDate" column to the Employees table. This will allow you to maintain referential integrity with your UserLog table and all details for all employees, past and present, remain available in the database.
The downside to this approach is that you need to add application logic to check for active employees.
Yes, this is perfectly sensible. The log is just a raw audit of data that should never change. It doesn't need to be normalized (and shouldn't be) and/or linked to other tables.
Ideally, I would put write-heavy audit logging in a different database entirely than the read-heavy transactional day-to-day stuff. They may grow differently over time. But starting small it's fine to keep them in the same database as long as you understand the fundamental differences between them.
On a side note, I would recommend not deleting the users from the tables. Maybe have some kind of IsActive or IsDeleted bit on them that would effectively blind them from the application, but deleting should be avoided if possible.
The problem you have here is that it's perfectly possible to insert UserLog data for users that have never existed as there's no link to the table that defines valid users.
I would say that perhaps the better course of action would be to mark the users as invalid and remove all their personal details when they leave rather than delete the record entirely.
That's not to say there aren't situations where it is valid to have a table (or tables) on the database that don't reference others.
Is this a sensible idea
The problem is this. Since the data isn't linked you can delete something from the employee table and still have references in the UserLog. After the employee infomration is deleted, you have no way of knowing what Log data ties back to. Is this ok? Technically yes. There is nothing preventing you from doing it, but then why are you keeping the data in the first place? You also have no guarantee that the data in the table actually is about an employee. Someone could accidently enter a wrong EmployeeID in the table that doesn't belong to anyone. Keys help prevent data corruption. It's always better to have extra data than it is to have bad data.
What I've found is that you never want to delete data when possible. Space is cheap, and you can add flags etc. to show the record isn't active. Yes, this does cause more work (this can be quickly remedied by creating a view which only shows active employees), and saying that you should never delete data is far fetched, but you start linking data together. Deleting becomes very difficult. If you are not adding a FK just so you can delete records, it's a tell tale sign you need to rethink your strategy.
Relying on Cascade Delete can be very dangerous too. The model you are stating is that anytime you don't want data deleted you have to know not to add a FK to that table which links it back to users. It doesn't take long for someone to forget this.
What you can do is use logical deletion or disabling a user by adding a bool value Deleted or Disabled to the Users table.
Or replace the EmployeeId with the name of the employee in the UserLog.
An alternative to using the soft delete process, is to store all the historical details you would want about the user at the time the log record is created rather than store the employee id. So you might have username, logintime, logouttime, sessionlength in your table.
Sensible? Sure, as in it makes sense as you've described your need to keep those users indefinitely. The problem you'll run into is maintaining the tables. Instead of doing a cascading update once, you'll have to use at least two updates in order to insert a new user.
I think a table as you are suggesting is perfectly fine. I frequently encounter log tables that are do not have explicit relationships with other tables. Just because a database is "relational" doesn't mean everything has to relate haha.
One thing that I do notice though is that you are using EmployeeID in the log, but not using it as a foreign key to your Employee table. I understand why you don't want that, since you will be dropping employees. But, if you are dropping them completely, then the EmployeeID column is meaningless.
A solution to this would be to keep a flag for employees, such as active, that tracks if they are active or not. That way, the log data is meaningful.
IANADBA but it's generally considered very bad practice indeed to delete almost anything from a DB ever,It would be far better here to have some kind of locked flag / "deleted" datestamp on your users table and preserve your FK.
This is more of a curiosity at the moment, but let's picture an environment where I bill on a staunch nickle&dime basis. I have many operations that my system does and they're all billable. All these operations are recorded across various tables (these tables need to be separate because they record very different kinds of information). I also want to micro manage my accounts receivables. (Forgive me if you find inconsistencies here, as this example is not a real situation)
Is there a somewhat standard way of substituting a foreign key with something that can verify that the identifier in column X on my billing table is an existing identifier within one of many operations record tables?
One idea is that when journalizing account activity, I could reference the operation's identifier as well as the operation (specifically, the table that it's in) and use a CHECK constraint. This is probably the best way to go so that my journal is not ambiguous.
Are there other ways to solve this problem, de-facto or proprietary?
Do non-relational databases solve this problem?
EDIT:
To rephrase my initial question,
Is there a somewhat standard way of substituting a foreign key with something that can verify that the identifier in column X on my billing table is an existing identifier within one of many (but not necessarily all) operations record tables?
No, there's no way to achieve this with a single foreign key column.
You can do basically one of two things:
in your table which potentially references any of the other x tables, have x foreign key reference fields (ideally: ID's of type INT), only one of which will ever be non-NULL at any given time. Each FK reference key references exactly one of your other data tables
or:
have one "child" table per master table with a proper and enforced reference, and pull together the data from those n child tables into a view (instead of a table) for your reporting / billing.
Or just totally forget about referential integrity - which I would definitely not recommend!
you can Implementing Table inheritance
see article
http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server
An alternative is to enforce complex referential integrity rules via a trigger. However,and not knowing exactly what your design is, usually when these types of questions are asked it is to work around a bad design. Look at the design first and see if you can change it to make this something that can be handled through FKs, they are much more managable than doing this sort of thing through triggers.
If you do go the trigger route, don't forget to enforce updates as well as inserts and make sure your trigger will work properly with a set-based multi-row insert and update.
A design alternative is to havea amaster table that is parent to all your tables with the differnt details and use the FK against that.