I have a single table with images which I need to link to 6 other tables. Let's say those tables are - Users, Tables, Foods, Restaurants, Categories, and Ships.
Should I create 6 different junction tables so each table has it's own junction table - Images_Users, Images_Tables, Images_Restaurants etc..?
Or is it better to create one table with a field to distinguish where it links -
Images_Entity with fields- Id, Image_Id, Entity_Id, Entity_Type(I use this to distinguish whether its a user table, foods, or whatever). I don't like this solution since I will lack FK constraint in this case, but I'm leaning towards since the project will already have a large number of tables.
Perhaps there is a third approach? Create 6 image tables? Which solution is the best performance wise?
EDIT*
Database will be used to display data, insert, update performance is not an issue, only select statements. I just figured out that no image can link to two entries(This makes the junction tables redundant).
Let me rephrase question entirely- What is the best way to connect Table with only one of the 6 other tables using a one to many association?
So Images table should contain FK and can link to only one of the 6 tables, never two at the same time.
One possible approach is to add a UserId, RestaurantId, TableId, FoodId etc to the Images table.
That way, you can add a proper FK to each of those columns.
Using a constraint or trigger (depending on the DBMS), you can enforce that exactly one of those fields is not null. The actual validation of that one filled in id is handled by the FK constraints.
This way it is fairly easy to enfore all the rules you want to have.
With separate junction tables this is harder to manage. When you insert a junction row for a Table_Image, you must validate whether there is no such record for any of the other entities in the other junction tables.
You could use exclusive FKs or inheritance as described in this post.
If you opt for the former, the CHECK employed for exclusive FKs would need to use slightly different syntax (from what was used in the link) to be practical:
CHECK (
(
CASE WHEN UserID IS NULL THEN 0 ELSE 1 END
+ CASE WHEN TableID IS NULL THEN 0 ELSE 1 END
+ CASE WHEN FoodID IS NULL THEN 0 ELSE 1 END
+ CASE WHEN RestaurantID IS NULL THEN 0 ELSE 1 END
+ CASE WHEN CategoryID IS NULL THEN 0 ELSE 1 END
+ CASE WHEN ShipID IS NULL THEN 0 ELSE 1 END
)
= 1
)
The CHECK above ensures only one of the FKs is non-NULL at any given time.
BTW, your suspicions about "dynamic" FK (with the "type" field) were correct.
Related
Operational databases of identical structure work in several countries.
country A has table Users with column user_id
country B has table Users with column user_id
country C has table Users with column user_id
When data from all three databases is brought to the staging area for the further data warehousing purposes all three operational tables are integrated into a single table Users with dwh_user_id.
The logic looks like following:
if record comes from A then dwh_user_id = 1000000 + user_id
if record comes from B then dwh_user_id = 4000000 + user_id
if record comes from c then dwh_user_id = 8000000 + user_id
I have a strong feeling that it is a very bad approach. What would be a better approach?
(user_id + country_iso_code maybe?)
In general, it's a terrible idea to inject logic into your primary key in this way. It really sets you up for failure - what if country A gets more than 4000000 user records?
There are a variety of solutions.
Ideally, you include the column "country" in all tables, and use that together with the ID as the primary key. This keeps the logic identical between master and country records.
If you're working with a legacy system, and cannot modify the country tables, but can modify the master table, add the key there, populate it during load, and use the combination of country and ID as the primary key.
The way we handle this scenario in Ajilius is to add metadata columns to the load. Values like SERVER_NAME or DATABASE_NAME might provide enough unique information to make a compound key unique.
An alternative scenario is to generate a GUID for each row at extract or load time, which would then uniquely identify each row.
The data vault guys like to use a hash across the row, but in this case it would only work if no row was ever a complete duplicate.
This is why they made the Uniqueidentifier data type. See here.
If you can't change to that, I would put each one in a different table and then union them in a view. Something like:
create view vWorld
as
select 1 as CountryId, user_id
from SpainUsers
UNION ALL
select 2 as CountryId, user_id
from USUsers
Most efficient way to do this would be :-
If record from Country A, then user * 0 = Hence dwh_user_id = 0.
If record from Country B, then (user * 0)- 1 = Hence dwh_user_id = -1.
If record from Country C, then (user * 0)+ 1 = Hence dwh_user_id = 1.
Suggesting this logic assuming the dwh_user_id is supposed to be a number field.
I have a table representing an item like so:
item(id, description, motor_id, product_id, drive_id);
The issue I have is that an item can be only one of a "motor", a "drive", or a "product". In each case the foreign key points to a different table full of motors, products or drives.
Can this table design for item be improved, or is it pretty much good as is?
My concern is that having 3 foreign keys, only one of which can be used at a time is a sign of bad design.
Structure of other tables
motor(id, description, hp, voltage, motor_orientation);
drive(id, feature, model, weight, dimensions, hp);
product(id, description, model, class, type, option1, option1_cost);
This is a legitimate way to implement a "one-of" relationship using a relational database. In addition to the above, you should mandate that exactly one of the columns is not NULL:
alter table t
add constraint chk_t_foreignkeys
check (((case when motor_id is not null then 1 else 0 end) +
(case when product_id is not null then 1 else 0 end) +
(case when drive_id is not null then 1 else 0 end)
) = 1
);
This method of implementing a "one-of" relationship has the advantage that foreign keys are explicitly declared and enforced. It has the downside that most databases will still reserve space for each key, even though only one is used.
Lets assume I have 2 tables:
Order -< OrderItem
I have another table with 2 FKs:
Feature
- Id
- FkOrderId
- FkOrderItemId
- Text
UPDATE
This table is linked to another called FeatureReason which is common to both types of record, be they OrderFeatures or OrderItem features.
Feature -< FeatureReason
If I had 2 feature tables to account for both types of records, would this then require 2 FeatureReason tables. Same issue here with the FeatureReason table needing to have 2 FKs, each pointing to a different master table.
An Order can have a Feature record, as can an OrderItem. Therefore either "FkOrderId" OR FkOrderItemId would be populated. Is this fine to do?
I would also seriously think about using Views to to insert/edit and read either OrderFeatures or OrderItemFeatures.
Thoughts appreciated.
I would recommend using following structure, because if you have 2 foreign keys which either of them can be null, you can have rows with both columns being null or having value.
Added the FeatureReason table too
You can do this, but why? What is your reasoning for collating these two distinct items in a single table?
I would suggest having two separate tables, OrderFeatures and OrderItemFeatures, and on those occasions that you need to query both, collate them with a union query.
It is possible to have 2 foreign keys in one table. As long as the foreign key is mapping with the primary key on another table, it's OK
By not populating FkOrderItemId or FkOrderId, will you not be violating one or other of the FK constraints?
You can populate FkOrderItemId or FkOrderId according to your needs, I'm just not sure about defining an FK where it is not mandatory to supply a FK value.
Just a thought...
I'm thinking of adding a relationship table to a database and I'd like to include a sort of reverse relation functionality by using a FK pointing to a PK within the same table. For example, Say I have table RELATIONSHIP with the following:
ID (PK) Relation ReverseID (FK)
1 Parent 2
2 Child 1
3 Grandparent 4
4 Grandchild 3
5 Sibling 5
First, is this even possible? Second, is this a good way to go about this? If not, what are your suggestions?
1) It is possible.
2) It may not be as desirable in your case as you might want - you have cycles, as opposed to an acyclic structure - because of this if your FK is in place you cannot insert any of those rows as they are. One possibility is that after allowing NULLs in your ReverseID column in your table DDL, you would have to INSERT all the rows with NULL ReverseID and then doing an UPDATE to set the ReverseID columns which will now have valid rows to reference. Another possibility is to disable the foregin key or don't create it until the data is in a completely valid state and then apply it.
3) You would have to do an operation like this almost every time, and if EVERY relationship has an inverse you either wouldn't be able to enforce NOT NULL in the schema or you would regularly be disabling and re-enabling constraints.
4) The sibling situation is the same.
I would be fine using the design if this is controlled in some way and you understand the implications.
I have a table which has employee relationship defined within itself.
i.e.
EmpID Name SeniorId
-----------------------
1 A NULL
2 B 1
3 C 1
4 D 3
and so on...
Where Senior ID is a foreign key whose primary key table is same with refrence column EmpId
I want to clear all rows from this table without removing any constraint. How can i do this?
Deletion need to be performed like this
4, 3 , 2 , 1
How can I do this
EDIT:
Jhonny's Answer is working for me but which of the answers are more efficient.
I don't know if I am missing something, but maybe you can try this.
UPDATE employee SET SeniorID = NULL
DELETE FROM employee
If the table is very large (cardinality of millions), and there is no need to log the DELETE transactions, dropping the constraint and TRUNCATEing and recreating constraints is by far the most efficient way. Also, if there are foreign keys in other tables (and in this particular table design it would seem to be so), those rows will all have to be deleted first in all cases, as well.
Normalization says nothing about recursive/hierarchical/tree relationships, so I believe that is a red herring in your reply to DVK's suggestion to split this into its own table - it certainly is viable to make a vertical partition of this table already and also to consider whether you can take advantage of that to get any of the other benefits I list below. As DVK alludes to, in this particular design, I have often seen a separate link table to record self-relationships and other kinds of relationships. This has numerous benefits:
have many to many up AND down instead of many-to-one (uncommon, but potentially useful)
track different types of direct relationships - manager, mentor, assistant, payroll approver, expense approver, technical report-to - with rows in the relationship and relationship type tables instead of new columns in the employee table
track changing hierarchies in a temporally consistent way (including terminated employee hierarchy history) by including active indicators and effective dates on the relationship rows - this is only fully possible when normalizing the relationship into its own table
no NULLs in the SeniorID (actually on either ID) - this is a distinct advantage in avoiding bad logic, but NULLs will usually appear in views when you have to left join to the relationship table anyway
a better dedicated indexing strategy - as opposed to adding SeniorID to selected indexes you already have on Employee (especially as the number of relationship types grows)
And of course, the more information you relate to this relationship, the more strongly is indicated that the relationship itself merits a table (i.e. it is a "relation" in the true sense of the word as used in relational databases - related data is stored in a relation or table - related to a primary key), and thus a normal form for relationships might strongly indicate that the relationship table be created instead of a simple foreign key relationship in the employee table.
Benefits also include its straightforward delete scenario:
DELETE FROM EmployeeRelationships;
DELETE FROM Employee;
You'll note a striking equivalence to the accepted answer here on SO, since, in your case, employees with no senior relationship have a NULL - so in that answer the poster set all to NULL first to eliminate relationships and then remove the employees.
There is a possibly appropriate usage of TRUNCATE depending upon constraints (EmpployeeRelationships is typically able to be TRUNCATEd since its primary key is usually a composite and not a foreign key in any other table).
Try this
DELETE FROM employee;
Inside a loop, run a command that deletes all rows with an unreferenced EmpID until there are zero rows left. There are a variety of ways to write that inner DELETE command:
DELETE FROM employee WHERE EmpID NOT IN (SELECT SeniorID FROM employee)
DELETE FROM employee e1 WHERE NOT EXISTS
(SELECT * FROM employee e2 WHERE e2.SeniorID = e.EmpID
and probably a third one using a JOIN, but I'm not familiar with the SQL Server syntax for that.
One solution is to normalize this by splitting out "senior" relationship into a separate table. For the sake of generality, make that second table "empID1|empID2|relationship_type".
Barring that, you need to do this in a loop. One way is to do it:
declare #count int
select #count=count(1) from table
while (#count > 0)
BEGIN
delete employee WHERE NOT EXISTS
(select 1 from employee 'e_senior'
where employee.EmpID=e_senior.SeniorID)
select #count=count(1) from table
END