I am creating an ERD and one of my subclasses has a different PK than its superclass is it ok to do so?
I have a superclass Accounts which which has Username as its PK and then BannedAccounts which has (Foreign Key) AccountsUsername and BanDate as its PK.
The reason I did so is because the same account can be banned multiple times.
Is it correct?
Here is an image of the diagram:
https://prnt.sc/vfb56o
Yes, that is fine, and it is common to do this. The table BannedAccounts could be called a fact table since it forms a time series of data points for an account. On the other hand, the Accounts table is a dimension table because it is used to categorize fact data.
Another way to talk about it is to categorize entities as weak or strong. Account would probably be a strong entity because it does not depend on other entities to exist. In contrast, BannedAccounts is a weak entity because it's existence is dependent on Accounts; it doesn't make sense to talk about which accounts are banned if there is no definition of an account.
To define the relationship, BannedAccounts is a child and Accounts is a parent because BannedAccounts references the primary key of Accounts. The relationship can be further classified as a strong (or identifying) relationship because the primary key of BannedAccounts contains the entire primary key of Accounts. If the key for Accounts was composite (consisting of more than 1 column), and BannedAccounts did not include all of those columns in it's primary key, then the relationship would be called weak or non-identifying.
Related
I am going through a pluralsight course that is currently going through building an MVC application using an entity framework code-first approach. I was confused about the Database schema used for the project.
As you can see, the relationship between Securities and it's relating tables seems to be one-to-one, but the confusion comes when I realize there is no foreign key to relate the two sub-tables and they they appear to share the same primary key column.
The video before made the Securities model class abstract in order for the "Stock" and "MutualFund" model classes to inherit from it and contain all relating data. To me however, it seems that same thing could be done using a couple of foreign keys.
I guess my question is does this method of linking tables serve any useful purpose in SQL or EF? It seems to me in order to create a new record for one table, all tables would need a new record which is where I really get confused.
In ORM and EF terminology, this setup is referred to as the "Table per Type" inheritance paradigm, where there is a table per subclass, a base class table, and the primary key is shared between the subclasses and the base class.
e.g. In this case, Securities_Stock and Securities_MutualFund are two subclasses of the Securities base class / table (possibly abstract).
The relationship will be 0..1 (subclass) to 1 (base class) - i.e. only one of the records in Securities_MutualFund or Securities_Stock will exist for each base table Securities row.
There's also often a discriminator column on the base table to indicate which subclass table to join to, but that doesn't seem to be the case here.
It is also common to enforce referential integrity between the subclasses to the base table with a foreign key.
To answer your question, the reason why there's no FK between the two subclass instance tables is because each instance (with a unique Id) will only ever be in ONE of the sub class tables - it is NOT possible for the same Security to be both a mutual fund and a share.
You are right, in order for a new concrete Security record to be added, a row is needed in both the base Securities Table (must be inserted first, as their are FK's from the subclass tables to the base table), and then a row is inserted into one of the subclass tables, with the rest of the 'specific' data.
If a Foreign Key was added between Stock and Mutual Fund, it would be impossible to insert new rows into the tables.
The full pattern often looks like this:
CREATE TABLE BaseTable
(
Id INT PRIMARY KEY, -- Can also be Identity
... Common columns here
Discriminator, -- Type usually has a small range, so `INT` or `CHAR` are common
);
CREATE TABLE SubClassTable
(
Id INT PRIMARY KEY, -- Not identity, must be manually inserted
-- Specialized SubClass columns here
FOREIGN KEY (Id) REFERENCES BaseTable(Id)
);
I have the following problem that I have multiple scenarios that might be right or wrong, I've been searching on this for a while and I didn't find a specific answer for my problem:
Doctor Clinic Example:
We have doctor, patient, treatment, treatment-type
Doctor: id, name....
Patient: id, name...
Treatment: date, cost
Treatment-Type: id, name
Doctor can do multiple treatments, and Patient can also do multiple treatments, so they are connected with Treatment with(1-N) relationship.
Treatment entity is a weak entity, as it cannot be defined in the absence of Doctor or Patient, so my question is, when we convert this ERD to actual tables, which is the correct (or the best-practice) scenario?
1 - doctor-id, patient-id cannot define the Treatment table uniquely, so we add to Treatment table the treatment-id field, and the PK is (doctor-id, patient-id, treatment-id).
2 - We add treatment-id field, and the PK is(treatment-id).
3 - The PK will be (doctor-id, patient-id, date).
I struggled finding if 'date' can be part of PK or not, and also I struggled if I can create an unique ID for weak entity
Thanks in advance.
Weak entity sets are entity sets that are partially identified by a parent entity set's primary key. A weak entity set necessarily depends on its parent entity set for existence (we say it participates totally in its identifying relationship), but not everything with an existence dependency is a weak entity set. Regular entity sets can also participate totally in one or more relationships. So, it depends on how you identify an entity set. See also my answer to the question "is optionality (mandatory, optional) and participation (total, partial) are same?"
An entity set that is uniquely identified by its own attributes is a regular entity set. An entity set that is partially identified by a parent entity set's primary key is a weak entity set. An entity set that is fully identified by a parent entity set's primary key is a subtype.
You should also note that weak entity sets can only have one parent entity set according to the entity-relationship model as Chen described it. Being identified by multiple parent entity sets would make it a relationship rather than an entity set.
In some schema design tools, a different interpretation is used where tables are equated to entity sets and relationships equated to FK constraints, and an identifying relationship would be an FK that is part of the PK of a table. This approach is closer to the network data model than the entity-relationship model, despite having adopted much of ER's terminology.
Let's take a look at your examples:
In example 1, we should consider whether treatment-id is identifying on its own (i.e. a surrogate key) or only in combination with doctor-id and patient-id (i.e. an ordinal number). If it's a surrogate key, it would be a mistake to include doctor-id and patient-id in the PK, example 2 would be the right way of handling it. If it's an ordinal number, then it's basically the same as example 3 - two foreign entity keys and a value set in a primary key. I'll say more about that in my comments on example 3.
In example 2, treatment-id is a surrogate key which means Treatment is a regular entity set which participates totally in its relationships with Patient and Doctor. This would be my recommended solution, since it's the simplest.
In example 3, you have a primary key consisting of two foreign entity keys and a value set.
The entity-relationship model doesn't cover such relations - relations with a single entity key are called entity relations, and relations with multiple entity keys are called relationships relations. Value sets are only described as the codomains of attributes, not the domains. The ER model's inability to handle arbitrary relations are a consequence of artificial distinctions between entity sets vs value sets, and between attributes vs relationships. Other data modeling disciplines like the relational model and object-role modeling are complete and can handle any kinds of relations.
Back to example 3, despite the ER model's shortcomings, it's not invalid to create such a table/relation in an actual database. However, think about what the primary key means - can a patient receive only one treatment per day from the same doctor? I would think multiple treatments should be possible, in which case you might need to add another ordinal number, e.g. (doctor-id, patient-id, date, treatment-id). In that case, it might be simpler just to do (doctor-id, patient-id, treatment-id).
One argument against such composite/natural keys is that they add up - a many-to-many association between two relations, each with 3 columns in their primary keys, could have up to 6 columns in its primary key! That gets inconvenient quickly, but on the other hand, those columns are relevant related info that would otherwise need to be retrieved from joined tables if the association was identified by a surrogate key.
Sorry about the long answer, but I hope this covers all the fine points. Let me know if you have any questions.
This is a bird-watcher database example. Say if you have three entities, BirdSpecies, Location and Observer. To have an entity Observation, you need all of these three. Without them there is no observation.
My understanding is that the requirement above makes Observation a weak entity. But what if the same person can spot the same species on the same location several times? Then the entry won't be unique.
My question is therefore, can you have a primary key for Observation that is just a number, sequentially increasing for each observation, and the entity still being a weak entity?
I think that the weakness of the new entity is conditioned by its relations, now matter which its primary key is.
To understand this, imagine that instead of having a sequentially increasing number, you have a date-time, unique to each observation. This doesn't change the fact that if you remove one of the three entities there is no observation.
Weak entities are identified by a single parent entity's primary key and another attribute. Weak entities are typically parts of a whole. Observation (without introducing a surrogate key) is a ternary relationship, not a weak entity.
To record multiple observations by the same person of the same species in the same location, I would include a date/time value in the Observation relation and primary key, or alternatively a non-prime count column to record the number of observations. Remember that relations cannot have duplicate entries, so it's not uniqueness that is at risk without a distinguishing column, but your ability to record multiple entries. SQL DBMSs, however, aren't properly relational and will allow you to shoot yourself in the foot.
Once you introduce a surrogate key, you reify the relationship into an associative entity. Entities identified by a surrogate key are always strong entities, since they're identified by their own attributes. A surrogate key allows you to record otherwise duplicate entries, which is why surrogate keys are often complemented with unique keys on other attributes.
Lets say you have two entity named Parent and Child.
Child entity is DEPENDENT of Parent entity.
A weak key of child entity is the NAMEOFCHILD.
Is it possible for the Parent entity to have NAMEOFCHILD as a foreign key?
This idea has not been talked about in class. I was wondering is this possible in SQL?
If so, should i just add
FOREIGN KEY (NAMEOFCHILD) source CHILD
in my table?
In the database schema, yes (if Child.NAMEOFCHILD has a unique index). In entity framework, no. EF doesn't support associations to unique indexes (yet). But this is just on the technical level. Whether it's meaningful is another question.
Also, beware of painting yourself in a corner. When both foreign keys are not nullable you'd never be able to insert data, because you can't insert two records at a time and sequential inserts always cause foreign key violations. You would be able to design the database schema but never get any data in.
So as you can see I have an Identifying 1 to many relationship in the tables above.
If I was to change this relationship to a Identifying 1 to 1 relationship, then the auto_leads table will still contain two composite primary keys from its parent leads table. In other words, nothing will change.
Does an identifying relationship have any meaning in the context of relational models? It doesnt appear to change its effect with respect to relationships.
Identifying relationship is an ER-modelling concept which arises because ER modelling assumes there is some semantic significance to having a primary key for each entity. Primary keys have no special role in relational database design and therefore the concept of an identifying relationship is usually of no great importance.
Consider the example of a table with two candidate keys, A and B. A is also a foreign key. According to ER-modelling convention if A is chosen as a primary key then the foreign key relationship is an identifying one. If A is an alternate key then the relationship is deemed to be non-identifying. Yet the form, function, integrity constraints and presumably the business meaning is exactly the same in both cases. The concept of identifying relationships is only as important as you want it to be.