I'm reading Pro JPA 2. The book talks begins by talking about ORM in the first few pages.
It talks about mapping a single Java class named Employee with the following instance variables - id,name,startDate, salary.
It then goes on to the issue of how this class can be represented in a relational database and suggests the following scheme.
table A: emp
id - primary key
startDate
table B: emp_sal
id - primary key in this table, which is also a foreign key referencing the 'id' column in table A.
It thus seems to suggest that persisting an Employee instance to the database would require operations on two(multiple) tables.
Should the Employee class have an instance variable 'salary' in the first place?
I think it should possibly belong to a separate class (Class salary maybe?) representing salary and thus the example doesn't seem very intuitive.
What am I missing here?
First, the author explains that there are multiples ways to represent a class in a database: sometimes the mapping of a class to a table is straightforward, sometimes you don't have a direct correspondence between attributes and columns, sometimes a single class is represented by multiples tables:
In scenario (C), the EMP table has
been split so that the salary
information is stored in a separate
EMP_SAL table. This allows the
database administrator to restrict
SELECT access on salary information to
those users who genuinely require it.
With such a mapping, even a single
store operation for the Employee class
now requires inserts or updates to two
different tables.
So even storing the data from a single class in a database can be a challenging exercise.
Then, he describes how relationships are different. At the object level model, you traverse objects via their relations. At the relational model level, you use foreign keys and joins (sometimes via a join table that doesn't even exist at the object model level).
Inheritance is another "problem" and can be "simulated" in various ways at the relational model level: you can map an entire hierarchy into a single table, you can map each concrete class to its own table, you can map each class to its own table.
In other words, there is no direct and unique correspondence between an object model and a relational model. Both rely on different paradigms and the fit is not perfect. The difference between both is known as the impedance mismatch, which is something ORM have to deal with (allowing the mapping between an object model and the many possible representations in a relation model). And this is what the whole section you're reading is about. This is also what you missed :)
Related
For a school project, we have to create our own database. I decided to create a database to manage my electronic component inventory. As a requirement, we needed to create an ER diagram, then from that diagram derive the database schema. Unfortunately for me, the professor believes that the diagram I created can be simplified and the "Part" entity is unnecessary.
This is the diagram I came up with, and here is the derived schema.
If I remove the Part entity, then in order for a Circuit entity to "use" any number of any part, and have each part associated with possibly any circuit, I would have to have a separate M-to-N relationship from each component type to Circuit. Each of those relationships would generate a new table. This would definitely go over the strict maximum number of tables we are allowed for the project.
If the professor specifically mentioned Part was unnecessary, then there must be some way to remove it that results in a simpler ER diagram and schema - but I can't see what it is.
Maybe you guys can see what it is and give me a hint?
EDIT:
Dan W had a great suggestion. I could eliminate the Part by giving each part type (Capacitor, Resistor, etc.) their own keys. Then inside of uses part, include foreign keys to those components. I would have to assume that each entry of the table would only be associated with a single part, the rest being null. Here's the resulting schema. This schema should work well. But now I have to figure out exactly what modifications to the ER diagram would correspond to this schema.
EDIT2:
I've come to the conclusion that the relationship I'm looking for is n-ary. According to several sources, to convert from the n-ary to a schema you include the primary key of each participating entity type's relation as foreign key. Then add the simple attributes. This is what I came up with.
You have a strict maximum number of tables (physical design) but are you restricted in your ER diagram to that number of entities (logical design)? All of your entities for parts - resistors, transistors, capacitors, and General IC - could be stored in one parts table with all the attributes of Part, resistors, transistors, capacitors and General IC as nullable columns. If an attribute is valid for all types then it is not nullable. Include another column in the parts table which identifies the type of part (resistor, transistor, capacitor or IC) although you already have a type column in all the entities which might also serve for this.
The Parts table in your schema is now:
PartID (PK)
Quantity
Drawer
Part Type
Value
Tolerance
Subtype
Power Rating
Voltage
Term_Style
Diam
Height
Lead_Space
Name
Case
Polarity
Use
V_CE
P_D
I_C
H_FE
Package
Pins
Description
and you drop the Resistor, Capacitor, Transistor and General IC tables in your schema. Leave those entities in your ER diagram because that shows which attributes in the Parts table is required (shouldn't be null) for each part type.
I often see "linking" tables for M:N relationship, where N can be 1..X types of entities/classes, so that the table contains classNameId referring to ClassName table and classPK referring to the particular Entity table.
How is this called ? Does it have an alternative with the same effect without having the EntityName table ?
In the ER model, entities and subentities can be related by inheritance, the same way classes and subclasses are in an object model. The problem comes up when you transform your ER model into a relational model. The relational model does not support inheritance as such.
The design pattern is called is called generalization-specialization or gen-spec for short. Unfortunately, many database tutorials skip over how to design tables for a gen-spec situation.
But it's well understood.It looks quite different from your model, but you could create views that make it look like your model if necessary. Look up "generalization specialization relational modeling" for explanations of how to do this.
The main trick is that the specialized tables "inherit" the value of their primary key from the PK of the generalized table. The meaning of "inherit" here is that it's a copy of the same value. Thus the PK in each specialized table is also an FK link back to the correpsonding entry in the generalized table.
Coming from question “Relation” versus “relationship”
What are definitions of "relation" vs. "relationship" in RDBMS (or database theory)?
Update:
I was somewhat perplexed by comment to my question:
"relation is a synonym for table, and
thus has a very precise meaning in
terms of the schema stored in the
computer"
Update2:
Had I answered incorrectly that question , in terms of RDBMS, having written that relation is one-side direction singular connection-dependence-link,
i.e. from one table to another while relationship implies (not necessarily explicitly) more than one link connection in one direction (from one table to another)?
A RELATION is a subset of the cartesian product of a set of domains (http://mathworld.wolfram.com/Relation.html). In everyday terms a relation (or more specifically a relation variable) is the data structure that most people refer to as a table (although tables in SQL do not necessarily qualify as relations).
Relations are the basis of the relational database model.
Relationships are something different. A relationship is a semantic "association among things".
Relation is a mathematical term referring to a concept from set theory. Basically, in RDBMS world, the "relational" aspect is that data is organized into tables which reflect the fact that each row (tuple) is related to all the others. They are all the same type of info.
But then, your have ER (Entity Relationship) which is a modeling methodology in which you identify objects and their relationships in the real world. Then each object is modelled as a table, and each relationship is modelled as a table that contains only foreign keys.
For instance, if you have 3 entities: Teacher, Student, Class; then you might also create a couple of tables to record these 2 relationships: TaughtBy and StudyingIn. The TaughtBy table would have a record with a Teacher ID and a Class ID to record that this class is taught by this teacher. And the StudyingIn table would have a Student ID and a Class ID to reflect that the student is taking this class.
That way, each student can be in many Classes, and each Teacher can be in many classes without needing to have a field which contains a list of class ids in any records. SQL cannot deal with field containing a list of things.
A relation is a table with columns and rows.
and
relationship is association between relations/tables
for example employee table has relation in branch its called relationship between employee table and branch table
I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.
I hear a lot about subtyping tables when designing a database, and I'm fully aware of the theory behind them. However, I have never actually seen table subtyping in action. How can you create subtypes of tables? I am using MS Access, and I'm looking for a way of doing it in SQL as well as through the GUI (Access 2003).
Cheers!
An easy example would be to have a Person table with a primary key and some columns in that table. Now you can create another table called Student that has a foreign key to the person table (its supertype). Now the student table has some columns which the supertype doesn't have like GPA, Major, etc. But the name, last name and such would be in the parent table. You can always access the student name back in the Person table through the foreign key in the Student table.
Anyways, just remember the following:
The hierarchy depicts relationship between supertypes and subtypes
Supertypes has common attributes
Subtypes have uniques attributes
Subtypes of tables is a conceptual thing in EER diagrams. I haven't seen an RDBMS (excluding object-relational DBMSs) that supports it directly. They are usually implemented in either
A set of nullable columns for each property of the subtype in a single table
With a table for base type properties and some other tables with at most one row per base table that will contain subtype properties
The notion of table sub-types is useful when using an ORM mapper to produce class sub-type heirarchy that exactly models the domain.
A sub-type table will have both a Foreign Key back to its parent which is also the sub-types table's primary key.
Keep in mind that in designing a bound application, as with an Access application, subtypes impose a heavy cost in terms of joins.
For instance, if you have a supertype table with three subtype tables and you need to display all three in a single form at once (and you need to show not just the supertype date), you end up with a choice of using three outer joins and Nz(), or you need a UNION ALL of three mutually exclusive SELECT statements (one for each subtype). Neither of these will be editable.
I was going to paste some SQL from the first major app where I worked with super/subtype tables, but looking at it, the SQL is so complicated it would just confuse people. That's not so much because my app was complicated, but it's because the nature of the problem is complex -- presenting the full set of data to the user, both super- and subtypes, is by its very nature complex. My conclusion from working with it was that I'd have been better off with only one subtype table.
That's not to say it's not useful in some circumstances, just that Access's bound forms don't necessarily make it easy to present this data to the user.
I have a similar problem I've been working on.
While looking for a repeatable pattern, I wanted to make sure I didn't abandon referential integrity, which meant that I wouldn't use a (TABLE_NAME, PK_ID) solution.
I finally settled on:
Base Type Table: CUSTOMER
Sub Type Tables: PERSON, BUSINESS, GOVT_ENTITY
I put nullable PRERSON_ID, BUSINESS_ID and GOVT_ENTITY_ID fields in CUSTOMER, with foreign keys on each, and a check constraint that only one is not null. It's easy to add new sub types, just need to add the nullable foreign key and modify the check constraint.