Extensive Data Dictionary and ER Diagram - sql

This is a question for an assignment. Can somebody please help me?
Criteria
Extensive data dictionary that contains appropriate data items and all relevant details of each data item.
Extensive ER diagram that contains appropriate tables and constraints used. All data items in the data dictionary are reflected in the tables.
Given Data Dictionaries
Finally my question
Should I create an extensive data dictionary when there is already a data dictionary?
Is this correct what I did below?

You'll probably understand that I cannot sovlve your assignment for you: one day you might write mission critical system for the plane or medical device I'll use and I want to be sure that you'll have all the skills needed ;-)
But here some hints to guide you:
The data dictionary provided is not as extensive as it should. So I guess you have to fill the missing cells. For example:
if every employee belongs to a department, do you think that Employee.Department_id is nullable ?
if several employees may belong to the same department department, do you think that Employee.Department_id is unique ?
What with the descriptions and examples?
Your second ERD uses Chen notation. These are excellent to show Entities, Relationships, Attributes. They are not meant to replicate tables. While it seems correct at first sight, some improvements are needed:
the cardinalities between the entities and relationships are definitively missing.
Primary key attributes should be underlined.
Foreign keys are usually not shown, since they are deduced from the relationships and cardinality.
Your first ERD uses Barker's notation. While it also shows entities, relationships and attributes, it is meant to map entities and attributes to tables, and keys. In this regard, it's better in view of your assignment requirement to show all the attributes of the dictionary. Some improvements are required:
Primary keys are well identified. But there are problems with the foreign keys: put a FK only in front of the columns identified as foreign key in data dictionary.
between the entities, you should use the right symbols to reflect the cadinality (simple bar on the side where one item corresponds, crowfoot bars on the side where several items corrspond, and o on the side where there could be no item)
While it is possible to simply show the relationship between entities by connecting them to the table header or the bottom line, in a detailed diagram showing all the field, it is better to graphically connect the boexes at the level of the primary and foreign keys that implement the relationship.

Related

From Visio to SQL

How do we translate something like this into SQL?
Entity A -thick line- relation -simple line- Entity B
Its easy enough to write any of the other connections, but somehow I can't seem to figure it out when it comes to 1 thick line and a simple one, like shown aboove
I have a primary key which is the date of a football season (Entity A - Season) and an entity (Entity B - Football team) which has 2 primary keys which are it's name and primary key of the Season entity. But 'cause of that doubt I have I can't relate them properly.
Relations do not typically form independent tables (diamonds). However, for a many-many relationship, you will usually see them in a separate tables. Depending on your notation (there are many) your diagram could represent a many-many relationship or a 1:1 relationship.
Strong entities (your rectangles) get tables.
In your ER diagram, you will also typically see attributes for each table in circles connected by lines to the entity itself. Those attributes are turned into columns for each table. Attributes which are underlined in the diagram are representative of a primary key for a particular entity.
Additional or strange constraints that aren't typically easily represented in an ER diagram are usually put as side notes.
To answer your question, you must know whether or not it's a many-many relationship; if so, you would create a SeasonClub table with the two different primary keys inside it.

Simplify Database ER Diagram/Schema

For a school project, we have to create our own database. I decided to create a database to manage my electronic component inventory. As a requirement, we needed to create an ER diagram, then from that diagram derive the database schema. Unfortunately for me, the professor believes that the diagram I created can be simplified and the "Part" entity is unnecessary.
This is the diagram I came up with, and here is the derived schema.
If I remove the Part entity, then in order for a Circuit entity to "use" any number of any part, and have each part associated with possibly any circuit, I would have to have a separate M-to-N relationship from each component type to Circuit. Each of those relationships would generate a new table. This would definitely go over the strict maximum number of tables we are allowed for the project.
If the professor specifically mentioned Part was unnecessary, then there must be some way to remove it that results in a simpler ER diagram and schema - but I can't see what it is.
Maybe you guys can see what it is and give me a hint?
EDIT:
Dan W had a great suggestion. I could eliminate the Part by giving each part type (Capacitor, Resistor, etc.) their own keys. Then inside of uses part, include foreign keys to those components. I would have to assume that each entry of the table would only be associated with a single part, the rest being null. Here's the resulting schema. This schema should work well. But now I have to figure out exactly what modifications to the ER diagram would correspond to this schema.
EDIT2:
I've come to the conclusion that the relationship I'm looking for is n-ary. According to several sources, to convert from the n-ary to a schema you include the primary key of each participating entity type's relation as foreign key. Then add the simple attributes. This is what I came up with.
You have a strict maximum number of tables (physical design) but are you restricted in your ER diagram to that number of entities (logical design)? All of your entities for parts - resistors, transistors, capacitors, and General IC - could be stored in one parts table with all the attributes of Part, resistors, transistors, capacitors and General IC as nullable columns. If an attribute is valid for all types then it is not nullable. Include another column in the parts table which identifies the type of part (resistor, transistor, capacitor or IC) although you already have a type column in all the entities which might also serve for this.
The Parts table in your schema is now:
PartID (PK)
Quantity
Drawer
Part Type
Value
Tolerance
Subtype
Power Rating
Voltage
Term_Style
Diam
Height
Lead_Space
Name
Case
Polarity
Use
V_CE
P_D
I_C
H_FE
Package
Pins
Description
and you drop the Resistor, Capacitor, Transistor and General IC tables in your schema. Leave those entities in your ER diagram because that shows which attributes in the Parts table is required (shouldn't be null) for each part type.

Compound primary key table with subtypes

Me and a database architect were having argument over if a table with a compound primary key with subtypes made sense relationally and if it was a good practice.
Say we have two tables Employee and Project. We create a composite table Employee_Project with a composite primary key back to Employee and Project.
Is there a valid way for Employee_Project to have subtypes? Or can you think of any scenario where a composite key table can have subtypes?
To me a composite key relationship is a 'Is A' relationship (Employee_Project is a Employee and a Project). Subtypes are also a 'Is A' relationship. So if you have a composite key with a subtype its two 'Is A' relationships in one sentence which makes me believe this is a bad practice.
Employee-project is a bit hard, but one can imagine something like this -- although I'm not much of a chemist.
Or something like this, which would require different legal forms (fields) for single person ownership vs joint (time-share).
Or like this, providing that different forms are needed for full time and temp.
Employee projects have subtypes if the candidate subtypes are
not utterly different, but
not exactly alike
That means that
Every employee project has some
attributes (columns) in common. So they're not utterly different.
Some employee projects have different
attributes than others. So they're not exactly alike.
The determination has to do with common and distinct attributes. It doesn't have anything to do with the number of columns in a candidate key. Do you have employee projects that are not utterly different, but not exactly alike?
The most common business supertype/subtype example concerns organizations and individuals. They're not utterly different.
Both have addresses.
Both have phone numbers.
Both can be plaintiffs and defendants
in court.
But they're not exactly alike.
Individuals can go to college.
Organizations can have a CEO.
Individuals can get married.
Individuals can have children.
Organizations (in the USA) can be liquidated.
So you can express individuals and organizations as subtypes of a supertype called, say, "Parties". The attributes all the subtypes have in common relate to the supertype.
Parties have addresses.
Parties have phone numbers.
Parties can be plaintiffs and defendants
in court.
Again, this has to do with attributes that are held in common, and attributes that are distinct. It has nothing to do with the number of columns in a candidate key.
To me a composite key relationship is
a 'Is A' relationship
(Employee_Project is a Employee and a
Project).
Database designers don't think that way. We think in terms of a table's predicate.
If an employee can have many projects and a project can have many employees it is a many-to-many join that RDBM's can only represent easily in one way (the way you have outlined above.) You can see in the ER diagram below (employee / departments is one of the classic many-to-many examples) that it does not have a separate ER component. The separate table is a leaky abstraction of RDBMS's (which is probably why you are having a hard time modeling it).
http://www.library.cornell.edu/elicensestudy/dlfdeliverables/fallforum2003/ERD_final.doc
Bridge Entities
When an instance of an entity may be related to multiple instances of another entity and vice versa, that is called a “many-to-many relationship.” In the example below, a supplier may provide many different products, and each type of product may be offered by many suppliers:
While this relationship model is perfectly valid, it cannot be translated directly into a relational database design. In a relational database, relationships are expressed by keys in a table column that point to the correct instance in the related table. A many-to-many relationship does not allow this relationship expression, because each record in each table might have to point to multiple records in the other table.
http://users.csc.calpoly.edu/~jdalbey/205/Lectures/ERD_image004.gif
Here they do not event bother with a separate box although they add in later (at this step it is a 'pure' ER diagram). It can also be explicitly represented with a box and a diamond superimposed on each other.

ER inheritance modeling

A supply farm can have a transportation document. If present it can be one of two types: internal or external. Both documents share some common data, but have different specialized fields.
I though of modeling this in a OO-ish fashion like this:
alt text http://www.arsmaior.com/tmp/mod1.png
In the document table, one of the two doc_*_id is null, the other is the foreign key with the corresponding table.
That is opposed to the other schema where the common data is redundant:
alt text http://www.arsmaior.com/tmp/mod2.png
I'm trying to discover pros&cons of both approaches.
How do I SELECT to know all the internal docs in both cases? We have a sort of mutually exclusive foreign keys, the JOINs are not so trivial.
Is the first approach completely junky?
Classical ER modeling doesn't include foreign keys, and the gist of your question revolves around how the foreign keys are going to work. I think that what you are really doing is relational modeling, even though you are using ER diagrams.
In terms of relational modeling, there is a third way to model inheritance. That is to use the same ID for the specialized tables as is used for the generalized table. Then the ID field of the doc_internal table is both the primary key for the doc_internal table and also a foreign key referencing the supply_farm table. Ditto for the doc_external table.
The ID field in the supply_farm table is both the primary key of the supply_farm table and also a foreign key that references either the doc_internal or the doc_external table, depending. The joins magically get the right data together.
It takes a little programming to set this up, but it's well worth it.
For more details I suggest you google "generalization specialization relational modeling". There are some excellent articles on this subject out there on the web.
Both approaches are correct and their usage will totally depend on the use cases, the kind and volume of data you want to store and the type of queries you want to mostly fire. You can also think of combining these two strategies when the inheritance hierarchies are complex.
One use case where the first approach would be preferred I think is when you want to search through all the documents, for example, based on description or any common field.
This document (although specific to hibernate) can provide a little more insight on different inheritance modelling strategies.
If I have understood this correctly, then supply farm corresponds to either 0 or 1 documents, which is always either an internal or external document (never both).
If so, then why not just use a single table, like so:
**SUPPLY_FARM_DOC**
ID Int (PK)
DOC_ID Int
INTERNAL_FLAG Boolean
DESCRIPTION Varchar(40)
SOME_DATA Varchar(40)
OTHER_DATA Varchar(40)
etc.

How to model a mutually exclusive relationship in SQL Server

I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.