Is there any abnormality in this modeling? - sql

I am using a table to centralize the addresses. Customers and Suppliers have a reference to address.
A Customer has an Address, an Address may or may not be associated with a Customer.
A Supplier has an Address, an Address may or may not be associated with a Supplier.
To ensure that an address is not associated with more than one Customer or Supplier, I have a unique index on the Customer and Supplier tables on AddressID column.
I am suspicious that this relationship is abnormal, because I am not able to map it using Entity-Framework with FluentAPI.
Edit:
In my real scenario, the address table will have many more columns.
In fact this is an adaptation to simplify a complex scenario where
the address table is a financial release and the tables customer and
supplier are representing the origin of the financial release, like
Sale and Purchase.

Your model seems reasonable. The merits of having a second table when you want to enforce a 1-1 relationship may not be obvious to everyone.
I can think of two good reasons off-hand:
You want the addresses in one place so you can treat all addresses equivalently (say geo-coding them, standardizing them, extracting features).
The address column is long and many queries do not require it, so you gain efficiency by not storing the address with the rest of the data ("vertical partitioning").
And there may be other reasons. I can't speak to why EF makes such relationships difficult to express.

In your current design this is not EF issue and by it self it cannot prevent you to assign the same address to a customer and to a supplier. If you go this path you are the sole responsible of enforce this uniqueness through business rules and validations in your model.
In the other hand, the correctness (or not) of your model design, apart from what Linoff points out in his answer, depends of the nature of your problem and what important and address is to your business. For example, if this is an app for Post Office, then the Address merits an individual table, as it will be one of the core concepts to your application. But if not, with the current approach you are going to add complexity to your model.

Related

need help answering question about sql data integrity

The question is:
For your final designed database, find a scenario in which a relatively prominent business data
integrity can not be ensured by your current primary keys and foreign keys, nor by adding directly
more of such keys or check clauses in the created tables.
In other words, the data integrity ensured by
the keys within the database may not be enough to ensure all the data integrity within the business
context.
Write a SQL statement that will determine if such a problem exists or not, and where, for any
given state of the database.
I am not too sure what this question is asking or how to approach it.
Need help writing a sql code for this question.
I think the question is asking you to define some business logic that cannot be encoded in the database. However, it then wants you to find conflicts that could occur because the business logic is not encoded in the database. This second part seems to be in conflict with the first, but not necessarily.
An example based on your previous assignment would be if a coach is suddenly sick and there are too few additional coaches to cover the booked clients, or some coaches are not qualified to replace the sick coach, or had previous conflicts with certain clients and therefore can't be assigned to those clients. Therefore, some training bookings must be cancelled.
The decision on which are best to cancel may be difficult or impossible to code in SQL, but you can use SQL to verify that all of the sick coach's slots have either been filled by others or cancelled after the external business logic is applied.
EDIT: I think the above scenario fits the question's requirement that you can't find the conflicts (such as clients that don't like certain coaches) in your existing foreign key relationships, but you can verify that the external logic is consistent with the final requirements (all slots accounted for).
Perhaps a better example is the traveling salesman problem: It is difficult to code the least cost routing in SQL, but it's easy to verify that all cities have been visited.
The scenarios where every row can have variable number of attributes and each attribute can have variable number of datatypes, is generally modeled using EAV datamodel. EAV wikipedia.
Here, attribute can have variety of values and so, we cannot enforce check constraint always. In some scenarios, if attributes finite list is not available, we cannot have foreign key for attributeID.
This datamodel is popular in the medical history datamodel, where every patient can have different kinds of symptoms.
May be this can be an example for a scenario, where data integrity cannot be completely enforced.

What are the best examples of natural keys in SQL?

I've been reading the great debates about natural vs surrogate keys in data modeling, and to be clear I'm not trying to get into that thorny question here. All I want to know is what are some of the best examples of good natural keys?
All I seem to find online are keys that someone thought might be good but turn out not to be, like social security numbers. (For that one: privacy concerns, not everyone has one, reused after death, can be changed after identity theft, can double as business tax id.)
My own guess is that internationally standardized codes (ISBN, VIN, country codes, language codes) would make good keys.
Invoice numbers, vehicle registration numbers, scheduled flight codes, login names, email addresses, employee numbers, room numbers, UPC codes. There are also many thousands of industry, public and international standards for everything from currencies, languages, financial instruments, chemical compounds and medical diagnoses. All of these are potentially good candidates for key attributes. Some sensible criteria for choosing and designing keys are: Simplicity, Stability and Familiarity (i.e. familiar within the business or other context in which they are used).
Some people seem to struggle with the choice of "natural" key attributes because they hypothesize situations where a particular key might not be unique in some given population. This misses the point. The point of a key is to impose a business rule that attributes must and will be unique for the population of data within a particular table at any given point in time. The table always represents data in a particular and hopefully well-understood context (the "business domain" AKA "domain of discourse"). It is the intention/requirement to apply a uniqueness constraint within that domain that matters.
For example, if my website requires each user to supply a unique email address when they register then email address may be a valid choice of key in the database supporting that website. The fact that there are other populations of people in other domains where email addresses are not required to be unique does not necessarily invalidate that choice of key for my website.
Assume, there is a table named person. When we use the columns LastName, FirstName and Address together as a key, then this will be a natural key as those columns are completely natural to people, and there is also a logical relationship between the columns in the table.
Your DNA code would be one really good example of a natural key in real life.

Is this Library Management System ER diagram correct?

Quick question about an ER/EER Diagram.
I have made this Entity Relationship Diagram, but I have been told, that there is something wrong with it by a friend. Is there something wrong with it?
The ER diagram is a design of a Library Management System, where a member can borrow 5 books at a time. The rest of the functionality of the system is how a normal library functions.
Library Management System EER
i don't understand the utility of the relationship between the librarian and the card and i don't understand why the books are splitted in two entities.
I would do 3 entities:
-member
-card
-book
every member has one card, every card is of one member;
every member can take many books, every book can be taken by many members,
the relation between member and book create another table in the logic schema: loans. before inserting a new loan you can check if the member has alredy 5 active loans (by checking the attribute active in the loans table).
Your given context is incomplete for me. I do not see the whole description of your problem/situation, so I will answer based on assumptions, and the experience I had during my life. So let's see...
The tino user questioned the existence of two entities, title and volume, which is something important. Let me explain this for a moment, which will eliminate this as an error. Previously (a time ago) we had video rental stores (I don't know if this the right name where you live, english is not my native language). Remember? We used to go there to rent VHS tapes to watch at home.
What we rented were not films, but more copies/midia of them. A film will always have the same actor, director, title, etc., but a copy could have different attributes/properties, like the year that the media was manufactured, the available languages, the expiration year, among other things. So we had distinctly two different things.
But despite this, we have to consider whether there is a need to create two entities for persistence. We have to remember if we need to persist this information. If a copy/midia has no attributes, then it's entity should not exist, and what a user would rent really would be the movies titles.
In your case, the relationship between volume and title, I belive, is really expressing this discrepancy.
Let's talk about the relationship between librarian and title. What a librarian manages? Does It manages the titles that never change and are abstract things, or the physical objects present in the library? :)
Finally, let's talk about the borrows relationship. When we break down 1-N (or N-1) relationships, we always pass the primary key from the 1 side to the N side, solving the relationship to the formation of the Physical Model in a Entity-Relationship Diagram.
Despite this relationship here is a 0-5, to decompose it, we will not have exactly a 0-5 relationship. We would have in anyway to pass the primary key from the two sides to the table formed by this relationship. Therefore, here we have initially a N-N relationship between member and volume.
N-N relationships allow optional relations between entities. This means we can have the zero side cardinality here. To limit the number of books that can be rented, you need to implement a restriction/constraint with SQL, or with any procedural language in your database. In this case, you can implement a before insert trigger. This trigger has a duty to verify this restriction to allow or denny the completion of the operation as a whole.
Let it be clear that I'm not saying you should remove this notation. Your Conceptual Model should express it. But when you are decomposing, you have to remember that. I think you should just correct it.
Remember one important rule: Relations that have attributes/properties (the attributes/properties) can only exist in N-N relationships. If you have to put attributes/properties in a 1-N (or a N-1) relation, they (the attributes/properties) will always be on the N side. In summary, there are no N-1 (or 1-N) relationships with attributes in the relation. Only N-N relations can have attributes/properties. So be careful with this.
Any questions or clarification, please comment and I will answer.
I see no reason to distinguish member and card. Volume and Librarian don't have primary keys. Are they supposed to be weak entities? That doesn't make sense for Librarian and Volume needs an identifier to distinguish different copies.

Whether to Split Data in to Separate PostgreSQL Table

I am creating an app with a WPF frontend and a PostgreSQL database. The data includes patient addresses and supplier addresses. There is an average of about 3 contacts per mailing address listed. I'm estimating 10,000 - 15,000 contact records per database.
When designing the database structure, it occurred to me that rather than storing mailing addresses in a single "contacts" table, I could have one table storing names and other individual data, with a second table holding addresses. I could then create a relationship between the tables, to match addresses with contacts.
I have a pretty good idea how I can neatly organise situations such as changing the address of a single contact, where the other contacts are staying at the same address.
The question is: is it worth it? Can I expect to save much in the way of storage size? Will this impact the speed of queries adversley? How about if I was using something other than PostgreSQL?
I would strongly suggest normalizing this. You never know what kind of trouble you will run into. LedgerSMB has a relatively decent entity/user/contact/location schema that creates a very flexible environment. You can see it here (starts at line 363):
http://ledger-smb.svn.sourceforge.net/viewvc/ledger-smb/trunk/sql/Pg-database.sql?revision=3042&view=markup
Unless you think a large number of your users will be sharing addresses and they'll be often changing, I don't see the need to normalize out the address portion. In the various places I've worked and see users tables, sometimes it is, sometimes it isn't - never really seemed to make a terrible amount of trouble one way or another.
In terms of performance, with just 10-15k records and proper indexes, I can't imagine you'd notice too much difference one way or the other on modern hardware (although technically the separate table should be slower).
I agree with Joshua. Once it's set up properly (normalized) it's very easy to manage any changes in your app in the future.

DDD: Modeling M:N relation between two roots where the relation itself carries semantic meaning

Update Edited to reflect clarifications requested by Chris Holmes below. Originally I was referring to a Terminal as a Site but changed it to better reflect my actual domain.
At heart, I think this is a question about modeling a many to many relationship between two root entities where the relationship itself carries some semantic meaning.
In my domain
You can think of a Terminal as a branch location of our company
A Terminal can have a relationship with any number of customers
A customer can have a relationship with any number of terminals (standard many to many)
A customer\terminal relationship means that a customer can potentially store products at the Terminal
This relationship can be enabled\disabled. To be disabled merely means you are temporarily not allowed to store product, so a disabled relationship is different from no relationship at all.
A customer can have many Offices
A Terminal that has a relationship with a customer (enabled or not) must have a default office for that customer that they communicate with
There are some default settings that are applied to all transactions between a Customer and a Terminal, these are set up on a Terminal-Customer Relationship level
I think my objects here are pretty clear, Terminal, Customer, Office, and TerminalCustomerRelationship (since there is information being stored specifically about the relationship such as whether it is enabled, default office, ad default settings). Through multiple refactorings I have determined that both Terminal and Customer should probably be aggregate roots. This leaves me with the question of how to design my TerminalCustomerRelationship object to relate the two.
I suppose I could make the traversal from Terminal to TerminalCustomerRelationship unidirectional toward the relationship but I don't know how to break the relation from the relationship to the customer, especially since it needs to contain a reference to an Office which has a relationship to a Customer.
I'm new to this stuff and while most of DDD makes perfect sense I'm getting confused and need a fresh outlook. Can someone give me their opinion on how to deal with this situation?
Please note that I say Relationship not relation. In my current view it deserves to be an object in the same way that a Marriage would be an object in an application for a wedding chapel. Its most visible purpose is that it relates two objects, but it has other properties that rightfully belong to it as well.
By your description, you definitely need a "TerminalCustomerRelationship" entity to track the associated information. I would also convert the 'IsEnabled' flag into a first class 'Event' entity with a timestamp - this gives you the ability to save a history of the state changes (a more realistic view of what's happening in the domain.)
Here's a sample application (in VS2008) that refects your problem. You can tweak/test the code until the relationships make sense. Run "bin/debug/TerminalSampleApp.exe" and right-click "Terminal->Create Example" to get started.
Let me know if you find it useful.
Names can often clarify an object's responsibilities and bring a domain model into focus.
I am unclear what a Site is and that makes the entire model confusing, which makes it difficult for me to offer better advice. If a Site were a Vendor, for instance, then it would be easy to rename SiteCustomerRelationship as a Contract. In that context it makes perfect sense for Contract to be its own entity, and have the the model look like Vendor-Contract-Customer-Office.
There are other ways to look at this as well. Udi has a decent post on this sort of many-to-many relationship here.
You should not have a object Like SiteCustomerRelationship, its DB specific.
If its truly DDD you should have a Relation like:
Aggregate<Site> Customer.Site
IEnumerable<Aggregate<Office>> Customer.Offices
and perhaps
Aggregate<Office> Customer.DefaultOffice