DB relationship depends on table type - sql

I am trying to find the name of my problem:
A Comment table contains comments made by User on either a Concert or Play or Album or Painting, etc.
I am trying to avoid using many different comment tables CommentPlay, CommentAlbum!
This is perhaps a one-to-many relationship until I want to search for all Comments under a specific Play. There must be a comment_type (e.g. for_play, for_album) somewhere but I do not know how to include that in the relationship.
I am accessing the database using abstract SQL (Perl's DBIx::Class) and would like to remain db-vendor agnostic (among SQLite, MySQL, Pg).
Could someone give me a pointer as to what exactly is the name of the situation I am facing?

Related

database design: link table for multiple different tables

I have a conceptual question about database design which came up multiple times in my history as a developer.
Imagine I have a bigger database that is designed in the past, and already in production (for example you can take http://sqlfiddle.com/#!9/7c06f/1, a library database).
Now there is a new feature request: for every existing object in every table there shall be a "help text" (or something different, an error, a tag...) that you can all view in one place.
I implemented something like that multiple times in the past, but every time I'm not satisfied with the solution.
One solution is to have a link table with every table, like that:
CREATE TABLE BooksHelp (
BookId INT NOT NULL,
HelpText VARCHAR NOT NULL
);
CREATE TABLE AuthorHelp (
AuthorId INT NOT NULL,
HelpText VARCHAR NOT NULL
);
...
But I would need multiple link tables, which makes the selection of every existing "help text" difficult.
How would you design this problem? Is there another, better solution?
This appears to be similar trying to map polymorphism to the relational model - it's just a bad fit - there's no obvious right answer.
There are a few obvious solutions. The one you've identified (storing the helptext in a linked table) is neat, but requires lots of joins if you're retrieving all the books for an author belonging to a publisher etc. As the business logic says "all objects should have helptext" (and I assume each row has different helptext), it's not super logical to store this in a child table - "helptext" is an attribute of each object, not a related concept.
You could also add a helptext column to each table. That stores the attribute in the main table, and reduces the cognitive load (as well as the number of joins). This is logical if each author, book, etc. has their own helptext.
In the case of text items of some sort, especially if they are used in several tybles, like help texts, it seems preferable to centralize their text data in one table and store references in the tables that need them. This design also supports multi-lingual apps where the text items would have to be maintained in several languages.

Polymorphic Associations in SQL

I'm creating a project with big database of Movies and Series, both in seperate tables. Now I have other tables like Country for specifying production country of movie or/and serie. If I want to do this in many to many relation it requires connection table one for movie and one for serie. It would look like this:
DB with separate connection tables.
It is many tables, so I was searching for other solutions and I found this article and the "Using one data entity per class" method seems to be best for me. I implemented it like that: DB with combined connection table.
The second implementation seems to be good, but there's a one problem I'm facing and that's a complicated inserts. To insert new Movie, I need first to add new Production and second to insert Movie and pick just created Production ID.
My question is could it be fixed in some way ? Can't it be auto incremented from Movie table ?
I'm using 10.1.36-MariaDB and I'm realy sorry for my poor english :c
Since Movies and Series have only one column that is different (boxoffice), it makes sense to put the two tables together and let boxoffice be NULL for series. I don't understand the need for Production, but it seems to add as much complexity as it saves. Try to get rid of it.
I suggest that the id for Country be the standard 2-letter codes. This will be more compact and eliminate some stuff.
The is no good reason to have an id for a connection table. See many-to-many tips . Those tips will speed up many of your queries, and save space.
"Polymorphic" and many other neat-things-in-a-textbook don't necessarily work well in Relational Databases.

One column with multiple foreign keys situation

Here's the deal : this is my first ever database project and I am afraid my solution to this problem isn't quite the best. The database keeps track of different "types" of cooperators. Those types are companies, organisations, employed workers, other "persons" ...
All of those have totally different sets of information, but they all have one in common - contact information. I decided to let user enter what kind of contact information he wants to add to any of the cooperator, whether its e-mail, phone, URL, Fax and so on ...
So I created "Contacts" table in which all the contact data for all the cooperators will be put, regardless of what type the cooperator is it.
The TablesList is the table that contains the list of cooperator types (Companies, Organisations, Workers)
Each row in the Contacts table must contain "TableID" number which identifies what type of the cooperator is it (Company/Worker/Organisation...), and must contain "RowID" which identifies what exact company/worker/organisation the contact is about.
The problem that exists it that Contacts table contains foreign keys from 3 other tables in 1 column, which cannot be good. I could remove the relationships and just fill the column with thos ID's without the DBMS knowing about the constrains, but that just doesnt look like a good solution to me, so now I doubt this idea is any good.
What do you suggest ?
Keep in mind that in future there may be some more types of cooperators added if needed (like temp/contract workers, agencies) and Contacts table should be designed to support them too
Thanks in advance !
Btw im using SQL CE and C#
here is the sketch of whats going on :
EDIT
Although it doesnt feel right, I just removed the relationships and it works just fine with my application so far
I suspect your design is over-normalized. You can simplify it by consolidating data into three tables: Companies, Workers and Organisations. It's not going to be normalized but it is much simpler to work with.
Check out http://www.codinghorror.com/blog/2008/07/maybe-normalizing-isnt-normal.html

How to model a mutually exclusive relationship in SQL Server

I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.

Table Naming Dilemma: Singular vs. Plural Names [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Academia has it that table names should be the singular of the entity that they store attributes of.
I dislike any T-SQL that requires square brackets around names, but I have renamed a Users table to the singular, forever sentencing those using the table to sometimes have to use brackets.
My gut feel is that it is more correct to stay with the singular, but my gut feel is also that brackets indicate undesirables like column names with spaces in them etc.
Should I stay, or should I go?
I had same question, and after reading all answers here I definitely stay with SINGULAR, reasons:
Reason 1 (Concept). You can think of bag containing apples like "AppleBag", it doesn't matter if contains 0, 1 or a million apples, it is always the same bag. Tables are just that, containers, the table name must describe what it contains, not how much data it contains. Additionally, the plural concept is more about a spoken language one (actually to determine whether there is one or more).
Reason 2. (Convenience). it is easier come out with singular names, than with plural ones. Objects can have irregular plurals or not plural at all, but will always have a singular one (with few exceptions like News).
Customer
Order
User
Status
News
Reason 3. (Aesthetic and Order). Specially in master-detail scenarios, this reads better, aligns better by name, and have more logical order (Master first, Detail second):
1.Order
2.OrderDetail
Compared to:
1.OrderDetails
2.Orders
Reason 4 (Simplicity). Put all together, Table Names, Primary Keys, Relationships, Entity Classes... is better to be aware of only one name (singular) instead of two (singular class, plural table, singular field, singular-plural master-detail...)
Customer
Customer.CustomerID
CustomerAddress
public Class Customer {...}
SELECT FROM Customer WHERE CustomerID = 100
Once you know you are dealing with "Customer", you can be sure you will use the same word for all of your database interaction needs.
Reason 5. (Globalization). The world is getting smaller, you may have a team of different nationalities, not everybody has English as a native language. It would be easier for a non-native English language programmer to think of "Repository" than of "Repositories", or "Status" instead of "Statuses". Having singular names can lead to fewer errors caused by typos, save time by not having to think "is it Child or Children?", hence improving productivity.
Reason 6. (Why not?). It can even save you writing time, save you disk space, and even make your computer keyboard last longer!
SELECT Customer.CustomerName FROM Customer WHERE Customer.CustomerID = 100
SELECT Customers.CustomerName FROM Customers WHERE Customers.CustomerID = 103
You have saved 3 letters, 3 bytes, 3 extra keyboard hits :)
And finally, you can name those ones messing up with reserved names like:
User > LoginUser, AppUser, SystemUser, CMSUser,...
Or use the infamous square brackets [User]
I prefer to use the uninflected noun, which in English happens to be singular.
Inflecting the number of the table name causes orthographic problems (as many of the other answers show), but choosing to do so because tables usually contain multiple rows is also semantically full of holes. This is more obvious if we consider a language that inflects nouns based on case (as most do):
Since we're usually doing something with the rows, why not put the name in the accusative case? If we have a table that we write to more than we read, why not put the name in dative? It's a table of something, why not use the genitive? We wouldn't do this, because the table is defined as an abstract container that exists regardless of its state or usage. Inflecting the noun without a precise and absolute semantic reason is babbling.
Using the uninflected noun is simple, logical, regular and language-independent.
If you use Object Relational Mapping tools or will in the future I suggest Singular.
Some tools like LLBLGen can automatically correct plural names like Users to User without changing the table name itself. Why does this matter? Because when it's mapped you want it to look like User.Name instead of Users.Name or worse from some of my old databases tables naming tblUsers.strName which is just confusing in code.
My new rule of thumb is to judge how it will look once it's been converted into an object.
one table I've found that does not fit the new naming I use is UsersInRoles. But there will always be those few exceptions and even in this case it looks fine as UsersInRoles.Username.
Others have given pretty good answers as far as "standards" go, but I just wanted to add this... Is it possible that "User" (or "Users") is not actually a full description of the data held in the table? Not that you should get too crazy with table names and specificity, but perhaps something like "Widget_Users" (where "Widget" is the name of your application or website) would be more appropriate.
What convention requires that tables have singular names? I always thought it was plural names.
A user is added to the Users table.
This site agrees:
http://vyaskn.tripod.com/object_naming.htm#Tables
This site disagrees (but I disagree with it):
http://justinsomnia.org/writings/naming_conventions.html
As others have mentioned: these are just guidelines. Pick a convention that works for you and your company/project and stick with it. Switching between singular and plural or sometimes abbreviating words and sometimes not is much more aggravating.
How about this as a simple example:
SELECT Customer.Name, Customer.Address FROM Customer WHERE Customer.Name > "def"
vs.
SELECT Customers.Name, Customers.Address FROM Customers WHERE Customers.Name > "def"
The SQL in the latter is stranger sounding than the former.
I vote for singular.
I am of the firm belief that in an Entity Relation Diagram, the entity should be reflected with a singular name, similar to a class name being singular. Once instantiated, the name reflects its instance. So with databases, the entity when made into a table (a collection of entities or records) is plural. Entity, User is made into table Users. I would agree with others who suggested maybe the name User could be improved to Employee or something more applicable to your scenario.
This then makes more sense in a SQL statement because you are selecting from a group of records and if the table name is singular, it doesn't read well.
I stick with singular for table names and any programming entity.
The reason? The fact that there are irregular plurals in English like mouse ⇒ mice and sheep ⇒ sheep. Then, if I need a collection, i just use mouses or sheeps, and move on.
It really helps the plurality stand out, and I can easily and programatically determine what the collection of things would look like.
So, my rule is: everything is singular, every collection of things is singular with an s appended. Helps with ORMs too.
IMHO, table names should be plural like Customers.
Class names should be singular like Customer if it maps to a row in the Customers table.
Singular. I don't buy any argument involving which is most logical - every person thinks his own preference is most logical. No matter what you do it is a mess, just pick a convention and stick to it. We are trying to map a language with highly irregular grammar and semantics (normal spoken and written language) to a highly regular (SQL) grammar with very specific semantics.
My main argument is that I don't think of the tables as a set but as relations.
So, the AppUser relation tells which entities are AppUsers.
The AppUserGroup relation tells me which entities are AppUserGroups
The AppUser_AppUserGroup relation tells me how the AppUsers and AppUserGroups are related.
The AppUserGroup_AppUserGroup relation tells me how AppUserGroups and AppUserGroups are related (i.e. groups member of groups).
In other words, when I think about entities and how they are related I think of relations in singular, but of course, when I think of the entities in collections or sets, the collections or sets are plural.
In my code, then, and in the database schema, I use singular. In textual descriptions, I end up using plural for increased readability - then use fonts etc. to distinguish the table/relation name from the plural s.
I like to think of it as messy, but systematic - and this way there is always a systematically generated name for the relation I wish to express, which to me is very important.
I also would go with plurals, and with the aforementioned Users dilemma, we do take the square bracketing approach.
We do this to provide uniformity between both database architecture and application architecture, with the underlying understanding that the Users table is a collection of User values as much as a Users collection in a code artifact is a collection of User objects.
Having our data team and our developers speaking the same conceptual language (although not always the same object names) makes it easier to convey ideas between them.
I personaly prefer to use plural names to represent a set, it just "sounds" better to my relational mind.
At this exact moment i am using singular names to define a data model for my company, because most of the people at work feel more confortable with it.
Sometimes you just have to make life easier to everyone instead of imposing your personal preferences.
(that's how i ended up in this thread, to get a confirmation on what should be the "best practice" for naming tables)
After reading all the arguing in this thread, i reached one conclusion:
I like my pancakes with honey, no matter what everybody's favorite flavour is. But if i am cooking for other people, i will try to serve them something they like.
Singular. I'd call an array containing a bunch of user row representation objects 'users', but the table is 'the user table'. Thinking of the table as being nothing but the set of the rows it contains is wrong, IMO; the table is the metadata, and the set of rows is hierarchically attached to the table, it is not the table itself.
I use ORMs all the time, of course, and it helps that ORM code written with plural table names looks stupid.
I've actually always thought it was popular convention to use plural table names. Up until this point I've always used plural.
I can understand the argument for singular table names, but to me plural makes more sense. A table name usually describes what the table contains. In a normalized database, each table contains specific sets of data. Each row is an entity and the table contains many entities. Thus the plural form for the table name.
A table of cars would have the name cars and each row is a car. I'll admit that specifying the table along with the field in a table.field manner is the best practice and that having singular table names is more readable. However in the following two examples, the former makes more sense:
SELECT * FROM cars WHERE color='blue'
SELECT * FROM car WHERE color='blue'
Honestly, I will be rethinking my position on the matter, and I would rely on the actual conventions used by the organization I'm developing for. However, I think for my personal conventions, I'll stick with plural table names. To me it makes more sense.
I don't like plural table names because some nouns in English are not countable (water, soup, cash) or the meaning changes when you make it countable (chicken vs a chicken; meat vs bird).
I also dislike using abbreviations for table name or column name because doing so adds extra slope to the already steep learning curve.
Ironically, I might make User an exception and call it Users because of USER (Transac-SQL), because I too don't like using brackets around tables if I don't have to.
I also like to name all the ID columns as Id, not ChickenId or ChickensId (what do plural guys do about this?).
All this is because I don't have proper respect for the database systems, I am just reapplying one-trick-pony knowledge from OO naming conventions like Java's out of habit and laziness. I wish there were better IDE support for complicated SQL.
We run similar standards, when scripting we demand [ ] around names, and where appropriate schema qualifiers - primarily it hedges your bets against future name grabs by the SQL syntax.
SELECT [Name] FROM [dbo].[Customer] WHERE [Location] = 'WA'
This has saved our souls in the past - some of our database systems have run 10+ years from SQL 6.0 through SQL 2005 - way past their intended lifespans.
If we look at MS SQL Server's system tables, their names as assigned by Microsoft are in plural.
The Oracle's system tables are named in singular. Although a few of them are plural.
Oracle recommends plural for user-defined table names.
That doesn't make much sense that they recommend one thing and follow another.
That the architects at these two software giants have named their tables using different conventions, doesn't make much sense either... After all, what are these guys ... PhD's?
I do remember in academia, the recommendation was singular.
For example, when we say:
select OrderHeader.ID FROM OrderHeader WHERE OrderHeader.Reference = 'ABC123'
maybe b/c each ID is selected from a particular single row ...?
The system tables/views of the server itself (SYSCAT.TABLES, dbo.sysindexes, ALL_TABLES, information_schema.columns, etc.) are almost always plural. I guess for the sake of consistency I'd follow their lead.
I am a fan of singular table names as they make my ER diagrams using CASE syntax easier to read, but by reading these responses I'm getting the feeling it never caught on very well? I personally love it. There is a good overview with examples of how readable your models can be when you use singular table names, add action verbs to your relationships and form good sentences for every relationships. It's all a bit of overkill for a 20 table database but if you have a DB with hundreds of tables and a complex design how will your developers ever understand it without a good readable diagram?
http://www.aisintl.com/case/method.html
As for prefixing tables and views I absolutely hate that practice. Give a person no information at all before giving them possibly bad information. Anyone browsing a db for objects can quite easily tell a table from a view, but if I have a table named tblUsers that for some reason I decide to restructure in the future into two tables, with a view unifying them to keep from breaking old code I now have a view named tblUsers. At this point I am left with two unappealing options, leave a view named with a tbl prefix which may confuse some developers, or force another layer, either middle tier or application to be rewritten to reference my new structure or name viewUsers. That negates a large part of the value of views IMHO.
Tables: plural
Multiple users are listed in the users table.
Models: singular
A singular user can be selected from the users table.
Controllers: plural
http://myapp.com/users would list multiple users.
That's my take on it anyway.
I once used "Dude" for the User table - same short number of characters, no conflict with keywords, still a reference to a generic human. If I weren't concerned about the stuffy heads that might see the code, I would have kept it that way.
I've always used singular simply because that's what I was taught. However, while creating a new schema recently, for the first time in a long time, I actively decided to maintain this convention simply because... it's shorter. Adding an 's' to the end of every table name seems as useless to me as adding 'tbl_' in front of every one.
This may be a bit redundant, but I would suggest being cautious. Not necessarily that it's a bad thing to rename tables, but standardization is just that; a standard -- this database may already be "standardized", however badly :) -- I would suggest consistency to be a better goal given that this database already exists and presumably it consists of more than just 2 tables.
Unless you can standardize the entire database, or at least are planning to work towards that end, I suspect that table names are just the tip of the iceberg and concentrating on the task at hand, enduring the pain of poorly named objects, may be in your best interest --
Practical consistency sometimes is the best standard... :)
my2cents ---
As others have mentioned here, conventions should be a tool for adding to the ease of use and readability. Not as a shackle or a club to torture developers.
That said, my personal preference is to use singular names for both tables and columns. This probably comes from my programming background. Class names are generally singular unless they are some sort of collection. In my mind I am storing or reading individual records in the table in question, so singular makes sense to me.
This practice also allows me to reserve plural table names for those that store many-to-many relationships between my objects.
I try to avoid reserved words in my table and column names, as well. In the case in question here it makes more sense to go counter to the singular convention for Users to avoid the need to encapsulate a table that uses the reserved word of User.
I like using prefixes in a limited manner (tbl for table names, sp_ for proc names, etc), though many believe this adds clutter. I also prefer CamelBack names to underscores because I always end up hitting the + instead of _ when typing the name. Many others disagree.
Here is another good link for naming convention guidelines: http://www.xaprb.com/blog/2008/10/26/the-power-of-a-good-sql-naming-convention/
Remember that the most important factor in your convention is that it make sense to the people interacting with the database in question. There is no "One Ring to Rule Them All" when it comes to naming conventions.
Possible alternatives:
Rename the table SystemUser
Use brackets
Keep the plural table names.
IMO using brackets is technically the safest approach, though it is a bit cumbersome. IMO it's 6 of one, half-a-dozen of the other, and your solution really just boils down to personal/team preference.
My take is in semantics depending on how you define your container. For example, A "bag of apples" or simply "apples" or an "apple bag" or "apple".
Example:
a "college" table can contain 0 or more colleges
a table of "colleges" can contain 0 or more collegues
a "student" table can contain 0 or more students
a table of "students" can contain 0 or more students.
My conclusion is that either is fine but you have to define how you (or people interacting with it) are going to approach when referring to the tables; "a x table" or a "table of xs"
I think using the singular is what we were taught in university. But at the same time you could argue that unlike in object oriented programming, a table is not an instance of its records.
I think I'm tipping in favour of the singular at the moment because of plural irregularities in English. In German it's even worse due to no consistent plural forms - sometimes you cannot tell if a word is plural or not without the specifying article in front of it (der/die/das). And in Chinese languages there are no plural forms anyway.
I only use nouns for my table names that are spelled the same, whether singular or plural:
moose
fish
deer
aircraft
you
pants
shorts
eyeglasses
scissors
species
offspring
I did not see this clearly articulated in any of the previous answers. Many programmers have no formal definition in mind when working with tables. We often communicate intuitively in terms of of "records" or "rows". However, with some exceptions for denormalized relations, tables are usually designed so that the relation between the non-key attributes and the key constitutes a set theoretic function.
A function can be defined as a subset of a cross-product between two sets, in which each element of the set of keys occurs at most once in the mapping. Hence the terminology arising from from that perspective tends to be singular. One sees the same singular (or at least, non-plural) convention across other mathematical and computational theories involving functions (algebra and lambda calculus for instance).
I always thought that was a dumb convention. I use plural table names.
(I believe the rational behind that policy is that it make it easier for ORM code generators to produce object & collection classes, since it is easier to produce a plural name from a singular name than vice-versa)