SQL Schema to allow users add comments to various tables - sql

So, I'm building a website and I'm going to have standard CMS tables like Article, Blog, Poll, etc. I want to let users post comments to any of these items. So my question is, do I need to create separate comments tables for each (e.g. ArticleComment, BlogComment, PollComment), or can I just make a generic Comment table that could be used with any table? What has worked for people?
Method 1: Many Comment Tables
Article {ArticleID [PK], Title, FriendlyUrl}
ArticleComment {ArticleCommendID [PK], ArticleID [FK], Comment}
Blog {BlogID, Title, PubDate, Category}
BlogComment {BlogCommendID [PK], BlogID [FK], Comment}
Poll {PollID, Title, IsClosed}
PollComment {PollCommentID [PK], PollID [FK], Comment}
Method 2: Single Comment Table
Article {ArticleID [PK], Title, FriendlyUrl}
Blog {BlogID, Title, PubDate, Category}
Poll {PollID, Title, IsClosed}
Comment {CommentID [PK], ReferenceID [FK], Comment}

I'd go with the generic comment table. It will make a lot of things much simpler. I'd also tag comments with the ID of the user who created them, or other source-identifying information (IP address, etc.). Even if you don't display this it can be very handy when you have to clean up spam, etc.

There seem to be two major ways of mapping OO-inheritance to relational databases:
Take all the attributes from the parent class and all the child classes and put them in the table, together with a 'which class is this?' field. Each object is serialized as one row in one table.
Create one table for the parent class and one table for each child class. The table for the parent class table contains the 'which class is this?' field. The child class table contains a foreign key pointing to the parent class table. Each object is serialized as one row in the parent class table and one row in the child class table.
Method one doesn't really scale well: it quickly winds up with lots of nullable fields, almost always null, and scary CHECK constraints. But it is fairly simple for small class hierarchies.
Method two scales much better, but is more work. It also results in many more tables in your schema.
I suggest taking a look at method two for your Articles/Polls/Blogs tables — to me, they sound like child tables of a Content or something. You will then have a very clear and easy place to attach comments: to the Content.

Why do you want to keep all of your comments in the same table? Will you be treating all comments as a group? If you don't anticipate working with all of the comments on all items as a single group then there isn't really a reason to bunch them all together. Just because two entities in a database share the same attributes doesn't mean that they should be put in the same physical table.

I'd suggest just one comment table, adding an ItemID field telling which type of item is the comment for:
Article {ArticleID [PK], Title, FriendlyUrl}
Blog {BlogID, Title, PubDate, Category}
Poll {PollID, Title, IsClosed}
Comment {CommentID [PK], ReferenceID [FK], ItemID, Comment}
Item {ItemID, Type}
The last table would contain records such as (1, 'article'), (2, 'blog'), etc.
That way you'll be able to identify which content type each comment was made for.

I am working on a system where we used the following model for comments:
Data Table(s) Many-to-many Assoc Comment Table
CommentableId -> CommentableId/CommentId -> Comment_Id
Not my design, but I like the fexibility. It allows us to use one comment in many
different places. Since this is not trivial to implement in the UI, users don't get to see this feature (just a text box to type in a comment), but it is used when we do batch imports and legacy data processing in the database.

Related

How appropriate is one column table

Does this kind of design come along with overhead or data redundancy?
The structure of tables should remain able to do CRUD on tag, for something like manga/anime tag, allowing specific resources found-able through selection of tags. * representing primary key.
tag (tagID*, tagName)
tagMap (tagSetID*, tagID*)
tagSet (tagSetID*)
announce (announceID*, tagSetID, title, content)
There is nothing at all wrong with your design. Most of the time, we might expect the tagSet table to also maybe have a name column, e.g.
tagSet (tagSetID*, tagSetName)
That you don't have one isn't really an issue. This is really a standard many to many relationship between tags and sets, with the tagMap table serving as the junction table.

Site-wide comments with different type of pages and special requirements

I am interested in designing the database (well, I'm only concerned about one table really) for a site with the following requirements:
There is an items page, which lists items. items.xyz?id=t displays the item with ID t. I need the IDs of the items to be consecutive. The first item has ID 1, the second ID 2 and so on. Each item page has comments on that item.
There are other pages, such as objects, where objects.xyz?id=t displays the object with ID t. The IDs here need not necessarily be consecutive (and they can overlap with item IDs, but it's ok if you suggest something that forces them not to overlap). These also have comments.
My question is how to design the Comments table? If I have an EntityID in it that represents the page the comment should be displayed on (be it an item page or an object page), then should I make it so that the ItemID never overlaps the ObjectID by making all ObjectID start from, say, 109 and using a GUID table? (The ItemIDs increase very slowly). Is this acceptable practice?
Right now I'm doing it by having a bunch of nullable boolean fields in each comment: IsItem, IsObjectType1, IsObjectType2, ..., which allows me to know where each comment should be displayed. This isn't so bad since I only have a few objects, but it seems like an ugly hack.
What is the best way to go about this?
I see three solutions (assuming it is impossible or undesired to put Pages and Objects in one table). Either:
Tell the comment which it belongs to by giving it two columns: PageId and ObjectId.
That way you can also give these columns foreign keys to the respective tables and add proper indexes.
Introduce a table 'Entity' that has a unique id, a PageId and an ObjectId. Either columns are optional off course, exactly one of them must be filled, not 0 or both.
This way, you move all the potential garbage of having separate entities to this table, not polluting the Comments table, which should contain just comments. You isolate the mess.
Create a link table between Comments and Items and another table between Comments and Objects. Items and Objects are completely unrelated, and you don't have to pollute the Comments table with a lot of NULL values in multiple columns. When you create a comment, you decide if it links to an Item or an Object by inserting a link in either ItemComments or ObjectComments. Reading comments for an item or object is a matter of two simple joins.
The comments table can then contain only a single EntityId that refers to the Id in the Entity table.
The big advantage to this approach is twofold:
a) You can link other things to the same table too, whichout much hassle.
b) You can add other kinds of Entities and they will automatically support Comments and other things you might add, as mentioned in a).

Ideal database schema for similar structures

In our business application, we have a need to store user or system generated "comments" about a particular entity. For example, comments can be created about a customer, an order, or a purchase order. These comments all share many of the same characteristics, with the exception of the referenced entity.
All comments require the date, time, user, and comment text. They also require a foreign key to the referenced table so that you can look up comments for that particular entity.
Currently the application has a separate table for each type of comment (e.g. customer_comments, order_comments, purchaseOrder_comments). The schemas of these tables are all identical with respect to the date, time, user and comment text, but each has a FK to the respective table that the comment is for.
Is this the best design or is there a better design?
Personally, I'd create a single comment table and then use one intersection table per entity (customer, order, etc.) to link the comment with the entity.
you could put all these comments into one table comments and add another column commented_table, which would make it a polymorphic ManyToOne. some frameworks / orms (like Rails' ActiveRecord) do have built in support for that, if whatever you're using doesn't, it's basically as simple as adding another where clause
If it had been me, I think I would have had a single Comments table with an extra comment_type_id field that maps to whether the comment should be available for customer entities, order entities, etc... Somewhere else you'd need to have a comment_type table that comment_type_id refers to.
This way, if you decide to add an extra piece of meta-data to comments, you only have to do it in a single Comments table rather than in customer_comments, order_comments, etc...

What to do if 2 (or more) relationship tables would have the same name?

So I know the convention for naming M-M relationship tables in SQL is to have something like so:
For tables User and Data the relationship table would be called
UserData
User_Data
or something similar (from here)
What happens then if you need to have multiple relationships between User and Data, representing each in its own table? I have a site I'm working on where I have two primary items and multiple independent M-M relationships between them. I know I could just use a single relationship table and have a field which determines the relationship type, but I'm not sure whether this is a good solution. Assuming I don't go that route, what naming convention should I follow to work around my original problem?
To make it more clear, say my site is an auction site (it isn't but the principle is similar). I have registered users and I have items, a user does not have to be registered to post an item but they do need to be to do anything else. I have table User which has info on registered users and Items which has info on posted items. Now a user can bid on an item, but they can also report a item (spam, etc.), both of these are M-M relationships. All that happens when either event occurs is that an email is generated, in my scenario I have no reason to keep track of the actual "report" or "bid" other than to know who bid/reported on what.
I think you should name tables after their function. Lets say we have Cars and People tables. Car has owners and car has assigned drivers. Driver can have more than one car. One of the tables you could call CarsDrivers, second CarsOwners.
EDIT
In your situation I think you should have two tables: AuctionsBids and AuctionsReports. I believe that report requires additional dictinary (spam, illegal item,...) and bid requires other parameters like price, bid date. So having two tables is justified. You will propably be more often accessing bids than reports. Sending email will be slightly more complicated then when this data is stored in one table, but it is not really a big problem.
I don't really see this as a true M-M mapping table. Those usually are JUST a mapping. From your example most of these will have additional information as well. For example, a table of bids, which would have a User and an Item, will probably have info on what the bid was, when it was placed, etc. I would call this table... wait for it... Bids.
For reporting items you might want what was offensive about it, when it was placed, etc. Call this table OffenseReports or something.
You can name tables whatever you want. I would just name them something that makes sense. I think the convention of naming them Table1Table2 is just because sometimes the relationships don't make alot of sense to an outside observer.
There's no official or unofficial convention on relations or tables names. You can name them as you want, the way you like.
If you have multiple user_data relationships with the same keys that makes absolutely no sense. If you have different keys, name the relation in a descriptive way like: stores_products_manufacturers or stores_products_paymentMethods
I think you're only confused because the join tables are currently simple. Once you add more information, I think it will be obvious that you should append a functional suffix. For example:
Table User
UserID
EmailAddress
Table Item
ItemID
ItemDescription
Table UserItem_SpamReport
UserID
ItemID
ReportDate
Table UserItem_Post
UserID -- can be (NULL, -1, '', ...)
ItemID
PostDate
Table UserItem_Bid
UserId
ItemId
BidDate
BidAmount
Then the relation will have a Role. For instance a stock has 2 companies associated: an issuer and a buyer. The relationship is defined by the role the parent and child play to each other.
You could either put each role in a separate table that you name with the role (IE Stock_Issuer, Stock_Buyer etc, both have a relationship one - many to company - stock)
The stock example is pretty fixed, so two tables would be fine. When there are multiple types of relations possible and you can't foresee them now, normalizing it into a relationtype column would seem the better option.
This also depends on the quality of the developers having to work with your model. The column approach is a bit more abstract... but if they don't get it maybe they'd better stay away from databases altogether..
Both will work fine I guess.
Good luck, GJ
GJ

SQL Database Structure for Custom Categories

I am creating an online blog website and for each blog post, I'd like to have the user be able to create/edit/delete their own category so that they may categorize their post.
What is generally considered a good database design for user generated categories?
This is my proposed table design. (Is there a name for this type of db?)
USER_TABLE
user_id (pk), user_name
CATEGORY_TABLE
category_id (pk), category_name
USER_CATEGORIES
user_id (fk), category_id (fk)
Thanks for helping out. I'm confident there's a post somewhere regarding this but I was unable to find it. If this is a dupe please let me know and I will remove this question.
This is a many to many relationship. This would allow each user to potentially have many different categories and each category to potentially have many different users. This seems like a useful model for what you are trying to do.
I think your schema looks good. You are keeping the category labels in one table to avoid duplication and then just assigning their IDs to the users.
If what you are trying to do is to have "private" categories for each user then this is fine.
If on the other hand categories are supposed to be public (sth like tags on stackoverflow) then you may consider another option - not storing user<->category relationship, instead add field use_counter to category table and use triggers to increment it when category is being used(blog entry is categorized) or decrement when it's "freed" (blog entry is deleted/ its category is removed). When the use_counter reaches 0 - remove the category.