Comment strategy needed - sql

I'm working on new web application contains such as library
books
pictures
files
every kind of the past kinds of sections has different properties in database and i cant store them information inside one data table, so i need to create 3 different tables.
Visitors can comment on books, files and pictures and i want to develop one module for comment and store all comments inside one table, lets call it (comments)
my question is, what the strategy i have to follow to make this done?
I am thinking about create reference column [reference_id] [nvarchar 50]
and i will store the comments like this
files_{id of file}
pictures_{id of picture} and so on... is that would be great method??
thanks

You should use separate ItemId and ItemType.
Additionally you can create table with ItemTypes and store ItemId and ItemTypeId.
Structure like this: pictures_{id of picture} will waste a lot of space and will not help in performance or later code development.
Example: how you cut item type from something like this:
picture_1234
You have to search for "_", convert truncated text to number, and write a lot of SQL code...

I answered a very similar question:
In a StackOverflow clone, what relationship should a Comments table have to Questions and Answers?
In your case, I would recommend creating a single table Commentables:
CREATE TABLE Commentables (
item_id INT AUTO_INCREMENT PRIMARY KEY
item_type CHAR(1) NOT NULL,
UNIQUE KEY (item_id, item_type)
);
Then each of Books, Pictures, Files has a 1:1 relationship to Commentables.
CREATE TABLE Books (
book_id INT PRIMARY KEY, -- but not auto-increment
item_type CHAR(1) NOT NULL DEFAULT 'B',
FOREIGN KEY (book_id, item_type) REFERENCES Commentables(item_id, item_type)
);
Do the same for Pictures and Files. The item_type should always be 'B' for books, always 'P' for pictures, always 'F' for files. Therefore you can't have a book and a picture reference the same row in Commentables.
Then your comments can reference one table, Commentables:
CREATE TABLE Comments (
comment_id INT AUTO_INCREMENT PRIMARY KEY,
item_id INT NOT NULL,
FOREIGN KEY (item_id) REFERENCES Commentables (item_id)
);

Related

PosgreSQL - Ensuring that at least one row exists in Table B for each row in Table A

Currently we have a products table which is fairly straightforward, the relevant part of the structure is something like this:
id SERIAL PRIMARY KEY,
title text NOT NULL,
description text NOT NULL,
[...]
We now need to support an arbitrary number of languages for the title and description of each product, and the default language can vary from product to product (some sites may be multilingual from page to page).
So far, nothing too difficult - add a product_metadata table something like this:
product_id int NOT NULL REFERENCES products(id),
language_code_id int NOT NULL REFERENCES language_codes(id),
title text NOT NULL,
description text NOT NULL,
[...]
CONSTRAINT product_metadata_pkey PRIMARY KEY (product_id, language_code_id)
It seems like the next logical step is to move the existing title and description data into the new table and remove those columns from products, but this means that new rows in products can be added without a title or description.
Using id SERIAL PRIMARY KEY in product_metadata (and replacing the existing composite primary key with a unique constraint) and adding a default_metadata_id int NOT NULL REFERENCES product_metadata(id) column to products would ensure at least one metadata row per product, but it creates a loop between the tables.
It looks like using a deferrable constraint would accommodate this as long as the insert queries were written to insert into both tables before committing, but creating a deliberate cycle and relying on this kind of behaviour seems... messy. Is there a neater way to achieve the same thing, or is this one of those cases where that really is the right way to go?

Storing single form table questions in 1 or multiple tables

I have been coding ASP.NET forms inside web applications for a long time now. Generally most web apps have a user that logs in, picks a form to fill out and answers questions so your table looks like this
Table: tblInspectionForm
Fields:
inspectionformid (either autoint or guid)
userid (user ID who entered it)
datestamp (added, modified, whatever)
Question1Answer: boolean (maybe a yes/no)
Question2Answer: int (maybe foreign key for sub table 1 with dropdown values)
Question3Answer: int (foreign key for sub table 2 with dropdown values)
If I'm not mistaken it meets both 2nd and 3rd normal forms. You're not storing user names in the tables, just the ID's. You aren't storing the dropdown or "yes/no" values in Q-3, just ID's of other tables.
However, IF all the questions are exactly the same data type (assume there's no Q1 or Q1 is also an int), which link to the exact same foreign key (e.g. a form that has 20 questions, all on a 1-10 scale or have the same answers to chose from), would it be better to do something like this?
so .. Table: tblInspectionForm
userid (user ID who entered it)
datestamp (added, modified, whatever)
... and that's it for table 1 .. then
Table2: tblInspectionAnswers
inspectionformid (composite key that links back to table1 record)
userid (composite key that links back to table1 record)
datastamp (composite key that links back to table1 record)
QuestionIDNumber: int (question 1, question 2, question3)
QuestionAnswer: int (foreign key)
This wouldn't just apply to forms that only have the same types of answers for a single form. Maybe your form has 10 of these 1-10 ratings (int), 10 boolean-valued questions, and then 10 freeform.. You could break it into three tables.
The disadvantage would be that when you save the form, you're making 1 call for every question on your form. The upside is, if you have a lot of nightly integrations or replications that pull your data, if you decide to add a new question, you don't have to manually modify any replications to reporting data sources or anything else that's been designed to read/query your form data. If you originally had 20 questions and you deploy a change to your app that adds a 21st, it will automatically get pulled into any outside replications, data sources, reporting that queries this data. Another advantage is that if you have a REALLY LONG (this happens a lot maybe in the real estate industry when you have inspection forms with 100's of questions that go beyond the 8k limit for a table row) you won't end up running into problems.
Would this kind of scenario ever been the preferred way of saving form data?
As a rule of thumb, whenever you see a set of columns with numbers in their names, you know the database is poorly designed.
What you want to do in most cases is have a table for the form / questionnaire, a table for the questions, a table for the potential answers (for multiple-choice questions), and a table for answers that the user chooses.
You might also need a table for question type (i.e free-text, multiple-choice, yes/no).
Basically, the schema should look like this:
create table Forms
(
id int identity(1,1) not null primary key,
name varchar(100) not null, -- with a unique index
-- other form related fields here
)
create table QuestionTypes
(
id int identity(1,1) not null primary key,
name varchar(100) not null, -- with a unique index
)
create table Questions
(
id int identity(1,1) not null primary key,
form_id int not null foreign key references Forms(id),
type_id int not null foreign key references QuestionTypes(id),
content varchar(1000)
)
create table Answers
(
id int identity(1,1) not null primary key,
question_id int not null foreign key references Questions(id),
content varchar(1000)
-- For quizez, unremark the next row:
-- isCorrect bit not null
)
create table Results
{
id int identity(1,1) not null primary key,
form_id int not null foreign key references Forms(id)
-- in case only registered users can fill the form, unremark the next row
--user_id int not null foreign key references Users(id),
}
create table UserAnswers
(
result_id int not null foreign key references Results(id),
question_id int not null foreign key references Questions(id),
answer_id int not null foreign key references Answers(id),
content varchar(1000) null -- for free text questions
)
This design will require a few joins when generating the forms (and if you have multiple forms per application, you just add an application table that the form can reference), and a few joins to get the results, but it's the best dynamic forms database design I know.
I'm not sure whether it's "preferred" but I have certainly seen that format used commercially.
You could potentially make the secondary table more flexible with multiple answer columns (answer_int, answer_varchar, answer_datetime), and assign a question value that you can relate to get the answer from the right column.
So if q_var = 2 you know to look in answer_varchar, whereas q_value=1 you know is an int and requires a lookup (the name of which could also be specified with the question and stored in a column).
I use an application at the moment which splits answers into combobox, textfield, numeric, date etc in this fashion. The application actually uses a JSON form which splits out the data as it saves into the separate columns. It's a bit flawed as it saves JSON into these columns but the principle can work.
You could go with a single identity field for the parent table key that the child table would reference.

Add files to multiple tables M:N

What is The best Data model for Add multiple files to The multiple tables? I have for example 5 tables articles, blogs, posts... and for each item I would like to store multiple files. Files table contains only filepaths (not physicaly files).
Example:
Im using The links table, but when I create in the future The new table for example "comments", then I need to add new column to The links table.
Is there a better way of modeling such data?
One way to solve this is to use the table inheritance pattern. The main idea is to have a base table (let's call it content) with general shared information about all the items (e.g., creation date) and most importantly, the relationship with files. Then, you may add additional content types in the future without having to worry about their relation to files, since the content parent type already handles it.
E.g.:
CREATE TABLE flies (
id NUMERIC PRIMARY KEY,
path VARCHAR(100) NOT NULL
);
CREATE TABLE content (
id NUMERIC PRIMARY KEY,
created TIMESTAMP NOT NULL
);
CREATE TABLE links (
file_id NUMERIC NOT NULL REFERENCES files(id),
content_id NUMERIC NOT NULL REFERENCES content(id),
PRIMARY KEY (file_id, content_id)
);
CREATE TABLE articles (
id NUMERIC PRIMARY KEY REFERENCES content(id),
title VARCHAR(400),
subtitle VARCHAR(400)
);
-- etc...

Store array of items in SQL table

I know this has probably been asked a million times but I can't find anything definite for me. I'm making a website involving users who can build a list of items. I'm wondering what would be the best way for store their items in an SQL table?
I'm thinking will I need to make a seperate table for each user since there I can't see any way to store an array. I think this would be inefficient however.
Depending on what an "item" is, there seem to be two possible solutions:
a one-to-many relationship between users and items
a many-to-many relationship between users and items
If a single item (such as a "book") can be "assigned" to more than one user, it's 2). If each item is unique and can only belong to a single user it's 1).
one-to-many relationship
create table users
(
user_id integer primary key not null,
username varchar(100) not null
);
create table items
(
item_id integer primary key not null,
user_id integer not null references users(user_id),
item_name varchar(100) not null
);
many-to-many relationship:
create table users
(
user_id integer primary key not null,
username varchar(100) not null
);
create table items
(
item_id integer primary key not null,
item_name varchar(100) not null
);
create table user_items
(
user_id integer not null references users(user_id),
item_id integer not null references items(item_id)
);
Because of your extremely vague description, this is the best I can think of.
There is no need to use an array or something similar. It seems you are new to database modelling, so you should read up about normalisation. Each time you think about "arrays" you are probably thinking about "tables" (or relations).
Edit (just saw you mentioned MySQL): the above SQL will not create a foreign key constraint in MySQL (even though it will run without an error) due to MySQL's stupid "I'm not telling you if I can't do something" attitude. You need to define the foreign keys separately.
A separate table for each user\account would be best. This will limit the size of the necessary tables and allow for faster searching. When you present data you are usually displaying data for that current user/account. When you have to search through the table to find the relative information. The application will start to slow down the larger the dependent table grows. Write the application as if it will be used to the fullest extent of SQL. This will limit the need for redesign in the future if the website becomes popular.

T-SQL Tag Database Architecture Design?

Scenario
I am building a database that contains a series of different tables. These consist of a COMMENTS table, a BLOGS table & an ARTICLES table. I want to be able to add new items to each table, and tag them with between 0 and 5 tags to help the user search for particular information that is relevant more easily.
Initial thoughts for architecture
My first thoughts were to have a centralised table of TAGS. This table would list all of the available tags using a TagID field & a TagName field. Since each item can have many tags and each tag can have many items, I would need a MANY-TO-MANY relationship between each item table and the TAGS table.
For Example:
Many COMMENTS can have many TAGS.
Many TAGS can have many COMMENTS.
Many ARTICLES can have many TAGS.
Many TAGS can have many ARTICLES.
etc.....
Current Understanding
From previous experience I understand that a way of implementing this structure in T-SQL is to have an ajoining table between the COMMENTS table and the TAG table. This ajoining table would contain the CommentID & the TagID, as well as its own unique CommentTagID. This structure would also apply to all other items.
Questions
Firstly is this the right way to go about implementing such a database architecture? If not, what other methods would be feasible? Since the database will eventually contain a lot of information, I need to ensure that it is scalable. Is this a scalable implementation?
If I had lots of these tables would this architecture make CRUD operations very slow?
Should I use GUIDs or Incrementing INTs for the ID fields?
Help & suggestions would be appreciated greatly.
Thankyou.
You may also want to look at WordPress schema and database description to see how others are solving a similar problem.
Keeping a centralized table of tags is a good idea if you will ever need to do one of the following:
Build a complete list of all tags (that is mixing blog tags, comment tags and article tags)
Update the tags so that they get updated everywhere: so that when you change sqlserver to sql-server, it gets changed anywhere: in blogs, articles and comments.
Option 1 is very useful to build the tag clouds so I'd recommend to build a table of tags and reference it from your tables.
If you won't ever need to update the tags as described in the option 2, you don't ever need surrogate key for them.
You will most probably need a UNIQUE constraint on them anyway and there is no point not to make it a PRIMARY KEY, if you are not going to update them.
This will also save you lots of joins: you don't need to join with the tags table to show the tags.
GUIDs are more simple to manage, but theу make the indexes and link tables quite large in size.
You can assign a numerical identifier to each table and link like this:
tTag (tag VARCHAR(30) NOT NULL PRIMARY KEY)
tTaggable (type INT NOT NULL, id INT NOT NULL, PRIMARY KEY (type, id))
tTagLink (
tag VARCHAR(30) NOT NULL FOREIGN KEY REFERENCES tTag,
type INT NOT NULL, id INT NOT NULL,
PRIMARY KEY (tag, type, id),
FOREIGN KEY (type, id) REFERENCES tTaggable
)
tBlog (
id INT NOT NULL PRIMARY KEY,
type INT NOT NULL, CHECK(type = 1),
FOREIGN KEY (type, id) REFERENCES tTaggable,
…)
tArticle (
id INT NOT NULL,
blog INT NOT NULL FOREIGN KEY REFERENCES tBlog,
type INT NOT NULL, CHECK(type = 2),
FOREIGN KEY (type, id) REFERENCES tTaggable,
…)
tComment (
id INT NOT NULL PRIMARY KEY,
article INT NOT NULL FOREIGN KEY REFERENCES tArticle,
type INT NOT NULL, CHECK(type = 3),
FOREIGN KEY (type, id) REFERENCES tTaggable,
…)
Note that if you want to delete a blog, an article or a comment, you should delete from tTaggable as well.
This way, tTaggable is only used to ensure the referential integrity. To query all tags for an article, you just issue this query:
SELECT tag
FROM tTagLink
WHERE type = 2
AND id = 1234567
, so you get all tags by querying a single table, without any joins.
usually many-to-many relationship implemented exactly as you describe it.
Auto-incrementing IDs it is good idea since it guarantee that they will be unique.
And you can use guids if you want to tag comments and articles with the same tag(instead of 6 tables you need just 5). But searching with guids may be more slow.