Database Design Question - Categories / Subcategories - sql

I have a question for how I would design a few tables in my database. I have a table to track Categories and one for Subcategories:
TABLE Category
CategoryID INT
Description NVARCHAR(500)
TABLE Subcategory
SubcategoryID INT
CategoryID INT
Description NVARCHAR(500)
A category might be something like Electronics, and its Subcategories might be DVD Players, Televisions, etc.
I have another table that is going to be referencing the Category/Subcategory. Does it need to reference the SubcategoryID?
TABLE Product
SubcategoryID INT -- should this be subcategory?
Is there a better way to do this or is this the right way? I'm not much of a database design guy. I'm using SQL Server 2008 R2 if that matters.

Your design is appropriate. I'm a database guy turned developer, so I can understand the inclination to have Category and SubCategory in one table, but you can never go wrong by KISS.
Unless extreme performance or infinite hierarchy is a requirement (I'm guessing not), you're good to go.
If being able to associate multiple subcategories with a product is a requirement, to #Mikael's point, you would need a set-up like this which creates a many-to-many relationship via a join/intersect table, Product_SubCategory:
CREATE TABLE Product (ProductID int, Description nvarchar(100))
CREATE TABLE Product_SubCategory (ProductID int, SubCategoryID int)
CREATE TABLE SubCategory (SubCategoryID int, CategoryID int, Description nvarchar(100))
CREATE TABLE Category (CategoryID int, Description nvarchar(100))
Hope that helps...
Eric Tarasoff

Having two separate tables for Categories and SubCategories depends on your situation.
If you keep it the way it is you are limited to a Category > Subcategory scenario, as in you can't have SubCategories of SubCategories.
If you make them into one table you need a column for ParentID. If a category is the top most it will have a ParentID of 0. If you want to allow unlimited sub categories foreach subcategory, e.g. Electronics > Recordable Media, Blueray, 4gb you will need to use recursive programming to display them.

Attach tags to the products in instead of a category hierarchy. It is much more flexible.
create table product (id, name,...)
create table tag (id, name, description)
create table product_tag (product_id, tag_id)

If categories and subcategories have the same attributes, then collapse them into one table.
If one 'sub' category can belong to more than one 'parent' category then add a link class, otherwise add a single column to point to a parent.
e.g. if you have Electronics > TV, can you also have Entertainment > TV ? etc.
Your other table should reference just the category_id (note - not parent_category_id)
hth

As long as Sub-Categories are never repeated in a different Category, and especially if they have different attributes, then your proposed method is good.
The one problem can come when you are adding/editing Products, and you don't have a field for Category, even though you probably want a control where the user can edit the Category.

It depends on your requirements. If every Product is linked to no more than one SubCategory you should have SubCategoryID in Products. There is no need to add CategoryID as well.
Other scenarios that require a different model might be that a Product could link directly to a Category instead of a SubCategory or that one Product could be linked to more than one SubCategory or that a SubCategory is linked to more than one Category.

Related

I am making a SQL database of categories and subcategories. What is the best way to link these tables?

The database has a table called "categories" with columns CATEGORY_ID(primary key) and CATEGORY_NAME.
I have subcategories for each category.
For better accessing which is the best method from the below methods.
Method 1: The "CATEGORY_ID" column in the "categories" table is a FOREIGN KEY in the "subcategories " table.
Method 2: Maintaining a separate table for each category representing the subcategories.
I prefer to use same table for category and sub category
like
Table Categories
[CATEGORY_ID, CATEGORY_NAME, PARENT_CATEGORY_ID]
In case you don't know how many sub categories are there.
This scenario is just an example, the scenario is as follows: We have a product table where all the records of products are stored. Same way we will have customer table where records of customers are stored. The daily sales keep the record of all the sales. This sales table will keep record of which product who has purchased. So linking is to be done from Sales table to product table and customer table.
The query to link the two tables is as follows:SELECT product_name, customer.name, date_of_sale FROM sales, product, customer WHERE product.product_id = sales.product_id and customer.customer_id >= sales.customer_id LIMIT 0, 30
It is better to go with Method 1 since it is more scalable.
Let me elaborate on this. If we go with method 1, we need to maintain 2 tables only that is Categories and Subcategories. In future if we have new categories or subcategories we can directly deal with this 2 tables.
If we consider same situation with Method2 then we need to create new tables every time, this may become maintenance overhead.
Let me be a bit more direct. You explain in a comment that Method 2 is a separate table for each category. If so, then Method 2 -- in general -- is just wrong.
There are two methods for storing this type of information. One is a Categories table with a (single) Subcategories table. The Subcategories table would have CategoryId, a foreign key reference back to Categories. This is the normalized data model.
The second method is to store everything in one table. Each row would be a category/subcategory combination. Information about a given category would be duplicated across multiple rows, so this is not a normalized approach. However, this is a typical approach when doing dimensional modeling for decision support systems.
If the subcategories are just names of things, there is a third approach, which would be to store a list of the subcategories within each Category row. The list would not be a delimited string. It would be JSON, a nested table, XML, array, or similar collection data type supported by the database you are using. I am mentioning this as a possibility, but not recommending it.

In SQL Server I need to change data structure of relationships (FK)

Ok I wasn't entirely sure what to title this question, so here's the situation.
I'm big on data integrity... Meaning as many constraints and rules that I can use I want to use in SQL Server and not rely on the application.
So I have a website that has a business directory, and those businesses can create a post.
So I have two tables like this:
tbl_Business ( BusinessID, Title, etc. )
tbl_Business_Post ( PostID, BusinessID, PostTitle, etc. )
There's a FK relationship for the column BusinessID between the two tables. A post cannot exist in the tbl_Business_Post table without the BusinessID existing in the tbl_Business table.
So pretty standard...
I've recently added classifieds to the site. So now I have two more tables:
tbl_Classified ( ClassifiedID, SellerID, ClassifiedTitle, etc. )
tbl_Classified_Seller ( SellerID, SellerName, etc. )
What I'm wanting to do is take advantage of my tbl_Business_Post table to include classifieds in that as well. Think of its usage like a feed... So the site will show recent posts from businesses and classifieds all in one feed.
Here's where I need guidance.
I was tempted to remove the FK relationship on the tbl_Business_Posts...
I thought about creating another separate Posts table that holds the classifieds posts.
Is there a way to make a conditional FK relationship based on a column? For example, if it's a business posting the BusinessID must exist in the Business table, or if its a classifieds post, the SellerID must exist in the Seller table?
Or should I create a separate table to hold the classifieds posts and UNION both the tables on the query?
You might question why I have a "Posts" table and that's hard to explain... but I do need it for the way the site is organized and how the feed works.
It's just that the posts table is perfect and I wanted to combine all posts and organize them by type (Ie: 'business', 'classified', 'etc.') as there might be more later.
So it comes down to, what's the best way to organize this to sustain data integrity from SSMS?
Thank you for guidance.
======== EDIT =========
Full explanation of tbl_Business_Post
PostID PK
Post_Type int <-- 1-21 is business types, 22 for classified type
BusinessID INT <-- This is the FK currently for the tbl_Business
SiblingID INT <-- This is the ID of the related item they're posting on. So for example, if they post a story about one of their products, this is the ProductID, if it's a service, this is the ServiceID.
Post_Title <-- Depending on the post, this could be a Product title, a service title, etc.
So if I changed the structure so it's as follows:
PostID PK
Post_Type int
BusinessID INT <-- this is populated on insert if it's a business.
SellerID INT <-- This is populated on insert if it's a classified seller
SiblingID INT <-- This is either the classifiedID or ProductID, SeviceID, etc. Depending on post type.
So leaning toward Peter's 1st solution/example... interested in the proper way to create check constraints or triggers on this so that if the type is 1-21, it makes sure BusinessID exists in the Business table, or if it's type 22, make sure the SellerID exists in the seller table.
Even going further with this:
If Post_Type = 22, I should make sure that not only is the Seller in the seller table, but the SiblingID is also the ClassifiedID in the Classified table.
1) There's no way to do this kind of conditional FK you're thinking of. What you need here is basically a FK from tbl_Business_Post which points logically to one of two tables, depending on the value in another column of tbl_Business_Post. This situation is what people encounter quite often. But in a relational DB this is not a very native idea.
So OK, this cannot be enforced with a FK. Instead, you can probably enforce this with a trigger or check constraint on tbl_Business_Post.
2) Alternatively, you can do the below.
Create some table tbl_Basic_Post, put there all columns which pertain to the post itself (e.g. PostTitle) and not to the parent entity which this post record belongs/points to (Business or Classified). Then create two other tables which point via a FK to the tbl_Basic_Post table like e.g.
tbl_Business_Post.Basic_Post_ID (FK)
tbl_Classified_Post.Basic_Post_ID (FK)
Put in these two tables the columns which are Business_Post/Classified_Post-specific
(you see, this is basically inheritable in relational DB terms).
Also, make each of these two tables have FKs to their respective parent tables
tbl_Business and tbl_Classified too. Now these FKs become unconditional (in your sense).
To get business posts you join tbl_Basic_Post and tbl_Business_Post.
To get classified posts you join tbl_Basic_Post and tbl_Classified_Post.
Both approaches have their pros and cons.
Approach 1) is simple, does not lead to the creation of too many tables; but it's not trivial to enforce the data integrity.
Approach 2) does not require anything special to enforce data integrity but leads to the creation of more tables.

How do I tie a table to a value (for example an ID) in SQL?

These are actually two questions relating to an online bookstore.
I have a table for customers in SQL, and it has all this customer info, including a customer ID. Then I have a table full of books on sale. Lastly, I have a table for a shopping cart.
Now the shopping cart is going to be a table very similar to the books table, only it will have less items since it'll contain some subset of the book table's contents.
I want the entire cart to be tied to a single customer's ID, and I want every entry in that cart to come directly from the Books table.
How to I go about defining such a table, I mean what statements do I need?
As it is, I'm confused about the issue, because the entire cart table essentially an attribute of a single customer, but I have no idea how to represent that in SQL. I want to be able to look up the cart table using the customer ID, basically.
Any help would be greatly appreciated.
You basically need these four tables:
create table books (book_id int, name varchar(200), author(200));
create table customers (customer_id int, name varchar(200));
create table carts (cart_id int, customer_id int);
create table cart_details (cart_id int, line_number int, book_id int, qty int, price numeric(18,2));
In essence, you will store objects like this:
Books into books table
Customers into customers table
Carts into carts and cart_details table. carts will represent a cart for a certain customer, and cart_details will represent cart's content.
Whenever you want to retrieve a cart corresponding to a certain customer you can just do:
select * from carts inner join customers using (customer_id);
If you also want cart's detail you can do:
select * from carts
inner join cart_details using (cart_id)
inner join customers using (customer_id)
;
Note: Tried to write examples in a as general as possible SQL syntax since you didn't provide the RDBMS you are using. Also, left out on purpose all details related to primary and foreign keys so you can understand tables and their attributes first.
You could address this by creating a CustomerCart table that references the Customer and Book tables. It could have a format like the following (I'm making this as non-implementation-specific as possible, partly because you haven't indicated what RDBMS you're using):
Composite primary key consisting of CustomerId + a CartId (allows a Customer to have multiple / saved / historical carts)
BookId (each record then represents an instance of a book from the catalog being added to a customer's cart)
Measures, such as Quantity, that relate to the instance of the given book in the given customer's cart
Think of your problem like this... one customer can have many books in their shopping cart, and one book can exist in many customers' shopping carts, so you have a many-to-many relationship.
With this in mind, the way to handle this problem is create a table that tracks the associations between customers and books, including any other information that is relevant.
create table shopping_cart (
customerId int,
bookId int,
quantity int,
etc...
)
It depends on the functionality you want to support with a cart.
The easiest would be to create a cart table with BookID and CustomerID. However if a customer or book is ever removed; you will, though cascade deletes if allowed, automatically remove something from someone's cart... or remove all their cart items for a person if the person is removed... do you want that to happen? If not then you need to keep all book and customer information in the cart.
Secondly, do you want to keep record of everything they've purchased over time in a "Cart" do you want them to be able to checkout with a subset of items in their cart? Is the life of the cart their visit to the site, or would the items remain months/years after they come back?
The answers to these types of questions determine the appropriate design.

Table Design For Multiple Different Products On One Order

If I were to have an online shopping website that sold apples and monitors and these were stored in different tables because the distinguishing property of apples is colour and that of monitors is resolution how would I add these both to an invoice table whilst still retaining referential integrity and not unioning these tables?
Invoices(InvoiceId)
|
InvoiceItems(ItemId, ProductId)
|
Products(ProductId)
| |
Apples(AppleId, ProductId, Colour) Monitors(MonitorId, ProductId, Resolution)
In the first place, I would store them in a single Products table, not in two different tables.
In the second place, (unless each invoice was for only one product) I would not add them to a single invoice table - instead, I would set up an Invoice_Products table, to link between the tables.
I suggest you look into Database Normalisation.
A question for your data model is You need a reference scheme will you use to identify products? Maybe SKU ?
Then identify each apple as a product by assigning an SKU. Likewise for monitors. Then use the SKU in the invoice item. Something like this:
product {sku}
key {sku};
invoice_item {invoice_id, sku}
key {invoice_id, sku} ;
apple {color, sku}
key {color}
key {sku};
monitor {size, sku}
key {size}
key {sku};
with appropriate constrains... in particular, the union of apple {sku} and monitor {sku} == product {sku}.
So Invoice table has a ProductID FK, and a ProductID can be either an AppleID (PK color) or MonitorID (PK resolution)?
If so, you can introduce a ProductTypeID with values like 0=apple, 1=monitor, or a isProductTypeApple boolean if there's only ever going to be 2 product types, and include that in the ProductID table PK.
You also need to include the ProductTypeID field in the Apple table and Monitor table PK.
I like name-value tables for these...It might be easier to redesign so it goes 'Product' and then 'product details'...product details holds the product id, the detail type and then the value. This would allow you to hold apples and monitors in the same table regardless of identifying attribute (and leave it open for other product to be added later on).
Similiar approach can be taken in the invoice table...have a 'product_type' column that tells you which table to look into (apple or monitor) and then a 'product_id' that references whatever ID column is in the apple/monitor table. Querying on a setup like this is a bit difficult and may force you to use dynamic sql...I'd only take this route if you have no control over doing the redesign above (and other answers posted here refer to)
First solution is preferential I would think...change the design on this db to the name value pair with the products and you'll save headaches writing your queries later.

How to maintain subcategory in MYSQL?

I am having categories as following,
Fun
Jokes
Comedy
Action
Movies
TV Shows
Now One video can have multiple categories or sub categories, let's say VideoId: 23 is present in Categories Fun, Fun->Comedy, Action->TV Shows but not in Action category. Now I am not getting idea that hwo should I maintain these categories in Database. Should I create only one column "CategoryId AS VARCHAR" in Videos and add category id as comma-separated values (1,3,4) like this but then how I will fetch the records if someone is browsing category Jokes?
Or should I create another table which will have videoId and categoryid, in that case if a Video is present in 3 different categories then 3 rows will be added to that new table
Please suggest some way of how to maintain categories for a particular record in the table
Thanks
You categories table could have a column in it called parentID that reference another entry in the categories table. It would be a foreign key to itself. NULL would represent a top-level category. Something other then NULL would represent "I am a child category of this category". You could assign a video to any category still, top-level, child, or somewhere inbetween.
Also, use autoincrement notnull integers for your primary keys, not varchar. It's a performance consideration.
To answer your comment:
3 tables: Videos, Categories, and Video_Category
Video_Category would have VideoID and CategoryID columns. The primary key would be a combination of the two columns (a compound primary key)
You have two choices, parentID (better as INT) to refer to the parent or an extra table with categoryID - parentID.
The last one may provide a better logical separation and allows you to have multiple categories.
I suggest that create another table which will have videoId and categoryid. Then you can use sql-query as follow:
select a.*,GROUP_CONCAT(b.category_id) as cagegory_ids
from table_video a
left join table_video_category b on a.video_id=b.video_id
group by a.video_id