Imagine that we have a website where users can read articles, view photos, watch videos, and many more. Every "item" may be commented, so that we need space to save that comments somewhere. Let's discuss storage possibilities for this case.
Distributed solution
We can obviously create separate tables for each "item", so that we have tables like:
CREATE TABLE IF NOT EXISTS `article_comments` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`createdBy` int(11) DEFAULT NULL,
`createdAt` int(11) DEFAULT NULL,
`article` int(11) DEFAULT NULL,
`content` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
and then obviously photo_comments, video_comments, and so on. The advantages of this way are as follows:
we can specify Foreign Key to every "item" table,
database is divided into logical parts.
there is no problem with export of such data.
Disadvantages:
many tables
probably hard to maintain (adding fields, etc.)
Centralized solution
On the other hand we can merge all those tables into two:
CREATE TABLE IF NOT EXISTS `comment_types` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
and
CREATE TABLE IF NOT EXISTS `comments` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`createdBy` int(11) DEFAULT NULL,
`createdAt` int(11) DEFAULT NULL,
`type` int(11) DEFAULT NULL,
`content` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
Table comment_types is a dictionary, it contains key-value pairs of commented item "type" and its name, for example :
1:Articles
2:Photos
3:Videos
Table comments stores usual data with additional type field.
Advantages:
Maintenance (adding / removing fields),
Adding new comment types "on the fly".
Disadvantages:
Harder to migrate / export,
Possible performance drop when querying large dataset.
Discussion:
Which storage option will be better in terms of query performance (assume that dataset IS big enough for that to be the case),
Again performance - will adding INDEX on type remove or drastically reduce that percormance drop?
Which storage option will be better in terms of management and possible migration in the future (distributed will be better, of course, but let's see if centralized one isn't the one far away)
I'm not sure either of the disadvantages you list for option 2 are serious, data export is easily accomplished with a simple WHERE clause and I wouldn't worry about performance. Option 2 is properly normalised and in a modern relational database performance should be excellent (and can be tweaked further with appropriate indexes etc if necessary).
I would only consider the first option if I could prove that it was necessary for performance, scalability or other reasons - but it must be said that seems unlikely.
Related
I have the following issue. We are currently working on a system for a shuttle service company. Now, part of the entities in the database for this system include numerous lookup tables (such as vehicle_type, employee_status, etc), as well as some other tables, such as vehicle and vehicle_service log.
Now the issue we as a team are having is that we cannot agree on what the logical relationship cardinalities between entities should be. The two main problem relationships include the tables defined as follows:
CREATE TABLE IF NOT EXISTS `user_type` (
`type_id` tinyint(4) NOT NULL AUTO_INCREMENT,
`description` varchar(200) NOT NULL,
PRIMARY KEY (`type_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Store the user types - employee
or consultant' AUTO_INCREMENT=1 ;
which is linked to
CREATE TABLE IF NOT EXISTS `user` (
`user_id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(100) NOT NULL,
`password` varchar(100) NOT NULL,
`user_type` tinyint(4) NOT NULL,
PRIMARY KEY (`user_id`),
KEY `user_type` (`user_type`),
KEY `username` (`username`),
KEY `login` (`username`,`password`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Table used when logging in
to check access level, type of user, etc. ' AUTO_INCREMENT=1 ;
The user table includes other irrelevant data. So the issue here is that I say (because MySQL Workbench reverse engineered it that way and it makes more sense) that the relationship should be 1-many, while another team member says that it should be 0-many (because some records may exist in the user_type table that aren't used in the user table)
The other table relationship we are having words about are defined as follows:
CREATE TABLE IF NOT EXISTS `vehicle` (
`vehicle_id` int(11) NOT NULL AUTO_INCREMENT,
`registration_number` varchar(10) NOT NULL,
PRIMARY KEY (`vehicle_id`),
UNIQUE KEY `registration_number` (`registration_number`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Actual vehicle information'
AUTO_INCREMENT=1 ;
Again, with some other columns not relative to the question. This links with
CREATE TABLE IF NOT EXISTS `service_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`vehicle_id` int(11) NOT NULL,
`description` text NOT NULL,
`date` date NOT NULL,
`cost` double NOT NULL,
PRIMARY KEY (`id`),
KEY `vehicle_id` (`vehicle_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Store records of all services
to vehicles' AUTO_INCREMENT=1 ;
Should this be 1-many or 0-many because a vehicle may not yet go in for a service? According to me it should be 1-many, but I don't know if this works logically.
We are all very confused about this whole logical modelling thing, so any help would be much appreciated!
I figured it would be easier for me to create the DB first and then reverse engineer it to a physical model, but never though about logical.
Zero to many if it is optional. Say for example a Sales Rep would have a zero or many customer. Why is that? Because if there is a new sales rep then it would mean he/she has no customer to begin with unless of course he/she assume the accounts of a resigned Sales Rep.
On the other hand one or many is mandatory. For example a an Order which has order date and customer who ordered should have at least one record on Order Detail table. Let's say a customer ordered a tablet last 04/22/2013 then he/she would have:
Order table
----------------------------------------
Orderid. OrderDate. Customermnum
----------------------------------------
1. 04/22/2013 101
Order detail table
----------------------------------------
Orderid. Productid. Qty. quotedprice
----------------------------------------
1. T101 1 500
So, in your case User to UserType is 1 to 0 or many beacause a user type may have not been used by any user yet.
Now, vehicle to service It is also 1 to 0 or many since a vehicle may not necessarily have a service done yet.
Are there libraries that focus on taking two database exports, finding the differences and creating update/alter statements for it? Basically an update script from export A to export B.
For instance this:
-- Version 1
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
-- Version 2
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`description` text,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
-- Would result in this:
ALTER TABLE `mytable`
ADD `description` text;
Edit: this question is related to libraries for MySQL, not tools.
There are a few MySQL comparison tools out there.
SQLyog
Redgate MySQL Compare
RedGate http://www.red-gate.com/products/sql-development/sql-compare/index-b offer a very good and stable solution to this.
I believe ultimate edition of Visual Studio 2010 can also compare schemas however I'm not sure if it will generate the ALTER scripts for you.
Edit:
I just remembered this http://opendbiff.codeplex.com/ too however I didn't have much luck when I last looked at it.
This node module could be useful. It diffs live databases, but then it should be simple to create a live database from an SQL dump.
https://github.com/contra/dbdiff
I am designing a many-to-many relationship between two models: Cell and Isolator:
CREATE TABLE `isolator` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(127) NOT NULL,
`surname` varchar(127) NOT NULL,
PRIMARY KEY (`id`),
);
CREATE TABLE `cell` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(127) NOT NULL,
`gen` varchar(127) NOT NULL,
PRIMARY KEY (`id`),
);
CREATE TABLE `cell_isolators` (
`cell_id` bigint(20) NOT NULL DEFAULT '0',
`isolator_id` bigint(20) NOT NULL DEFAULT '0',
PRIMARY KEY (`cell_id`,`isolator_id`),
CONSTRAINT `cell_isolator_id` FOREIGN KEY (`isolator_id`) REFERENCES `isolator` (`id`),
CONSTRAINT `isolator_cell_id` FOREIGN KEY (`cell_id`) REFERENCES `cell` (`id`) ON DELETE CASCADE
);
The client has asked me the possibility of specifying the order on which the list of isolators for a cell is shown, since it's important for publication purposes.
What is the best approach to model that? I was thinking on adding a third field to the many-to-many relation (e.g. sort_order), but I would like to know if there are other alternatives.
Thanks!
The only way to do that in the general case is to add a column to "cell_isolators".
The data type is application-dependent to a certain extent. I've seen this done with columns as integers, floats or decimals, and alphanumerics (varchar(n)).
Populating and maintaining it can be troublesome, though. That doesn't have anything to do with the database design; it's just painful to maintain the sort order for more than a few rows, especially if there are regular inserts to the table and changes to the sort order. Fortunately, that job usually falls to the user interface, not to the dbms.
Just wanted to know what would happen if in my book database i had two different authors which have the same name. How could i redesign my database to sort this problem out? Do i have to assign primary and secondary keys or something? By the way this question is related to my previous one.
An AUTHORS table would help your book database - you could store the author info once, but associate it with multiple books:
DROP TABLE IF EXISTS `example`.`authors`;
CREATE TABLE `example`.`authors` (
`author_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`firstname` varchar(45) NOT NULL,
`lastname` varchar(45) NOT NULL,
PRIMARY KEY (`author_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Books can have multiple authors, so you'd need a many-to-many table to relate authors to books:
DROP TABLE IF EXISTS `example`.`book_authors_map`;
CREATE TABLE `example`.`book_authors_map` (
`book_id` int(10) unsigned NOT NULL,
`author_id` int(10) unsigned NOT NULL,
PRIMARY KEY (`book_id`,`author_id`),
KEY `FK_authors` (`author_id`),
CONSTRAINT `FK_books` FOREIGN KEY (`book_id`) REFERENCES `books` (`book_id`),
CONSTRAINT `FK_authors` FOREIGN KEY (`author_id`) REFERENCES `authors` (`author_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
You should almost always use your own in-house ID system, even if it's never displayed to your users. In your database each book will have it's own 'id' attribute, which you can just auto-increment by 1 each time.
The reason for doing this, other than the example in your question, is that even if you use a seemingly unique identifier (like an ISBN), this standard could (and has) change at some point in time, leaving you with a lot of work to do to update your database.
If you have two different authors with the exact same name, each author should have some sort of unique ID to differntiate them, either a GUID or an autonumber.
Use natural keys where they exist - in this case ISBN numbers
I have been slowly learning SQL the last few weeks. I've picked up all of the relational algebra and the basics of how relational databases work. What I'm trying to do now is learn how it's implemented.
A stumbling block I've come across in this, is foreign keys in MySQL. I can't seem to find much about the other than that they exist in the InnoDB storage schema that MySQL has.
What is a simple example of foreign keys implemented in MySQL?
Here's part of a schema I wrote that doesn't seem to be working if you would rather point out my flaw than show me a working example.
CREATE TABLE `posts` (
`pID` bigint(20) NOT NULL auto_increment,
`content` text NOT NULL,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`uID` bigint(20) NOT NULL,
`wikiptr` bigint(20) default NULL,
`cID` bigint(20) NOT NULL,
PRIMARY KEY (`pID`),
Foreign Key(`cID`) references categories,
Foreign Key(`uID`) references users
) ENGINE=InnoDB;
Assuming your categories and users table already exist and contain cID and uID respectively as primary keys, this should work:
CREATE TABLE `posts` (
`pID` bigint(20) NOT NULL auto_increment,
`content` text NOT NULL,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`uID` bigint(20) NOT NULL,
`wikiptr` bigint(20) default NULL,
`cID` bigint(20) NOT NULL,
PRIMARY KEY (`pID`),
Foreign Key(`cID`) references categories(`cID`),
Foreign Key(`uID`) references users(`uID`)
) ENGINE=InnoDB;
The column name is required in the references clause.
Edited: Robert and Vinko state that you need to declare the name of the referenced column in the foreign key constraint. This is necessary in InnoDB, although in standard SQL you're permitted to omit the referenced column name if it's the same name in the parent table.
One idiosyncrasy I've encountered in MySQL is that foreign key declaration will fail silently in several circumstances:
Your MySQL installation doesn't include the innodb engine
Your MySQL config file doesn't enable the innodb engine
You don't declare your table with the ENGINE=InnoDB table modifier
The foreign key column isn't exactly the same data type as the primary key column in the referenced table
Unfortunately, MySQL gives no message that it has failed to create the foreign key constraint. It simply ignores the request, and creates the table without the foreign key (if you SHOW CREATE TABLE posts, you may see no foreign key declaration). I've always thought this is a bad feature of MySQL!
Tip: the integer argument for integer data types (e.g. BIGINT(20)) is not necessary. It has nothing to do with the storage size or range of the column. BIGINT is always the same size regardless of the argument you give it. The number refers to how many digits MySQL will pad the column if you use the ZEROFILL column modifier.
This has some code showing how to create foreign keys by themselves, and in CREATE TABLE.
Here's one of the simpler examples from that:
CREATE TABLE parent (
id INT NOT NULL,
PRIMARY KEY (id)
) ENGINE=INNODB;
CREATE TABLE child (
id INT,
parent_id INT,
INDEX par_ind (parent_id),
FOREIGN KEY (parent_id) REFERENCES parent(id)
ON DELETE CASCADE
) ENGINE=INNODB;
I agree with Robert. You are missing the name of the column in the references clause (and you should be getting the error 150). I'll add that you can check how the tables got created in reality with:
SHOW CREATE TABLE posts;
The previous answers deal with the foreign key constraint. While the foreign key constraint is definitely useful to maintain referential integrity, the concept of "foreign key" itself is fundamental to the relational model of data, regardless of whether you use the constraint or not.
Whenever you do an equijoin, you are equating a foreign key to something, usually the key it references. Example:
select *
from
Students
inner join
StudentCourses
on Students.StudentId = StudentCourses.StudentId
StudentCourses.StudentId is a foreign key referencing Students.StudentId.