Why use Foreign Key constraints in MySQL? - sql

I was wondering,
What will be my motivation to use constraint as foreign key in MySQL, as I am sure that I can rule the types that are added?
Does it improve performance?

Foreign keys enforce referential integrity. These constraints guarantee that a row in a table order_details with a field order_id referencing an orders table will never have an order_id value that doesn't exist in the orders table.
Foreign keys aren't required to have a working relational database (in fact MySQL's default storage engine doesn't support FKs), but they are definitely essential to avoid broken relationships and orphan rows (ie. referential integrity). The ability to enforce referential integrity at the database level is required for the C in ACID to stand.
As for your concerns regarding performance, in general there's a performance cost, but will probably be negligible. I suggest putting in all your foreign key constraints, and only experiment without them if you have real performance issues that you cannot solve otherwise.

One reason is that a set of tables with foreign key constraints cannot be sharded into multiple databases.

Related

What database design do you favor; one with actual physical relations or one that mimics it?

What is the best practice while designing a relational database? To have physical relationships between tables (actual line drawn from table to table(s)) or mimic the relationship only?
E.g.
TableA has columns
ID, Name
TableB has columns
ID, CNIC, TableA_ID
Now, TableA_ID doesn't have an actual foreign key constraint on it but stills stores the value that maps to the ID column of TableA.
I consider the later to be good and I believe that the former slows down and has cascade operation problems?
One of the main jobs of a database is to ensure data integrity.
A part of data integrity is referential integrity.
Relational databases ensure referential integrity with Foreign key constraints.
Remove the constraint, and you remove the database ability to guard against corrupt data.
Therefor, you should always specify foreign keys when your tables are related, even at the price of a performance penalty (which is usually negligible anyway).
You can't mimic a constraint if it doesn't exist - even if you have a front-end application that validates all the data being entered into the database - nothing is stopping a developer, DBA, or anyone that has a direct access to the database to enter corrupt data by mistake.
Relational model doesn't contains any kind of physical links. One relation (table) may have a reference to other one using pairs "key--foreign key" ("key" may not be a primary). To ensure the integrity of references the foreign key constraint is required. Foreign key constraint should be used in your design "by default".
Usually, a database designer can add an index on FK columns and don't take into consideration any other performance issues because they are well managed in production (i.e. temporary disabling FK checks, bulk load operations ignore FK etc).

Should I remove the foreign keys if we manually guarantee database integrity?

I use foreign keys at work. But we pretty much manually manage our tables and we always make sure that we always have a parent entry in another table for a child entry that references it by its Id. We insert, update and delete the parent and child entities in the table in the same transaction.
So why should we still keep those foreign keys? They slow the database down when inserting new entities in the database and may be one of the reasons we get deadlocks from time to time.
Are they actually used by Sql Server for other things? Like gathering better statistics or is their only purpose to keep data integrity?
You shouldn't. Drop constraints with their foreign keys.
Checks at the Database lever are the last integrity barrier protecting your data.
For performance issues you might want to remove foreign keys but you might end up having to maintain a partially corrupted DB what ends up being a nightmare.
Can Foreign key improve performance
Foreign key constraint improve performance at the time of reading data
but at the same time it slows down the performance at the time of
inserting / modifying / deleting data.
In case of reading the query, the optimizer can use foreign key
constraints to create more efficient query plans as foreign key
constraints are pre declared rules. This usually involves skipping
some part of the query plan because for example the optimizer can see
that because of a foreign key constraint, it is unnecessary to execute
that particular part of the plan.

Create an index on columns

We have some tables, in which we didn't give any foreign key constraints. So if we indexing those columns without giving foreign key constraints, is there any performance issue?
Indexes will increase your performance only if you are using the indexed columns in your queries. So index a column only if you are using that column frequently in the queries. The foreign key constraint has no relation for the column to be indexed.
Not defining a foreign key constraint on a column which is supposedly acting as foreign-key in your DB, then you will probably face data-integrity issues as already mentioned by #The scrum.
Depends on the query optimizer. Optimizers can make certain assumptions when they know about foreign key constraints. There used to be a bug in SQL Server's optimizer related to those assumptions.
But query performance isn't the real issue. Data integrity is the real issue.
Is it important to get the wrong answer really, really fast?

Why use primary keys?

What are primary keys used aside from identifying a unique column in a table? Couldn't this be done by simply using an autoincrement constraint on a column? I understand that PK and FK are used to relate different tables, but can't this be done by just using join?
Basically what is the database doing to improve performance when joining using primary keys?
Mostly for referential integrity with foreign keys,, When you have a PK it will also create an index behind the scenes and this way you don't need table scans when looking up values
RDBMS providers are usually optimized to work with tables that have primary keys. Most store statistics which helps optimize query plans. These statistics are very important to performance especially on larger tables and they are not going to work the same without primary keys, and you end up getting unpredictable query response times.
Most database best practices books suggest creating all tables with a primary key with no exceptions, it would be wise to follow this practice. Not many things say junior software dev more than one who builds a database without referential integrity!
Some PKs are simply an auto-incremented column. Also, you typically join USING the PK and FK. There has to be some relationship to do a join. Additionally, most DBMS automatically index PKs by default, which improves join performance as well as querying for a particular record based on ID.
You can join without a primary key within a query, however, you must have a primary key defined to enforce data integrity constraints, at least with SQL Server. (Foreign Keys, etc..)
Also, here is an interesting read for you on Primary Keys.
In Microsoft Access, if you have a linked table to, say, SQL Server, the source table must have a primary key in order for the linked table to be writeable. At least, that was the case with Access 2000 and SQL Server 6.5. It may be different with later versions.
Keys are about data integrity as well as identification. The uniqueness of a key is guaranteed by having a constraint in the database to keep out "bad" data that would otherwise violate the key. The fact that data integrity rules are guaranteed in that way is precisely what makes a key usable as an identifier. That goes for any key. One key per table by convention is called a "primary" key but that doesn't make other alternate keys any less important.
In practice we need to be able to enforce uniqueness rules against all types of data (not just numbers) to satisfy the demands of data quality and usability.

SQL Server Foreign Key constraint benefits

We're designing a database in which I need to consider some FK(foreign key)
constraints. But it is not limited to
formal structuring and normalization.
We go for it only if it provides any
performance or scalability benefits.
I've been going thru some interesting articles and googling for practical benefits. Here are some links:
http://www.mssqltips.com/tip.asp?tip=1296
I wanted to know more about the benefits of FK (apart from the formal structuring and the famous cascaded delete\update).
FK are not 'indexed' by default so what are the considerations while indexing an FK?
How to handle nullable fields which are mapped as foreign key - is this allowed?
Apart from indexing, does this help in optimizing query-execution plans in SQL-Server?
I know there's more but I'd prefer experts speaking on this. Please guide me.
Foreign keys provide no performance or scalability benefits.
Foreign keys enforce referential integrity. This can provide a practical benefit by raising an error if someone attempted to delete rows from the parent table in error.
Foreign keys are not indexed by default. You should index your foreign keys columns, as this avoids a table scan on the child table when you delete/update your parent row.
You can make a foreign key column nullable and insert null.
The main benefit is that your database will not end up inconsistent if your buggy client code tries to do something wrong. Foreign keys are a type of 'constraint', so that's how you should use them.
They do not have any "functional" benefit, they will not optimize anything. You still have to create indexes yourself, etc. And yes, you can have NULL values in a column that is a foreign key.
FK constraints keep your data consistent. That's it. This is the main benefit.
FK constraints will not provide you with any performance gain.
But, unless you have denormalized on purpose db structure, I'd recommend you to use FK constraints. The main reason - consistency.
I have read at least one example on net where it was shown that Foreign Keys do improve performance because the optimiser does not have to do additional checks across tables because it knows data meets certain criteria already due to the FK. Sorry I don't have a link but the blog gave detailed output of the query plans to prove it.
As mentioned, they are for data integrity. Any performance "loss" would be utterly wiped out by the time required to fix broken data.
However, there could be an indirect performance benefit.
For SQL Server at least, the columns in the FK must have the same datatype on each side. Without an FK, you could have an nvarchar parent and a varchar child for example. When you join the 2 tables, you'll get a datatype conversions which can kill performance.
Example: different varchar lengths causing an issue