How does "default" known as a "constraint" in sql? - sql

Every constraint in SQL is trying to restrict some kind of illegal input. FOREIGN KEY, UNIQUE, CHECK, etc.
The thing is, DEFAULT, as the name suggests, does not restrict the input value, rather, "assigns" the value if it's not provided. So how could it possibly categorised as an "constraint" in SQL?

A DEFAULT constraint, constrains the insertion of an implicit NULL into a column.
That is one way it called be known as a constraint.
However, note that in the SQL-92 standard grammer the 'DEFAULT' column definition element is part of a "default clause" and is not referred to as a constraint.
Its referred to as a "constraint" in vendor specific implementations, notably Microsoft SQL Server.
Argument's exist, for and against.

In general it's all about data integrity. There are two types of integrity:
data integrity (about correct values in the columns) and
referential integrity (about relationships between tables).
So constraints (as MSDN says) - let you define the way the Database Engine automatically enforces the integrity of a database.
And finally - default constraint are a special type of constraint (i.e. special type of data integrity mechanisms) that specify what values are used in a column if you do not specify a value for the column when you insert a row.
In other words constraint is just a term that used for defining the mechanisms for ensuring the database integrity. Something like this.

Related

Possible to add any checks beyond type in BigQuery table?

Other than specifying the type of a column, are there any additional constraints that can be added? For example:
CREATE TABLE team(
code VARCHAR,
conference VARCHAR,
UNIQUE INDEX (code, conference),
CHECK (code is not null)
);
In other words, anything like:
Unique
Check
Constraint/references
etc.
BigQuery doesn't enforce referential integrity constraints like uniqueness.
You can however control column nullability (e.g. making a value required, disallowing NULLs).
Additionally, some types support parameterization, e.g. restricting the max length of strings/byte values, or constraining the precision/scale of decimal types.
Biquery is a columnar database, and did not have unique indexes, if you need unique identifier to identfy each row, need to add in your code logic to insert in BQ.
For better understanding you can refer this bq documents
https://cloud.google.com/bigquery/docs/tables-intro

Is there any difference between inline and out-of-line constraint in sql?

Which one to use or which one is the better? There's any difference?
searchtype_id bigint NOT NULL
or
CONSTRAINT searchtype_id_nn CHECK ((searchtype_id IS NOT NULL))
Is there a difference? Yes. NOT NULL is part of the storage definition of the type of the column. So, NOT NULL can affect how the value is stored (is a NULL-flag required?). A NOT NULL definition can also be used for optimizations during the compilation phase of a query.
By contrast, a CHECK constraint does validate that the data meets certain characteristics, but it is less likely that this information will be used in the compilation phase.
The NOT NULL definition predates the CHECK constraint and is standard across all databases.
NULL-ability is something that I think of as part of the type -- because it is a declaration built into the language that says "this column is required to have a value". An integer column that can take on NULL values is subtly different from an integer that cannot.
I would recommend using the NOT NULL syntax rather than a CHECK constraint. It gives the database more information about the column.
Both are different and it's hard to choose between them as some things apply to one while some doesn't to other like NOT NULL constraint can only be declared inline while CHECK constraints can applied to out-of-line constraint.
If I had to choose between one of them I'll choose out-of-line as:
I'll be able to individually name my constraints which helps in debugging once you get an error or something.
Also CHECK constraint will allow me to refer single as well as multiple columns which can only be used as out-of-line constraint.

Unique Constraint column can only contain one NULL value

A Unique Constraint can be created upon a column that can contain NULLs. However, at most, only a single row may ever contain a NULL in that column.
I do not understand why this is the case since, by definition, a NULL is not equal to another NULL (since NULL is really an unknown value and one unknown value does not equal another unknown value).
My questions:
1. Why is this so?
2. Is this specific to MsSQL?
I have a hunch that it is because a Unique Constraint can act as a reference field for a Foreign Key and that the FK would otherwise not know which record in the reference table to which it was refering if more than one record with NULL existed. But, it is just a hunch.
(Yes, I understand that UCs can be across multiple columns, but that doesn't change the question; rather it just complicates it a bit.)
Yes, it's "specific" to Microsoft SQL Server (in that some other database systems have the opposite approach, the one you expected - and the one defined in the ANSI standard, but I believe there are other database systems that are the same as SQL Server).
If you're working on a version of SQL Server that supports filtered indexes, you can apply one of those:
CREATE UNIQUE INDEX IX_T ON [Table] ([Column]) WHERE [Column] IS NOT NULL
(But note that this index cannot be the target of an FK constraint)
The "Why" of it really just comes down to, that's how it was implemented long ago (possibly pre-standards) and it's one of those awkward situations where to change it now could potentially break a lot of existing systems.
Re: Foreign Keys - you would be correct, if it wasn't for the fact that a NULL value in a foreign key column causes the foreign key not to be checked - there's no way (in SQL Server) to use NULL as an actual key.
Yes it's a SQL Server feature (and a feature of a few other DBMSs) that is contrary to the ISO SQL standard. It perhaps doesn't make much sense given the logic applied to nulls in other places in SQL - but then the ISO SQL Standard isn't very consistent about its treatment of nulls either. The behaviour of nullable uniqueness constraints in Standard SQL is not very helpful. Such constraints aren't necessarily "unique" at all because they permit duplicate rows. E.g., the constraint UNIQUE(foo,bar) permits the following rows to exist simultaneously in a table:
foo bar
------ ------
999 NULL
999 NULL
(!)
Avoid nullable uniqueness constraints. It's usually straightforward to move the columns to a new table as non-nullable columns and put the uniqueness constraint there. The information that would have been represented by populating those columns with nulls can (presumably) be represented by simply not populating those columns in the new table at all.

What's a foreign key for?

I've been using table associations with SQL (MySQL) and Rails without a problem, and I've never needed to specify a foreign key constraint.
I just add a table_id column in the belongs_to table, and everything works just fine.
So what am I missing? What's the point of using the foreign key clause in MySQL or other RDBMS?
Thanks.
A foreign key is a referential constraint between two tables
The reason foreign key constraints exist is to guarantee that the referenced rows exist.
The foreign key identifies a column or set of columns in one (referencing or child) table that refers to a column or set of columns in another (referenced or parent) table.
you can get nice "on delete cascade" behavior, automatically cleaning up tables
There are lots of reason of using foreign key listed over here: Why Should one use foreign keys
Rails (ActiveRecord more specifically) auto-guesses the foreign key for you.
... By default this is guessed to be the name of the association with an “_id” suffix.
Foreign keys enforce referential integrity.
Foreign key: A column or set of columns in a table whose values are required to match at least one PrimaryKey values of a row of another table.
See also:
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html#method-i-belongs_to
http://c2.com/cgi/wiki?ForeignKey
The basic idea of foreign keys, or any referential constraint, is that the database should not allow you to store obviously invalid data. It is a core component of data consistency, one of the ACID rules.
If your data model says that you can have multiple phone numbers associated with an account, you can define the phone table to require a valid account number. It's therefore impossible to store orphaned phone records because you cannot insert a row in the phone table without a valid account number, and you can't delete an account without first deleting the phone numbers. If the field is birthdate, you might enforce a constraint that the date be prior to tomorrow's date. If the field is height, you might enforce that the distance be between 30 and 4000 cm. This means that it is impossible for any application to store invalid data in the database.
"Well, why can'd I just write all that into my application?" you ask. For a single-application database, you can. However, any business with a non-trivial database that stores data used business operations will want to access data directly. They'll want to be able to import data from finance or HR, or export addresses to sales, or create application user accounts by importing them from Active Directory, etc. For a non-trivial application, the user's data is what's important, and that's what they will want to access. At some point, they will want to access their data without your application code getting in the way. This is the real power and strength of an RDMBS, and it's what makes system integration possible.
However, if all your rules are stored in the application, then your users will need to be extremely careful about how they manipulate their database, lest they cause the application to implode. If you specify relational constraints and referential integrity, you require that other applications modify the data in a way that makes sense to any application that's going to use it. The logic is tied to the data (where it belongs) rather than the application.
Note that MySQL is absolute balls with respect to referential integrity. It will tend to silently succeed rather than throw errors, usually by inserting obviously invalid values like a datetime of today when you try to insert a null date into a datetime field with the constraint not null default null. There's a good reason that DBAs say that MySQL is a joke.
Foreign keys enforce referential integrity. Foreign key constraint will prevent you or any other user from adding incorrect records by mistake in the table. It makes sure that the Data (ID) being entered in the foreign key does exists in the reference table. If some buggy client code tries to insert incorrect data then in case of foreign key constraint an exception will raise, otherwise if the constraint is absent then your database will end up with inconsistent data.
Some advantages of using foreign key I can think of:
Make data consistent among tables, prevent having bad data( e.g. table A has some records refer to something does not exist in table B)
Help to document our database
Some framework is based on foreign keys to generate domain model

What are database constraints? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
What is a clear definition of database constraint? Why are constraints important for a database? What are the types of constraints?
Constraints are part of a database schema definition.
A constraint is usually associated with a table and is created with a CREATE CONSTRAINT or CREATE ASSERTION SQL statement.
They define certain properties that data in a database must comply with. They can apply to a column, a whole table, more than one table or an entire schema. A reliable database system ensures that constraints hold at all times (except possibly inside a transaction, for so called deferred constraints).
Common kinds of constraints are:
not null - each value in a column must not be NULL
unique - value(s) in specified column(s) must be unique for each row in a table
primary key - value(s) in specified column(s) must be unique for each row in a table and not be NULL; normally each table in a database should have a primary key - it is used to identify individual records
foreign key - value(s) in specified column(s) must reference an existing record in another table (via it's primary key or some other unique constraint)
check - an expression is specified, which must evaluate to true for constraint to be satisfied
To understand why we need constraints, you must first understand the value of data integrity.
Data Integrity refers to the validity of data. Are your data valid? Are your data representing what you have designed them to?
What weird questions I ask you might think, but sadly enough all too often, databases are filled with garbage data, invalid references to rows in other tables, that are long gone... and values that doesn't mean anything to the business logic of your solution any longer.
All this garbage is not alone prone to reduce your performance, but is also a time-bomb under your application logic that eventually will retreive data that it is not designed to understand.
Constraints are rules you create at design-time that protect your data from becoming corrupt. It is essential for the long time survival of your heart child of a database solution. Without constraints your solution will definitely decay with time and heavy usage.
You have to acknowledge that designing your database design is only the birth of your solution. Here after it must live for (hopefully) a long time, and endure all kinds of (strange) behaviour by its end-users (ie. client applications). But this design-phase in development is crucial for the long-time success of your solution! Respect it, and pay it the time and attention it requires.
A wise man once said: "Data must protect itself!". And this is what constraints do. It is rules that keep the data in your database as valid as possible.
There are many ways of doing this, but basically they boil down to:
Foreign key constraints is probably the most used constraint,
and ensures that references to other
tables are only allowed if there
actually exists a target row to
reference. This also makes it
impossible to break such a
relationship by deleting the
referenced row creating a dead link.
Check constraints can ensure that only specific values are allowed in
certain column. You could create a constraint only allowing the word 'Yellow' or 'Blue' in a VARCHAR column. All other values would yield an error. Get ideas for usage of check constraints check the sys.check_constraints view in the AdventureWorks sample database
Rules in SQL Server are just reusable Check Constraints (allows
you to maintain the syntax from a
single place, and making it easier to
deploy your constraints to other
databases)
As I've hinted here, it takes some thorough considerations to construct the best and most defensive constraint approach for your database design. You first need to know the possibilities and limitations of the different constraint types above. Further reading could include:
FOREIGN KEY Constraints - Microsoft
Foreign key constraint - w3schools
CHECK Constraints
Good luck! ;)
Constraints are nothing but the rules on the data. What data is valid and what is invalid can be defined using constraints. So, that integrity of data can be maintained.
Following are the widely used constraints:
Primary Key : which uniquely identifies the data . If this constraint has been specified for certain column then we can't enter duplicate data in that column
Check : Such as NOT NULL . Here we can specify what data we can enter for that particular column and what is not expected for that column.
Foreign key : Foreign key references to the row of other table. So that data referred in one table from another table is always available for the referencing table.
Constraints can be used to enforce specific properties of data. A simple example is to limit an int column to values [0-100000]. This introduction looks good.
Constraints dictate what values are valid for data in the database. For example, you can enforce the a value is not null (a NOT NULL constraint), or that it exists as a unique constraint in another table (a FOREIGN KEY constraint), or that it's unique within this table (a UNIQUE constraint or perhaps PRIMARY KEY constraint depending on your requirements). More general constraints can be implemented using CHECK constraints.
The MSDN documentation for SQL Server 2008 constraints is probably your best starting place.
UNIQUE constraint (of which a PRIMARY KEY constraint is a variant). Checks that all values of a given field are unique across the table. This is X-axis constraint (records)
CHECK constraint (of which a NOT NULL constraint is a variant). Checks that a certain condition holds for the expression over the fields of the same record. This is Y-axis constraint (fields)
FOREIGN KEY constraint. Checks that a field's value is found among the values of a field in another table. This is Z-axis constraint (tables).
A database is the computerized logical representation of a conceptual (or business) model, consisting of a set of informal business rules. These rules are the user-understood meaning of the data. Because computers comprehend only formal representations, business rules cannot be represented directly in a database. They must be mapped to a formal representation, a logical model, which consists of a set of integrity constraints. These constraints — the database schema — are the logical representation in the database of the business rules and, therefore, are the DBMS-understood meaning of the data. It follows that if the DBMS is unaware of and/or does not enforce the full set of constraints representing the business rules, it has an incomplete understanding of what the data means and, therefore, cannot guarantee (a) its integrity by preventing corruption, (b) the integrity of inferences it makes from it (that is, query results) — this is another way of saying that the DBMS is, at best, incomplete.
Note: The DBMS-“understood” meaning — integrity constraints — is not identical to the user-understood meaning — business rules — but, the loss of some meaning notwithstanding, we gain the ability to mechanize logical inferences from the data.
"An Old Class of Errors" by Fabian Pascal
There are basically 4 types of main constraints in SQL:
Domain Constraint: if one of the attribute values provided for a new
tuple is not of the specified attribute domain
Key Constraint: if the value of a key attribute in a new tuple
already exists in another tuple in the relation
Referential Integrity: if a foreign key value in a new tuple
references a primary key value that does not exist in the referenced
relation
Entity Integrity: if the primary key value is null in a new tuple
constraints are conditions, that can validate specific condition.
Constraints related with database are Domain integrity, Entity integrity, Referential Integrity, User Defined Integrity constraints etc.