MySql compound keys and null values - sql

I have noticed that if I have a unique compound keys for two columns, column_a and column_b, then my sql ignores this constraint if one column is null.
E.g.
if column_a=1 and column_b = null I can insert column_a=1 and column_b=null as much as I like
if column_a=1 and column_b = 2 I can only insert this value once.
Is there a way to apply this constraint, other than maybe changing the columns to Not Null and setting default values?

http://dev.mysql.com/doc/refman/5.0/en/create-index.html
"A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. This constraint does not apply to NULL values except for the BDB storage engine. For other engines, a UNIQUE index allows multiple NULL values for columns that can contain NULL."
So, no, you can't get MySQL to treat NULL as a unique value. I guess you have a couple of choices: you could do what you suggested in your question and store a "special value" instead of null, or you could use the BDB engine for the table. I don't think this minor difference in behaviour warrants making an unusual choice of storage engine, though.

I worked around this issue by creating a virtual (stored) column on the same table that was COALESCE(column_b, 0). I then made by unique composite index based upon that column (and the second column) instead. Works very well.
Of course this was probably not possible back in 2010 :)

Related

Best practice to enforce uniqueness on column but allow some duplicates?

Here is what I am trying to figure out: there should be a table to store authorizations for our new client management system, and every authorization has their unique identifier. This constraint would be pretty easy to translate to SQL, but unfortunately because of the slowness of bureaucracy, sometimes we need to create an entry with a placeholder ID (e.g., "temp") in order for the client to be able to start taking services.
What would be the best practice to enforce this conditional uniqueness constraint?
These are what I could come up with my limited experience:
Use partial indexing mentioned in the PostgreSQL manual (5.3.3. -> Example 11-3.). It also mentions that This is a particularly efficient approach when there are few successful tests and many unsuccessful ones. In our legacy DB that will be migrated, there are 130,000 rows and about 5 temp authorizations a month, but the whole table only grows by about 200 rows per year. Would this be the right approach? (I am also not sure what "efficient" means in this context.)
Create a separate table for the temp authorizations but then it would duplicate the table structure.
Define a unique constraint for a group of columns. An authorization is for a specific service for a certain time period issued to an individual.
EDIT:
I'm sorry I think my description of the authorization ID was a bit obscure: it is provided by a state department with the format of NMED012345678 and it is entered by hand. It is unique, but sometimes only provided at a later time for unknown reasons.
There is a simple, fast and secure way:
Add a boolean column to mark temporary entries which is NULL by default, say:
temp bool DEFAULT NULL CHECK (temp)
The added check constraint disallows FALSE, only NULL or TRUE are possible. Storage cost for the default NULL value is typically ... nothing - unless there are no other NULL values in the row.
How much disk-space is needed to store a NULL value using postgresql DB?
The column default means you don't normally have to take care of the column. It's NULL by default (which is the default default anyway, I'm just being explicit here). You only need to mark the few exceptions explicitly.
Then create a partial unique index like:
CREATE UNIQUE INDEX tbl_unique_id_uni ON tbl (unique_id) WHERE temp IS NULL;
That only includes rows supposed to be unique. Index size is not increased at all.
Be sure to add the predicate WHERE temp IS NULL to queries that are supposed to use the unique index.
Related:
Create unique constraint with null columns
You can have several possibilities:
Make the temp identifiers unique; for instance, if they are automatically created (not entered manually) make them:
CREATE SEQUENCE temp_ids_seq ; -- This done only once for the database
Whenever you need a new temporary id, issue
'temp' || nxtval('temp_ids_seq') AS id
Use a partial index, assuming that the value which is allowed is temp
CREATE UNIQUE INDEX tbl_unique_idx ON tbl (id) WHERE (id IS DISTINCT FROM 'temp')
For the sake of efficiency, you probably would like to have, in those cases, also the complementary index:
CREATE INDEX tbl_temp_idx ON tbl (id) WHERE (id IS NOT DISTINCT FROM 'temp')
This last index will help queries seeking id = 'temp'.
This is a bit long for a comment.
I think I would have an authorization table with a unique authorization. The authorization could then have two types: "approved" and "temporary". You could handle this with two columns.
However, I would probably have the authorization id as a serial column with the "approved" id being a field in the table. That table could have a unique constraint on it. You can use either a full unique constraint or a unique constraint with filtered values (Postgres allows multiple NULL values in a unique constraint, but the second is more explicit).
You can have the same process for the temporary authorizations -- using a different column. Presumably you have some mechanism for authorizing them and storing the approval date, time, and person.
I would not use two tables. Having authorizations spread among multiple tables just seems likely to sow confusion. Anywhere in the code where you want to see who has an authorization is a potential for mis-reading the data.
IMO it is not advisable to use remote keys as (part of) primary keys.
they are not under your control; they can change
you cannot guarantee correctness and/or uniqueness(email-addresses, telefone numbers, licence-numbers, serial numbers)
using them AS PK would cause them to be used AS FK for other tables into this table, with fat indexes and lots cascading on change.
\i tmp.sql
CREATE TABLE the_persons
( seq SERIAL NOT NULL PRIMARY KEY -- surrogate key
, registrationnumber varchar -- "remote" KEY, not necesarily UNIQUE
, is_validated BOOLEAN NOT NULL DEFAULT FALSE
, last_name varchar
, dob DATE
);
CREATE INDEX name_dob_idx ON the_persons(last_name, dob)
;
CREATE UNIQUE INDEX registrationnumber_idx ON the_persons(registrationnumber,seq)
-- WHERE is_validated = False
;
CREATE UNIQUE INDEX registrationnumber_key ON the_persons(registrationnumber)
WHERE is_validated = True
;
INSERT INTO the_persons(is_validated,registrationnumber,last_name, dob)VALUES
( True, 'OKAY001', 'Smith', '1988-02-02')
,( True, 'OKAY002', 'Jones', '1988-02-02')
,( False, 'OKAY001', 'Smith', '1988-02-02')
,( False, 'OMG001', 'Smith', '1988-08-02')
;
-- validated records:
SELECT *
FROM the_persons
WHERE is_validated = True
;
-- some records with nasty cousins
SELECT *
FROM the_persons p
WHERE EXISTS (
SELECT*
FROM the_persons x
WHERE x.registrationnumber = p.registrationnumber
AND x.is_validated = False
)
AND last_name LIKE 'Smith%'
;

How to define unique indexes when softDeletes is enabled

Will Laravel 4.1 manage the creation of a unique index(where deleted_at = null) by itself when softDeletes?
Is the approach below correct? Or is it going to mix in already deleted records?
Schema::create('example', function(Blueprint $table) {
$table->increments('id');
$table->integer('example')->unsigned()->unique(); //?????
$table->softDeletes();
});
The database is mysql, but if there's certain solutions for other DB:s, you can provide them as well. However, it should be done within the laravel framework! A uniform solution that works with all dbs that laravel officially supports is appreciated.
Update
It seems like this approach does not work, since it just ignores the softDeletes() option.
So proposed solution
Schema::create('exampe', function(Blueprint $table) {
$table->increments('id');
$table->integer('example')->unsigned();
$table->softDeletes();
$table->unique('example', 'deleted_at');
});
Problem is that there can potentially be two exactly similar timestamps at the deleted_at column.
What I actually need is a where-condition.
$table->unique('example', array('where', 'deleted_at', '=', null));
or
$table->integer('example')->unsigned()->unique()->where('deleted_at', '=', null)
I would recommend making a two-column UNIQUE constraint over the column you want to be unique (example) and a dummy column for instance called is_live. This column is always '1' when the row is not soft-deleted. When you soft-delete a row, set is_live=NULL.
The reason is related to the way "uniqueness" is defined. Unique constraints allow any number of rows that have a NULL value. This is because NULL is not equal to NULL in SQL, so therefore two NULLs count as "not the same".
For multi-column unique keys, if any column is NULL, the whole set of columns in the unique key behaves as if it's not the same as any other row. Therefore you can have any number of rows that have one column of the unique key the same, as long as the other column in the unique key is NULL.
create table example (
id serial primary key,
example int unsigned not null,
is_live tinyint default 1,
unique key (example, is_live)
);
Demo: http://sqlfiddle.com/#!9/8d1e4d/1
PS: The direct answer to your question, about implementing a condition for indexes in MySQL, is that MySQL doesn't support this. Databases that do support partial indexes include:
PostgreSQL (https://www.postgresql.org/docs/current/static/indexes-partial.html)
Microsoft SQL Server (https://msdn.microsoft.com/en-us/library/cc280372.aspx)
Using $table->softDeletes() doesn't change how the Schema sets unique indexes. In your example only the example column will be unique.
If you want to have multiple columns in your unique index just run $table->unique(['column1', 'column2']).
To set unique index you either use it on a chain, like $table->integer('example')->unique() or have it on a new row, like I wrote above.
I have same problem. Using 2 columns as unique, after soft delete I can't create another row with same unique data.
What I want is using Blueprint table object achieve (not RAW query) that SQL:
CREATE UNIQUE INDEX test ON test_table USING btree (test_id, user_id) WHERE deleted_at IS NULL;
But Blueprint and Fluent object dont have any where method.

How to allow NULL value for one column in unique index on multiple columns

Seems like a pretty straightforward question, but I can't seem to locate the specific answer anywhere online.
I have a price model that references both quote and a line item, thus it has quote_id and line_item_id columns. I am using a multicolumn index with unique: true to prevent multiple prices for the same line_item from being attached to a quote. quote_id is the first column in the index.
However, I want to allow users to add prices to line_items that haven't been quoted. It works fine for one price, quote_id is null and line_item_id is present. However, the unique index is preventing me from attaching a second price to the same line_item. The null value in quote_id is being treated as unique.
Is there any way to make the unique constraint only apply when the quote is not null?
I know that allow_nil can be used in model validation, but can it be used in the index?
I am thinking something like:
add_index :prices, [:quote_id, :line_item_id], :unique => true, :allow_nil => true
PostgreSQL and Rails 4.
Is there any way to make the unique constraint only apply when the quote is not null?
Actually, this is the only way. You can have multiple "identical" entries with one or more of the columns in a multicolumn index being NULL, because Postgres does not consider two NULL values identical for this purpose (like in most contexts).
The sequence of columns in the index doesn't matter for this. (It matters for other purposes, though.)
Make sure the underlying columns in the table itself can be NULL. (Sometimes confused with the empty string ''.)
After re-reading your question, the more appropriate scenario seems to be this related answer:
Create a multicolumn index to enforce uniqueness
This one may be of help, too, but the question asks for the opposite of yours:
How to add a conditional unique index on PostgreSQL
If you actually want a partial index with only non-null value, you can do that too:
CREATE INDEX price_uni_idx ON price (quote_id, line_item_id)
WHERE quote_id IS NOT NULL
AND line_item_id IS NOT NULL;
You do not need that for your purpose, though. It may still be useful to exclude rows with NULL values from the index to make it faster. Details here:
Indexed ORDER BY with LIMIT 1

How to apply Unique key constraint on 2 columns in the table when the values in the table are showing error

The CREATE UNIQUE INDEX statement terminated because a duplicate key was found
for the object name 'dbo.tblhm' and the index name 'New_id1'. The duplicate
key value is (45560, 44200).
i want to know how to work on unique key constraint taking 2 columns together.Such that the values previously stored in the database are not in that format.Such it is showing me the above error ,So how to overcome that so that all the work can be done and no column value in the database gets deleted
If I follow you correctly, you have a duplicate key which you want to ignore but still want to apply a unique constraint going forward? I don't think this is possible. Either you need to remove the duplicate row (or update it such that it is not a duplicate), move the duplicated data into an archive table without a unique index or add the index to the existing table without a unique constraint.
I stand to be corrected, but I don't think there is any other way round this.
Lets assume that you are creating a Unique Index on columns : column1 and column2 in your table dbo.tblhm
This would assume that there is no repitition of any combination of column1, column2 values in any rows in table dbo.tblhm
As per your error, the following combination (45560, 44200) of values for column1, column2 is present in more than 1 row and hence the constraint fails.
What you need to do is to clean up your data first using an UPDATE statement to change the column1 or column2 values in the rows that are duplicates BEFORE you try to create the constraint.
AFAIK, in Oracle you have the " novalidate" keyword which can be used to achieve what you want to do without cleaning up the existing data. But atleast I am not aware of any way to achieve that in SQL Server without first cleaning up the data
The error means exactly what it says - there is more than one row with the same key.
i.e. for
CREATE UNIQUE INDEX New_id1 on dbo.tblhm(Column1, Column2)
there is more than one row with the same values for Column1 and Column2
So either
Your data is corrupt (e.g. inserting without checking for duplicates) - you will need to find and merge / delete duplicate keys before recreating the index
Or your Index can't be unique (e.g. there is a valid reason why there can be more than one row with this key, e.g. at a business level).

How to update unique values in SQL using a PostgreSQL sequence?

In SQL, how do update a table, setting a column to a different value for each row?
I want to update some rows in a PostgreSQL database, setting one column to a number from a sequence, where that column has a unique constraint. I hoped that I could just use:
update person set unique_number = (select nextval('number_sequence') );
but it seems that nextval is only called once, so the update uses the same number for every row, and I get a 'duplicate key violates unique constraint' error. What should I do instead?
Don't use a subselect, rather use the nextval function directly, like this:
update person set unique_number = nextval('number_sequence');
I consider pg's sequences a hack and signs that incremental integers aren't the best way to key rows. Although pgsql didn't get native support for UUIDs until 8.3
http://www.postgresql.org/docs/8.3/interactive/datatype-uuid.html
The benefits of UUID is that the combination are nearly infinite, unlike a random number which will hit a collision one day.