How to define unique indexes when softDeletes is enabled - sql

Will Laravel 4.1 manage the creation of a unique index(where deleted_at = null) by itself when softDeletes?
Is the approach below correct? Or is it going to mix in already deleted records?
Schema::create('example', function(Blueprint $table) {
$table->increments('id');
$table->integer('example')->unsigned()->unique(); //?????
$table->softDeletes();
});
The database is mysql, but if there's certain solutions for other DB:s, you can provide them as well. However, it should be done within the laravel framework! A uniform solution that works with all dbs that laravel officially supports is appreciated.
Update
It seems like this approach does not work, since it just ignores the softDeletes() option.
So proposed solution
Schema::create('exampe', function(Blueprint $table) {
$table->increments('id');
$table->integer('example')->unsigned();
$table->softDeletes();
$table->unique('example', 'deleted_at');
});
Problem is that there can potentially be two exactly similar timestamps at the deleted_at column.
What I actually need is a where-condition.
$table->unique('example', array('where', 'deleted_at', '=', null));
or
$table->integer('example')->unsigned()->unique()->where('deleted_at', '=', null)

I would recommend making a two-column UNIQUE constraint over the column you want to be unique (example) and a dummy column for instance called is_live. This column is always '1' when the row is not soft-deleted. When you soft-delete a row, set is_live=NULL.
The reason is related to the way "uniqueness" is defined. Unique constraints allow any number of rows that have a NULL value. This is because NULL is not equal to NULL in SQL, so therefore two NULLs count as "not the same".
For multi-column unique keys, if any column is NULL, the whole set of columns in the unique key behaves as if it's not the same as any other row. Therefore you can have any number of rows that have one column of the unique key the same, as long as the other column in the unique key is NULL.
create table example (
id serial primary key,
example int unsigned not null,
is_live tinyint default 1,
unique key (example, is_live)
);
Demo: http://sqlfiddle.com/#!9/8d1e4d/1
PS: The direct answer to your question, about implementing a condition for indexes in MySQL, is that MySQL doesn't support this. Databases that do support partial indexes include:
PostgreSQL (https://www.postgresql.org/docs/current/static/indexes-partial.html)
Microsoft SQL Server (https://msdn.microsoft.com/en-us/library/cc280372.aspx)

Using $table->softDeletes() doesn't change how the Schema sets unique indexes. In your example only the example column will be unique.
If you want to have multiple columns in your unique index just run $table->unique(['column1', 'column2']).
To set unique index you either use it on a chain, like $table->integer('example')->unique() or have it on a new row, like I wrote above.

I have same problem. Using 2 columns as unique, after soft delete I can't create another row with same unique data.
What I want is using Blueprint table object achieve (not RAW query) that SQL:
CREATE UNIQUE INDEX test ON test_table USING btree (test_id, user_id) WHERE deleted_at IS NULL;
But Blueprint and Fluent object dont have any where method.

Related

SQL table with incompatible columns (only 1 must be used at a time)

Context:
Let's consider that I have a database with a table "house". I also have tables "tiledRoof" and "thatchedRoof".
Aim:
All my houses must have only 1 roof at a time. It can be a tiled one or a thatched one, but not both. Even if it doesn't makes a lot of sense, imagine that we might change the roof of our houses many times.
My solution:
I can figure out 2 solutions to link houses to roofs:
Solution 1 : Delete/create roofs every time :
The database should looks like this (more or less pseudo sql code):
house{
tiledRoof_id int DEFAULT NULL FOREIGN KEY REFERENCES tiledRoof(id)
thatchedRoof_id int DEFAULT NULL FOREIGN KEY REFERENCES thatchedRoof(id)
// Other columns ...
}
tiledRoof{
id
// Other columns ...
}
thatchedRoof{
id
// Other columns ...
}
So, I make "tiledRoof_id" and "thatchedRoof_id" nullable. Then if I want to link an house with a tiled roof, I do an upsert in the table "tiledRoof" . If a row have been created, I update "tiledRoof_id" to match the id created. Then, if my house was linked to a thatched roof, I delete a row in "thatchedRoof" and set "thatchedRoof_id" to NULL (I guess I can do it automatically by implementing the onDelete of my foreign key constraint).
Down sides :
Deleting a row and create later a similar other row might not be really clever. If I change 50 times my roof, I will create 50 rows and also delete 49 of them...
More queries to run than with the second solution.
Solution 2 : Add "enabler columns" :
The database should looks like this (more or less pseudo sql code):
house{
tiledRoof_id int DEFAULT(...) FOREIGN KEY REFERENCES tiledRoof(id)
thatchedRoof_id int DEFAULT(...) FOREIGN KEY REFERENCES thatchedRoof(id)
tiledRoof_enabled boolean DEFAULT True
thatchedRoof_enabled boolean DEFAULT False
// Other columns ...
}
tiledRoof{
id
// Other columns ...
}
thatchedRoof{
id
// Other columns ...
}
I fill both "tiledRoof_id" and "thatchedRoof_id" with a foreign id that links each of my houses to a tile roof AND to a thatched roof.
To make my house not really having both roofs, I just enable one of them. To do so I implement 2 additional columns : "tiledRoof_enabled " and "thatchedRoof_enabled" that will define which roof is enabled.
Alternatively, I can use a single column to set the enabled roof if that column takes an integer (1 would means that the tiled one is enabled and 2 would means the thatched one).
Difficulty :
To make that solution works, It would requiere an implementation of the default value of "tiledRoof_id" and "thatchedRoof_id" that might not be possible. It have to insert in the corresponding roof-table a new row and use the resulting row id as default value.
If that can not be done, I have start by running queries to create my roofs and then create my house.
Question:
What is the best way to reach my purpose? One of the solutions that I proposed? An other one? If it's the second one of my propositions, I would be grateful if you could explain to me if my difficulty can be resolved and how.
Note:
I'm working with sqlite3 (just for syntax is differences)
It sounds like you want a slowly changing dimension. Given only two types, I would suggest:
create table house_roofs (
house_id int references houses(house_id),
thatched_roof_id int references thatched_roofs(thatched_roof_id),
tiled_roof_id int references tiled_roofs(tiled_roof_id),
version_eff_dt datetime not null,
version_end_dt datetime,
check (thatched_roof_id is null or tiles_roof_id is null) -- only one at a time
);
This allows you to have properly declared foreign key relationships.
Are you sure you need to normalize the roof type? Why not simply add a boolean for each of the roof types in your house table. SQLLite doesn't actually have a boolean, so you could use integer 0 or 1.
Note: You would still want to have the tables thatchedRoof and tiledRoof if there are details about each of those types that are generic for all roofs of that type.
If the the tables thatchedRoof and tiledRoof contain details that are specific to each specific house, then this strategy may not work to well.

Best practice to enforce uniqueness on column but allow some duplicates?

Here is what I am trying to figure out: there should be a table to store authorizations for our new client management system, and every authorization has their unique identifier. This constraint would be pretty easy to translate to SQL, but unfortunately because of the slowness of bureaucracy, sometimes we need to create an entry with a placeholder ID (e.g., "temp") in order for the client to be able to start taking services.
What would be the best practice to enforce this conditional uniqueness constraint?
These are what I could come up with my limited experience:
Use partial indexing mentioned in the PostgreSQL manual (5.3.3. -> Example 11-3.). It also mentions that This is a particularly efficient approach when there are few successful tests and many unsuccessful ones. In our legacy DB that will be migrated, there are 130,000 rows and about 5 temp authorizations a month, but the whole table only grows by about 200 rows per year. Would this be the right approach? (I am also not sure what "efficient" means in this context.)
Create a separate table for the temp authorizations but then it would duplicate the table structure.
Define a unique constraint for a group of columns. An authorization is for a specific service for a certain time period issued to an individual.
EDIT:
I'm sorry I think my description of the authorization ID was a bit obscure: it is provided by a state department with the format of NMED012345678 and it is entered by hand. It is unique, but sometimes only provided at a later time for unknown reasons.
There is a simple, fast and secure way:
Add a boolean column to mark temporary entries which is NULL by default, say:
temp bool DEFAULT NULL CHECK (temp)
The added check constraint disallows FALSE, only NULL or TRUE are possible. Storage cost for the default NULL value is typically ... nothing - unless there are no other NULL values in the row.
How much disk-space is needed to store a NULL value using postgresql DB?
The column default means you don't normally have to take care of the column. It's NULL by default (which is the default default anyway, I'm just being explicit here). You only need to mark the few exceptions explicitly.
Then create a partial unique index like:
CREATE UNIQUE INDEX tbl_unique_id_uni ON tbl (unique_id) WHERE temp IS NULL;
That only includes rows supposed to be unique. Index size is not increased at all.
Be sure to add the predicate WHERE temp IS NULL to queries that are supposed to use the unique index.
Related:
Create unique constraint with null columns
You can have several possibilities:
Make the temp identifiers unique; for instance, if they are automatically created (not entered manually) make them:
CREATE SEQUENCE temp_ids_seq ; -- This done only once for the database
Whenever you need a new temporary id, issue
'temp' || nxtval('temp_ids_seq') AS id
Use a partial index, assuming that the value which is allowed is temp
CREATE UNIQUE INDEX tbl_unique_idx ON tbl (id) WHERE (id IS DISTINCT FROM 'temp')
For the sake of efficiency, you probably would like to have, in those cases, also the complementary index:
CREATE INDEX tbl_temp_idx ON tbl (id) WHERE (id IS NOT DISTINCT FROM 'temp')
This last index will help queries seeking id = 'temp'.
This is a bit long for a comment.
I think I would have an authorization table with a unique authorization. The authorization could then have two types: "approved" and "temporary". You could handle this with two columns.
However, I would probably have the authorization id as a serial column with the "approved" id being a field in the table. That table could have a unique constraint on it. You can use either a full unique constraint or a unique constraint with filtered values (Postgres allows multiple NULL values in a unique constraint, but the second is more explicit).
You can have the same process for the temporary authorizations -- using a different column. Presumably you have some mechanism for authorizing them and storing the approval date, time, and person.
I would not use two tables. Having authorizations spread among multiple tables just seems likely to sow confusion. Anywhere in the code where you want to see who has an authorization is a potential for mis-reading the data.
IMO it is not advisable to use remote keys as (part of) primary keys.
they are not under your control; they can change
you cannot guarantee correctness and/or uniqueness(email-addresses, telefone numbers, licence-numbers, serial numbers)
using them AS PK would cause them to be used AS FK for other tables into this table, with fat indexes and lots cascading on change.
\i tmp.sql
CREATE TABLE the_persons
( seq SERIAL NOT NULL PRIMARY KEY -- surrogate key
, registrationnumber varchar -- "remote" KEY, not necesarily UNIQUE
, is_validated BOOLEAN NOT NULL DEFAULT FALSE
, last_name varchar
, dob DATE
);
CREATE INDEX name_dob_idx ON the_persons(last_name, dob)
;
CREATE UNIQUE INDEX registrationnumber_idx ON the_persons(registrationnumber,seq)
-- WHERE is_validated = False
;
CREATE UNIQUE INDEX registrationnumber_key ON the_persons(registrationnumber)
WHERE is_validated = True
;
INSERT INTO the_persons(is_validated,registrationnumber,last_name, dob)VALUES
( True, 'OKAY001', 'Smith', '1988-02-02')
,( True, 'OKAY002', 'Jones', '1988-02-02')
,( False, 'OKAY001', 'Smith', '1988-02-02')
,( False, 'OMG001', 'Smith', '1988-08-02')
;
-- validated records:
SELECT *
FROM the_persons
WHERE is_validated = True
;
-- some records with nasty cousins
SELECT *
FROM the_persons p
WHERE EXISTS (
SELECT*
FROM the_persons x
WHERE x.registrationnumber = p.registrationnumber
AND x.is_validated = False
)
AND last_name LIKE 'Smith%'
;

UNIQUE constraint controlled by a bit column

I have a table, something like
FieldsOnForms(
FieldID int (FK_Fields)
FormID int (FK_Forms)
isDeleted bit
)
The pair (FieldID,FormID) should be unique, BUT only if the row is not deleted (isDeleted=0).
Is it possible to define such a constraint in SQLServer 2008? (without using triggers)
P.S. Setting (FieldID, FormID, isDeleted) to be unique adds the possibility to mark one row as deleted, but i would like to have the chance to set n rows (per FieldID,FormID) to isDeleted = 1, and to have only one with isDeleted = 0
You can have a unique index, using the SQL Server 2008 filtered indexes feature, or you can apply a UNIQUE index against a view (poor man's filtered index, works against earlier versions), but you cannot have a UNIQUE constraint such as you've described.
An example of the filtered index:
CREATE UNIQUE NONCLUSTERED INDEX IX_FieldsOnForms_NonDeletedUnique ON FieldsOnForms (FieldID,FormID) WHERE isDeleted=0
You could change your IsDeleted column to a DeletedDate and make it a DATETIME with the exact time when the row was logically deleted. Alternatively, you could add a DeletedDate column and then create an IsDeleted computed column against that so that you still have that column available if it's being used in code. You would then of course put a unique index over the DeletedDate (in addition to the FieldID and FormId) instead of the IsDeleted column. That would allow exactly one NULL column.
Albin posted a solution similar to this, but then deleted it. I'm not sure why, but if he re-posts it then his was here before mine. :)
No, unique means really unique. You'll either have to move your deleted records to another table or change IsDeleted to something that can be unique across all deleted records (say a time stamp). Either solution will require additional work either in your application, in a stored procedure, or in a trigger.

MySql compound keys and null values

I have noticed that if I have a unique compound keys for two columns, column_a and column_b, then my sql ignores this constraint if one column is null.
E.g.
if column_a=1 and column_b = null I can insert column_a=1 and column_b=null as much as I like
if column_a=1 and column_b = 2 I can only insert this value once.
Is there a way to apply this constraint, other than maybe changing the columns to Not Null and setting default values?
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
"A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. This constraint does not apply to NULL values except for the BDB storage engine. For other engines, a UNIQUE index allows multiple NULL values for columns that can contain NULL."
So, no, you can't get MySQL to treat NULL as a unique value. I guess you have a couple of choices: you could do what you suggested in your question and store a "special value" instead of null, or you could use the BDB engine for the table. I don't think this minor difference in behaviour warrants making an unusual choice of storage engine, though.
I worked around this issue by creating a virtual (stored) column on the same table that was COALESCE(column_b, 0). I then made by unique composite index based upon that column (and the second column) instead. Works very well.
Of course this was probably not possible back in 2010 :)

How to update unique values in SQL using a PostgreSQL sequence?

In SQL, how do update a table, setting a column to a different value for each row?
I want to update some rows in a PostgreSQL database, setting one column to a number from a sequence, where that column has a unique constraint. I hoped that I could just use:
update person set unique_number = (select nextval('number_sequence') );
but it seems that nextval is only called once, so the update uses the same number for every row, and I get a 'duplicate key violates unique constraint' error. What should I do instead?
Don't use a subselect, rather use the nextval function directly, like this:
update person set unique_number = nextval('number_sequence');
I consider pg's sequences a hack and signs that incremental integers aren't the best way to key rows. Although pgsql didn't get native support for UUIDs until 8.3
http://www.postgresql.org/docs/8.3/interactive/datatype-uuid.html
The benefits of UUID is that the combination are nearly infinite, unlike a random number which will hit a collision one day.