I recently read about a way to ensure unique values in a column in SQL while allowing multiple NULLS.
This was done using filtered indexes:
CREATE UNIQUE INDEX indexName ON tableName(columns) INCLUDE includeColumns
WHERE columnName IS NOT NULL
Could someone explains how this actually works? Is the UNIQUE constraint created on the column or not ?
To answer your first question: When the index is filtered, anything that doesn't fix the criteria in the where clause is simply left out of the index.
If the index is unique, the uniqueness is enforced only on the data that fits the criteria in the where clause.
To answer your second question: In Sql server unique constraints are implemented by creating unique indexes under the hood, so there really is not much difference between them. In any case the uniqueness is enforced on an index and not directly on the table column.
Related
In Teradata, I create table with unique primary key out of two varchar columns A and B. I will write queries that need to filter on one or both of these columns.
For best performance, should I submit a create index statement for each of the two columns (the table would have 3 indexes: the unique primary key(column A, B), non-unique column A, and non-unique column B)?
On this table, I only care about read performance and not insert/update performance.
In Teradata, if you specify a PRIMARY KEY clause when you create the table, then the table will automatically be created with a UNIQUE PRIMARY INDEX (UPI) on those PK columns. Although Teradata supports keys, it is more of an index-based DBMS.
In your case, you will have very, very fast reads (i.e. UPI access - single AMP, single row) only when you specify all of the fields in your PK. This applies to equality access as mentioned in the previous comments (thanks Dieter).
If you access the table on some but not ALL of the PK / UPI columns, then your query won't use the UPI access path. You'd need to define separate indexes or other optimization strategies, depending on your queries.
If you only care about read performance, then it makes sense to create secondary indexes on the separate columns. Just run the EXPLAIN on your query to make sure the indexes are actually being used by the Optimizer.
Another option is to ditch the PK specification altogether, especially if you never access the table on that group of columns. If there is one column you access more than the other, specify that one as your PRIMARY INDEX (non-unique) and create a secondary index on the other one. Something like:
CREATE TABLE mytable (
A INTEGER,
B INTEGER,
C VARCHAR(10)
)
PRIMARY INDEX(A) -- Non-unique primary index
;
CREATE INDEX (B) ON mytable; -- Create secondary index
You only need two indexes.
If you have a primary key on (A, B), then this also works for (A). If you want to filter on B, then you want an index on (B).
You might want to make it (B, A) so the index can handle cases such as:
where B = ? and A in (?, ?, ?)
I have a table in Oracle 11g such as with below columns.
COL1_STATUS
COL2_ID
COL3_TYPE
COL4_DATE
I want to create a UNIQUE constraint combining all 4 columns but only when COL1_STATUS = 10
How can I do that? Table is already created so I am looking for only ALTER command.
Also, I have searched and found similar question where it is suggested to use unique index but I want to achieve this by constraint.
Conditional unique constraint with multiple fields in oracle db
Thanks in advance.
A unique index and a constraint are essentially the same thing. A unique constraint is implement using a unique index. So this really should do what you want:
create unique index idx_table_4 on
table(case when status = 10 then id end,
case when status = 10 then type end,
case when status = 10 then date end);
In fact, this is how the documentation recommends implementing a unique constraint:
When you specify a unique constraint on one or more columns, Oracle
implicitly creates an index on the unique key. If you are defining
uniqueness for purposes of query performance, then Oracle recommends
that you instead create the unique index explicitly using a CREATE
UNIQUE INDEX statement. You can also use the CREATE UNIQUE INDEX
statement to create a unique function-based index that defines a
conditional unique constraint. See "Using a Function-based Index to
Define Conditional Uniqueness: Example" for more information.
The manual says "Indexes should not duplicate the columns of PRIMARY KEY, UNIQUE or FOREIGN key constraints as each of these constraints creates an index automatically." It is unclear to me whether that applies also to individual columns of a multi-column constraint. Say I have a unique constraint on columns (A,B) and I plan to do selects on B, do I need an index on B?
The quoted bit is from chapter 4. The answer can be found in chapter 2: "n HyperSQL 2.0, a multi-column index will speed up queries that contain joins or values on the first n columns of the index. You need NOT declare additional individual indexes on those columns unless you use queries that search only on a subset of the columns." So in my scenario I would need an extra index on B but I could obviate the need by making the uniqueness constraint be on (B,A) instead of (A,B).
Why does INDEX creation statement have UNIQUE argument?
As I understand, the non-clustered index contains a bookmark, a pointer to a row, which should be unique to distinguish even non-unique rows,
so insuring non-clustered index to be unique ?
Correct?
So, do I understand that no-unique index can be only on clustered table? since
"A clustered index on a view must be unique" [1]
Since "The bottom, or leaf, level of the clustered index contains the actual data rows of the table" [1], do I understand correctly that the same effect as UNIUE on clustered index can be achieved by unique constraint on (possibly all or part of) columns of a table [2]?
Then, what does bring UNIQUE argument for index?
except confusion to basic concepts definitions [3]
Update:
This is again the same pitfall - explaining something already explained many times based on undefined terms converting all explanation to never-ending guessing game.
Please see my subquestion [4] which is really re-wording of this same question here.
Update2:
The problem is in ambiguous, lacking definitions or improper use of terms in improper contexts. If index is defined as structure serving to (find and) identify/point to real data, then non-unique or NULL indexes do not make any sense. Bye
Cited:
[1]
CREATE INDEX (Transact-SQL)
http://msdn.microsoft.com/en-us/library/ms188783.aspx
[2]
CREATE TABLE (Transact-SQL)
http://msdn.microsoft.com/en-us/library/ms174979.aspx
[3]
Unique index or unique key?
Unique index or unique key?
[4]
what is index and can non-clustered index be non-unique?
what is index and can non-clustered index be non-unique?
While a non-unique index is sufficient to distinguish between rows (as you said), the UNIQUE index serves as a constraint: it will prevent duplicates from being entered into the database - where "duplicates" are rows containing the same data in the indexed columns.
Example:
Firstname | Lastname | Login
================================
Joe | Smith | joes
Joe | Taylor | joet
Susan | Smith | susans
Let's assume that login names are by default generated from first name + first letter of last name.
What happens when we try to add Joe Sciavillo to the database? Normally, the system would happily generate loginname joes and insert (Joe,Sciavillo,joes). Now we'd have two users with the same username - probably a Bad Thing.
Now let's say we have a UNIQUE index on Login column - the database will check that no other row with the same data already exists, before it allows inserting the new row. In other words, the attempt to insert another joes will be rejected, because that data wouldn't be unique in that row any more.
Of course, you could have unique indexes on multiple columns, in which case the combination of data would have to be unique (e.g. a unique index on Firstname,Lastname will happily accept a row with (Joe,Badzhanov), as the combination is not in the table yet, but would reject a second row with (Joe,Smith))
The UNIQUE index clause is really just a quirk of syntax in SQL Server and some other DBMSs. In Standard SQL, uniqueness constraints are implemented through the use of the PRIMARY KEY and UNIQUE CONSTRAINT syntax, not through indexes (there are no indexes in standard SQL).
The mechanism SQL Server uses internally to implement uniqueness constraints is called a unique index. A unique index gets created automatically for you whenever you create a PRIMARY KEY or UNIQUE constraint. For reasons best known to the SQL Server development team they decided to expose the UNIQUE keyword as part of the CREATE INDEX syntax, even though the constraint syntax does the same job.
In the interests of clarity and standards support I would recommend you avoid creating UNIQUE indexes explicitly wherever possible. Use the PRIMARY KEY or UNQIUE constraint syntax instead.
The UNIQUE clause specifies that the values in the column(s) must be unique across the table, essentially adding a unique constraint. A clustered index on a table specifies that the ordering of the rows in the table will be the same as the index. A non-clustered index does not change the physical ordering, which is why it is OK to have multiple non-clustered but only one clustered index. You can have unique or non-unique clustered and non-clustered indexes on a table.
I think the underlying question is: what is the difference between unique and non-unique indexes?
The answer is that entries in unique indexes can each only point to a single row, while entries in non-unique indexes can point to many rows.
For example, consider an order item table:
ORDER_NO INTEGER
LINE_NO INTEGER
PRODUCT_NO INTEGER
QUANTITY DECIMAL
- with a unique index on ORDER_NO and LINE_NO, and a non-unique index on PRODUCT_NO.
For a single combination of ORDER_NO and LINE_NO there can only be one entry in the table, while for a single value of PRODUCT_NO there can be many entries in the table (because there will be many entries for that value in the index).
I was creating a new table today in 10g when I noticed an interesting behavior. Here is an example of what I did:
CREATE TABLE test_table ( field_1 INTEGER PRIMARY KEY );
Oracle will by default, create a non-null unique index for the primary key. I double checked this. After a quick check, I find a unique index name SYS_C0065645. Everything is working as expected so far. Now I did this:
CREATE TABLE test_table ( field_1 INTEGER,
CONSTRAINT pk_test_table PRIMARY KEY (field_1) USING INDEX (CREATE INDEX idx_test_table_00 ON test_table (field_1)));
After describing my newly created index idx_test_table_00, I see that it is non-unique. I tried to insert duplicate data into the table and was stopped by the primary key constraint, proving that the functionality has not been affected. It seems strange to me that Oracle would allow a non-unique index to be used for a primary key constraint. Why is this allowed?
There is actually no structural difference between a unique index and a non-unique index, Oracle can use either for the PK constraint. One advantage of allowing a PK definition like this is that you can disable or defer the constraint for data loading - this isn't possible with a unique index, so one could argue that this implementation is more flexible.
Why not allow it? I love that Oracle gives you lots of options and flexibility.
Maybe you can create one index and use it for two purposes:
validate the PK
help a query perform better
Oracle will by default create a non-null unique index
Oh, and the index has nothing to do with the not null aspect.
see this excellent article about non-unique indexes policing primary keys by Richard Foote. Richard shows that you will take a performance hit when using a non-unique index.
In other words: don't use non-unique indexes to police a primary key constraint unless you really need the constraint to be deferrable.