Non-clustered index in sql server - sql-server-2005

can i specify more then one column as Non-cluster index, what will it affect?

Like other people have said, this will have an overhead in terms of maintaining the table, it's just a case of whether the benefits outweigh the cost.
The only reasons I can think of for doing adding a multiple-column non-clustered index are to speed up queries where you regularly search based on a combination of fields, or to make a combination of fields unique in the table. If those are your aims then generally I'd say go for it.

Assuming you are asking about how to define multiple indexes, you can define as many non-clustered indexes as you want. It's the clustered index that you can only have one of, since it determines how rows are clumped together as records on disk.
The more indexes you create, the longer it will take to perform insert, update and delete operations.

Yes, you can have multiple columns in an index, be it clustered or non-clustered. It is very common. It's effect depends entirely on the data in your table.

Related

Is unique composite index as effective as non-composite for queries on first column?

I have a table with b-tree index on column A (non-unique). Now I want to add a check for uniqueness of column A and column B combination when inserting, so I want to add a unique composite index (A, B).
Should I drop existing non-composite index? (queries in most cases use single index, as I have read)?
Will unique composite index be as effective as non-unique non-composite one for queries only on column A?
If you have a lot of queries going for column A only in a where clause, then most likely you should keep the index on column A in addition to the new one.
The amount of queries which would use the index and the query cost difference are the 2 most important criteria for deciding whether or not to leave the index. As it depends on many factors like amount of content in the table and also query parameters, as Frank Heikens comment says, you can use the EXPLAIN ANALYZE statements to check important queries with and without the index to confirm your hypothesis.
There is a very small probability it would make sense to keep both indexes. If the unique index is almost never exercised (because you never do inserts or non-HOT updates, or queries that benefit from both columns) and you have precisely the right amount of memory and memory usage patterns, then it is possible for the single-column index to be small enough to stay in cache while the composite would not be.
But most likely what would happen is that the composite index would be used at least enough of the time that both indexes would be fighting with each other for cache space, making it overall less effective.

when I use something like unique key(element1,element2) how does it work internally?

If I say index(element1),
index(element2) does it use much
less space than unique
key(element1,element2)?
I know what they do is different. My
understanding is that unique
key(element1,element2) ensures that
there are no duplicates where those
2 rows are the same. Is this
correct?
Does it still index both keys
individually?
But is this expensive in terms of
disk space and checking to create
such and index?
Maybe it's better to not have it if
it's not critical there are no
duplicates?
An INDEX(a,b) uses less space than two indexes INDEX(a) and INDEX(b), because each index consists of (a part of) that column and the primary key. But read the note below about the functional difference between these indices.
Correct. A UNIQUE KEY makes that no 2 rows have the same values for the columns in that key.
A UNIQUE INDEX is also an INDEX and can be used for searching. A special example of a UNIQUE KEY is the PRIMARY KEY.
Indexes do take up space on the disk, depending on your Storage Engine. If your application is write-heavy (like a logging table), sometimes it might be better to not have an index. Most tables are probably read-heavy though.
From a logical point of view, if it's not critical there are no duplicates, don't put the index.
Edit: elaboration on pst's comment:
If you have INDEX(A), INDEX(B) and INDEX(A,B), your INDEX(A) is redundant. Drop it
But INDEX(A,B) does not cover queries that only search on B, you need an INDEX(B) for that.
You can argument that INDEX(A) and INDEX(B) together can use MySQL's INDEX MERGE to form the INDEX(A,B). This leaves you the choice between
INDEX(A,B) and INDEX(B)
INDEX(A) and INDEX(B)
Solution 2 will take less disk-space, that is true. But read this Very Nice MySQLPerformanceBlog article about INDEX MERGE, which comes to this conclusion:
As a summary: Use multi column indexes
is typically best idea if you use AND
between such columns in where clause.
Index merge does helps performance but
it is far from performance of combined
index in this case. In case you're
using OR between columns - single
column indexes are required for index
merge to work and combined indexes
can't be used for such queries.

mysql too many indexes?

I am spending some time optimizing our current database.
I am looking at indexes specifically.
There are a few questions:
Is there such a thing as too many indexes?
What will indexes speed up?
What will indexes slow down?
When is it a good idea to add an index?
When is it a bad idea to add an index?
Pro's and Con's of multiple indexes vs multi-column indexes?
What will indexes speed up?
Data retrieval -- SELECT statements.
What will indexes slow down?
Data manipulation -- INSERT, UPDATE, DELETE statements.
When is it a good idea to add an index?
If you feel you want to get better data retrieval performance.
When is it a bad idea to add an index?
On tables that will see heavy data manipulation -- insertion, updating...
Pro's and Con's of multiple indexes vs multi-column indexes?
Queries need to address the order of columns when dealing with a covering index (an index on more than one column), from left to right in index column definition. The column order in the statement doesn't matter, only that of columns 1, 2 and 3 - a statement needs have a reference to column 1 before the index can be used. If there's only a reference to column 2 or 3, the covering index for 1/2/3 could not be used.
In MySQL, only one index can be used per SELECT/statement in the query (subqueries/etc are seen as a separate statement). And there's a limit to the amount of space per table that MySQL allows. Additionally, running a function on an indexed column renders the index useless - IE:
WHERE DATE(datetime_column) = ...
I disagree with some of the answers on this question.
Is there such a thing as too many indexes?
Of course. Don't create indexes that aren't used by any of your queries. Don't create redundant indexes. Use tools like pt-duplicate-key-checker and pt-index-usage to help you discover the indexes you don't need.
What will indexes speed up?
Search conditions in the WHERE clause.
Join conditions.
Some cases of ORDER BY.
Some cases of GROUP BY.
UNIQUE constraints.
FOREIGN KEY constraints.
FULLTEXT search.
Other answers have advised that INSERT/UPDATE/DELETE are slower the more indexes you have. That's true, but consider that many uses of UPDATE and DELETE also have WHERE clauses and in MySQL, UPDATE and DELETE support JOINs too. Indexes may benefit these queries more than making up for the overhead of updating indexes.
Also, InnoDB locks rows affected by an UPDATE or DELETE. They call this row-level locking, but it's really index-level locking. If there's no index to narrow down the search, InnoDB has to lock a lot more rows than the specific row you're changing. It can even lock all the rows in the table. These locks block changes made by other clients, even if they don't logically conflict.
When is it a good idea to add an index?
If you know you need to run a query that would benefit from an index in one of the above cases.
When is it a bad idea to add an index?
If the index is a left-prefix of another existing index, or the index doesn't help any of the queries you need to run.
Pro's and Con's of multiple indexes vs multi-column indexes?
In some cases, MySQL can perform index-merge optimization, and either union or intersect the results from independent index searches. But it gives better performance to define a single index so the index-merge doesn't need to be done.
For one of my consulting customers, I defined a multi-column index on a many-to-many table where there was no index, and improved their join query by a factor of 94 million!
Designing the right indexes is a complex process, based on the queries you need to optimize. You shouldn't make broad rules like "index everything" or "index nothing to avoid slowing down updates."
See also my presentation How to Design Indexes, Really.
Is there such a thing as too many indexes?
Indexes should be informed by the problem at hand: the tables, the queries your application will run, etc.
What will indexes speed up?
SELECTs.
What will indexes slow down?
INSERTs will be slower, because you have to update the index.
When is it a good idea to add an index?
When your application needs another WHERE clause.
When is it a bad idea to add an index?
When you don't need it to query or enforce uniqueness constraints.
Pros and Cons of multiple indexes vs multi-column indexes?
I don't understand the question. If you have a uniqueness constraint that includes multiple columns, by all means model it as such.
Is there such a thing as too many indexes?
Yes. Don't go out looking to create indexes, create them as necessary.
What will indexes speed up?
Any queries against the indexes table/view.
What will indexes slow down?
Any INSERT statements against the indexed table will be slowed down, because each new record will need to be indexed.
When is it a good idea to add an index?
When a query is not running at an acceptable speed. You may be filtering on records that are not part of the clustered PK, in which case you should add indexes based on the filters you are searching upon (if the performance deems fit).
When is it a bad idea to add an index?
When you do it for the sake of it - i.e over-optimization.
Pro's and Con's of multiple indexes vs multi-column indexes?
Depends on the queries you are trying to improve.
Is there such a thing as too many indexes?
Yup, like all things, too many indexes will slow down data manipulation.
When is it a good idea to add an index?
A good idea to add an index is when your queries are too slow (i.e. you have too many joins in your queries). You should use this optimization only after you built a solid model, to tweak the performance.

multiple cluster indices effect

My question is about limitation of clustered index on a table.
By theory, in a single table we can have only one cluster index. But what if I have datetime columns in a table say "From date" and "To date"? These columns will often required in WHERE clause to populate reports in my application. And if I also require a cluster index on primary key in the same table, then still how to get advantage of cluster index on other columns? In this case my queries will still run slower with larger records.
In practice, you also can have just a single clustered index on a table - since the table's data is physically ordered by that clustered index.
If you need two datetime columns frequently in WHERE clauses, the best choice would be to have a non-clustered index on those two columns, and possibly include additional columns that you frequently retrieve with those queries, in order to make it a covering index.
There's really not much difference between a good, covering non-clustered index, and the clustered index, in terms of query performance.
However, you don't want to bloat your clustered index, since those columns will also be added to all non-clustered indices on the same table - keep it small, preferably an INT, ever-increasing, stable (not changing) and you should be just fine.
Another option is indexed (or materialized) views: you could create multiple views on the table, each with a different clustered index. That might be useful in a reporting scenario, but indexed views have lots of restrictions and will affect the performance of queries that modify the table data. Books Online has all the information you need to create and test them.
I suspect your real requirement is indeed to implement a reporting solution, and if so then it might be best to do it properly: create a separate database with a schema optimized for reporting (Google "star schema") and load data regularly from the main database into the reporting one. But that's a whole new area of development to investigate, and I wouldn't rush into it.
If you need the performance of a cluster index table for multiple indexes of the same table, the only route I see is holding a copy of the table for each cluster index.
The clustered index effects the physical storage of the data in a table, so by definition there can only be one. You can widen your clustered index to include the other columns, but this can have its own disadvantages.
The performance advantage from a clustered index is that the records are stored in a manner that reflects the index (which is why random inserts and updates to the clustered index very quickly fragment the table), and therefore query performance based on this index can be as good as performance as reads from your storage device, you can't get this on other indexes.
I suggest that you choose the index from which you would derive the most benefit from clustering and make that your clustered index. Make the rest of the indexes non-clustered. You may want to run some tests to find out what benefits will be derived from making different indexes clustered vs. non-clustered.
Share and enjoy.

How to know when to use indexes and which type?

I've searched a bit and didn't see any similar question, so here goes.
How do you know when to put an index in a table? How do you decide which columns to include in the index? When should a clustered index be used?
Can an index ever slow down the performance of select statements? How many indexes is too many and how big of a table do you need for it to benefit from an index?
EDIT:
What about column data types? Is it ok to have an index on a varchar or datetime?
Well, the first question is easy:
When should a clustered index be used?
Always. Period. Except for a very few, rare, edge cases. A clustered index makes a table faster, for every operation. YES! It does. See Kim Tripp's excellent The Clustered Index Debate continues for background info. She also mentions her main criteria for a clustered index:
narrow
static (never changes)
unique
if ever possible: ever increasing
INT IDENTITY fulfills this perfectly - GUID's do not. See GUID's as Primary Key for extensive background info.
Why narrow? Because the clustering key is added to each and every index page of each and every non-clustered index on the same table (in order to be able to actually look up the data row, if needed). You don't want to have VARCHAR(200) in your clustering key....
Why unique?? See above - the clustering key is the item and mechanism that SQL Server uses to uniquely find a data row. It has to be unique. If you pick a non-unique clustering key, SQL Server itself will add a 4-byte uniqueifier to your keys. Be careful of that!
Next: non-clustered indices. Basically there's one rule: any foreign key in a child table referencing another table should be indexed, it'll speed up JOINs and other operations.
Furthermore, any queries that have WHERE clauses are a good candidate - pick those first which are executed a lot. Put indices on columns that show up in WHERE clauses, in ORDER BY statements.
Next: measure your system, check the DMV's (dynamic management views) for hints about unused or missing indices, and tweak your system over and over again. It's an ongoing process, you'll never be done! See here for info on those two DMV's (missing and unused indices).
Another word of warning: with a truckload of indices, you can make any SELECT query go really really fast. But at the same time, INSERTs, UPDATEs and DELETEs which have to update all the indices involved might suffer. If you only ever SELECT - go nuts! Otherwise, it's a fine and delicate balancing act. You can always tweak a single query beyond belief - but the rest of your system might suffer in doing so. Don't over-index your database! Put a few good indices in place, check and observe how the system behaves, and then maybe add another one or two, and again: observe how the total system performance is affected by that.
Rule of thumb is primary key (implied and defaults to clustered) and each foreign key column
There is more but you could do worse than using SQL Server's missing index DMVs
An index may slow down a SELECT if the optimiser makes a bad choice, and it is possible to have too many. Too many will slow writes but it's also possible to overlap indexes
Answering the ones I can I would say that every table, no matter how small, will always benefit from at least one index as there has to be at least one way in which you are interested in looking up the data; otherwise why store it?
A general rule for adding indexes would be if you need to find data in the table using a particular field, or set of fields. This leads on to how many indexes are too many, generally the more indexes you have the slower inserts and updates will be as they also have to modify the indexes but it all depends on how you use your data. If you need fast inserts then don't use too many. In reporting "read only" type data stores you can have a number of them to make all your lookups faster.
Unfortunately there is no one rule to guide you on the number or type of indexes to use, although the query optimiser of your chosen DB can give hints based on the queries you are executing.
As to clustered indexes they are the Ace card you only get to use once, so choose carefully. It's worth calculating the selectivity of the field you are thinking of putting it on as it can be wasted to put it on something like a boolean field (contrived example) as the selectivity of the data is very low.
This is really a very involved question, though a good starting place would be to index any column that you will filter results on. ie. If you often break products into groups by sale price, index the sale_price column of the products table to improve scan times for that query, etc.
If you are querying based on the value in a column, you probably want to index that column.
i.e.
SELECT a,b,c FROM MyTable WHERE x = 1
You would want an index on X.
Generally, I add indexes for columns which are frequently queried, and I add compound indexes when I'm querying on more than one column.
Indexes won't hurt the performance of a SELECT, but they may slow down INSERTS (or UPDATES) if you have too many indexes columns per table.
As a rule of thumb - start off by adding indexes when you find yourself saying WHERE a = 123 (in this case, an index for "a").
You should use an index on columns that you use for selection and ordering - i.e. the WHERE and ORDER BY clauses.
Indexes can slow down select statements if there are many of them and you are using WHERE and ORDER BY on columns that have not been indexed.
As for size of table - several thousands rows and upwards would start showing real benefits to index usage.
Having said that, there are automated tools to do this, and SQL server has an Database Tuning Advisor that will help with this.