unique index slows down? - sql

does it slow down the query time to use a lot of unique indexes? i dont have that many im just curious, i think i have heard this some where
id (primary auto_increment)
username (unique)
password
salt (unique)
email (unique)

It depends on the database server software and table/index type you are using.
Inserts will be slower any time you have indexes of any sort, but not necessarily by a lot - this will depend on the table size as well.
In general, unique indexes should speed up any SELECT queries you use that are able to take advantage of the index(es)

They will slow down your inserts and updates (since each has to be checked to see if the constraint is violated), but should not slow down selects. In fact, they could speed them up, since there are more choices for the optimizer to use to find your data.

In general, having more indexes will slow down inserts, updates and deletes, but can speed up queries - assuming that you query based on these fields.
If you need a unique index to guarantee consistency in your application then you should usually add it even if it will result in a slight performance hit. It's better to be correct than fast but wrong.

Each time you add a unique key to a database table it adds a non-clustered index. Clustered indexes are on primary keys and sort the data in the table physically by that column(s). Each non-clustered index creates a number of leaves external to the table that sort based off of the unique key. This allows for a much more consistent search on the unique key because a full table scan is no longer required. The down side is every time you insert, update, or delete a row from the table, the server must go back and update all of the leaves external to the table. This is usually not an issue if you have a small number of unique keys but when you have many it can slow down the response times. For more info, read the Wikipedia article here. Msdn also has a good article.

Query time, no. They will slow down INSERT and DELETE time though, since each index value has to be calculated and then inserted, or removed.

Related

Do I need to use this many indexes in my SQL Server 2008 database?

I'd appreciate some advice from SQL Server gurus here. Let me explain...
I have an SQL Server 2008 database table that has 21 columns. Here's a quick type of those:
INT Primary Key
Several other INT's that are indexes already (used to reference this and other tables)
Several NVARCHAR(64) to hold user-provided text
Several NVARCHAR(256) to hold longer user-provided text
Several DATETIME2
One BIGINT
Several UNIQUEIDENTIFIER, one is already an index
The way this table is used is that it is presented to a user as a sortable table and a user can choose which column to sort it by. This table may contain many thousands of records (like currently it does 21,000 and it will be growing.)
So my question is, do I need to set each column as an INDEX to enable faster sorting?
PS. Forgot to say. The output obviously supports pagination, so the user sees no more than 100 rows at once.
Contrary to popular belief, just having an index on a column does not guarantee that any queries will be any faster!
If you constantly use SELECT *.. from that table, these non-clustered indices on a single column will most likely not be used at all.
A good nonclustered index is a covering index, which means, it contains all the necessary columns to satisfy one or multiple given queries. If you have this situation, then a nonclustered index can make sense - otherwise, in more cases than not, the nonclustered index is likely to be ignored by the query optimizer. The reason for this being: if you need all the columns anyway, the query would have to do key lookups from the nonclustered index into the actual data (the clustered index) for each row found - and the key lookup is a very expensive operation, so doing this for a lots of hits becomes overly costly, and the query optimizer will rather quickly switch to a index scan (possibly the clustered index scan) to fetch the data.
Don't over-index - use a well-designed clustered index, put indices on the foreign key columns to speed up joins - and then let it be for the time being. Observe your system, measure performance, maybe add an index here or there - but don't just overload the system with tons of indices!
Having too many indices can be worse than having none - every index must be maintained, e.g. updated for each INSERT, UPDATE and DELETE statement - does that take time!
this table is ... presented to a user as a sortable table ... [that] may contain many thousands of records
If you're ordering many thousands of records for display, you're doing it wrong. Typical users can reasonably process at most around 500 typical records. Exceptional users can handle a couple thousand. Any more than that, and you're misleading your users into a false sense that they've seen a representative sample. This results in poor decision making and inefficient user workflow. Instead, you need to focus on a good search algorithm.
Another to keep in mind here is that more indexes means slower inserts and updates. It's a balancing act. Sql Server keeps statistics on what queries and sorts it actually performs, and makes those statistics available to you. There are queries you can run that tell you exactly what indexes Sql Server thinks it could use. I would deploy without any sorting index and let it run for a week or two that way. Then look at data and see what users actually sort on and index just those columns.
Take a look at this link for an example and introduction on finding missing indexes:
http://sqlserverpedia.com/wiki/Find_Missing_Indexes
Generally indexes use to accelerate WHERE conditions (in some cases JOINS). so I don't thinks create index on column except PRIMARY KEY accelerate sorting. you can do your sorting in clients(if you use win forms or wpf) or in database for web scenarios
Good Luck

Having an index on a database field that is VARCHAR can increase the speed in the insertions?

I have two tables in my database: page and link. In each one I define that the URL field is UNIQUE because I don't want repetead URLs.
Being a UNIQUE field, it automatically have an index? Creating an index for these field can speed up the insertions? What is the most appropriate index for a VARCHAR field?
Having a lot of rows can slow the insert because this UNIQUE field? At the moment, I have 1,200,000 rows.
Yes, adding a UNIQUE constraint will create an index:
Adding a unique constraint will automatically create a unique btree index on the column or group of columns used in the constraint.
This won't speed up your INSERTs though, it will actually slow them down:
Every insert will have to be checked (using the index) to ensure that uniqueness is maintained.
Inserts will also update the index and this doesn't come for free.
Logically speaking, a constraint is one thing, and an index is another. Constraints have to do with data integrity; indexes have to do with speed.
Practically speaking, most dbms implement a unique constraint by building a unique index. A unique index lets the dbms determine more quickly whether the values you're trying to insert are already in the table.
I suppose an index on a VARCHAR() column might speed up an insert under certain circumstances. But generally an index slows inserts, because the dbms has to
check all the constraints, then
insert the data, and finally
update the index.
A suitable index will speed up updates, because the dbms can find the rows to be updated more quickly. (But it might have to update the index, too, which costs you a little bit.)
PostgreSQL can tell you which indexes it's using. See EXPLAIN.
Usually b-tree/b+tree index is the most common indexes, and most likely inserts and updates are slower with these indexes, whereas selection of single row, selection of ranges and ORDER BY (ascending in most cases) would be very quick. This is because this index is ordered and so insertion would have to find out where to insert, instead of just inserting it at the end of the table. In the case of a clustered index, insertion/updates are even worse because of page splits.
Being unique would probably make it a bit slower since it has to scan more rows to make sure it is unique.
Also varchar is generally not the best choice for indexes if you are looking for optimal performance, integer is much much faster if it can be used. So there really is no 'best' index for varchar, each index has its own strengths and weaknesses and theres always a tradeoff. It really depends on the situation and what you plan to do with it, do you only need inserts/updates? Or do you also need to make selections? These are the things you need to ask.

Having Identity column in sql table increases the performace with query used for that table

I have created a table without any indexes and identity columns. I don't have any necessity for the identity column. If I add the identity column will it increase the execution of query(SELECT, UPDATE, DELETE) used with that table.
IDENTITY has nothing to do with the indexes or performance, it just allows you not to worry about the values of the surrogate keys.
It's a part of the table's metadata and not inferred from the actual values.
If you added an identity field as a PK you might have an increase in performance as it would create a unique index automatically. However the increase in performance is from the indexing not the fact that it is an identity. You could simliarly increate performance by adding an index to whatever your current natural key is. If you have no natural, you may need to re-think the table design. All tables should have a way to uniquely identify a row.
Now indexes may not do much for performance if you do not have many rows in the table.
Why are you not using indexes?
If you are trying to speed up your queries, the first thing you should do is add appropriate indexes. A primary key will have an index added automatically, but you'll have to create the rest of the indexes by hand.
It depends on your platform on how these are created. A quick search in Google should teach you how to add these indexes to increase your performance.
if you have many rows(100K +) you are going to need indexes for performance gains, if you have few rows then you will not feel the performance gain.
Adding identity as said before will only be another data on table and will not affect performance but you still do need to use indexes if you have many rows.

Should searchable date fields in a database table always be indexed?

If I have a field in a table of some date type and I know that I will always be searching it using comparisons like between, > or < and never = could there be a good reason not to add an index for it?
The only reason not to add an index on a field you are going to search on is that the cost of maintaining the index overweights its benefits.
This may happen if:
You have a really tough DML on your table
The existence of the index makes it intolerably slow, and
It's more important to have fast DML than the fast queries.
If it's not the case, then just create the index. The optimizer just won't use it if it thinks it's not needed.
There are far more bad reasons.
However, an index on the search column may not be enough if the index is nonclustered and non-covering. Queries like this are often good candidates for clustered indexes, however a covering index is just as good.
This is a great example of why this is as much art as science. Some considerations:
How often is data added to this table? If there is far more reading/searching than adding/changing (the whole point of some tables to dump data into for reporting), then you want to go crazy with indexes. You clustered index might be needed more for the ID field, but you can have plenty of multi-column indexes (where the date fields comes later, with columns listed earlier in the index do a good job of reducing the result set), and covered indexes (where all returned values are in the index, so it's very fast, like you're searching on the clustered index to begin with).
If the table is edited/added to often, or you have limited storage space and hence can't have tons of indexes, then you have to be more careful with your indexes. If your date criteria typically gives a wide range of data, and you don't search often on other fields, then you could give a clustered index over to this date field, but think several times before you do that. You clustered index being on a simple autonumber field is a bonus for all you indexes. Non-covered indexes use the clustered index to zip to the records for the result set. Don't move the clustered index to a date field unless the vast majority of your searching is on that date field. It's the nuclear option.
If you can't have a lot of covered indexes (data changes a lot on the table, there's limited space, your result sets are large and varied), and/or you really need the clustered index for another column, and the typical date criteria gives a wide range of records, and you have to search a lot, you've got problems. If you can dump data to a reporting table, do that. If you can't, then you'll have to balance all these competing factors carefully. Maybe for the top 2-3 searches you minimize the result-set columns as much as you can configure covered indexes, and you let the rest make due with a simple non -clustered index
You can see why good db people should be paid well. I know a lot of the factors, but I envy people to can balance all these things quickly and correctly without having to do a lot of profiling.
Don't index it IF you want to scan the entire table every time. I would want the database to try and do a range scan, so I'd add the index, but I use SQL Server and it will use the index in most cases. However different databases many not use the index.
Depending on the data, I'd go further than that, and suggest it could be a clustered index if you're going to be doing BETWEEN queries, to avoid the table scan.
While an index helps for querying the table, it will also slow down inserts, updates and deletes somewhat. If you have a lot more changes in the table than queries, an index can hurt the overall performance.
If the table is small it might never use the indexes therefore adding them may just be wasting resources.
There are datatypes (like image in SQL Server) and data distributions where indexes are unlikely to be used or can't be used. For instance in SQL Server, it is pointless to index a bit field as there is not enough variability in the data for an index to do any good.
If you usually query with a like clause and a wildcard as the first character, no index will be used, so creating one is another waste of reseources.

How to know when to use indexes and which type?

I've searched a bit and didn't see any similar question, so here goes.
How do you know when to put an index in a table? How do you decide which columns to include in the index? When should a clustered index be used?
Can an index ever slow down the performance of select statements? How many indexes is too many and how big of a table do you need for it to benefit from an index?
EDIT:
What about column data types? Is it ok to have an index on a varchar or datetime?
Well, the first question is easy:
When should a clustered index be used?
Always. Period. Except for a very few, rare, edge cases. A clustered index makes a table faster, for every operation. YES! It does. See Kim Tripp's excellent The Clustered Index Debate continues for background info. She also mentions her main criteria for a clustered index:
narrow
static (never changes)
unique
if ever possible: ever increasing
INT IDENTITY fulfills this perfectly - GUID's do not. See GUID's as Primary Key for extensive background info.
Why narrow? Because the clustering key is added to each and every index page of each and every non-clustered index on the same table (in order to be able to actually look up the data row, if needed). You don't want to have VARCHAR(200) in your clustering key....
Why unique?? See above - the clustering key is the item and mechanism that SQL Server uses to uniquely find a data row. It has to be unique. If you pick a non-unique clustering key, SQL Server itself will add a 4-byte uniqueifier to your keys. Be careful of that!
Next: non-clustered indices. Basically there's one rule: any foreign key in a child table referencing another table should be indexed, it'll speed up JOINs and other operations.
Furthermore, any queries that have WHERE clauses are a good candidate - pick those first which are executed a lot. Put indices on columns that show up in WHERE clauses, in ORDER BY statements.
Next: measure your system, check the DMV's (dynamic management views) for hints about unused or missing indices, and tweak your system over and over again. It's an ongoing process, you'll never be done! See here for info on those two DMV's (missing and unused indices).
Another word of warning: with a truckload of indices, you can make any SELECT query go really really fast. But at the same time, INSERTs, UPDATEs and DELETEs which have to update all the indices involved might suffer. If you only ever SELECT - go nuts! Otherwise, it's a fine and delicate balancing act. You can always tweak a single query beyond belief - but the rest of your system might suffer in doing so. Don't over-index your database! Put a few good indices in place, check and observe how the system behaves, and then maybe add another one or two, and again: observe how the total system performance is affected by that.
Rule of thumb is primary key (implied and defaults to clustered) and each foreign key column
There is more but you could do worse than using SQL Server's missing index DMVs
An index may slow down a SELECT if the optimiser makes a bad choice, and it is possible to have too many. Too many will slow writes but it's also possible to overlap indexes
Answering the ones I can I would say that every table, no matter how small, will always benefit from at least one index as there has to be at least one way in which you are interested in looking up the data; otherwise why store it?
A general rule for adding indexes would be if you need to find data in the table using a particular field, or set of fields. This leads on to how many indexes are too many, generally the more indexes you have the slower inserts and updates will be as they also have to modify the indexes but it all depends on how you use your data. If you need fast inserts then don't use too many. In reporting "read only" type data stores you can have a number of them to make all your lookups faster.
Unfortunately there is no one rule to guide you on the number or type of indexes to use, although the query optimiser of your chosen DB can give hints based on the queries you are executing.
As to clustered indexes they are the Ace card you only get to use once, so choose carefully. It's worth calculating the selectivity of the field you are thinking of putting it on as it can be wasted to put it on something like a boolean field (contrived example) as the selectivity of the data is very low.
This is really a very involved question, though a good starting place would be to index any column that you will filter results on. ie. If you often break products into groups by sale price, index the sale_price column of the products table to improve scan times for that query, etc.
If you are querying based on the value in a column, you probably want to index that column.
i.e.
SELECT a,b,c FROM MyTable WHERE x = 1
You would want an index on X.
Generally, I add indexes for columns which are frequently queried, and I add compound indexes when I'm querying on more than one column.
Indexes won't hurt the performance of a SELECT, but they may slow down INSERTS (or UPDATES) if you have too many indexes columns per table.
As a rule of thumb - start off by adding indexes when you find yourself saying WHERE a = 123 (in this case, an index for "a").
You should use an index on columns that you use for selection and ordering - i.e. the WHERE and ORDER BY clauses.
Indexes can slow down select statements if there are many of them and you are using WHERE and ORDER BY on columns that have not been indexed.
As for size of table - several thousands rows and upwards would start showing real benefits to index usage.
Having said that, there are automated tools to do this, and SQL server has an Database Tuning Advisor that will help with this.