Index all columns - sql

Knowing that an indexed column leads to a better performance, is it worthy to indexes all columns in all tables of the database? What are the advantages/disadvantages of such approach?
If it is worthy, is there a way to auto create indexes in SQL Server? My application dynamically adds tables and columns (depending on the user configuration) and I would like to have them auto indexed.

It is difficult to imagine real-world scenarios where indexing every column would be useful, for the reasons mentioned above. The type of scenario would require a bunch of different queries, all accessing exactly one column of the table. Each query could be accessing a different column.
The other answers don't address the issues during the select side of the query. Obviously, maintaining indexes is an issue, but if you are creating the table/s once and then reading many, many times, the overhead of updates/inserts/deletes is not a consideration.
An index contains the original data along with points to records/pages where the data resides. The structure of an index makes it fast to do things like: find a single value, retrieve values in order, count the number of distinct values, and find the minimum and maximum values.
An index does not only take space up on disk. More importantly, it occupies memory. And, memory contention is often the factor that determines query performance. In general, building an index on every column will occupy more space than then original data. (One exception would be a column that is relative wide and has relatively few values.)
In addition, to satisfy many queries you may need one or more indexes plus the original data. Your page cache gets rather filled with data, which can increase the number of cache misses, which in turn incurs more overhead.
I wonder if your question is really a sign that you have not modelled your data structures adequately. There are few cases where you want users to build ad hoc permanent tables. More typically, their data would be stored in a pre-defined format, which you can optimize for the access requirements.

No because you have to take in consideration that every time you add or update a record, you have to recalculate your indexes and having indexes on all columns would take a lot of time and lead to bad performance.
So databases like data warehouses where there use only select queries is a good idea but on normal database it's a bad idea.
Also, it's not because you are using a column in a where clause that you have to add an index on it.
Try to find a column where the record will be almost all unique like a primary key and that you don't edit often.
A bad idea would be to index the sex of a person cause there are only 2 possible values and the result of the index would only split the data then it will search in almost every records.

No, you should not index all of your columns, and there's several reasons for this:
There is a cost to maintain each index during an insert, update or delete statement, that will cause each of those transactions to take longer.
It will increase the storage required since each index takes up space on disk.
If the column values are not disperse, the index will not be used/ignored (ex: A gender flag).
Composite indexes (indexes with more than one column) can greatly benefit performance for frequently run WHERE, GROUP BY, ORDER BY or JOIN clauses, and multiple single indexes cannot be combined.
You are much better off using Explain plans and data access and adding indexes when necessary (and only when necessary, IMHO), rather than creating them all up front.

No, there is overhead in maintaining the indexes, so indexing all columns would slow down all of your insert, update and delete operations. You should index the columns that you are frequently referencing in WHERE clauses, and you will see a benefit.

Indexes take up space. And they take up time to create, rebuild, maintain, etc. So there's not a guaranteed return on performance for indexing just any old column. You should index the columns that give the performance for the operations you'll use. Indexes help reads, so if you're mostly reading, index columns that will be searched on, sorted by, or joined to other tables relationally. Otherwise, it's more expensive than what benefit you may see.

Every index requires additional CPU time and disk I/O overhead during
inserts and deletions.
Indies on non-primary keys might have to be hanged on updates, although an index on the primary key might not (this is beause updates typially do not modify the primary-key attributes).
Each extra index requires additional storage spae.
For queries whih involve onditions on several searh keys, e ieny
might not be bad even if only some of the keys have indies on them.
Therefore, database performane is improved less by adding indies when
many indies already exist.

Related

When isn't it appropriate to use a SQL Index

I was asked a question today on when wouldn't I want to create a SQL Index on a table.
The only thing I can think of is when you don't need one (i.e. a small table). That answer doesn't feel right. Is there a thresh-hold on when I should use an index and when I shouldn't?
When not to create an index on the table, there are lots of things to consider.
First, is that there are a lot of possible indexes you could create. For example, you could create an index containing not only every column in the table, but every permutation of the columns (since column ordering in indexes does matter). This could be a huge number of indexes as your column count gets higher.
Every index comes with a number of things that decrease performance in different ways. For example, they may take memory/disk space from what is available. Probably worse than this though, is the fact that indexes need to be updated when the table underneath it is updated. This means that every insert/update/delete in a table, can trigger an index update. As you have more indexes, that's more indexes to update, which can kill performance on your CUD operations, and can kill your server performance if you are doing these often.
Because of this performance impact, you want to avoid 'useless' indexes. Indexes that are used for every query are typically good, but an index used only once a day for a <1s query is probably useless. It's all a tradeoff in attempting to determine which indexes are useful enough to use and whose performance benefits are greater than the performance hits.
You could answer it with the conter question: When do you need an index?
You need an index, if you want to search for entries, to get your results faster. For example if the column is used in a where clause. Of course you could try index everything, but indexing will cause you to use extra memory/hard disk. So you only index columns you use to find your rows.
What rows MySQL for example is reading while trying to find your rows, you can analyze with the EXPLAIN command.
Does this help?
A rule of thumb is, to drop all indices except the unique index on the primary key, on small tables (less than about 100'000 rows).
Also, it is not appropriate to use an index, if the column is not for search purpose (e.g. the salary of employees).

Database indexes: A good thing, a bad thing, or a waste of time?

Adding indexes is often suggested here as a remedy for performance problems.
(I'm talking about reading & querying ONLY, we all know indexes can make writing slower).
I have tried this remedy many times, over many years, both on DB2 and MSSQL, and the result were invariably disappointing.
My finding has been that no matter how 'obvious' it was that an index would make things better, it turned out that the query optimiser was smarter, and my cleverly-chosen index almost always made things worse.
I should point out that my experiences relate mostly to small tables (<100'000 rows).
Can anyone provide some down-to-earth guidelines on choices for indexing?
The correct answer would be a list of recommendations something like:
Never/always index a table with less than/more than NNNN records
Never/always consider indexes on multi-field keys
Never/always use clustered indexes
Never/always use more than NNN indexes on a single table
Never/always add an index when [some magic condition I'm dying to learn about]
Ideally, the answer will give some instructive examples.
Indexes are kind of like chemotherapy...too much and it kills you...too little and you die...do it the wrong way and you die. You gotta know just how much, how often, and what kind to make it not kill you.
Your hardware, platform, environment, load all play a role. So to answer your questions..
Yes, possibly sometimes.
As a rule of thumb, primary keys and foreign keys need to be indexed. Usually primary key are indexed just by defining them as such, but FKs are not in every database (they definitely are not in SQL Server, I can't really speak for other dbs). You will be using these in joins, so it is generally critical to performance to define these.
Now if you have fields you often use in where clauses, they can benefit from indexes as well providing several things:
First the field must have a range of
values. A bit field or a field with
only 2 or 3 values will almost never
use an index.
Second the queries you write must be sargable. That is they must be designed to use indexes. I suspect if you never get performance improvements from what look like likely candidates for indexes, then you probably have queries that are not sargable. For instance take "WHERE Name like '%Smith'" as a where clause. Without knowing the first characters, the optimizer can't use the index.
Small tables rarely benefit much from indexes. If the optimizer can hold the whole thing in memory, then it is often faster to do so. If you were working with multimillion record tables, you would see that indexes are critical.
Indexing can be very complex and if you are interested in the subject, I suggest you get a good book on performance tuning your particular database and read in depth about them.
An index that's never used is a waste of disk space, as well as adding to the insert/update/delete time. It's probably best to define the clustering index first, then define
additional indexes as you find yourself writing WHERE clauses.
One common index mistake I see is people wondering why a select on col2 (or col3) takes so long when the index is defined as col1 ASC, col2 ASC, col3 ASC. When you have a multiple column index, your WHERE clause must use the first column in the index, or the first and second column in the index, and so forth.
If you need to access the data by col2, then you need an additional index that's defined as col2 ASC.
With small domain tables, it's sometimes faster to do a table scan than it is to read rows from the table using an index. This depends on the speed of your database machine and the speed of the network.
You need indexes. Only with indexes you can access data fast enough.
To make it as short as possible:
add indexes for columns you are frequently filtering (or grouping) for. (eg. a state or name)
like and sql functions could make the DBMS not use indexes.
add indexes only on columns which have many different values (eg. no boolean fields)
It is common to add indexes to foreign keys, but it is not always needed.
don't add indexes in very short tables
never add indexes when you don't know how they should enhance performance.
Finally: look into execution plans to decide how to optimize queries.
You'll add indexes just for a single, critical query. In this case, you'll add exactly the indexes that are needed in the query in question (multi-column indexes).
Basically when DB is collecting data and it's alive indexes have to go and evolve with that flow. There maybe really good index on table but after growing beyond of XXX records the same index in the same table is useless and in that case it should be refactored.
To have optimized and fast DB the only way is to monitor it all the time and refactor it over the time as records come in.
Real life example i got some time ago was super fast query restricted by some time range (created_at between A and B) and super slow query where time range was different. Same query, same database, same application and only one difference on time range.
Always use clustered indexes.
In fact you can't help but using them. The data in a table will be laid out on disk in some particular order anyway, it can't be save as a pile or something. You have the chance of specifying how exactly this data will be laid out. Why burn it?
When you have a table which gets new records appended and you observe that some value in those records always grow (like StackOverflow question number), make a clustered index out of it. Then the new data will not be inserted in the middle but will basically be appended to a file on disk which is a relatively cheap operation.
If a table is expected to be the target of a join then it is best to have a clustered index on that table so that the joins can be performed sequentially through the data pages. The columns in the clustered index will (on some DB systems) be included in all of the other indexes on that table, since those are the values that the indexes will use to reference the table data. To keep the other indexes from getting too large, the columns in the clustered index should be as narrow as possible, so it is best to use only numeric—rather than character—data types in the clustered index. In general, fewer columns are better than more columns, but notice that three int columns (12 bytes per row) are much better than one nvarchar(32) column (potentially 64 bytes per row).
If the clustered index is narrow, then a few additional indexes should not negatively impact performance very much even on very large tables.
Seems you are confusing two concepts here.
Adding indices *generally can only make a read query faster, very very rarely (almost never) slower. Adding an index never forces the query optimizer to use it. It will only use it if it thinks it can benefit from it, and it is generally very smart about those decisions.
For inserts/updates, of course, every index hurts performance a bit more... But at the other end of the spectrum, for, say a read only database, (like a USPS address database which is distributed monthly), in operational use there would ne no inserts/updates, so the only negative impact of additional indices is the disk space they take up.
This is entirely different that specifying that the query optimizer USE an index, in effect overriding what it would do on it's own... That can potentially make a query slower.
EDIT: Edited to eliminate opportunity for misinterpretation by overly literal readers.

Do indexes suck in SQL?

Say I have a table with a large number of rows and one of the columns which I want to index can have one of 20 values.
If I were to put an index on the column would it be large?
If so, why? If I were to partition the data into the data into 20 tables, one for each value of the column, the index size would be trivial but the indexing effect would be the same.
It's not the indexes that will suck. It's putting indexes on the wrong columns that will suck.
Seriously though, why would you need a table with a single column? What would the meaning of that data be? What purpose would it serve?
And 20 tables? I suggest you read up on database design first, or otherwise explain to us the context of your question.
Indexes (or indices) don't suck. A lot of very smart people have spent a truly remarkable amount of time of the last several decades ensuring that this is so.
Your schema, however, lacking the same amount of expertise and effort, may suck very badly indeed.
Partitioning, in the case described is equivalent to applying a clustered index. If the table is sorted otherwise (or is in arbitrary order) then the index necessarily has to occupy much more space. Depending on the platform, a non-clustered index may reduce in size as the sortedness of the rows with respect to the indexed value increases.
YMMV.
The short answer:
Do indexes suck: Yes and No
The longer answer:
They don't suck if used properly. Maybe you should start reading about how indexes work, why they can work and why they sometimes don't work.
Good starting points:
http://www.sqlservercentral.com/articles/Indexing/
No indexes don't suck, but you have to pay attention to how you use them or they can backfire on the performance of your queries.
First: Schema / design
Why would you create a table with only one column? That's probably taking normalization one step to far. Database design is one of the most important things to consider in optimizing performance
Second: Indexes
In a nutshell the indexes will help the database to perform a binary search of your record. Without an index on a column (or set of columns) the database will often fall back to a table scan. A table scan is very expensive because it involves enumerating each and every record.
It doesn't really matter THAT much for index scans how many records there are in the database table. Because of the (balanced) binary tree search doubling the amount of records will only result in one extra search step.
Determine the primary key of your table, SQL will automatically place a clustered index on that column(s). Clustered indexes perform really well. In addition you can place non-clustered indexes on columns that are used often in SELECT, JOIN, WHERE, GROUP BY and ORDER BY statements. Do remember that indexes have a certain overlap, try to never include your clustered index into a non-clustered index.
Also interesting might be the fill factor on the indexes. Do you want to optimize your table for reads (high fill factor - less storage, less IO) or for writes (low fill factor more storage, less rebuilding your database pages).
Third: Partitioning
One of the reasons to use partitioning is to optimize your data access. Let's say you have 1 million records of which 500,000 records are no longer relevant but stored for archiving purposes. In this case you could decide to partition the table and store the 500,000 old records on slow storage and the other 500,000 records on fast storage.
To measure is to know
The best way to get insight in what happens is to measure what happens to your cpu and io. Microsoft SQL server has some tools like the Profiler and Execution plans in Management Studio that will tell you the duration of your query, number of read/writes and cpu usage. Also the execution plan will tell you which or IF indexes are being used. To your surprise you might see a table scan although you didn't expect it.
Say I have a table with a large number of rows and one column which I want to index can have one of 20 values. If I were to put an index on the column would it be large?
The index size will be proportional to the number of your rows and the length of the indexed values.
The index keeps not only the indexed value, but also some kind of a pointer to the row (ROWID in Oracle, LCID in PostgreSQL, primary key in InnoDB etc).
If you have 10,000 rows and a 1 distinct value, you will still have 10,000 records in your index.
If so, why? If I were to partition the data into the data into 20 tables, one for each value of the column, the index size would be trivial but the indexing effect would be the same
In this case, you would come with 20 indexes being same in size in total as your original one.
This technique is sometimes used in fact in such called partitioned indexes. It has its advantages and drawbacks.
Standard b-tree indexes are best suited to fairly selective indexes, which this example would not be. You don't say what DBMS you are using; Oracle has another type of index called a bitmap index which is more suited to low-selectivity indexes in OLAP environments (since these indexes are expensive to maintain, making them unsuitable for OLTP environments).
The optimiser will decide bases on stats whether it thinks the index will help get the data in the fastest time; if it won't, the optmiser won't use it.
Partitioning is another strategy. In Oracle you can define a table as partitioned on some set of columns, and for the optimiser can automatically perform "partition elimination" like you suggest.
Sorry, I'm not quite sure what you mean by "large".
If your index is clustered, all the data for each record will be on the same leaf page, thereby creating the most efficient index available to your table as long as you write your queries against it properly.
If your index is non-clustered, then only the index related data will be on your leaf pages. Then, depending on suchs things as how many other indexes you have, coupled with details like your fill factor, your index may or may not be efficient. In general, if you don't have a ton of indexes on your table, you should be safe.
The efficiency of your index will also be determined by the data type of the 20 values you're speaking of going into the column. If those are pre-defined values, then their details should probably be in a lookup table with a simple primary key datatype (like Int/Number). Then add that column to your table as a foreign key with an index on the column.
Ultimately, you could have a perfect index on a column. But it's best use will be determined for the most part by the queries you write. So if your queries make use of the indexes, you're golden.
Indexes are purely for performance. If an index doesn't boost performance for the queries you're interested in, then it sucks.
As for disk usage, you have to weigh your concerns. Different SQL providers build indexes differently, but as a client, you generally trust that they do the best that can be done. In the case you're describing, a clustered index may be optimal for both size and performance.
It would be large enough to hold those values for all the rows, in a sorted order.
Say you have 20 different strings of 4 characters, and 1 million rows, it would at least be 4 million bytes (or 8 if 16-bit unicode) to hold those values.

When should you consider indexing your sql tables?

How many records should there be before I consider indexing my sql tables?
There's no good reason to forego obvious indexes (FKs, etc.) when you're creating the table. It will never noticeably affect performance to have unnecessary indexes on tiny tables, and it's good to take a first cut when your mind is into schema design. Also, some indexes serve to prevent duplicates, which can be useful regardless of table size.
I guess the proper answer to your question is that the number of records in the table should have nothing to do with when to create indexes.
I would create the index entries when I create my table. If you decide to create indices after the table has grown to 100, 1000, 100000 entries it can just take alot of time and perhaps make your database unavailable while you are doing it.
Think about the table first, create the indices you think you'll need, and then move on.
In some cases you will discover that you should have indexed a column, if thats the case, fix it when you discover it.
Creating an index on a searched field is not a pre-optimization, its just what should be done.
When the query time is unacceptable. Better yet, create a few indexes now that are likely to be useful, and run an EXPLAIN or EXPLAIN ANALYZE on your queries once your database is populated by representative data. If the indexes aren't helping, drop them. If there are slow queries that could benefit from more or different indexes, change the indexes.
You are not going to be locked in to an initial choice of indexes. Experiment, and make sure you measure performance!
In general I agree with the previous advice.
Always declare the referential integrity for the tables (Primary Key, Foreign Keys), column constraints (not null, check). Saves you from nightmares when apps put bad data into the tables (even in development).
I'd consider adding indexes for the common access columns (columns in your where clauses which are used in =, <> tests), as well.
Most of the modern RDBMS implementations are quite good at keeping you indexes up to date, without hitting your performance. So, the cost of having indexes is minimal.
Also, most RDBMS's have query plan evaluators which look at the relative costs going to the data rows via the index, or using some sort of table scan. So, again the performance hits are minimal.
Two.
I'm serious. If there are two rows now, and there will always be two rows, the cost of indexing is almost zero. It's quicker to index than to ponder whether you should. It won't take the optimizer very long to figure out that scanning the table is quicker than using the index.
If there are two rows now, but there will be 200,000 in the near future, the cost of not indexing could become prohibitively high. The right time to consider indexing is now.
Having said this, remember that you get an index automatically when you declare a primary key. Creating a table with no primary key is asking for trouble in most cases. So the only time you really need to consider indexing is when you want an index other than the index on the primary key. You need to know the traffic, and the anticipated volume to make this call. If you get it wrong, you'll know, and you can reverse the decision.
I once saw a reference table that had been created with no index when it contained 20 rows. Due to a business change, this table had grown to about 900 rows, but the person who should have noticed the absence of an index didn't. The time to insert a new order had grown from about 10 seconds to 15 minutes.
As a matter of routine I perform the following on read heavy tables:
Create indexes on common join fields such as Foreign Keys when I create the table.
Check the query plan for Views or Stored Procedures and add indexes wherever a table scan is indicated.
Check the query plan for queries by my application and add indexes wherever a table scan is indicated. (and often try to make them into Stored Procedures)
On write heavy tables (like activity logs) I avoid indexes unless they are absolutely necessary. I also tend to archive such data into indexed tables at regular intervals.
It depends.
How much data is in the table? How often is data inserted? A lot of indexes can slow down insertion time. Do you always query all the rows of the table? In this case indexes probably won't help much.
Those aren't common usages though. In most cases, you know you're going to be querying a subset of data. ON what fields? Are there common fields that are always joined on? Look at query plans for common or typical queries, it will generally show you where it's spending all of its time.
If there's a unique constraint on the table (and there should be at least one), then that will usually be enforced by a unique index.
Otherwise, you add indexes when the query performance is bad and adding the index will demonstrably improve the performance. There are books on the subject of how to create good sets of indexes on tables, including Relational Database Index Design and the Optimizers. It will give you a lot of ideas and the reasons why they are good.
See also:
No indexes on small tables
When to create a new SQL Server index
Best Practices and Anti-Patterns in Creating Indexes
and, no doubt, a host of others.

No indexes on small tables?

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." (Donald Knuth). My SQL tables are unlikely to contain more than a few thousand rows each (and those are the big ones!). SQL Server Database Engine Tuning Advisor dismisses the amount of data as irrelevant. So I shouldn't even think about putting explicit indexes on these tables. Correct?
The value of indexes is in speeding reads. For instance, if you are doing lots of SELECTs based on a range of dates in a date column, it makes sense to put an index on that column. And of course, generally you add indexes on any column you're going to be JOINing on with any significant frequency. The efficiency gain is also related to the ratio of the size of your typical recordsets to the number of records (i.e. grabbing 20/2000 records benefits more from indexing than grabbing 90/100 records). A lookup on an unindexed column is essentially a linear search.
The cost of indexes comes on writes, because every INSERT also requires an internal insert to each column index.
So, the answer depends entirely on your application -- if it's something like a dynamic website where the number of reads can be 100x or 1000x the writes, and you're doing frequent, disparate lookups based on data columns, indexing may well be beneficial. But if writes greatly outnumber reads, then your tuning should focus on speeding those queries.
It takes very little time to identify and benchmark a handful of your app's most frequent operations both with and without indexes on the JOIN/WHERE columns, I suggest you do that. It's also smart to monitor your production app and identify the most expensive, and most frequent queries, and focus your optimization efforts on the intersection of those two sets of queries (which could mean indexes or something totally different, like allocating more or less memory for query or join caches).
Knuth's wise words are not applicable to the creation (or not) of indexes, since by adding indexes you are not optimising anything directly: you are providing an index that the DBMSs optimiser may use to optimise some queries. In fact, you could better argue that deciding not to index a small table is premature optimisation, as by doing so you restrict the DBMS optimiser's options!
Different DBMSs will have different guidelines for choosing whether or not to index columns based on various factors including table size, and it is these that should be considered.
What is an example of premature optimisation in databases: "denormalising for performance" before any benchmarking has indicated that the normalised database actually has any performance issues.
Primary key columns will be indexed for the unique constraint. I would still index all foreign key columns. The optimizer can choose to ignore your index if it is irrelevant.
If you only have a little bit of data then the extra cost for insert/update should not be significant either.
Absolutely incorrect. 100% incorrect. Don't put a million pointless indexes, but you do want a Primary Key (in most cases), and you do want it CLUSTERED correctly.
Here's why:
SELECT * FROM MySmallTable <-- No worries... Index won't help
SELECT
*
FROM
MyBigTable INNER JOIN MySmallTable ON... <-- Ahh, now I'm glad I have my index.
Here's a good rule to go by.
"Since I have a TABLE, I'm likely going to want to query it at some time... If I'm going to query it, I'm likely going to do so in a consistent way..." <-- That's how you should index the table.
EDIT: I'm adding this line: If you have a concrete example in mind, I'll show you how to index it, and how much of a savings you'll get from doing so. Please supply a table, and an example of how you plan in using that table.
I suggest that you follow the usual rules about indexing, which approximately means "create indexes on those columns that you use in your queries".
This might sound unnecessary with such a small database. As others have already said: as long as your database stays as small as you have described, the queries will be fast enough anyway, and the indexes aren't really needed. They can even slow down insertions and updates, but unless you have very specific requirements there, it doesn't matter with such a small database.
But, if the database grows (which databases sometimes have a tendency to do) you don't have to remember to add indexes to that old database that you've probably forgotten about by then. Maybe it has even been installed at one your customers, and you can't modify it!
I guess what I'm saying is this: indexes should be such a natural part of your database design, that it is the lack of indexes that is the optimization, premature or not.
It depends. Is the table a reference table?
There are tables of a thousand rows where the absence of an index, and the resulting table scans can make the difference between a fairly simple operation delaying the user by 5 minutes instead of 5 seconds. I have seen exactly this problem, using a DBMS other than SQL Server.
Generally, if the table is a reference table, updates on it will be relatively rare. This means that the performance hit for updating the index will also be relatively rare. If the optimizer passes over the index, the performance hit on the optimizer will be negligible. The space needed to store the index will also be negligible.
If you declare a primary key, you should get an automatic index on that key. That automatic index will almost always do you enough good to justify its cost. Leave it in there. If you create a reference table without a primary key, there are other problems in your design methodology.
If you do frequent searches or frequent joins on some set of columns other than the primary key, an additional index might pay for itself. Don't fix that problem unless it is a problem.
Here's the general rule of thumb: go with the default behavior of the DBMS, unless you find a reason not to. Anything else is a premature preoccupation with optimization on your part.
If the rows have narrow width, and a few thousand rows fit on say 10-20 8K pages, it is unlikely that the SQL optimiser would elect to use an index even if you create one.
Put indexes ONLY if you have to :)
There are times when putting indexes can actually hurt performance, depending on what the table is used for...
So, in other words, you would think about putting indexes on tables when it is necessary as determined by profiling the application.
Indexes are often created implicitly when using UNIQUE constraints. I wouldn't try to avoid their use in that case!
As a general rule of thumb, it's good to avoid smaller indexes as they typically won't be used.
But sometimes they can provide a huge boost as I outlined here.
I guess there is an auto indexing on the primary key of the table which should be sufficient when querying on a table with less data.
So, yes explicit indexes can be avoided in case there is a small data set to be worked upon.
Even if you have an index, SQL Server might not even use it, depending on the statistics for that table. And if you plan to put in an index for a report that will run at most a couple times a year, bear in mind that the INSERT/UPDATE penalties for adding the index will be in effect ALL THE TIME. Before adding an index, ask yourself if it is worth the performance penalty.
You have to understand that based upon the query two lookups may be done, one into the index to get the pointer to the row, the next to the row itself. If the data that is being queried is in the index columns that extra step may not be necessary.
It is entirely possible that double dipping for data may be slower even if the optimizer goes after the index. Whether or not we care is up to application profiling and eventual explain plans.