Deleting rows from a table cause index fragmentation? - sql

I removed some rows from a very large table. Then I ran a query that usually runs within a few seconds and found it to be running very slowly after deleting the rows. I re-built my index and ran my query and found it to be fast again. Could deleting those rows caused the index to be fragmented?

Yes, deleting rows affects the index and maintenance should take place to keep the index relatively in sync with existing data.
Rebuilding an index was likely unnecessary - you only need to do this if the physical fragmentation is 30 percent or more according to MS documentation. REORGANIZE is usually a better choice - think of it as defraging the index.
This is a good article series on SQL Server Index Fragmentation.

Related

Does deleting rows from a table distrupts indexes?

I need to know that if we delete some rows (I am talking for sql server) from a table which has some indexes (clustered or non-clustered, for both situation) can give any damage to indexes or not? What happens to indexes when we delete rows? Which one is better for performance, deleting rows from a table after processing them, or mark up them as processed (When we will need to reuse them like 20 times more). Thanks for the answers.
I don't know what you mean by "damage". When you delete rows from the table, the index entries need to be deleted as well. This does not "damage" the index per se. At least, the index continues to be useful.
If you have lots of deletes, updates, and inserts, then over time the index will be fragmented. This does affect performance. At some point it becomes useful to re-build the index for performance purposes. You can read about this in the documentation.
I would not worry about rebuilding the indexes because of a handful of deletes. It takes a bit of work to really fragment an index.
My answer is YES.
Index is created on data in the tables and in short if data is deleted from the tables then the levels of fragmentation rise.
Rise in fragmentation levels effects the data retrieval in many ways.

Batch index updates?

I'm writing several hundred or potentially several thousand rows into a set of tables at a time, each of which is heavily indexed both internally and via indexed views.
Generally, the inserts are occurring where the rows inserted will be adjacent in the index.
I expect these inserts to be expensive, but they are really slow. I think part of the performance issue is that the indexes are being updated with each individual INSERT.
Is there a way to tell SQL Server to hold off on updating the indexes until I am finished with my batch of inserts so the index trees will only need to be updated once?
These are executed as separate statements due to needing to show the user a progress bar during save and log any individual issues, but are all coming from the same connection in C#. I can place them in a transaction if needed, though I'd prefer not to.
You are paying the cost of adding those rows to the index one way or another. Not updating the index during the insert would cause an issue with accuracy of concurrent statements - any query on that table that used any of the indexes would not "see" the new rows!
If speed is of the essence, and downtime after the insert isn't a major concern, you can:
Disable non-clustered indexes on the target table
Inert
Rebuild non-clustered indexes
You probably should clarify some more about your table:
How wide is the table?
How many indexes?
How wide are the indexes?
If you have 20 indexes and each index has 5 fields, you are really updating 100 extra fields per row which can get expensive quickly.

Why is my index getting fragmented?

I have a simple little table (just 8 fields) an hour ago I rebuilt one of the indexes on the table, which reset it to 0% fragmentation, but now it’s up to 38%.
The table itself has 400k records in it, but only 158 new ones have been inserted since I rebuilt the index, there have been no updates to records but perhaps a couple of deletes.
Why should the index be getting so fragmented?
The index is non-unique, non-clustered just on one field.
The database is running on SQL Server 2005 but with a compatibility level of SQL Server 2000.
Thanks
Check the Fill Factor for that index when it is re-built. The fill factor may be too high. If this is the case, the index pages will be too full when the index is re-built and adding new rows will soon start to cause page splits (fragmentation). Reducing the fill factor on rebuild will allow more new records to be inserted into the index pages before page splitting starts to occur.
http://msdn.microsoft.com/en-us/library/aa933139%28SQL.80%29.aspx
Fill factor 0 is equal to 100, so you are not allowing any room for inserts. You should be choosing a lower fill factor if you will be inserting.

SQL Server Indexes - Initial slow performance after creation

Using SQL Server 2005. This is something I've noticed while doing some performance analysis.
I have a large table with about 100 million rows. I'm comparing the performance of different indexes on the table, to see what the most optimal is for my test scenario which is doing about 10,000 inserts on that table, among other things on other tables. While my test is running, I'm capturing an SQL Profiler trace which I load in to an SQL table when the test has finished so I can analyse the stats.
The first test run after recreating a different set of indexes on the table is very noticeably slower than subsequent runs - typically about 10-15 times slower for the inserts on this table on the first run after the index creation.
Each time, I clear the data and execution plan cache before the test.
What I want to know, is the reason for this initial poorer performance with a newly created set of indexes?
Is there a way I can monitor what is happening to cause this for the first run?
One possibility is that the default fill factor of zero is coming in to play.
This means that there's 'no room' in the index to accommodate your inserts. When you insert, a page split in the index is needed, which adds some empty space to store the new index information. As you carry out more inserts, more space is created in the index. After a while the rate of splitting will go down, because your inserts are hitting pages that are not fully filled, so splits are not needed. An insert requiring page splits is more expensive than one that doesn't.
You can set the fill factor when you create the index. Its a classic trade off between space used and performance of different operations.
I'm going go include a link to some Sybase ASE docs, 'cos they are nicely written and mostly applicable to SQL Server too.
Just to clarify:
1) You build an index on a table with 100m pre-existing rows.
2) You insert 10k rows into the table
3) You insert another 10k rows into the table
Step 3 is 10x faster than step 2?
What kind of index is the new index - not clustered, right? Because inserts on a clustered index will cause very different behavior. In addition, is there any significant difference in the profile of the 2 inserts, because depending on the clustered index, they will have different behavior. Typically, it should either have no clustered index or be clustered on an increasing key.

SQL Server: What is the difference between Index Rebuilding and Index Reorganizing?

What is the difference between Index Rebuilding and Index Reorganizing?
Think about how the index is implemented. It's generally some kind of tree, like a B+ Tree or B- Tree. The index itself is created by looking at the keys in the data, and building the tree so the table can be searched efficiently.
When you reorganize the index, you go through the existing index, cleaning up blocks for deleted records etc. This could be done (and is in some databases) when you make a deletion, but that imposes some performance penalty. instead, you do it separately in order to do it more or less batch mode.
When you rebuild the index, you delete the existing tree and read all the records, building a new tree directly from the data. That gives you a new, and hopefully optimized, tree that may be better than the results of reorganizing the table; it also lets you regenerate the tree if it somehow has been corrupted.
REBUILD locks the table for the whole operation period (which may be hours and days if the table is large).
REORGANIZE doesn't lock the table.
Well. actually, it places some temporary locks on the pages it works with right now, but they are removed as soon as the operation is complete (which is fractions of second for any given lock).
As #Andomar noted, there is an option to REBUILD an index online, which creates the new index, and when the operation is complete, just replaces the old index with the new one.
This of course means you should have enough space to keep both the old and the new copy of the index.
REBUILD is also a DML operation which changes the system tables, affects statistics, enables disabled indexes etc.
REORGANIZE is a pure cleanup operation which leaves all system state as is.
There are a number of differences. Basically, rebuilding is a total rebuild of an index - it will build a new index, then drop the existing one, whereas reorganising it will simply, well... it will reorganise it.
This blog entry I came across a while back will explain it much better than I can. :)
Rebuild it droping the current indexes and recreating new ones.
Reorganizing is like putting the house in order with what u already have.
it is a good practice to use 30% fragmentation to determine rebuild vs reorganize.
<30% reorganize vs. >30% rebuild
"Reorganize index" is a process of cleaning, organizing, and defragmenting of "leaf level" of the B-tree (really, data pages).
Rebuilding of the index is changing the whole B-tree, recreating the index.
It’s recommended that index should be reorganized when index fragmentation is from 10% to 40%; if index fragmentation is great than 40%, it’s better to rebuild it.
Rebuilding of an index takes more resources, produce locks and slowing performance (if you choose to keep table online). So, you need to find right time for that process.
In addition to the differences above (basically rebuild will create the index anew, and then "swap it in" for the existing one, rather than trying to fix the existing one), an important consideration is that a rebuild - even an Enterprise ONLINE rebuild - will interfere with snapshot isolation transactions.
TX1 starts snapshot transaction
TX1 reads from T
TX2 rebuilds index on T
TX2 rebuild complete
TX1 read from T again:
Error 3961, Snapshot isolation transaction failed in database because the object accessed by the statement has been modified by a DDL statement in another concurrent transaction since the start of this transaction. It is disallowed because the metadata is not versioned. A concurrent update to metadata can lead to inconsistency if mixed with snapshot isolation.
Rebuild index - rebuilds one or more indexes for a table in the specified database.
Reorganise index - Defragments clustered and secondary indexes of the specified table