Index fragmentation - azure-sql-database

Index fragmentation - azure-sql-database

I have a Azure SQL database with most tables fragmented more than 95%. I am rebuilding the indexes weekly. Tables are truncated and inserted on each load (Every 2 hours). After rebuilding, fragmentation increases rapidly and reaches 95% in couple of days. How can I improve the situation?

Use DROP and Re-Create index method to reduce Fragmentation in Index.
• Reducing Fragmentation in Index: if avg_fragmentation_in_percent between 30% to 100% then use ALTER INDEX REBUILD: This is replacement for DBCC DBREINDEX to rebuild the index online or offline. also use the drop and re-create index method.
Refer - SQL Server - Fragmentation on azure SQL database

Related

Prevent DTU exhaustion when creating a new index

We're using Azure SQL as a single database and under the DTU pricing model. We have a table with ~50M records, and we'd like to add a new non-clustered index on a single string attribute.
The problem is that this is a production database. If I were to use the simple TSQL syntax of
CREATE NONCLUSTERED INDEX [IndexName]
ON [dbo].[TableName]([FieldName] ASC);
GO
The index creation will peg DTUs to 100% for several minutes, which essentially starves any queries from our application code.
Is there a way in Azure SQL to instruct SQL Server "Hey, only use X DTUs for this indexing operation"?

You can use option MAXDOP to limit the number of processors used.
ALTER INDEX test_idx on test_table REBUILD WITH (ONLINE=ON, MAXDOP=1, RESUMABLE=ON) ;
You may consider using the Resumable Online Index Rebuild feature so you can schedule X executions of one minute of those indexes.
ALTER INDEX [ix_CustomerIDs]
ON [ContosoSales].[ConstosoTransactionData]
REBUILD
WITH (ONLINE = ON, RESUMABLE = ON, MAX_DURATION = 1 MINUTES);
GO
ALTER INDEX ix_CustomerIDs ON [ContosoSales].[ConstosoTransactionData] PAUSE
ALTER INDEX ix_CustomerIDs ON [ContosoSales].[ConstosoTransactionData] RESUME

Is gathering statistics necessary after dropping indexes and recreating them?

I use Oracle SQL, and I have one question. I have a table with indexes and it's statistics are gathered. I have a procedure that drops the indexes, inserts data into the table, and after that recreates the same indexes again. Do I need to gather statistics again after this procedure, or will the indexes be recognized anyway?
For gathering statistics i use:
EXEC dbms_stats.gather_table_stats(''MIGBUFFER'','''||table_name||''',cascade=>TRUE);

No. There is no need to gather index statistics immediately after index creation, since Oracle automatically gathers stats for indexes at creation time.
From documentation,
Oracle Database now automatically collects statistics during index
creation and rebuild.
I think it was way back in earlier releases, when you could use COMPUTE STATISTICS clause to start or stop the collection of statistics on an index. But, now this clause has been deprecated. Oracle Database now automatically collects statistics during index creation and rebuild. This clause is supported for backward compatibility and will not cause errors.

Locking table for more transactions

I am using Oracle 11g.
I am trying to realize scenario of concurrent loading into a table with index rebuild. I have few flows which are trying to realize this scenario:
1. load data from source,
2. transform the data,
3. turn off the index on DWH table,
4. load data into DWH,
5. rebuild index on DWH table.
I turn off and rebuild indexes for better performance, there are situations where one flow is rebuilding the index while the other tries to turn it off. What I need to do is to place some lock between points 2 and 3, which would be released after point 5.
Oracle built in LOCK TABLE mechanism is not sufficient, as the lock is released by the end of transaction, so any ALTER statement releases the lock.
The question is how to solve the problem?

Problem solved. What I wanted to do can be easily achieved using DBMS_LOCK package.
1. DBMS_LOCK.REQUEST(...)
2. ALTER INDEX ... UNUSABLE
3. Load data
4. ALTER INDEX ... REBUILD
5. DBMS_LOCK.RELEASE(...)

You can use DBMS_SCHEDULER:
run jobs that load and transform the data
turn off indexes
run jobs that load data into DWH
rebuild indexes
Using Chains:
http://docs.oracle.com/cd/B28359_01/server.111/b28310/scheduse009.htm#CHDCFBHG

You have to range partition your table on minutes/hours basis of insertion time and enable indexes only when time is up. Each partition should have all indexes disabled after creation. Only one process can enable indexes.

Deleting column doesn't reduce database size

I have a test database with 1 table having some 50 million records. The table initially had 50 columns. The table has no indexes. When I execute the "sp_spaceused" procedure I get "24733.88 MB" as result. Now to reduce the size of this database, I remove 15 columns (mostly int columns) and run the "sp_spaceused", I still get "24733.88 MB" as result.
Why is the database size not reducing after removing so many columns? Am I missing anything here?
Edit: I have tried database shrinking but it didn't help either

Try running the following command:
DBCC CLEANTABLE ('dbname', 'yourTable', 0)
It will free space from dropped variable-length columns in tables or indexed views. More information here DBCC CLEANTABLE and here Space used does not get changed after dropping a column
Also, as correctly pointed out on the link posted on the first comment to this answer. After you've executed the DBCC CLEANTABLE command, you need to REBUILD your clustered index in case the table has one, in order to get back the space.
ALTER INDEX IndexName ON YourTable REBUILD

When any variable length column is dropped from table, it does not reduce the size of table. Table size stays the same till Indexes are reorganized or rebuild.
There is also DBCC command DBCC CLEANTABLE, which can be used to reclaim any space previously occupied with variable length columns. Here is the syntax:
DBCC CLEANTABLE ('MyDatabase','MySchema.MyTable', 0)WITH NO_INFOMSGS;
GO
Raj

The database size will not shrink simply because you have deleted objects. The database server usually holds the reclaimed space to be used for subsequent data inserts or new objects.
To reclaim the space freed, you have to shrink the database file. See How do I shrink my SQL Server Database?

What is the difference between OFFLINE and ONLINE index rebuild in SQL Server?

When rebuilding an index, there is an option for ONLINE=OFF and ONLINE=ON. I know that when ONLINE mode is on, it makes a copy of the index, switches new queries to utilizing it, and then rebuilds the original index, tracking changes using versioning to both (correct me if I am wrong).
But what does SQL do in OFFLINE mode?

In ONLINE mode the new index is built while the old index is accessible to reads and writes. any update on the old index will also get applied to the new index. An antimatter column is used to track possible conflicts between the updates and the rebuild (ie. delete of a row which was not yet copied). See Online Index Operations. When the process is completed the table is locked for a brief period and the new index replaces the old index. If the index contains LOB columns, ONLINE operations are not supported in SQL Server 2005/2008/R2.
In OFFLINE mode the table is locked upfront for any read or write, and then the new index gets built from the old index, while holding a lock on the table. No read or write operation is permitted on the table while the index is being rebuilt. Only when the operation is done is the lock on the table released and reads and writes are allowed again.
Note that in SQL Server 2012 the restriction on LOBs was lifted, see Online Index Operations for indexes containing LOB columns.

The main differences are:
1) OFFLINE index rebuild is faster than ONLINE rebuild.
2) Extra disk space required during SQL Server online index rebuilds.
3) SQL Server locks acquired with SQL Server online index rebuilds.
This schema modification lock blocks all other concurrent access to the table, but it is only held for a very short period of time while the old index is dropped and the statistics updated.

Online index rebuilds are less intrusive when it comes to locking tables. Offline rebuilds cause heavy locking of tables which can cause significant blocking issues for things that are trying to access the database while the rebuild takes place.
"Table locks are applied for the duration of the index operation [during an offline rebuild]. An offline index operation that creates, rebuilds, or drops a clustered, spatial, or XML index, or rebuilds or drops a nonclustered index, acquires a Schema modification (Sch-M) lock on the table. This prevents all user access to the underlying table for the duration of the operation. An offline index operation that creates a nonclustered index acquires a Shared (S) lock on the table. This prevents updates to the underlying table but allows read operations, such as SELECT statements."
http://msdn.microsoft.com/en-us/library/ms188388(v=sql.110).aspx
Additionally online index rebuilds are a enterprise (or developer) version only feature.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Index fragmentation - azure-sql-database

I have a Azure SQL database with most tables fragmented more than 95%. I am rebuilding the indexes weekly. Tables are truncated and inserted on each load (Every 2 hours). After rebuilding, fragmentation increases rapidly and reaches 95% in couple of days. How can I improve the situation?

Related

Prevent DTU exhaustion when creating a new index

Is gathering statistics necessary after dropping indexes and recreating them?

Locking table for more transactions

Deleting column doesn't reduce database size

What is the difference between OFFLINE and ONLINE index rebuild in SQL Server?

Categories

Resources