Heavily fragmented Indexes - SQL Server 2005

Heavily fragmented Indexes - SQL Server 2005 - sql

I recently inherited a poorly maintained production database with heavily fragmented indexes (most of the indexes with more than 80% fragmented). I requested downtime with my manager to perform Index rebuild, but unfortunately downtime is not allowed at the moment. If online Index reorganize too is not option, can I do the following?
Restore a fresh production copy to a test instance
Rebuild Indexes, update statistics
Overwrite the prod database from test instance
Apply transaction logs to get the database.
Though the above method too requires downtime, but its relatively less. I wanted to know whether one can do this or I am just being stupid :) Please advise
RK

SQL Server 2005 has online index rebuilds (that is, non-blocking).
Otherwise, it's never offline (that is, database or server off line) but does have an exclusive locks on the table/index being rebuilt.

If most of the application's access is via seeks, then the fragmentation is not a problem. Otherwise I'd try to find some time in which the index rebuild will have minimal effect, and do it via a scheduled job. Surely the application isn't running 24/7/365 with poor maintenance, without the company expecting some problem to occur. (Do they change the oil on their cars?)
As far as your 4 step solution, copying the table to another database, rebuilding the index, and copying it back won't accomplish anything more than just rebuilding the existing index. On copying it back the index is rebuilt anyway, so just try and schedule a couple of tables at a time, until you get everything done.
Good luck.

Related

IBM i SQL - dump plan cache

I run heavy query on IBM i. First time it takes a long time, Subsequent times are much faster. It seems to be creating temporary index. How can I remove this index, so I can re-test like the first time?

Use the Visual Explain (VE) tool in the Run SQL Scripts component of ACS to see the differences between runs.
If indeed the issue is a system maintained temporary index (MTI), you can track it down via the schema's tooling in ACS and delete it if you so desire.
However, an MTI only gets deleted by the system when the system reboots (IPL).
So if you seeing differences without rebooting the server, I suspect the differences are caused by psuedo-closing. By default, once the DB see's the same query a few times (3 is the default), instead of hard closing it's cursors, it will psuedo-close them.
Again, VE will show "hard opens" and "pseudo opens".
To get the pseduo closed cursors to hard close, simply disconnect and reconnect.

How to continuously delivery SQL-based app?

I'm looking to apply continuous delivery concepts to web app we are building, and wondering if there any solution to protecting the database from accidental erroneous commit. For example, a bug that erases whole table instead of a single record.
How this issue impact can be limited according to continuous delivery doctorine, where the application deployed gradually over segments of infrastructure?
Any ideas?

Well first you cannot tell just from looking what is a bad SQL statement. You might have wanted to delete the entire contents of the table. Therefore is is not physiucally possible to have an automated tool that detects intent.
So to protect your database, first make sure you are in full recovery (not simple) mode and have full backups nightly and transaction log backups every 15 minutes or so. Now you cannot lose much information no matter how badly the process breaks. Your dbas should be trained to be able to recover to a point in time. If you don't have any dbas, I'd suggest the best thing you can do to protect your data is hire some. This is a non-negotiable in any non-trivial database environment and it is terribly risky not to have trained, experienced dbas if your data is critical to the business.
Next, you need to treat SQL like any other code, it should be in source control in scripts. If you are terribly concerned about accidental deletions, then write the scripts for deletes to copy all deletes to a staging table and delete the content of the staging table once a week or so. Enforce this convention in the code reviews. Or better yet set up an auditing process that runs through triggers. Once all records are audited, it is much easier to get back the 150 accidental deletions without having to restore a database. I would never consider having any enterprise application without auditing.
All SQL scripts without exception should be code-reviewed just like other code. All SQL scripts should be tested on QA and passed before moving to porduction. This will greatly reduce the possiblility for error. No developer should have write rights to production, only dbas should have that. Therefore each script should be written so that is can just be run, not run one chunk at a time where you could accidentally forget to highlight the where clause. Train your developers to use transactions correctly in the scripts as well.

Your concern is bad data happening to the database. The solution is to use full logging of all transactions so you can back out of transactions that you want to. This would usually be used in a context of full backups/incremental backups/full logging.
SQL Server, for instance, allows you to restore to a point in time (http://msdn.microsoft.com/en-us/library/ms190982(v=sql.105).aspx), assuming you have full logging.
If you are creating and dropping tables, this could be an expensive solution, in terms of space needed for the log. However, it might meet your needs for development.
You may find that full-logging is too expensive for such an application. In that case, you might want to make periodic backups (daily? hourly?) and just keep these around. For this purpose, I've found LightSpeed to be a good product for fast and efficient backups.

One of the strategies that is commonly adopted is to log the incremental sql statements rather than a collective schema generation so you can control the change at a much granular levels:
ex:
change 1:
UP:
Add column
DOWN:
Remove column
change 2:
UP:
Add trigger
DOWN:
Remove trigger
Once the changes are incrementally captured like this, you can have a simple but efficient script to upgrade (UP) from any version to any version without having to worry about the changes that happening. When the change # are linked to build, it becomes even more effective. When you deploy a build the database is also automatically upgraded(UP) or downgraded(DOWN) to that specific build.
We have an pipeline app which does that at CloudMunch.

Recommended approach how to modify schema of a production SQL database?

Say there is a database with 100+ tables and a major feature is added, which requires 20 of existing tables to be modified and 30 more added. The changes were done over a long time (6 months) by multiple developers on the development database. Let's assume the changes do not make any existing production data invalid (e.g. there are default values/nulls allowed on added columns, there are no new relations or constraints that could not be fulfilled).
What is the easiest way to publish these changes in schema to the production database? Preferably, without shutting the database down for an extended amount of time.

Write a T-SQL script that performs the needed changes. Test it on a copy of your production database (restore from a recent backup to get the copy). Fix the inevitable mistakes that the test will discover. Repeat until script works perfectly.
Then, when it's time for the actual migration: lock the DB so only admins can log in. Take a backup. Run the script. Verify results. Put DB back online.
The longest part will be the backup, but you'd be crazy not to do it. You should know how long backups take, the overall process won't take much longer than that, so that's how long your downtime will need to be. The middle of the night works well for most businesses.

There is no generic answer on how to make 'changes' without downtime. The answer really depends from case to case, based on exactly what are the changes. Some changes have no impact on down time (eg. adding new tables), some changes have minimal impact (eg. adding columns to existing tables with no data size change, like a new nullable column that doe snot increase the null bitmap size) and other changes will wreck havoc on down time (any operation that will change data size will force and index rebuild and lock the table for the duration). Some changes are impossible to apply without *significant * downtime. I know of cases when the changes were applies in parallel: a copy of the database is created, replication is set up to keep it current, then the copy is changed and kept in sync, finally operations are moved to the modified copy that becomes the master database. There is a presentation at PASS 2009 given by Michelle Ufford that mentions how godaddy gone through such a change that lasted weeks.
But, at a lesser scale, you must apply the changes through a well tested script, and measure the impact on the test evaluation.
But the real question is: is this going to be the last changes you ever make to the schema? Finally, you have discovered the perfect schema for the application and the production database will never change? Congratulation, once you pull this off, you can go to rest. But realistically, you will face the very same problem in 6 months. the real problem is your development process, with developers and making changes from SSMS or from VS Server Explored straight into the database. Your development process must make a conscious effort to adopt a schema change strategy based on versioning and T-SQL scripts, like the one described in Version Control and your Database.

Use a tool to create a diff script and run it during a maintenance window. I use RedGate SQL Compare for this and have been very happy with it.

I've been using dbdeploy successfully for a couple of years now. It allows your devs to create small sql change deltas which can then be applied against your database. The changes are tracked by a changelog table within database so that it knows what to apply.
http://dbdeploy.com/

What's the easiest way to add an index on a live myISAM table?

I have a myISAM table running in production on mySQL, and by doing a few tests, we've found we can tremendously speed up a query by adding a certain compound index. So far so good. However, I am not really about the best way to add this index in a production environment without locking the table for a long time (it's got 27GBs of data, so not so much, but it does take a while).
Do you have any tips? If this was a more sophisticated setup of course we'd have a live replica of all of the data on another machine, and we could safely switch. Unfortunately, we're not there yet, and I would like to speed up this query as soon as possible (it's causing big customer headaches). Is there some simple way to replicate the data and then do a swap-out trick? Some other tricks that I am missing?
UPDATE: Reading about "Online Index Operations" in SQL Server makes me very jealous http://msdn.microsoft.com/en-us/library/ms191261.aspx :)
Thanks!

you can use replication to get downtime on the order of a couple minutes, instead of the hours it might take to create an index on that table.
to set up the slave, see http://dev.mysql.com/doc/refman/5.0/en/replication-howto-existingdata.html
a recommendation i can make to help speed up the process is in step 2 follow the "Creating a Data Snapshot Using Raw Data Files" method. but instead of copying over the wire to the slave, copy to a different location on the master. and bring the master back up as soon at the copy is done and you've made the necessary changes to the config file (set server-id and enabled binary logging). this will minimize your downtime to just a minute or two. once the server is back up, you can copy the copied files to the slave box.
once you have the slave up and running and you have verified everything is replicating properly, you can pause the slave. create the index on the salve. when the index creation is complete, resume the slave. this will catch the slave up to the master. on the master, use FLUSH TABLE WITH READ LOCK. check the slave status to make sure the log position on the master and the slave match. if they do, shut down the slave and copy the files for that table back to the master.

I'm with Randy. We've been in a similar situation, and there are two ways in MySQL to accomplish something like this:
Take down the server while it runs. This is what you'll probably do. It's simple, it's easy, it works. Time to do? Maybe a half hour/45 minutes, dependent on disk bandwidth. See below.
Make a new table with the new index, copy all the data over, pause the server delete the first table, alter the new one to the old name, start the server. Downtime? 10 minutes, maybe, but really complicated.
Option two works, and saves you the downtime of creating the index (if it takes a long time). But it takes more space, it's more complicated (since you have to deal with the new records inserted off the main table, and it will probably lock on MyISAM while copying the data out. Deleting a table will take some time, altering the table to the new name will take some time. It's just really complicated. If you had a 2TB table this might be useful, but for 27G it's probably overkill.
Do you have a second server that is close in specifications to your production server? Load up your most recent backup and do the index there, so you know about how long it will take to add. Then plan for downtime.
InnoDB is better about many things, but new indexes still lock the table. Those abilities that MSSQL (and I think PostgreSQL) have to do those kind of things without locking would be great.

Find your low usage window and take your application offline during the index build. Since you don't have replication or a multimaster or whatever, you're just going to have to bite the bullet on this one. See you at 1am. :-)

Not much you can do with one server here.
If you copy the table and do a dry run, at least you'll find out how long it's going to take without locking the live table, so you can schedule some maintenance time if necessary, or make a decision whether you can just push the button and leave users hanging for a couple of minutes :)
Or schedule it for a quiet time...
at 04:00 /usr/bin/mysql -uXXX -pXXX -e 'alter table mytable add key(col1, col2)'

SQL Server slow down after duplicating database

I recently moved a bunch of tables from an existing database into a new database due to the database getting rather large. After doing so I noticed a dramatic decrease in performance of my queries when running against the new database.
The process I took to recreate the new database is this:
Generate Table CREATE scripts using sql servers automatic script
generator.
Run the create table scripts
Insert all data into new database
using INSERT INTO with a select from
the existing database.
Run all the alter scripts to create
the foreign keys and any indexes
Does anyone have any ideas of possible problems with my process, or some key step I'm missing that is causing this performance issue?
Thanks.

first I would an a mimimum make sure that auto create statistics is enabled you can also set auto update statistics to true
after that I would update the stats by running
sp_updatestats
or
UPDATE STATISTICS
Also realize that the first time you hit the queries it will be slower because nothing will be cached in RAM. On the second hit should be much faster

Did you script the indexes from the tables in the original database? Missing indexes could certainly account for poor performance.

Have you tried looking at the execution plans on each server when running these queries - that should allow you to easily see if they're doing something different e.g. table scanning due to a missing index, poor statistics etc.
Are both DBs sat on the same box with their data files on the same drive arrays?

Can you tell what about those queries got slower? New access plans? Same plans but they perform slower? Do they execute slower or are they suspended more? Did all queries got slower or just some? And last but not least, how do you know, ie. what exactly did you measure and how?
Some of the usual suspects could be:
The new storage is much slower (.mdf on slow disk, or on a busy disk)
You changed the data structure during move (ie. some indexes did not get ported)
You changed the data size (ie. compression options) resulting on more pages for the same data
Did anything else change at the same time, new app code or anything the like?
By extending the data size (you do no mention deleting the old tables) you are now trashing the buffer pool (did the page lifetime expectancy decreased in performance counters?)

Look on how you set up the initial size and growth options. If you didn't give it enough space to begin with or if you are growing by 1MB at a time that could be a cause of performance issues.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas