We have a table that has more than 20 million records and has more than 50 columns. I recently added a new column to it of type bit. After my change was done, some of the stored procedures that used this table were performing poorly. The DBA asked me to run the SP_Recompile 'tableName' command to update the table statistics. After I did that, the procedures were performing well. Could someone please explain what happens when a table is altered and a new column is added? How does it affect the performance?
This is actually explained in the documentation.
Firstly, sys.sp_recompile N'{Table Name}'; doesn't update the statistics of the table. From the documentation:
If object is the name of a table or view, all the stored procedures, triggers, or user-defined functions that reference the table or view will be recompiled the next time that they are run.
Recompiling means that the next time the query plan for the query is regenerated; the old cached one is not used. Why you would want to do this, is also discussed in the documentation:
The queries used by stored procedures, or triggers, and user-defined functions are optimized only when they are compiled. As indexes or other changes that affect statistics are made to the database, compiled stored procedures, triggers, and user-defined functions may lose efficiency. By recompiling stored procedures and triggers that act on a table, you can reoptimize the queries.
When you altered the table, you can have affects the statistics. Also, however, so do just day to day usage where rows and inserted, updated, and deleted. It appears that this was the case here, and the plan the procedures were using weren't the most efficient now. Forcing them to recompile means that they can use the new statistics and a new (hopefully more efficient) plan.
Related
Problem statement
I have a view for recursively collecting and aggregating infos from 3 different large to very large tables. This view itself needs quite a time to execute but is needed in many select statements and is executed quite often.
The resulting view, however, is very small (a few dozend results in 2 columns).
All updating actions typically start a transaction, execute many thousand INSERTs and then commit the transaction. This does not occur very frequently, but if something is written to the database it is usually a large amount of data.
What I tried
As the view is small, does not change frequently and is read often, I thought of creating an indexed view. However, sadly you can not create an indexed view with CTEs or even recursive CTEs.
To 'emulate' a indexed or materialized view, I thought about writing a trigger that executes the view and stores the results into a table every time one of the base tables get modified. However, I guess this would take forever if a large amout of entries are UPDATEed or INSERTed and the trigger runs for each INSERT/UPDATE statement on those tables, even if they are inside a single transaction.
Actual question
Is it possible to write a trigger that runs once before commiting and after the last insert/update statement of a transaction has finished and only if any of the statements has changed any of the three tables?
No, there's no direct way to make a trigger that runs right before the end of a transaction. DML Triggers run once per triggering DML statement (INSERT, UPDATE, DELETE), and there's no other kind of trigger related to data modification.
Indirectly, you could have all of your INSERT's insert into a temporary table and then INSERT them all together from the #temp table into the real table, resulting in one trigger firing for that table. But if you are writing to multiple tables, you would still have the same problem.
The SOP (Standard Operating Practice) way to address this is to have a stored procedure handle everything up front instead of a Trigger trying to catch everything on the back-side.
If data consistency is important, then I'd recommend that you follow the SOP approach based in a stored procedure that I mentioned above. Here's a hi-level outline of this approach:
Use a stored procedure that dumps all of the changes into #temp tables first,
then start a transaction,
then make the changes, moving data/changes from your #temp table(s) into your actual tables,
then do the follow-up work you wanted in a trigger. If these are consistency checks, then if they fail, you should rollback the transaction.
Otherwise, it then commits the transaction.
This is almost always how something like this is done correctly.
If your view is small and queried frequently and your underline tables are rarely changed, you don't need a "view". Instead you need a summary table with the same result of the view and updated by your triggers on each underline table.
A trigger is triggered every time you have data modification (insert, delete and update) but one modification will only trigger once, whether it updates one record or one million rows. You don't need worry about the size of update. Instead the frequency of updating is your concern.
If your have a procedure periodically insert large number of rows, or updates large number of rows one by one, you can change the procedure and disable the triggers before the update so the summary table will be updated only before the end of procedure, where you can call the same "sum" procedure and enable those triggers.
If you HAVE TO keep the "summary" up-to-date all the time, even during large number of transactions (i doubt it's very helpful or practical, if your view is slow to execute), you can disable those triggers, do some calculation by yourself after each transaction, update the summary table after each transaction, in your procedure.
In my database through stored procedures I'm using deletes in certain tables , what I know is that if they delete affect future performance of my database , considering that use quite often.
I use SQL SERVER
The indices (might) get less good with every change.
Having the Delete (or Insert or Update) execute in a stored procedure does not change anything for good nor bad.
I am currently trying to clean up some stored procedures. There are about 20 of them that look very similar and do many of the same things, but take in slightly different parameters and filter data differently.
For the most part, all of the stored procs start by loading some data in to one or two table variables (which is generally where the procs differ). After that, all of the code for each sproc is more or less the same. They perform some logging and apply some additional common filters.
I wanted to at least turn the common pieces in to stored procs so that the code would be easier to read and we wouldn't have to open 20 procedures to update the same line of sql, but the use of table variables prevents it. We are using Sql Server 2005, and to my knowledge we cannot used table valued parameters in our stored procedures.
We could, however, change all of the table variables to temp tables and reference them in the new common stored procedures. I am assuming that is a fairly common practice, but wanted to know if it was actually a good idea.
In the nested stored procedures, should I assume that a temp table has already been created elsewhere and just query off of it? I can test whether or not the table exists, but what if it doesn't? Is there a good alternative to this in 2005? Is it confusing for other developers who open one of the nested stored procs and see a temp table that is created elsewhere? Do I just need to add lots of informative comments?
In your nested proc; to be sure, you can check whether table exist or not. If not exist then RAISERROR with some error message. Since, you are using SS2005, #temptable would be the option you have. Commenting your code for ease of readability is never a bad practice.
Talking about naming convention; follow any convention that fits better in your organization (I just gave a proper name to the SP that will reflect the purpose of the SP). If code changes is happening then changing the comment accordingly is the developer responsibility. Other than this, whatever you are doing looks correct to me.
I'm looking for a way to get all table creation and alteration queries attached to a database, in SQL Server 2000. Is this stored in a system table, or is there a built in method to remake them?
Goal: to extract the schema for customizable backups.
My research so far turned up nothing. My Google-Fu is weak...
Note that I don't know that there's a way to specify which filegroup a stored procedure is on (other than the default). So what you may consider, in order to at least keep the script repository backup small, is:
create a filegroup called non_data_objects, and make it the default (instead of PRIMARY).
create a filegroup for each set of tables, and create those tables there.
backup each set of tables by filegroup, and always include a backup of non_data_objects so that you have the current set of procedures, functions etc. that belong to those tables (even though you'll also get the others). Because 1. will only contain the metadata for non-data, it should be relatively small.
You might also consider just using a different database for each set of tables. Other than using three-part naming in your scripts that need to reference the different sets, there really is no performance difference. And this makes your backup/recovery plan much simpler.
I have the following Oracle SQL:
Begin
-- tables
for c in (select table_name from user_tables) loop
execute immediate ('drop table '||c.table_name||' cascade constraints');
end loop;
-- sequences
for c in (select sequence_name from user_sequences) loop
execute immediate ('drop sequence '||c.sequence_name);
end loop;
End;
It was given to me by another dev, and I have no idea how it works, but it drops all tables in our database.
It works, but it takes forever!
I don't think dropping all of my tables should take that long. What's the deal? And, can this script be improved?
Note: There are somewhere around 100 tables.
"It works, but it takes forever!"
Forever in this case meaning less than three seconds a table :)
There is more to dropping a table than just dropping the table. There are dependent objects to drop as well - constraints, indexes, triggers, lob or nested table storage, etc. There are views, synonyms stored procedures to invalidate. There are grants to be revoked. The table's space (and that of its indexes, etc) has to be de-allocated.
All of this activity generates recursive SQL, queries which select from or update the data dictionary, and which can perform badly. Even if we don't use triggers, views, stored procs, the database still has to run the queries to establish their absence.
Unlike normal SQL we cannot tune recursive SQL but we can shape the environment to make it run quicker.
I'm presuming that this is a development database, in which objects get built and torn down on a regular basis, and that you're using 10g or higher.
Clear out the recycle bin.
SQL> purge recyclebin;
Gather statistics for the data dictionary (will require DBA privileges). These may already be gathered, as that is the default behaviour in 10g and 11g. Find out more.
Once you have dictionary stats ensure you're using the cost-based optimizer. Ideally this should be set at the database level, but we can fix it at the session level:
SQL> alter session set optimizer_mode=choose;
I would try changing the DROP TABLE statement to use the Purge keyword. Since you are dropping all tables, you don't really need to cascade the constraints at the same time. This action is probably what is causing it to be slow. I don't have an instance of Oracle to test this with though, so it may throw an error.
If it does throw an error, or not go faster, I would remove the Sequence drop commands to figure out which command is taking so much time.
Oracle's documentation on the DROP TABLE command is here.
One alternative is to drop the user instead of the individual tables etc., and recreate them if needed. It's generally more robust as is drops all of the tables, view, procedures, sequences etc., and would probably be faster.