In my database through stored procedures I'm using deletes in certain tables , what I know is that if they delete affect future performance of my database , considering that use quite often.
I use SQL SERVER
The indices (might) get less good with every change.
Having the Delete (or Insert or Update) execute in a stored procedure does not change anything for good nor bad.
Related
We have a table that has more than 20 million records and has more than 50 columns. I recently added a new column to it of type bit. After my change was done, some of the stored procedures that used this table were performing poorly. The DBA asked me to run the SP_Recompile 'tableName' command to update the table statistics. After I did that, the procedures were performing well. Could someone please explain what happens when a table is altered and a new column is added? How does it affect the performance?
This is actually explained in the documentation.
Firstly, sys.sp_recompile N'{Table Name}'; doesn't update the statistics of the table. From the documentation:
If object is the name of a table or view, all the stored procedures, triggers, or user-defined functions that reference the table or view will be recompiled the next time that they are run.
Recompiling means that the next time the query plan for the query is regenerated; the old cached one is not used. Why you would want to do this, is also discussed in the documentation:
The queries used by stored procedures, or triggers, and user-defined functions are optimized only when they are compiled. As indexes or other changes that affect statistics are made to the database, compiled stored procedures, triggers, and user-defined functions may lose efficiency. By recompiling stored procedures and triggers that act on a table, you can reoptimize the queries.
When you altered the table, you can have affects the statistics. Also, however, so do just day to day usage where rows and inserted, updated, and deleted. It appears that this was the case here, and the plan the procedures were using weren't the most efficient now. Forcing them to recompile means that they can use the new statistics and a new (hopefully more efficient) plan.
Working on redesigning some databases in my SQL SERVER 2012 instance.
I have databases where I put my raw data (from vendors) and then I have client databases where I will (based on client name) create a view that only shows data for a specific client.
Because of the this data being volatile (Google Adwords & Google DFA) I typically just delete the last 6 days and insert 7 days everyday from the vendor databases. Doing this gives me comfort in knowing that Google has had time to solidify its data.
The question I am trying to answer is:
1. Instead of using views, would it be better use a 'SELECT INTO' statement and DROP the table everyday in the client database?
I'm afraid that by automating my process using the 'DROP TABLE' method will not scale well longterm. While testing it myself, it seems that performance is improved because it does not have to scan the entire table for the date range. I've also tested this with an index on the 'date' column and performance still seemed better with the 'DROP TABLE' method.
I am looking for best practices here.
NOTE: This is my first post. So I am not too familiar with how to format correctly. :)
Deleting rows from a table is a time-consuming process. All the deleted records get logged, and performance of the server suffers.
Instead, databases offer truncate table. This removes all the rows of the table without logging the rows, but keeps the structure intact. Also, triggers, indexes, constraints, stored procedures, and so on are not affected by the removal of rows.
In some databases, if you delete all rows from a table, then the operation is really truncate table. However, SQL Server is not one of those databases. In fact the documentation lists truncate as a best practice for deleting all rows:
To delete all the rows in a table, use TRUNCATE TABLE. TRUNCATE TABLE
is faster than DELETE and uses fewer system and transaction log
resources. TRUNCATE TABLE has restrictions, for example, the table
cannot participate in replication. For more information, see TRUNCATE
TABLE (Transact-SQL)
You can drop the table. But then you lose auxiliary metadata as well -- all the things listed above.
I would recommend that you truncate the table and reload the data using insert into or bulk insert.
I have a big table with around 70 columns in SQL Server 2008. A multithreaded .NET application is calling a stored proc on database to insert into / update the table. Frequency is around 3 times a second.
I have made weekly partitions on table since almost every query has a datetime constraint on the table.
Sometimes it takes a long time to insert/update the table. I am suspicious that sometimes INSERTION makes UPDATE wait; sometimes UPDATE makes INSERTION wait. Is it possible?
How can I design the table to avoid such locks? Performance is the main issue here.
You're right that you're probably hitting deadlocks causing things to wait. A couple things to check first;
Are your indexes correct?
If your DB is in 'Full' recovery mode do you need it? Simple recovery really speeds up inserts/updates, but you loose point-in-time restores for backups.
Are you likely to have multiple threads writing the same record? If not, NOLOCK might be your friend here, but that would mean your data might be inconsitent for a second or two on occasion.
I have a application written by other team in our company that insert data in one table.
Let's say they write data into table Log1 with fields:
Id (auto-generated primary key);
KeyId;
Value1;
Value2;
Value3.
For now I need to have another additional record in another table (Log2) from them that has only part of their data:
Id (it will be my own auto-generated Id);
KeyId;
Value1.
I see 2 ways to do that:
Create trigger that on adding records into Log1 will automatically create record in Log2 with required data;
Implement SP that will accept all required data for Log1 table and will create records in both tables, then ask those applications authors use SP instead of direct INSERT query.
What do you think is the best way in this case and why?
Thank you very much for your help.
P.S. I'm using MS SQL 2005
Go with option 1.
It means that the tables will be synchronised properly even if the "correct" stored procedure interface isn't used and it will be easier and more efficient to insert multiple rows (How would you do this with a stored procedure in SQL Server 2005? - Call it multiple times? Convert all the data to XML format first?)
If you use a trigger, be aware that as it seems both Log1 and Log2 use identity columns, that you can't use SELECT ##IDENTITY to return the PK of Log1 - you will need to use SCOPE_IDENTITY().
On the other hand, if you use a SPROC, what you can do is revoke INSERT privileges to your table from (just about) everyone, and instead grant EXEC on your SPROC. This way access to your table should be fairly well guarded.
The only way to really guarantee your data integrity is with a trigger. There is always a chance that someone will execute an operation (bulk operation, sql insert statement, etc.) that will bypass your SP.
Go with option 2.
Triggers should be avoided whenever possible.
One not so obvious reason: Have you ever used SQL Server replication facilities? Triggers won't be very straightforward to replicate. (ie it is not as easy as a couple of clicks, like it is for tables for instance). But I'm going off topic ... bottom line, triggers are evil... avoid when you can.
EDIT
More reasons: Triggers are not easy to see like other objects in the DBMS. On the application side, they are invisible, and if not well documented, they tend to be forgotten. If there are changes to the schema ... oh well, it's just easier to maintain stuff with stored procedures.
I'm running the following SAS command:
Proc SQL;
Delete From Server003.CustomerList;
Quit;
Which is taking over 8 minutes... when it takes only a few seconds to read that file. What could be cause a delete to take so long and what can I do to make it go faster?
(I do not have access to drop the table, so I can only delete all rows)
Thanks,
Dan
Edit: I also apparently cannot Truncate tables.
This is NOT regular SQL. SAS' Proc SQL does not support the Truncate statement. Ideally, you want to figure out what's going on with the performance of the delete from; but if what you really need is truncate functionality, you could always just use pure SAS and not mess with SQL at all.
data Server003.CustomerList;
set Server003.CustomerList (obs=0);
run;
This effectively performs and operates like a Truncate would. It maintains the dataset/table structure but fails to populate it with data (due to the OBS= option).
Are there a lot of other tables which have foreign keys to this table? If those tables don't have indexes on the foreign key column(s) then it could take awhile for SQL to determine whether or not it's safe to delete the rows, even if none of the other tables actually has a value in the foreign key column(s).
Try adding this to your LIBNAME statement:
DIRECT_EXE=DELETE
According to SAS/ACCESS(R) 9.2 for Relational Databases: Reference,
Performance improves significantly by using DIRECT_EXE=, because the SQL delete statement is passed directly to the DBMS, instead of SAS reading the entire result set and deleting one row at a time.
I would also mention that in general SQL commands run slower in SAS PROC SQL. Recently I did a project and moved the TRUNCATE TABLE statements into a Stored Procedure to avoid the penalty of having them inside SAS and being handled by their SQL Optimizer and surrounding execution shell. In the end this increased the performance of the TRUNCATE TABLE substantially.
It might be slower because disk writes are typically slower than reads.
As for a way around it without dropping/truncating, good question! :)
You also could consider the elegant:
proc sql; create table libname.tablename like libname.tablename; quit;
I will produce a new table with the same name and same meta data of your previous table and delete the old one in the same operation.