Stored procs breaking overnight - sql-server-2005

We are running MS SQL 2005 and we have been experiencing a very peculiar problem the past few days.
I have two procs, one that creates an hourly report of data. And another that calls it, puts its results in a temp table, and does some aggregations, and returns a summary.
They work fine...until the next morning.
The next morning, suddenly the calling report, complains about an invalid column name.
The fix, is simply a recompile of the calling proc, and all works well again.
How can this happen? It's happened three nights in a row since moving these procs into production.
EDIT: It appears, that it's not a recompile that is needed of the caller (summary) proc. I was just able to fix the problem by executing the callee (hourly) proc. Then executing the summary proc. This makes less sense than before.
EDIT2:
The hourly proc is rather large, and I'm not posting it here in it's entirety. But, at the end, it does a SELECT INTO, then conditionally, returns the appropriate result(s) from the created temp table.
Select [large column list]
into #tmpResults
From #DailySales8
Where datepart(hour,RowStartTime) >= #StartHour
and datepart(hour,RowStartTime) < #EndHour
and datepart(hour, RowStartTime) <= #LastHour
IF #UntilHour IS NOT NULL
AND EXISTS (SELECT * FROM #tmpResults WHERE datepart(hour, RowEndTime) = #UntilHour) BEGIN
SELECT *
FROM #tmpResults
WHERE datepart(hour, RowEndTime) = #UntilHour
END ELSE IF #JustLastFullHour = 1 BEGIN
DECLARE #MaxHour INT
SELECT #MaxHour = max(datepart(hour, RowEndTime)) FROM #tmpResults
IF #LastHour > 24 SELECT #LastHour = #MaxHour
SELECT *
FROM #tmpResults
WHERE datepart(hour, RowEndTime) = #LastHour
IF ##ROWCOUNT = 0 BEGIN
SELECT *
FROM #tmpResults
WHERE datepart(hour, RowEndTime) = #MaxHour
END
END ELSE BEGIN
SELECT * FROM #tmpResults
END
Then it drops all temp tables and ends.
The caller (Summary)
First creates a temp table #tmpTodaySales to store the results, the column list DOES MATCH the definition of #tmpResults in the other proc. Then it ends up calling the hourly proc a couple times
INSERT #tmpTodaysSales
EXEC HourlyProc #LocationCode, #ReportDate, null, 1
INSERT #tmpTodaysSales
EXEC HourlyProc #LocationCode, #LastWeekReportDate, #LastHour, 0
I believe it is these calls that fail. But recompiling the proc, or executing the hourly procedure outside of this, and then calling the summary proc fixes the problem.

Two questions:
Does the schema of #DailySales8 vary at all? Does it have any direct/indirect dependence on the date of execution, or on any of the parameters supplied to HourlyProc?
Which execution of INSERT #tmpTodaysSales EXEC HourlyProc ... in the summary fails - first or second?

What do the overnight maintenance plans look like, and are there any other scheduled overnight jobs that run between 2230 and 1000 the next day? It's possible that step in the maintenance plan or another agent job is causing some kind of corruption that's breaking your SP.

Related

Optimize delete SQL query with unordered table

I am attempting a mass delete of old data from a huge table with 80,000,000 rows, about 50,000,000 rows will be removed. This will be done in batches of 50k to avoid database log overflow. Also the rows of the table are not sorted chronologically. I've come up with the following script:
BEGIN
DECLARE #START_TIME DATETIME,
#END_TIME DATETIME,
#DELETE_COUNT NUMERIC(10,0),
#TOTAL_COUNT NUMERIC(10,0),
#TO_DATE DATETIME,
#FROM_DATE DATETIME,
#TABLE_SIZE INT
SELECT #START_TIME = GETDATE()
PRINT 'Delete script Execution START TIME = %1!', #START_TIME
SELECT #TABLE_SIZE = COUNT(*) FROM HUGE_TABLE
PRINT 'Number of rows in HUGE_TABLE = %1!', #TABLE_SIZE
SELECT #DELETE_COUNT = 1,
#TOTAL_COUNT = 0,
#TO_DATE = DATEADD(yy, -2, GETDATE())
CREATE TABLE #TMP_BATCH_FOR_DEL (REQUEST_DT DATETIME)
WHILE(#DELETE_COUNT > 0)
BEGIN
DELETE FROM #TMP_BATCH_FOR_DEL
INSERT INTO #TMP_BATCH_FOR_DEL (REQUEST_DT)
SELECT TOP 50000 REQUEST_DT
FROM HUGE_TABLE
WHERE REQUEST_DT < #TO_DATE
ORDER BY REQUEST_DT DESC
SELECT #FROM_DATE = MIN(REQUEST_DT), #TO_DATE = MAX(REQUEST_DT)
FROM #TMP_BATCH_FOR_DEL
PRINT 'Deleting data from %1! to %2!', #FROM_DATE, #TO_DATE
DELETE FROM HUGE_TABLE
WHERE REQUEST_DT BETWEEN #FROM_DATE AND #TO_DATE
SELECT #DELETE_COUNT = ##ROWCOUNT
SELECT #TOTAL_COUNT = #TOTAL_COUNT + #DELETE_COUNT
SELECT #TO_DATE = #FROM_DATE
COMMIT
CHECKPOINT
END
SELECT #END_TIME = GETDATE()
PRINT 'Delete script Execution END TIME = %1!', #END_TIME
PRINT 'Total Rows deleted = %1!', #TOTAL_COUNT
DROP TABLE #TMP_BATCH_FOR_DEL
END
GO
I did a practice run and found the above was deleting around 2,250,000 rows per hour. So, it would take 24+ hours of continuous runtime to delete my data.
I know it's that darn ORDER BY clause within the loop that's slowing things down, but storing the ordered table in another temp table would take up too much memory. But, I can't think of a better way to do this.
Thoughts?
It is probably not the query itself. Your code is deleting about 600+ records per second. A lot is going on in that time -- logging, locking, and so on.
A faster approach is to load the data you want into a new table, truncate the old table, and reload it:
select *
into temp_huge_table
from huge_table
where request_dt > ?; -- whatever the cutoff is
Then -- after validating the results -- truncate the huge table and reload the data:
truncate table huge_table;
insert into huge_table
select *
from temp_huge_table;
If there is an identity column you will want to disable that to allow identity insert. You might have to take other precautions if there are triggers that set values in the table. Or if there are foreign key references to rows in the table.
I would not suggest doing this directly. After you have truncated the table, you should probably partition by the table by date -- by day, week, month, whatever.
Then, in the future, you can simply drop partitions rather than deleting rows. Dropping partitions is much, much faster.
Note that loading a few tens of millions of rows into an empty table is much, much faster than deleting them, but it still takes time (you can test how much time on your system). This requires downtown for the table. However, you hopefully have a maintenance period where this is possible.
And, the downtime can be justified by partitioning the table so you won't have this issue in the future.
Maybe you can optimize your Query by Inserting the 30.000.000 Records you want to keep into antoher Table which will be your new "Huge Table". And Drop the whole old "Huge Table" all together.
Best Regards
LK

Stored Procedure for batch delete in Firebird

I need to delete a bunch of records (literally millions) but I don't want to make it in an individual statement, because of performance issues. So I created a view:
CREATE VIEW V1
AS
SELECT FIRST 500000 *
FROM TABLE
WHERE W_ID = 14
After that I do a bunch deletes for example:
DELETE FROM V1 WHERE TS < 2021-01-01
What I want is to import this logic in a While loop and in stored procedure. I tried SELECT COUNT query like this:
SELECT COUNT(*)
FROM TABLE
WHERE W_ID = 14 AND TS < 2021-01-01;
Can I use this number in the same procedure as a condition and how can I manage that?
This is what I have tried and I get an error
ERROR: Dynamic SQL Error; SQL error code = -104; Token unknown; WHILE
Code:
CREATE PROCEDURE DeleteBatch
AS
DECLARE VARIABLE CNT INT;
BEGIN
SELECT COUNT(*) FROM TABLE WHERE W_ID = 14 AND TS < 2021-01-01 INTO :cnt;
WHILE cnt > 0 do
BEGIN
IF (cnt > 0) THEN
DELETE FROM V1 WHERE TS < 2021-01-01;
END
ELSE break;
END
I just can't wrap my head around this.
To clarify, in my previous question I wanted to know how to manage the garbage_collection after many deleted records, and I did what was suggested - SELECT * FROM TABLE; or gfix -sweep and that worked very well. As mentioned in the comments the correct statement is SELECT COUNT(*) FROM TABLE;
After that another even bigger database was given to me - above 50 million. And the problem was the DB was very slow to operate with. And I managed to get the server it was on, killed with a DELETE statement to clean the database.
That's why I wanted to try deleting in batches. The slow-down problem there was purely hardware - HDD has gone, and we replaced it. After that there was no problem with executing statements and doing backup and restore to reclaim disk space.
Provided the data that you need to delete, doesn't ever need to be rollbacked once the stored procedure is kicked off, there is another way to handle massive DELETEs in a Stored Procedure.
The example stored procedure will delete the rows 500,000 at a time. It will loop until there aren't any more rows to delete. The AUTONOMOUS TRANSACTION will allow you to put each delete statement in its own transaction and it will commit immediately after the statement completes. This is issuing an implicit commit inside a stored procedure, which you normally can't do.
CREATE OR ALTER PROCEDURE DELETE_TABLEXYZ_ROWS
AS
DECLARE VARIABLE RC INTEGER;
BEGIN
RC = 9999;
WHILE (RC > 0) DO
BEGIN
IN AUTONOMOUS TRANSACTION DO
BEGIN
DELETE FROM TABLEXYZ ROWS 500000;
RC = ROW_COUNT;
END
END
SELECT COUNT(*)
FROM TABLEXYZ
INTO :RC;
END
because of performance issues
What are those exactly? I do not think you actually are improving performance, by just running delete in loops but within the same transaction, or even different TXs but within the same timespan. You seem to be solving some wrong problem. The issue is not how you create "garbage", but how and when Firebird "collects" it.
For example, Select Count(*) in Interbase/Firebird engines means natural scan over all the table and the garbage collection is often trigggered by it, which can itself get long if lot of garbage was created (and massive delete surely does, no matter if done by one million-rows statement or million of one-row statements).
How to delete large data from Firebird SQL database
If you really want to slow down deletion - you have to spread that activity round the clock, and make your client application call a deleting SP for example once every 15 minutes. You would have to add some column to the table, flagging it is marked for deletion and then do the job like that
CREATE PROCEDURE DeleteBatch(CNT INT)
AS
DECLARE ROW_ID INTEGER;
BEGIN
FOR SELECT ID FROM TABLENAME WHERE MARKED_TO_DEL > 0 INTO :row_id
DO BEGIN
CNT = CNT - 1;
DELETE FROM TABLENAME WHERE ID = :ROW_ID;
IF (CNT <= 0) THEN LEAVE;
END
SELECT COUNT(1) FROM TABLENAME INTO :ROW_id; /* force GC now */
END
...and every 15 minutes you do EXECUTE PROCEDURE DeleteBatch(1000).
Overall this probably would only be slower, because of single-row "precision targeting" - but at least it would spread the delays.
Use DELETE...ROWS.
https://firebirdsql.org/file/documentation/html/en/refdocs/fblangref25/firebird-25-language-reference.html#fblangref25-dml-delete-orderby
But as I already said in the answer to the previous question it is better to spend time investigating source of slowdown instead of workaround it by deleting data.

How to track changes in development environment in SQL Server

Our development team works on SQL Server and writes stored procedures for our product.
We need something like a version control system for those procedures or any other objects.
Sometimes I change a stored procedure, and someone else in my team changes it and I don't know any thing about it.
Is there any solution for that?
If you want to do it via code you could run this on a daily or hourly basis to get a list of all procs that were changed in the last day:
select *
from sys.objects
where datediff(dd, create_date, getdate()) < 1
or datediff(dd, modify_date, getdate() < 1)
and type = 'P';
or you could create a ddl trigger:
Create trigger prochanged On database
For create_procedure, alter_procedure, drop procedure
as
begin
set nocount on
Declare #data xml
set #data = Eventdata()
-- save #data to a table...
end
This will allow you to save all kinds of information every time a proc is created, changed or deleted.

How to force a running t-sql query (half done) to commit?

I have database on Sql Server 2008 R2.
On that database a delete query on 400 Million records, has been running for 4 days , but I need to reboot the machine. How can I force it to commit whatever is deleted so far? I want to reject that data which is deleted by running query so far.
But problem is that query is still running and will not complete before the server reboot.
Note : I have not set any isolation / begin/end transaction for the query. The query is running in SSMS studio.
If machine reboot or I cancelled the query, then database will go in recovery mode and it will recovering for next 2 days, then I need to re-run the delete and it will cost me another 4 days.
I really appreciate any suggestion / help or guidance in this.
I am novice user of sql server.
Thanks in Advance
Regards
There is no way to stop SQL Server from trying to bring the database into a transactionally consistent state. Every single statement is implicitly a transaction itself (if not part of an outer transaction) and is executing either all or nothing. So if you either cancel the query or disconnect or reboot the server, SQL Server will from transaction log write the original values back to the updated data pages.
Next time when you delete so many rows at once, don't do it at once. Divide the job in smaller chunks (I always use 5.000 as a magic number, meaning I delete 5000 rows at the time in the loop) to minimize transaction log use and locking.
set rowcount 5000
delete table
while ##rowcount = 5000
delete table
set rowcount 0
If you are deleting that many rows you may have a better time with truncate. Truncate deletes all rows from the table very efficiently. However, I'm assuming that you would like to keep some of the records in the table. The stored procedure below backs up the data you would like to keep into a temp table then truncates then re-inserts the records that were saved. This can clean a huge table very quickly.
Note that truncate doesn't play well with Foreign Key constraints so you may need to drop those then recreate them after cleaned.
CREATE PROCEDURE [dbo].[deleteTableFast] (
#TableName VARCHAR(100),
#WhereClause varchar(1000))
AS
BEGIN
-- input:
-- table name: is the table to use
-- where clause: is the where clause of the records to KEEP
declare #tempTableName varchar(100);
set #tempTableName = #tableName+'_temp_to_truncate';
-- error checking
if exists (SELECT [Table_Name] FROM Information_Schema.COLUMNS WHERE [TABLE_NAME] =(#tempTableName)) begin
print 'ERROR: already temp table ... exiting'
return
end
if not exists (SELECT [Table_Name] FROM Information_Schema.COLUMNS WHERE [TABLE_NAME] =(#TableName)) begin
print 'ERROR: table does not exist ... exiting'
return
end
-- save wanted records via a temp table to be able to truncate
exec ('select * into '+#tempTableName+' from '+#TableName+' WHERE '+#WhereClause);
exec ('truncate table '+#TableName);
exec ('insert into '+#TableName+' select * from '+#tempTableName);
exec ('drop table '+#tempTableName);
end
GO
You must know D(Durability) in ACID first before you understand why database goes to Recovery mode.
Generally speaking, you should avoid long running SQL if possible. Long running SQL means more lock time on resource, larger transaction log and huge rollback time when it fails.
Consider divided your task some id or time. For example, you want to insert large volume data from TableSrc to TableTarget, you can write query like
DECLARE #BATCHCOUNT INT = 1000;
DECLARE #Id INT = 0;
DECLARE #Max = ...;
WHILE Id < #Max
BEGIN
INSERT INTO TableTarget
FROM TableSrc
WHERE PrimaryKey >= #Id AND #PrimaryKey < #Id + #BatchCount;
SET #Id = #Id + #BatchCount;
END
It's ugly more code and more error prone. But it's the only way I know to deal with huge data volume.

Linking a simple query to a script executing several stored procedures

Sorry I'm a bit new to this so just trying to get my head around linking everything up.
At the moment I have a normal query - SELECT FROM WHERE which basically finds about 2000 records that I need to update which link across several tables.
Can someone tell me how I can link this simple query to something else so I can basically execute several stored procedures, all in the same script? But only affecting the records returned by my simple query?
Apologies, that probably sounds as clear as mud!
*EDIT - MORE DETAIL *
So here is my Select query:
SELECT [MembershipTermID]
,[MemberStatusProgKey]
,[StartDate]
,[EndDate]
,[AdditionalDiscount]
,[EntryDateTime]
,[UpdateDateTime]
,[MembershipID]
,[AgentID]
,[PlanVersionID]
,[ForceThroughReference]
,[IsForceThrough]
,[NextTermPrePaid]
,[IsBillingMonthly]
,[CICSMEMBERNUM]
,[CICSHISTORY]
,[TMPSeqNoColumn]
,[LastPaymentDate]
,[PaidToDate]
,[IsIndeterminate]
,DATEDIFF(MONTH, PaidToDate, GETDATE()) as MonthsDifference
,dbo.FullMonthsSeparation (PaidToDate, GETDATE())
FROM [Apollo].[dbo].[MembershipTerm]
WHERE MemberStatusProgKey='DORMANT'
AND IsBillingMonthly=1
AND dbo.FullMonthsSeparation (PaidToDate, GETDATE()) >= 2
So using the rows that this returns I want to exec several stored procedures to update everything I need to in the database which would be affected by changing these rows. An example of one stored procedure is below, I think I will need to execute about 10 of these if not more:
USE [Apollo]
GO
/****** Object: StoredProcedure [dbo].[spCancellationDetailInsert] Script Date: 01/10/2012 10:21:50 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/* ************************* INSERT *************************/
/* Auto Generated 11/29/2006 7:28:53 PM by Object Builder */
/* ************************* INSERT *************************/
ALTER Procedure [dbo].[spCancellationDetailInsert]
#StampUser char (10),
#CancellationDetailID int,
#RefundAmount float,
#OldEndDate datetime,
#EffectiveDate datetime,
#CancelDate datetime,
#ReasonCodeProgKey nvarchar (50)
As
/* insert CancellationDetail record */
Insert [CancellationDetail]
(
RefundAmount,
OldEndDate,
EffectiveDate,
CancelDate,
ReasonCodeProgKey
)
Values
(
#RefundAmount,
#OldEndDate,
#EffectiveDate,
#CancelDate,
#ReasonCodeProgKey
)
If ##Error <> 0 GoTo InsertErrorHandler
/* save the key of the new row created by the insert */
Select #CancellationDetailID = Scope_Identity()
/* add audit record */
Insert CancellationDetailAudit
(StampUser,
StampDateTime,
StampAction,
CancellationDetailID,
RefundAmount,
OldEndDate,
EffectiveDate,
CancelDate,
ReasonCodeProgKey)
Values
(#StampUser ,
GetDate() ,
'I',
#CancellationDetailID,
#RefundAmount,
#OldEndDate,
#EffectiveDate,
#CancelDate,
#ReasonCodeProgKey)
If ##Error <> 0 GoTo AuditInsertErrorHandler
Select
CancellationDetailID = #CancellationDetailID
Return (0)
InsertErrorHandler:
Raiserror ('SQL Error whilst inserting CancellationDetailrecord: Error Code %d',17,1,##Error)
With Log
Return (99)
AuditInsertErrorHandler:
Raiserror ('SQL Error whilst inserting audit record for CancellationDetailInsert: Error Code %d',17,1,##Error)
With Log
Return (99)
If you're asking what I think you are -
Stored procedures can contain (pretty much) any valid SQL statement. This includes returning multiple results sets, performing multiple updates and calling other stored procedures.
For example:
CREATE PROCEDURE usp_Sample AS
SELECT * FROM INFORMATION_SCHEMA.COLUMNS
SELECT * FROM INFORMATION_SCHEMA.TABLES
UPDATE Users SET Active = 0 WHERE ExpiredDate < GetDate()
SELECT Active, COUNT(*) FROM Users GROUP BY Active
EXEC usp_Sample2
GO
Obviously that's a rather artificial example, but assuming all the objects existed it'd run perfectly well.
In order to perform more queries at the same time you just need to append them after your select.
So you can do
Select *
From table1
Select *
From table2
Select *
From table3
as many times as you want and they'll all execute independently.
If you want to UPDATE based on a SELECT you usually do something like:
UPDATE table1
WHERE ID IN (SELECT ID FROM TABLE2)
with regards to your stored procedures it would help if you posted more details.