Run long-running sproc (that doesn't need to return) from ASP.NET page - sql-server-2000

I would like to know how you would run a stored procedure from a page and just "let it finish" even if the page is closed. It doesn't need to return any data.

A database-centric option would be:
Create a table that will contain a list (or queue) of long-running jobs to be performed.
Have the application add an entry to the queue if, when, and as desired. That's all it does; once logged and entered, no web session or state data need be maintained.
Have a SQL Agent job configured to check every 1, 2, 5, whatever minutes to see if there are any jobs to run.
If there are as-yet unstarted items, mark the most recent one as started, and start it.
When it's completed, mark it as completed, or just delete it
Check if there are any other items to run. If there are, repeat; if not, exit the job.
Depending on capacity, you could have several (differently named) copies of this job running, concurrently processing items from the list.
(I've used this method for very long-running methods. It's more an admin-type trick, but it may be appropriate for your situation.)

Prepare the command first, then queue it in the threadpool. Just make sure the thread does not depend on any HTTP Context or any other http intrinsic object. If your request finishes before the thread; the context might be gone.

See Asynchronous procedure execution. This is the only method that guarantees the execution even if the ASP process crashes. It also self tuning and can handle spikes of load, requests are queued up and processed as resources become available.
The gist of the solution is leveraging the SQL Server Activation concept, which allows you to run a stored procedure in a background thread in SQL Server without a client connection.
Solutions based on SqlClient asynch methods or on CLR thread pool are unreliable, the calls are lost as the ASP process is recycled, and besides they build up in-memory queues of requests that actually trigger a process recycle due to memory consumption.
Solutions based on tables and Agent jobs are better, as they are reliable, but they lack the self tuning of Activation based solutions.

Related

Service to accept SQL queries and run in the background

Is there a service to accept large numbers of SQL queries and run them in the background with retires and logging?
I have multiple clients running large numbers of queries directly against a SQL Server database but because they’re only inserts it would be far more efficient to post the queries to some service which can run them offline in transactions freeing the clients from having to wait for the queries to finish and reducing the connections to the database.
Because the result isn’t needed by the application, I’d like to “fire and forget” the SQL statements knowing they’ll eventually complete, even if they need to retry due to timeouts or network issues.
Does such a service exist?
Does such a service exist?
There is not such a service out-of-the box. As suggested by Gordon Linhoff, you can SEND the batches into a Servcie Broker Queue, or INSERT them into regular Table, and have a background process run them.
In the case of Service Broker, the setup, programming, and troubledhooting is a bit trickier, but you get the Internal Activation to trigger a stored procedure you write when messages appear on the queue.
With a regular table you would just write a SQL Agent job (or similar) that runs in a loop and looks for new rows in the target table, runs the batches it finds, and deletes (or marks) the batches as complete. You don't get the low latency and automatic scale-out that Service Broker Activation provides, but it's much simpler to implement.

TransactedReceiveScope - when does the Transaction Commit?

Scenario:
We have a wcf workflow with a client that does NOT use transactionflow.
The workflow contains several sequential TransactedReceiveScopes (using content-based correlation).
The TransactedReceiveScopes contain custom db operations.
Observations:
When we run SQL profiler against the first call, we see all the custom db calls, and the SaveInstance call in the profile trace.
We've noticed that, even though the SendReply is at the very end of TransactedReceiveScope, sometimes the sendreply occurs a good 10 seconds before the transaction gets committed.
We tried changing the TimeToPersist and TimeToUnload to zero, but that had no effect. (The trace shows the SaveInstance happening immediately anyway, but rather the commit seems to be delayed).
Questions:
Are our observations correct?
At what point is the transaction committed? Is this like garbage collection - i.e. it commits some time later when it's not busy?
Is there any way to control the commit delay, or is the only way to do this to use transactionflow from the client (anc then it should all commit when the client commits, including the persist).
The TransactedReceiveScope commits the transaction when the body is completed but as all execution is done through the scheduler that could be some time later. It is not related to garbage collection and there is no real way to influence it other that to avoid a busy machine and a lot of other parallel activities that could also be in the execution queue.

Continuously checking database from a Windows service

I am making a Windows service which needs to continuously check for database entries that can be added at any time to tell it to execute some code. It is looking to see if it's status is set to pending, and it's execute time entry is > than the current time. Is the only way to do this to just run select statements over and over? It might need to execute the code every minute which means I need to run the select statement every minute looking for entries in the database. I'm trying to avoid unneccesary cpu time because I'm probably going to end up paying for cpu cycles on the hosting provider
Be aware that Notification Services is only for SQL 2005, and has been dropped from SQL 2008.
Rather than polling the database for changes, I would recommend writing a CLR stored procedure that is called from a trigger, which is raised when an appropriate change occurs (e.g. insert or update). The CLR sproc alerts your service which then performs its work.
Sending the service alert via a TCP/IP or HTTP channel is a good choice since you can deploy your service anywhere, just by modifying some configuration parameter that is read by the sproc. It also makes it easy to test the service.
I would use an event driven model in your service. The service waits on an auto-reset event, starting a block of work when the event is raised. The sproc communications channel runs on another thread and sets the event on each incoming request.
Assuming the service is doing a block of work and a set of multiple pending requests are outstanding, this design ensures that those requests trigger just 1 more block of work when the current one is finished.
You can also have multiple workers waiting on the same event if overlapping processing is desired.
Note: for external network access the CREATE ASSEMBLY statement will require the PERMISSION_SET option to be set to EXTERNAL_ACCESS.
Given you talk about the service provider, I suspect one of the main alternatives will not be open to you, which is notification services. It allows you to register for data changed events and be notified, without the need to poll the database. It does however require service broker enabled for it to work, and that potentially could be a problem if it is hosted - some companies keep it switched off.
The question is not tagged to a specific database just SQL, the notification services is a SQL Server facility.
If you're using SQL Server and open to a different approach, check out SQL Server Notification Services.
Oracle also provides notifications, the call it Database Change Notification

Can Sql Server 2008 Stored Procedures (or Triggers) manually parallel or background some logic?

If i have a stored procedure or a trigger in Sql Server 2008, can it do some sql calculations 'in another non-blocking thread'? ie. something in the background
also, can two sql code blocks be ran in parallel? or two stored procs be ran in parallel?
for example. Imagine we are given the job calculating the scores for each Stack Overflow user (and please leave all 'do that elsehwere/service/batch/overnight/etc, elswhere) after a user does some 'action'.
so we have a trigger on the Post table, so when a new post is INSERTED, the trigger fires off and part of that logic, it calculates the user's latest score. Instead of waiting for the stored proc to finish and block the current sql thread / executire, can we ask it to calc the score in the background OR parallel.
cheers!
SQL Server does not have parallel or deferred execution: each block of running code in a connection is serial, one line after the other.
To decouple processing, you usually have to use SQL Server Agent jobs or use Service broker. These start executing in a new connection, new session etc
This makes sense:
What if you want to rollback your changes? What does the background thread do and how does it know?
What data does it use? New, Old, lock wait, snapshot?
What if it gets ahead of the main thread and uses stale data?
No, but you could write the request to a queue. Service Broker, a SQL Server component, provides support for this kind of thing. It's probably the best option available for asynchronous processing.

Start stored procedures sequentially or in parallel

We have a stored procedure that runs nightly that in turn kicks off a number of other procedures. Some of those procedures could logically be run in parallel with some of the others.
How can I indicate to SQL Server whether a procedure should be run in parallel or serial — ie: kicked off of asynchronously or blocking?
What would be the implications of running them in parallel, keeping in mind that I've already determined that the processes won't be competing for table access or locks- just total disk io and memory. For the most part they don't even use the same tables.
Does it matter if some of those procedures are the same procedure, just with different parameters?
If I start a pair or procedures asynchronously, is there a good system in SQL Server to then wait for both of them to finish, or do I need to have each of them set a flag somewhere and check and poll the flag periodically using WAITFOR DELAY?
At the moment we're still on SQL Server 2000.
As a side note, this matters because the main procedure is kicked off in response to the completion of a data dump into the server from a mainframe system. The mainframe dump takes all but about 2 hours each night, and we have no control over it. As a result, we're constantly trying to find ways to reduce processing times.
I had to research this recently, so found this old question that was begging for a more complete answer. Just to be totally explicit: TSQL does not (by itself) have the ability to launch other TSQL operations asynchronously.
That doesn't mean you don't still have a lot of options (some of them mentioned in other answers):
Custom application: Write a simple custom app in the language of your choice, using asynchronous methods. Call a SQL stored proc on each application thread.
SQL Agent jobs: Create multiple SQL jobs, and start them asynchronously from your proc using sp_start_job. You can check to see if they have finished yet using the undocumented function xp_sqlagent_enum_jobs as described in this excellent article by Gregory A. Larsen. (Or have the jobs themselves update your own JOB_PROGRESS table as Chris suggests.) You would literally have to create separate job for each parallel process you anticipate running, even if they are running the same stored proc with different parameters.
OLE Automation: Use sp_oacreate and sp_oamethod to launch a new process calling the other stored proc as described in this article, also by Gregory A. Larsen.
DTS Package: Create a DTS or SSIS package with a simple branching task flow. DTS will launch tasks in individual spids.
Service Broker: If you are on SQL2005+, look into using Service Broker
CLR Parallel Execution: Use the CLR commands Parallel_AddSql and Parallel_Execute as described in this article by Alan Kaplan (SQL2005+ only).
Scheduled Windows Tasks: Listed for completeness, but I'm not a fan of this option.
I don't have much experience with Service Broker or CLR, so I can't comment on those options. If it were me, I'd probably use multiple Jobs in simpler scenarios, and a DTS/SSIS package in more complex scenarios.
One final comment: SQL already attempts to parallelize individual operations whenever it can*. This means that running 2 tasks at the same time instead of after each other is no guarantee that it will finish sooner. Test carefully to see whether it actually improves anything or not.
We had a developer that created a DTS package to run 8 tasks at the same time. Unfortunately, it was only a 4-CPU server :)
*Assuming default settings. This can be modified by altering the server's Maximum Degree of Parallelism or Affinity Mask, or by using the MAXDOP query hint.
Create a couple of SQL Server agent jobs where each one runs a particular proc.
Then from within your master proc kick off the jobs.
The only way of waiting that I can think of is if you have a status table that each proc updates when it's finished.
Then yet another job could poll that table for total completion and kick off a final proc. Alternatively, you could have a trigger on this table.
The memory implications are completely up to your environment..
UPDATE:
If you have access to the task system.. then you could take the same approach. Just have windows execute multiple tasks, each responsible for one proc. Then use a trigger on the status table to kick off something when all of the tasks have completed.
UPDATE2:
Also, if you're willing to create a new app, you could house all of the logic in a single exe...
You do need to move your overnight sprocs to jobs. SQL Server job control will let you do all of the scheduling you are asking for.
You might want to look into using DTS (which can be run from the SQL Agent as a job). It will allow you pretty fine control over which stored procedures need to wait for others to finish and what can run in parallel. You can also run the DTS package as an EXE from your own scheduling software if needed.
NOTE: You will need to create multiple copies of your connection objects to allow calls to run in parallel. Two calls using the same connection object will still block each other even if you don't explicitly put in a dependency.