Can Sql Server 2008 Stored Procedures (or Triggers) manually parallel or background some logic? - sql

If i have a stored procedure or a trigger in Sql Server 2008, can it do some sql calculations 'in another non-blocking thread'? ie. something in the background
also, can two sql code blocks be ran in parallel? or two stored procs be ran in parallel?
for example. Imagine we are given the job calculating the scores for each Stack Overflow user (and please leave all 'do that elsehwere/service/batch/overnight/etc, elswhere) after a user does some 'action'.
so we have a trigger on the Post table, so when a new post is INSERTED, the trigger fires off and part of that logic, it calculates the user's latest score. Instead of waiting for the stored proc to finish and block the current sql thread / executire, can we ask it to calc the score in the background OR parallel.
cheers!

SQL Server does not have parallel or deferred execution: each block of running code in a connection is serial, one line after the other.
To decouple processing, you usually have to use SQL Server Agent jobs or use Service broker. These start executing in a new connection, new session etc
This makes sense:
What if you want to rollback your changes? What does the background thread do and how does it know?
What data does it use? New, Old, lock wait, snapshot?
What if it gets ahead of the main thread and uses stale data?

No, but you could write the request to a queue. Service Broker, a SQL Server component, provides support for this kind of thing. It's probably the best option available for asynchronous processing.

Related

Run long-running sproc (that doesn't need to return) from ASP.NET page

I would like to know how you would run a stored procedure from a page and just "let it finish" even if the page is closed. It doesn't need to return any data.
A database-centric option would be:
Create a table that will contain a list (or queue) of long-running jobs to be performed.
Have the application add an entry to the queue if, when, and as desired. That's all it does; once logged and entered, no web session or state data need be maintained.
Have a SQL Agent job configured to check every 1, 2, 5, whatever minutes to see if there are any jobs to run.
If there are as-yet unstarted items, mark the most recent one as started, and start it.
When it's completed, mark it as completed, or just delete it
Check if there are any other items to run. If there are, repeat; if not, exit the job.
Depending on capacity, you could have several (differently named) copies of this job running, concurrently processing items from the list.
(I've used this method for very long-running methods. It's more an admin-type trick, but it may be appropriate for your situation.)
Prepare the command first, then queue it in the threadpool. Just make sure the thread does not depend on any HTTP Context or any other http intrinsic object. If your request finishes before the thread; the context might be gone.
See Asynchronous procedure execution. This is the only method that guarantees the execution even if the ASP process crashes. It also self tuning and can handle spikes of load, requests are queued up and processed as resources become available.
The gist of the solution is leveraging the SQL Server Activation concept, which allows you to run a stored procedure in a background thread in SQL Server without a client connection.
Solutions based on SqlClient asynch methods or on CLR thread pool are unreliable, the calls are lost as the ASP process is recycled, and besides they build up in-memory queues of requests that actually trigger a process recycle due to memory consumption.
Solutions based on tables and Agent jobs are better, as they are reliable, but they lack the self tuning of Activation based solutions.

Database Job Scheduling

I have a procedure written in PLJava that sends out updates over JMS in my postgres database.
What I would like to do is have that function called on an interval (every 15 seconds) internally in the database (preferably not from an outside process). Is this possible? Any ideas?
If you need no external access, you are presumably able to modify the database design so that you don't need the update at all. Can you explain more about what the update is doing?
As depesz said, you could use either cron or pgAgent, but they are only able to go down to a one minute granularity, not 15 seconds. Considering sleeping inside the stored procedure until the next iteration is not a good idea, because you will have an open transaction for all that time which is a really bad idea.
Strict answer: it is not possible. Since you don't want outside process, and PostgreSQL doesn't support jobs - you are out of luck.
If you'll reconsider using outside processes, then you're most likely want something like cron, or better yet pgagent.
On absolutely other hand - what do you need to do that has to happen every 30 seconds? this seems like a problem with design.
First, you'll spend the least amount of effort if you just go with a cron job.
However, if you were starting from scracth: You are trying to periodically replicate rows from your database. I think you are looking at a replication queue.
The PGQ project (used for Londiste replication, both from Skype's SkyTools) has a queue that you can use independently. When configuring it, you set a maximum event count, and a loop delay, before batched events are generated. You can get batches spaced by no more than 15 seconds that way. You now have to produce the events that will be batched, using a trigger that calls pgq.insert_event; and consume the queues. The consumer can call your PL/Java stored proc; you'll have to rewrite the procedure to send everything in the batch instead of scanning the base table for new events.
As far as I know postgresql doesn't support scheduled tasks. You'll need to use a script with cron or at (depending on your operating system.)
Sounds like you're doing sort of replication? Every 15s sounds like a lot of updates. Could you setup a trigger (or a number of triggers) instead of polling?
If you are using JMS why not just have th task wait for input on the queue?
Per your depesz comment, you have a PL/Java stored procedure that "flushes out database tables (updates) as java objects". Since you want it to run in 15 second intervals, it must be processing a batch of updates each time. Rather than processing a batch of updates in a stored procedure every 15 seconds, why not process them one at a time when they happen via an after update trigger and eliminate the need for a timed interval. If you are aggregrating data from multiple tables to build your objects than add the triggers to you upper most tables only.
In my case the problem was that agent couldn't authorize to database so after I've made all connections trusted from localhost the service started successfully and job works fine
for more information about error you should see into windows event viewer or eq in unix based system. see my config file C:\Program Files\PostgreSQL\10\data\pg_hba.conf

Can SQL CLR triggers do this? Or is there a better way?

I want to write a service (probably in c#) that monitors a database table. When a record is inserted into the table I want the service to grab the newly inserted data, and perform some complex business logic with it (too complex for TSQL).
One option is to have the service periodically check the table to see if new records have been inserted. The problem with doing it that way is that I want the service to know about the inserts as soon as they happen, and I don't want to kill the database performance.
Doing a little research, it seems like maybe writing a CLR trigger could do the job. I could write trigger in c# that fires when an insert occurs, and then send the newly inserted data to a Windows or WCF service.
What do you think, is that a good (or even possible) use of SQL CLR triggers?
Any other ideas on how to accomplish this?
Probably you should de-couple postprocessing from inserting:
In the Insert trigger, add the record's PK into a queue table.
In a separate service, read from the queue table and do your complex operation. When finished, mark the record as processed (together with error/status info), or delete the record from the queue.
What you are describing is sometimes called a Job Queue or a Message Queue. There are several threads about using a DBMS table (as well as other techniques) for doing this that you can find by searching.
I would consider doing anything iike this with a Trigger as being an inappropriate use of a database feature that's easy to get into trouble with anyway. Triggers are best used for low-overhead dbms structural functionality (e.g. fine-grained referential integrity checking) and need to be lightweight and synchronous. It could be done, but probably wouldn't be a good idea.
I would suggest having a trigger on the table that calls the SQL Server Service Broker, that then (asynchronously) executes a CLR stored procedure that does all your work in a different thread.
I have a service that polls the database every minute, it doesn't cause that much performance problems and it is a clean solution. Plus if your service or other wcf endpoint is not there your trigger will fail or be lost and you will have to poll anyways later.
I would not recommend using a CLR trigger, or any sort of trigger for this. You are opening yourself to having serious maintainability and potential locking issues. (A very simple trigger that chucks stuff into an audit/queue table may be acceptable IF you don't care about ##identity after inserts and you will never lock the audit/queue table up)
Instead, from your application/orm you should trigger inserting stuff into a queue table and have this queue processed on a regular basis. This can be done by having a transaction in your ORM or kicking off a stored proc the starts a transaction commits the change and audit/queue atomically. (be careful with locking here)
If you need immediate action, look at spawning a job to clear the queue after you do a insert/update/delete on the table and
Also ensure you are double checking the queue once a minute or so in case the background process was not kicked off properly. If its a web app and you want to avoid spawning threads you could communicate with a background process to clear up the queue.
Why not implement the insert in a stored procedure, and do the business logic in the procedure after the insert? What is so complicated about it that it can't be written in T-SQL?

What is the best way to execute an MS SQL Server 2000 DTS package from a trigger?

I looked around and found some ideas about how to do this, but no definitive best way. One of the ideas was to use sp_start_job to kick off an SQL Server Agent job that runs the DTS package. If this is the best way to do it, then the next question would be, "How do I schedule a DTS package from a job and make it non-recurring?"
Thanks,
Tim
xp_cmdshell would allow you to execute dtsrun.
I wouldn't suggest tying this kind of functionality to a trigger. Triggers are supposed to be fast. I don't think there is any way to launch a DTS package that will be as fast as I would want a trigger to be. If this resonates with you, then I would suggest having your trigger simply insert a row into a special table, and then have a job that executes as often as you need for your purpose (every minute? every 10 seconds?) that monitors this table and kicks off the appropriate DTS package as needed.
Instead of using xp_cmdshell, I did this:
When a certain value in a table changes, the trigger uses msdb.sp_start_job to start a job. This job should not run on a schedule, only when initiated by a user. I set the job schedule to run one time, which is now in the past, and I unchecked the enabled box.
This job has one step, which is DTSRun /~Z0xHEXENCRYPTEDVALUE. The DTS package copies some rows from this server to another server on a different platform and on success resets values in the table with the trigger for next time. The trigger checks a table value before calling sp_start_job, so that the job starts only under certain conditions, not every time.
Since sp_start_job runs asyhchronously the trigger completes quickly. The only drawback to this is that I need to poll the value that was reset on success and either let the user know it worked, or after some time out period, it did not work.
The alternative would be to use xp_cmdshell if I needed synchronous operation, which might not be a good idea from inside of a trigger.

Start stored procedures sequentially or in parallel

We have a stored procedure that runs nightly that in turn kicks off a number of other procedures. Some of those procedures could logically be run in parallel with some of the others.
How can I indicate to SQL Server whether a procedure should be run in parallel or serial — ie: kicked off of asynchronously or blocking?
What would be the implications of running them in parallel, keeping in mind that I've already determined that the processes won't be competing for table access or locks- just total disk io and memory. For the most part they don't even use the same tables.
Does it matter if some of those procedures are the same procedure, just with different parameters?
If I start a pair or procedures asynchronously, is there a good system in SQL Server to then wait for both of them to finish, or do I need to have each of them set a flag somewhere and check and poll the flag periodically using WAITFOR DELAY?
At the moment we're still on SQL Server 2000.
As a side note, this matters because the main procedure is kicked off in response to the completion of a data dump into the server from a mainframe system. The mainframe dump takes all but about 2 hours each night, and we have no control over it. As a result, we're constantly trying to find ways to reduce processing times.
I had to research this recently, so found this old question that was begging for a more complete answer. Just to be totally explicit: TSQL does not (by itself) have the ability to launch other TSQL operations asynchronously.
That doesn't mean you don't still have a lot of options (some of them mentioned in other answers):
Custom application: Write a simple custom app in the language of your choice, using asynchronous methods. Call a SQL stored proc on each application thread.
SQL Agent jobs: Create multiple SQL jobs, and start them asynchronously from your proc using sp_start_job. You can check to see if they have finished yet using the undocumented function xp_sqlagent_enum_jobs as described in this excellent article by Gregory A. Larsen. (Or have the jobs themselves update your own JOB_PROGRESS table as Chris suggests.) You would literally have to create separate job for each parallel process you anticipate running, even if they are running the same stored proc with different parameters.
OLE Automation: Use sp_oacreate and sp_oamethod to launch a new process calling the other stored proc as described in this article, also by Gregory A. Larsen.
DTS Package: Create a DTS or SSIS package with a simple branching task flow. DTS will launch tasks in individual spids.
Service Broker: If you are on SQL2005+, look into using Service Broker
CLR Parallel Execution: Use the CLR commands Parallel_AddSql and Parallel_Execute as described in this article by Alan Kaplan (SQL2005+ only).
Scheduled Windows Tasks: Listed for completeness, but I'm not a fan of this option.
I don't have much experience with Service Broker or CLR, so I can't comment on those options. If it were me, I'd probably use multiple Jobs in simpler scenarios, and a DTS/SSIS package in more complex scenarios.
One final comment: SQL already attempts to parallelize individual operations whenever it can*. This means that running 2 tasks at the same time instead of after each other is no guarantee that it will finish sooner. Test carefully to see whether it actually improves anything or not.
We had a developer that created a DTS package to run 8 tasks at the same time. Unfortunately, it was only a 4-CPU server :)
*Assuming default settings. This can be modified by altering the server's Maximum Degree of Parallelism or Affinity Mask, or by using the MAXDOP query hint.
Create a couple of SQL Server agent jobs where each one runs a particular proc.
Then from within your master proc kick off the jobs.
The only way of waiting that I can think of is if you have a status table that each proc updates when it's finished.
Then yet another job could poll that table for total completion and kick off a final proc. Alternatively, you could have a trigger on this table.
The memory implications are completely up to your environment..
UPDATE:
If you have access to the task system.. then you could take the same approach. Just have windows execute multiple tasks, each responsible for one proc. Then use a trigger on the status table to kick off something when all of the tasks have completed.
UPDATE2:
Also, if you're willing to create a new app, you could house all of the logic in a single exe...
You do need to move your overnight sprocs to jobs. SQL Server job control will let you do all of the scheduling you are asking for.
You might want to look into using DTS (which can be run from the SQL Agent as a job). It will allow you pretty fine control over which stored procedures need to wait for others to finish and what can run in parallel. You can also run the DTS package as an EXE from your own scheduling software if needed.
NOTE: You will need to create multiple copies of your connection objects to allow calls to run in parallel. Two calls using the same connection object will still block each other even if you don't explicitly put in a dependency.