Concurrent access problem in mysql database - sql

Hi i'm coding a python script that will create a number child processes that fetch and execute tasks from the database.
The tasks are inserted on the database by a php website running on the same machine.
What's the good (need this to be fast) way to select and update those tasks as "in progress" to avoid to be selected by multiple times by the python scripts
edit: database is mysql
Thanks in advance

Use an InnoDB table Tasks, then:
select TaskId, ... from Tasks where State="New" limit 1;
update Tasks set State="In Progress" where TaskId=<from above> and State="New";
if the update succeeds, you can work on the task. Otherwise, try again.
You'll want an index on TaskId and State.

Without knowing more about your architecture, I suggest the following method.
1) Lock Process table
2) Select ... from Process table where State="New"
3) processlist = [list of process id''s from step 2]
4) Update Process table set State="In progress" where ProcessId in [processlist]
5) Unlock Process table.

A way to speed things up is to put the process into a stored procedure, and return the selected row from that procedure. That way, only one trip to the db server.

Related

Atomicity of a job execution in SQL Server

I would like to find the proper documentation to confirm my thought about a SQL Server job I recently wrote. My fear is that data could be inconsistent for few milliseconds (timing between the start of the job execution and its end).
Let's say the job is setup to run every 30 minutes. It will only have one step with the following SQL statement:
DELETE FROM myTable
INSERT INTO myTable
SELECT *
FROM myTableTemp
Could it happens that a SELECT query would be executed exactly in between the DELETE statement and the INSERT statement and thus returning empty results?
And what if I would have created 2 steps in my job, one for the DELETE query and another for the INSERT INTO? Is the atomicity is protected by SQL Server between several steps of one job?
Thanks for your help on this one
No there is no automatic atomic handling of jobs, whether they are multiple statements or steps.
Use this:
begin transaction
delete...
insert....
... anything else you need to be atomic
commit work

Using powershell - how to prevent SQL from access the same record

I want to run multiple instances of powershell to collect data from Exchange. In powershell, I use invoke-sqlcmd to run various SQL commands from powershell.
SELECT TOP 1 SmtpAddress FROM [MMC].[dbo].[EnabledAccounts]
where [location] = '$Location' and Script1 = 'Done' and (Script2 = '' or
Script2 is null)
When running more than one script, I see both scripts accessing the same record. I know there's a way to update the record, to lock it, but not sure how to write it out. TIA:-)
The database management system (I'll assume SQL Server) will handle contention for you. Meaning, if you have two sessions trying to update the same set of records, SQL Server will block one session while the other completes. You don't need to do anything to explicitly make that happen. That said, it's a good idea to run your update in a transaction if you are applying multiple updates as a single unit; a single change occurs in an implicit transaction. The following thread talks more about transactions using Invoke-SqlCmd.
Multiple Invoke-SqlCmd and Sql Server transaction

does SQL Server handle or lock Transactions between two or more tables

I have a table like this:
CREATE TABLE Tasks(
Name text,
Date datetime
)
In my application each person will add his/her tasks to this table.
Every night my robot will pick the first task to do, so it calls following Storedprocedure:
CREATE PROCEDURE PickTask
begin
select top (1) * from Tasks
delete top (1) from Tasks
end
The robot will call PickTask until there is no rows in the Tasks table.
My robot works multithread, so I want to know what will happened if two or more thread in my application wants to call PickTask ?
at the first I thought the select query will execute for both thread, so both threads pick a same row from Tasks, after that each one will delete one row, finally the robot do one task for two times and delete one unfinished task!!
I tried to use TRANC but I'm not sure, does my application have problem to do Tasks or not?
if you change it to:
CREATE PROCEDURE PickTask
begin transaction
select top (1) * from Tasks
delete top (1) from Tasks
commit
You can be sure that whichever program invokes this first will lock tasks until both steps are done. Then the next program that next invokes picktask will lock tasks exclusively. You probably want to wrap the whole thing in a begin try/begin catch block.
You can use application locks in your T-SQL, as noted in this link: http://www.sqlteam.com/article/application-locks-or-mutexes-in-sql-server-2005
You would acquire the lock in your stored procedure, just before the SELECT statement, then release that lock as soon as you delete the row, although you would be better off deleting the row with a WHERE clause that uses the unique key you obtained from that SELECT, since any additions that occurred (though unlikely) between the SELECT and DELETE could be deleted and never selected.

SQL INSERT - how to execute a list of queries automatically

I've never done this, so apologies if I'm being quite quite vague.
Scenario
I need to run a long series of INSERT SQL queries. This data is inserted in a table for being processed by a web service's client, i.e. the data is uploaded on a different server and the table gets cleared as the process progresses.
What I've tried
I have tried to add a delay to each Insert statement like so
WAITFOR DELAY '00:30:00'
INSERT INTO TargetTable (TableName1, Id, Type) SELECT 'tablename1', ID1 , 1 FROM tablename1
WAITFOR DELAY '00:30:00'
INSERT INTO TargetTable (TableName2, Id, Type) SELECT 'tablename2', ID2 , 1 FROM tablename2
But this has the disadvantage of assuming that a query will finish executing in 30 minutes, which may not be the case.
Question
I have run the queries manually in the past, but that's excruciatingly tedious. So I would like to write a program that does that for me.
The program should:
Run each query in the order given
Wait to run the next query until the previous one has been processed, i.e. until the target table is clear.
I'm thinking of a script that I can copy into the command prompt console, SQL itself or whatever and run.
How do I go about this? Windows service application? Powershell function?
I would appreciate any pointers to get me started.
You need to schedule job in SQL Server
http://www.c-sharpcorner.com/UploadFile/raj1979/create-and-schedule-a-job-in-sql-server-2008/

Link Server Optimization Help

I have this code in a trigger.
if isnull(#d_email,'') <> isnull(#i_email,'')
begin
update server2.database2.dbo.Table2
set
email = #i_email,
where user_id = (select user_id from server2.database2.dbo.Table1 where login = #login)
end
I would like to update a table on another db server, both are MSSQL. the query above works for me but it is taking over 10 seconds to complete. table2 has over 200k records. When I run the execution plan it says that the remote scan has a 99% cost.
Any help would be appreciated.
First, the obvious. Check the indexes on the linked server. If I saw this problem without the linked server issue, that would be the first thing I would check.
Suggestion:
Instead of embedding the UPDATE in the server 1 trigger, create a stored procedure on the linked server and update the records by calling the stored procedure.
Try to remove the sub-query from the UPDATE:
if isnull(#d_email,'') <> isnull(#i_email,'')
begin
update server2.database2.dbo.Table2
set email = #i_email
from server2.database2.dbo.Table2 t2
inner join
server2.database2.dbo.Table1 t1
on (t1.user_id = t2.user_id)
where t1.login = #login
end
Whoa, bad trigger! Never and I mean never, never write a trigger assuming only one record will be inserted/updated or deleted. You SHOULD NOT use variables this way in a trigger. Triggers operate on batches of data, if you assume one record, you will create integrity problems with your database.
What you need to do is join to the inserted table rather than using a varaible for the value.
Also really updating to a remote server may not be such a dandy idea in a trigger. If the remote server goes down then you can't insert anything to the orginal table. If the data can be somewaht less than real time, the normal technique is to have the trigger go to a table on the same server and then a job pick up the new info every 5-10 minutes. That way if the remote server is down, the records can still be inserted and they are stored until the job can pick them up and send them to the remote server.