StoredProcedure msdb.dbo.sp_help_job causes SQL Server Deadlocks - sql

I'm currently making minor changes in some of our AgentJobs (adding new steps, renaming schedules, etc.).
From time to time it happens, that when I save the changes from an AgentJob, a deadlock occurs. We have a tool called SQLSentry, which sends me an e-mail about these locks with the following informations:
[Deadlock Victim Information]:
SPID [ecid]: 89 [0]
Host: [...]
Application: Microsoft SQL Server Management Studio
Database: msdb
Login: [...]
Log Used: 0
Deadlock Priority: 0
Wait Time: 2985
Transaction Start Time: 15.03.2016 15:56:19
Last Batch Start Time: 15.03.2016 15:56:19
Last Batch Completion Time: 15.03.2016 15:56:19
Mode/Type: S
Status: suspended
Isolation Level: read committed (2)
Text Data:
at msdb.dbo.sp_get_composite_job_info line 78
INSERT INTO #job_execution_state
SELECT xpr.job_id,
xpr.last_run_date,
xpr.last_run_time,
xpr.job_state,
sjs.step_id,
sjs.step_name,
xpr.current_retry_attempt,
xpr.next_run_date,
xpr.next_run_time,
xpr.next_run_schedule_id
FROM #xp_results xpr
LEFT OUTER JOIN msdb.dbo.sysjobsteps sjs ON ((xpr.job_id = sjs.job_i
Transaction Count: 0
SPID: [...]
Transaction Name: INSERT
Transaction ID: [...]
Database ID: [...]
Wait Resource: KEY: 4:72057594057392128 (b17adeaeb74a)
Lock Timeout: 4294967295
Input Buffer:
exec msdb.dbo.sp_help_job #job_id='0435b309-1178-4b60-a937-435cb9e299ff'
ECID: 0
I don't know what to do about these deadlocks. It doesn't seem, that any data would be lost, because of them. And also the don't appear immediately, after I press the "Save" button. The information from the deadlock above I received about 15 minutes later.
Any ideas or similar experiences? It's really painful, because I can't really reproduce the error on purpose.
Greetings, nDust

Related

SQL Server Backup Jobs Failing. The error seems to be related to MaximumErrorCount, but I am not sure how to fix value or if changing it will resolve?

My Backup jobs are failing with following error when check it's history. I have already checked this link regarding maximumerrorcount value.
https://stackoverflow.com/questions/3250648/sql-server-2008-change-the-maximumerrorcount-or-fix-the-errors
At this point I'm not sure what to do if maximumerrorcount is only masking the actual problem.
Log Job History (DB_TaxExemption_BKP.Subplan_1)
Step ID 1 Server INHOUSE-DB Job Name DB_TaxExemption_BKP.Subplan_1
Step Name Subplan_1 Duration 00:00:00 Sql Severity 0 Sql Message
ID 0 Operator Emailed Operator Net sent Operator Paged Retries
Attempted 0 Message Executed as user: NT AUTHORITY\LOCAL SERVICE.
Microsoft (R) SQL Server Execute Package Utility Version
14.0.1000.169 for 64-bit Copyright (C) 2017 Microsoft. All rights reserved. Started: 10:12:00 PM Error: 2022-03-30 22:12:00.47
Code: 0xC00291EC Source: {4D6AAF94-D3FC-4873-9F66-E35E323A6BEE}
Execute SQL Task Description: Failed to acquire connection "Local
server connection". Connection may not be configured correctly or you
may not have the right permissions on this connection. End Error
Warning: 2022-03-30 22:12:00.47 Code: 0x80019002 Source:
OnPreExecute Description: SSIS Warning Code
DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but
the number of errors raised (1) reached the maximum allowed (1);
resulting in failure. This occurs when the number of errors reaches
the number specified in MaximumErrorCount. Change the
MaximumErrorCount or fix the errors. End Warning Error: 2022-03-30
22:12:00.48 Code: 0xC0024104 Source: Back Up Database (Full)
Description: The Execute method on the task returned error code
0x80131904 (Login failed for user 'backupuser'.). The Execute method
must succeed, and indicate the result using an "out" parameter. End
Error ## Heading ##Error: 2022-03-30 22:12:00.48 Code: 0xC0024104
Source: {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX} Description: The
Execute method on the task returned error code 0x80131904 (Login
failed for user 'backupuser'.). The Execute method must succeed, and
indicate the result using an "out" parameter. End Error Warning:
2022-03-30 22:12:00.48 Code: 0x80019002 Source: OnPostExecute
Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The
Execution method succeeded, but the number of errors raised (2)
reached the maximum allowed (1); resulting in failure. This occurs
when the number of errors reaches the number specified in
MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
End Warning DTExec: The package execution returned DTSER_FAILURE (1).
Started: 10:12:00 PM Finished: 10:12:00 PM Elapsed: 0.219 seconds.
The package execution failed. The step failed
I know for sure that the user for this backup job is sysadmin and it is not locked out all rights are grated. And this is production so I cannot take many risks. What am I missing ?

Empty error message when processing SSAS Multidimention Cube

When processing an SSAS Multidimensional Cube's dimensions from an SSIS package the task repeatedly fails with an empty error message:
If I do a manual processing, with default settings, everything seems to work. The default setting are:
Processing order: Parallel
Transaction mode: (Default)
Dimension errors: (Default)
Dimension key error log path: (Default)
Process affected objects: Do not process
The SSIS task runs with the following settings:
Processing order: Sequential
Transaction mode: All in one transaction
Dimension errors: (Default)
Dimension key error log path: (Default)
Process affected objects: Do not process
Today I run the manual processing with the same setting form the Task, and got the result from the image.
Can someone help me understand what is the meaning of the empty error message? And how does the Processing order and Transaction mode affects error messages?

ServerXmlHttpRequest hanging sometimes when doing a POST

I have a job that periodically does some work involving ServerXmlHttpRquest to perform an HTTP POST. The job runs every 60 seconds.
And normally it runs without issue. But there's about a 1 in 50,000 chance (every two or three months) that it will hang:
IXMLHttpRequest http = new ServerXmlHttpRequest();
http.open("POST", deleteUrl, false, "", "");
http.send(stuffToDelete); <---hang
When it hangs, not even the Task Scheduler (with the option enabled to kill the job if it takes longer than 3 minutes to run) can end the task. I have to connect to the remote customer's network, get on the server, and use Task Manager to kill the process.
And then its good for another month or three.
Eventually i started using Task Manager to create a process dump,
so i could analyze where the hang is. After five crash dumps (over the last 11 months or so) i get a consistent picture:
ntdll.dll!_NtWaitForMultipleObjects#20()
KERNELBASE.dll!_WaitForMultipleObjectsEx#20()
user32.dll!MsgWaitForMultipleObjectsEx()
user32.dll!_MsgWaitForMultipleObjects#20()
urlmon.dll!CTransaction::CompleteOperation(int fNested) Line 2496
urlmon.dll!CTransaction::StartEx(IUri * pIUri, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4453 C++
urlmon.dll!CTransaction::Start(const wchar_t * pwzURL, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4515 C++
msxml3.dll!URLMONRequest::send()
msxml3.dll!XMLHttp::send()
Contoso.exe!FrobImporter.TFrobImporter.DeleteFrobs Line 971
Contoso.exe!FrobImporter.TFrobImporter.ImportCore Line 1583
Contoso.exe!FrobImporter.TFrobImporter.RunImport Line 1070
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.HandleFrobImport Line 433
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.CoreExecute Line 71
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.Execute Line 84
Contoso.exe!Contoso.Contoso Line 167
kernel32.dll!#BaseThreadInitThunk#12()
ntdll.dll!__RtlUserThreadStart()
ntdll.dll!__RtlUserThreadStart#8()
So i do a ServerXmlHttpRequest.send, and it never returns. It will sit there for days (causing the system to miss financial transactions, until come Sunday night i get a call that it's broken).
It is of no help unless someone knows how to debug code, but the registers in the stalled thread at the time of the dump are:
EAX 00000030
EBX 00000000
ECX 00000000
EDX 00000000
ESI 002CAC08
EDI 00000001
EIP 732A08A7
ESP 0018F684
EBP 0018F6C8
EFL 00000000
Windows Server 2012 R2
Microsoft IIS/8.5
Default timeouts of ServerXmlHttpRequest
You can use serverXmlHttpRequest.setTimeouts(...) to configure the four classes of timeouts:
resolveTimeout: The value is applied to mapping host names (such as "www.microsoft.com") to IP addresses; the default value is infinite, meaning no timeout.
connectTimeout: A long integer. The value is applied to establishing a communication socket with the target server, with a default timeout value of 60 seconds.
sendTimeout: The value applies to sending an individual packet of request data (if any) on the communication socket to the target server. A large request sent to a server will normally be broken up into multiple packets; the send timeout applies to sending each packet individually. The default value is 30 seconds.
receiveTimeout: The value applies to receiving a packet of response data from the target server. Large responses will be broken up into multiple packets; the receive timeout applies to fetching each packet of data off the socket. The default value is 30 seconds.
The KB305053 (a server that decides to keep the connection open will cause serverXmlHttpRequest to wait for the connection to close) seems like it plausibly could be the issue. But the 30 second default timeout would have taken care of that.
Possible workaround - Add myself to a Job
The Windows Task Scheduler is unable to terminate the task; even though the option is enabled to do do.
I will look into using the Windows Job API to add my self process to a job, and use SetInformationJobObject to set a time limit on my process:
CreateJobObject
AssignProcessToJobObject
SetInformationJobObject
to limit my process to three minutes of execution time:
PerProcessUserTimeLimit
If LimitFlags specifies
JOB_OBJECT_LIMIT_PROCESS_TIME, this member is the per-process
user-mode execution time limit, in 100-nanosecond ticks. Otherwise,
this member is ignored.
The system periodically checks to determine
whether each process associated with the job has accumulated more
user-mode time than the set limit. If it has, the process is
terminated.
If the job is nested, the effective limit is the most
restrictive limit in the job chain.
Although since Task Scheduler uses Job objects to also limit a task's time, i'm not hopeful that the Job Object can limit a job either.
Edit: Job objects cannot limit a process by process time - only user time. And with a process idle waiting for an object, it will not accumulate any user time - certainly not three minutes worth.
Bonus Reading
How can a ServerXMLHTTP GET request hang? (GET, not POST)
KB305053: ServerXMLHTTP Stops Responding When You Send a POST Request (which says the timeout should expire; where mine does not)
MS Forums: oHttp.Send - Hangs (HEAD, not POST)
MS Forums: ASP to test SOAP WebService using MSXML2.ServerXMLHTTP Send hangs
CC to MS Support Forums
Consider switching to a newer, supported API.
msxml6.dll using MSXML2.ServerXMLHTTP.6.0
winhttpcom.dll using WinHttp.WinHttpRequest.5.1.
The msxml3.dll library is no longer supported and is only kept around for compatibility reasons. Plus, there were a number of security and stability improvements included with msxml4.dll (and newer) that you are missing out on.

Doctrine deadlock with ORM updates

I'm trying to figure out what is causing deadlocks in my Symfony 2 application. I'm running a cronjob that does batch-updates on a fairly large dataset and one part of it causes this error:
Doctrine\DBAL\DBALException: An exception occurred while executing
'UPDATE SpotEvent SET ts = ?, current = ? WHERE id = ?' with params
["2015-12-28 00:35:27", 1, 39316]: SQLSTATE[40P01]: Deadlock
detected: 7 ERROR: deadlock detected DETAIL: Process 32030 waits for
ShareLock on transaction 2130787; blocked by process 32029. Process
32029 waits for ShareLock on transaction 2130786; blocked by process
32030. HINT: See server log for query details. CONTEXT: while updating tuple (105,68) in relation "spotevent" (uncaught exception)
at
/home/maf/symfony/vendor/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php
line 91 while running console command
The code causing it is basically this:
check event
if (already in database) {
update timestamp
} else {
create new
}
From what I see in the error, the first branch causes the deadlock, but from what I read about deadlocks, the second should be more likely. In any case I don't understand why I have a deadlock at all.
I should say I am running this job in 6 parallel processes. However, there is no overlap between them (i.e. job one is checking from 1-200, job 2 from 201 to 400, etc.)
I'm using PostgreSQL as the database backend. My "check event" step is done using DQL, everything else is pure ORM.

SQL server 2005 agent not working

Sql server 2005 service pack 2 version: 9.00.3042.00
All maintenance plans fail with the same error.
The details of the error are:-
Execute Maintenance Plan
Execute maintenance plan. test7 (Error)
Messages
Execution failed. See the maintenance plan and SQL Server Agent job history logs for details.
The advanced information section shows the following;
Job 'test7.Subplan_1' failed. (SqlManagerUI)
Program Location:
at Microsoft.SqlServer.Management.SqlManagerUI.MaintenancePlanMenu_Run.PerformActions()
At this point the following appear in the windows event log:
Event Type: Error
Event Source: SQLISPackage
Event Category: None
Event ID: 12291
Date: 28/05/2009
Time: 16:09:08
User: 'DOMAINNAME\username'
Computer: SQLSERVER4
Description:
Package "test7" failed.
and also this:
Event Type: Warning
Event Source: SQLSERVERAGENT
Event Category: Job Engine
Event ID: 208
Date: 28/05/2009
Time: 16:09:10
User: N/A
Computer: SQLSERVER4
Description:
SQL Server Scheduled Job 'test7.Subplan_1' (0x96AE7493BFF39F4FBBAE034AB6DA1C1F) - Status: Failed - Invoked on: 2009-05-28 16:09:02 - Message: The job failed. The Job was invoked by User 'DOMAINNAME\username'. The last step to run was step 1 (Subplan_1).
There are no entries in the SQl Agent log at all.
Probably no points for this, but you're likely to get more help on this over at ServerFault.com now that they are open.