I am running Azure Devops pipeline and this pipeline calls a SQL procedure.
This procedure takes more than 6 hours to complete (it takes around 10 hours)
After procedure execution, I will take a database backup (it takes around 12 hours)
The problem which I'm facing is pipeline execution throws error after 6 hours:
Not received any response, or we stopped hearing from Agent Azure Pipelines 4. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error. For more information
We have proper networks and no issues in procedure.
I have done the below setup still same problem
timeout to 0 & 7600
timeout to 7600
Number of retries if task fail to 5
Please let me know
how to make pipeline to wait until SQL procedure complete
how to catch SQL exception to stop pipeline execution. currently the pipeline status updates with success status even if any errors occurs in SQL
Regards,
Kumar
It seems you are using the Microsoft hosted agent, so the pipeline will be time out after 6 hours. Please refer this Capabilities and limitations for Microsoft-hosted agents:
You can pay for additional capacity per parallel job. Paid parallel jobs remove the monthly time limit and allow you to run each job for up to 360 minutes (6 hours).
To resolve the time out issue, as your pipeline will run more than 6 hours, please consider to use a self-hosted agent. Setting the timeout value to zero means that the job can run forever on self-hosted agents.
Related
Message [000] Request to run job Process*******DataSetLastTwoDays
(********) refused because the job is already running from a request
by Schedule 34 (Every Minute)
Could anyone help with the above error? This is an SQL Server Agent job that is meant to refresh every 2 minutes. After working for a couple of days, it stops running and causes other jobs depending on this to fail.
NB: It is a partitioning made in SSAS (Tabular mode).
I'm running into a weird issue with Pentaho 7.1. I run a job that I created and it runs perfectly and quickly the first time I run it.
The job is an ETL Job consisting of a Start widget, 7 Transformations running in a sequence, and a Success widget.
I'm confused as to why the job runs once, and when I try to run it again it says "Spoon - Starting job..." and then the job just hangs.
If I delete the job and I create a brand new one, I am then able to run the job once and I am stuck again with the job no longer able to run after that. I don't understand why the job keeps hanging after it gets executed once, and it is then 100% broken after a Successful run...
I turned up the logging in Pentaho 7.1 Spoon, and it shows this continuously...
2018/08/14 17:30:24 - Job1 - Triggering heartbeat signal for
ported_jobs at every 10 seconds 2018/08/14 17:30:34 - Job1 -
Triggering heartbeat signal for ported_jobs at every 10 seconds
2018/08/14 17:30:44 - Job1 - Triggering heartbeat signal for
ported_jobs at every 10 seconds 2018/08/14 17:30:54 - Job1 -
Triggering heartbeat signal for ported_jobs at every 10 seconds
I can't seem to put my finger on why this happening.
Any help is appreciated
Probable answer: Check that your transformations are not opening the same database for input and output. A quick check may to run the first transformation directly (without the job) and see if it locks.
Happened because the server db you want to update are slow to respond. Probably high CPU and RAM. I tried to increase the RAM and CPU for the db server, now my job runs okay.
We have Multi-instance WCF Service (more than 2) which receives requests from ServiceBus Topics (Can have more than 10000 request in subscription).
The nature of the request is that we mainly do inserts in out database. Very minimal processing. Our database is of P1 in SQL Azure.
After sometime, we keep running out of Connection & do receive time outs. I have increased Pool size to 1000 & connection time out to 120 secs. We have checked, & connection pools are definately getting disposed off correctly.
Any Idea where we should start digging?
Thanks
The higher latencies and the resulting timeouts could be due to reaching the max write capacity of the database.
You can check if this is the case by querying the view sys.dm_db_resource_stats in the database. It shows the resource utilization in percent for the last hour.
If you indeed reach the log write limits, you should consider to upgrade your server to the latest service version (V12) which will give you higher log write rates. If you are already running V12, you may want to consider upgrading to P2.
I have a dedicated server that's been running for years, with no recent code or configuration changes, but suddenly about a week ago, the MS SQL Server DB has started becoming unresponsive, and shortly thereafter, the entire site goes down due to memory issues on the server. It is sporadic, which leads me to believe it could be a malicious DDOS-like attack, but I am not sure how to confirm what's going on.
After a reboot, it can stay up for a few days, or only a few hours before I start seeing rampant occurrances of these Info messages in the Windows logs, shortly before it seizing up and failing. Research has not yielded any actionable info as of yet, please help, and thank you.
Process 52:0:2 (0xaa0) Worker 0x07E340E8 appears to be non-yielding on Scheduler 0. Thread creation time: 13053491255443. Approx Thread CPU Used: kernel 280 ms, user 35895 ms. Process Utilization 0%%. System Idle 93%%. Interval: 6505497 ms.
New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 2940 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%%. System Idle: 91%%.
Here's a blog about the issue: danieladeniji.wordpress that should help you get started.
Seems unlikely that it would be a DDOS.
We have an app with around 200-400 users and once a day or every other day we get the dreaded sql exception:
"Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding".
Once we get this then it happens several times for different users and then all users are stuck. They can't perform any operations.
I don't have the full specs of the boxes right in front of me but we have:
IIS and SQL Server running on separate boxes
each box has 64gb of memory with multiple cores
We get nothing in the SQL server logs (as would be expected) and our application catches the sqlexception so we just see the timeout error there - on an UPDATE. In the database we have only a few key tables. The timeout happens on one of the tables where there is 30k of rows. We have run profiler on these queries hitting the UI against a copy of production to get the size and made sure we have all of the right indexes (clustered/non-clustered). In a local environment (smaller box, same size database) everything runs fast and to the users most of the day the system runs fast. The exact same query (which had the timeout error in production) ran in less than a second.
We did change our command timeout from 30 seconds to 300 seconds (I know that 0 is unlimited and I guess we should use that, but it seems like that's just masking the real problem).
We had the profiler running in production, but unfortunately it wasn't fully enabled the last time it happened. We are setting it up correctly now.
Any ideas on what this might be?