I have a SQL Server database in production and it has been live for 2 months. A while ago the web application associated with it loading takes too long time. And sometimes it says timeout occurred.
Found a quick fix by running a command 'exec sp_updatestats' will fixed the problem. But I need to be run that one consistently (for every 5 minutes).
So I created a Windows service with timer and started on server. My question is what are the root causes and possible permanent solutions? Anyone?
Here is a Most expensive query from Activity Monitor
WAITFOR(RECEIVE TOP (1) message_type_name, conversation_handle, cast(message_body AS XML) as message_body from [SqlQueryNotificationService-2eea594b-f994-43be-a5ed-d9a47837a391]), TIMEOUT #p2;
To diagnose a poorly performing queries you need to:
Identify the poorly performing query, e.g. via application logging, a SQL Profiler trace filtered to show only queries with a longer duration than a certain threshold etc...
Get an execution plan for the query
At that point you can start to try to figure out what the performance issue is.
Given that exec sp_updatestats fixes your issue it could be that statistics on certain tables are out of date (this is a pretty common cause of performance issues). If thats the case then you might be able to tweak your statistics or at least rebuild only those statistics that are causing issues.
Its worth noting that updating statistics will also cause cached execution plans to become invalid, and so its plausible that your issue is unrelated to statistics - you need to collect more information about the poorly performing queries before looking at solutions.
Your most expensive query looks like its waiting for a message, i.e. its in your list of long running queries because its designed to run for a long time, not because its expensive.
Thanks for everyone i found a solution for my issue . Its quite different I've enabled sql dependency module on my sql server by setting up enable broker on , thats the one causing timeout query so by setting it to false everything is fine working now.
Related
SQL Azure issue.
I've got an issue that manifests as the following exception on our (asp.net) site:
Timeout expired. The timeout period elapsed prior to completion of
the operation or the server is not responding. The statement has been
terminated.
It also results in update and insert statements never completing in SMSS. There aren't any X or IX locks present when querying: sys.dm_tran_locks and there are no transactions when querying sys.dm_tran_active_transactions or sys.dm_tran_database_transactions.
The problem is present for every table in the database but other databases on the same instance don't cause the problem. The duration of the issue can be anywhere from 2 minutes to 2 hours and doesn't happen at any specific times of day.
The database is not full.
At one point this issue didn't resolve itself but I was able to resolve the issue by querying sys.dm_exec_connections finding the longest running session, and then killing it. The odd thing is, that the connection was 15 minutes old, but the lock issue had been present for over 3 hours.
Is there anything else I can check?
EDIT
As per Paul's answer below. I'd actually tracked down the problem before he answered. I will post the steps I used to figure this out below, in case they help anyone else.
The following queries were run when a "timeout period" was present.
select * from sys.dm_exec_requests
As we can see, all the WAIT requests are waiting on session 1021 which is the replication request! The TM Request indicates a DTC transaction and we don't use distributed transactions. You can also see the wait_type of SE_REPL_COMMIT_ACK which again implicates replication.
select * from sys.dm_tran_locks
Again waiting on session 1021
SELECT * FROM sys.dm_db_wait_stats ORDER BY wait_time_ms desc
And yes, SE_REPL_CATCHUP_THROTTLE has a total wait time of 8094034
ms, that is 134.9minutes!!!
Also see the following forum for details on this issue.
http://social.technet.microsoft.com/Forums/en-US/ssdsgetstarted/thread/c3003a28-8beb-4860-85b2-03cf6d0312a8
I've been given the following answer in my communication with
Microsoft (we've seen this issue with 4 of our 15 databases in the EU
data center):
Question: Have there been changes to these soft
throttling limits in the last three weeks ie since my problems
started?
Answer: No, there has not.
Question: Are there ways we can
prevent or be warned we are approaching a limit?
Answer: No. The issue
may not be caused by your application but can be caused by other
tenants relying on the same physical hardware. In other words, your
application can have very little load and still run into the problem.
In other words, your own traffic may be a cause of this problem, but
it can just as well be caused by other tenants relying on the same
physical hardware. There's no way to know beforehand that the issue
will soon occur - it can occur at any time without warning. The SQL
Azure operations team does not monitor this type of error, so they
won't automatically try to solve the problem for you. So if you run
into it you have two opitions:
Create a copy of your db and use that and hope the db is placed on another server with less load.
Contact Windows Azure Support and inform the about the problem and let them do Option 1 for you
You might be running into the SE_REPL* issues that are currently plaguing a lot of folks using Sql Azure (my company included).
When you experience the timeouts, try checking your wait requests for wait types of:
SE_REPL_SLOW_SECONDARY_THROTTLE
SE_REPL_COMMIT_ACK
Run the following to check your wait types on current connections:
SELECT TOP 10 r.session_id, r.plan_handle,
r.sql_handle, r.request_id,
r.start_time, r.status,
r.command, r.database_id,
r.user_id, r.wait_type,
r.wait_time, r.last_wait_type,
r.wait_resource, r.total_elapsed_time,
r.cpu_time, r.transaction_isolation_level,
r.row_count
FROM sys.dm_exec_requests r
You can also check a history of sorts for this by running:
SELECT * FROM sys.dm_db_wait_stats
ORDER BY wait_time_ms desc
If you're seeing a lot of SE_REPL* wait types and these are staying set on your connections for any length of time, then basically you're screwed.
Microsoft are aware of the problem, but I've had a support ticket open for a week with them now and they're still working on it apparently.
The SE_REPL* waits happen when the Sql Azure replication slaves fall behind.
Basically the whole db suspends queries while replication catches up :/
So essentially the aspect that makes Sql Azure highly available is causing databases to become randomly unavailable.
I'd laugh at the irony if it wasn't killing us.
Have a look at this thread for details:
http://social.technet.microsoft.com/Forums/en-US/ssdsgetstarted/thread/c3003a28-8beb-4860-85b2-03cf6d0312a8
I have SQL job that runs every night which does various inserts/updates/deletes. The job contains 40 steps which mainly execute stored procedures.
It's been running fine up until a week ago when suddenly the run time went up from 2.5 hours to over 5 hours, sometimes even 8,9,10!
Could one you please give me any pointers?
First of all let me recommend you a valuable resource on Simple-Talk site. Is a detailed methodology of how to troubleshoot performance issues on SQL Server.
Does the insert you say was carried out was a huge bulk insert that could affect performance? Maybe if it was a huge load the query execution plans could be different and you need to re-tune your table structure, indexes, etc.
If the run time suddenlychanged and no changes where done in the queries or your database structure then I would ask myself several questions:
first, does the process is still taking so long or it was only one time it ran so slow? maybe now is running smoothly and the issue only arised once. Nevertheless, try to find what triggered that bad performance, it can happend again and take down your server
is the server a dedicated sql server? if not, check if some new tasks unrelated to the SQL engine had been configured, maybe a new tasks is doing some heavy I/O jobs and therefore your CRUD operations take longer
if it is a dedicated server, then check that no new job has been added and can take down your existing jobs. Check this SO link for details on jobs settled up from the SQL Agent
maybe low memory due to another process on same server?
And there is lot more to check, but before going deeper I would check that no external (non sql server related) was the reason of the delay on the process execution.
First of, I apologize if this is the wrong place for this questions, but I haven't found any other location that might help me out.
I have a query that is running on a sql server that keeps running indefinitely and as a result the version store on SQL Server grows and tempdb grows as well. Currently I don't have the source code.
I would like to get a few pointers for where to search for the cause of this problem.
In activity monitor all I see is a process with a taskstate of SUSPENDED, and Wait Type of ASYNC_NETWORK_IO_WRITELOG. I'm running this on SQL Server 2008.
Again sorry if this is the wrong place for asking this.
/Andy.l
First, I've got no experience with SQL2008, but from my experience with SQL2000, it managed itself sometimes into nasty locking situations running multicore environment. I would try rerun query with Option (MAXDOP 1). Cost you almost nothing to check it out.
At least with older SQL Server versions a SELECT could be blocked by other sessions, so I would suspect something like that in your case (even though your mentioning the "version store" which seems to indicate that you have enabled the new snapshot isolation mode).
Running sp_who2 will give you more details about whether it is a blocking problem or not
I am working on a Flex application that is connecting via Flash Remoting to ColdFusion 8 with a SQL Server 2005 database. Most of the time, everything runs fine. However, from time to time, it will take an exceptionally long time for SQL Server to return data from a stored procedure call to ColdFusion; returning data from CF to Flex is very fast. When this happens, I can run the exact same call from Management Studio on the SQL Server or a ColdFusion page on the CF server and get results immediately. The most recent occurrence of the issue took about 90 seconds to return data to CF. During the 90 second window, I was able to run the stored procedure in Management Studio several times.
I have tried using different drivers and this doesn't seem to matter. I have also kept an eye on server performance and haven't noticed anything unusual while this is happening. Has anyone seen this behavior before? Any ideas as to what I should be looking for.
While it's working slowly, can you run "sp_who2" against your SQL Server? If it's a blocking issue, you'll see rows that have a value in the "BlkBy" column, meaning that they'r waiting for another process to complete before they can continue.
If that's the case, then there's other troubleshooting to do so you can figure out what's causing the blocks. This article provides an overview of locking and troubleshooting blocks. If that's your issue, please update your question and add more details, and we can help you go from there!
Are you absolutely sure that the query being run in the sp is the same every time? For example, is it possible that when it slows down, the query has a different sort order? Possibly 9 times out of 10, the query returns quickly, and that 10th time is slow b/c the data you're getting is being sorted by some column that isn't indexed?
In these situations, I'd try to have a SQL Trace set up (using sql profiler) and let it run for a while. Once the situation happens, let it run through, and then analyze the trace. Confirm beyond doubt that the query being run is the same as other executions of the same sp
I have RO access on a SQL View. This query below times out. How to avoid this?
select
count(distinct Status)
from
[MyTable] with (NOLOCK)
where
MemberType=6
The error message I get is:
Msg 121, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
Your query is probably fine. "The semaphore timeout period has expired" is a Network error, not a SQL Server timeout.
There is apparently some sort of network problem between you and the SQL Server.
edit: However, apparently the query runs for 15-20 min before giving the network error. That is a very long time, so perhaps the network error could be related to the long execution time. Optimization of the underlying View might help.
If [MyTable] in your example is a View, can you post the View Definition so that we can have a go at optimizing it?
Although there is clearly some kind of network instability or something interfering with your connection (15 minutes is possible that you could be crossing a NAT boundary or something in your network is dropping the session), I would think you want such a simple?) query to return well within any anticipated timeoue (like 1s).
I would talk to your DBA and get an index created on the underlying tables on MemberType, Status. If there isn't a single underlying table or these are more complex and created by the view or UDF, and you are running SQL Server 2005 or above, have him consider indexing the view (basically materializing the view in an indexed fashion).
You could put an index on MemberType.
Please check your Windows system event log for any errors specifically for the "Event Source: Dhcp". It's very likely a networking error related to DHCP. Address lease time expired or so. It shouldn't be a problem related to the SQL Server or the query itself.
Just search the internet for "The semaphore timeout period has expired" and you'll get plenty of suggestions what might be a solution for your problem. Unfortunately there doesn't seem to be the solution for this problem.
Do you have an index defined over the Status column and MemberType column?
how many records do you have? are there any indexes on the table? try this:
;with a as (
select distinct Status
from MyTable
where MemberType=6
)
select count(Status)
from a
My team were experiencing these issues intermittently with long running SSIS packages. This has been happening since Windows server patching.
Our SSIS and SQL servers are on separate VM servers.
Working with our Wintel Servers team we rebooted both servers and for the moment, the problem appears to have gone away.
The engineer has said that they're unsure if the issue is the patches or new VMTools that they updated at the same time. We'll monitor for now and if the timeout problems recur, they'll try rolling back the VMXNET3 driver, first, then if that doesn't work, take off the June Rollup patches.
So for us the issue is nothing to do with our SQL Queries (we're loading billions of new rows so it has to be long running).
This is happen because another instance of sql server is running. So you need to kill first then you can able to login to SQL Server.
For that go to Task Manager and Kill or End Task the SQL Server service then go to Services.msc and start the SQL Server service.
While I would be tempted to blame my issues - I'm getting the same error with my query, which is much, much bigger and involves a lot of loops - on the network, I think this is not the case.
Unfortunately it's not that simple. Query runs for 3+ hours before getting that error and apparently it crashes at the same time if it's just a query in SSMS and a job on SQL Server (did not look into details of that yet, so not sure if it's the same error; definitely same spot, though).
So just in case someone comes here with similar problem, this thread:
https://www.sqlservercentral.com/Forums/569962/The-semaphore-timeout-period-has-expired
suggest that it may equally well be a hardware issue or actual timeout.
My loops aren't even (they depend on sales level in given month) in terms of time required for each, so good month takes about 20 mins to calculate (query looks at 4 years).
That way it's entirely possible I need to optimise my query. I would even say it's likely, as some changes I did included new tables, which are heaps... So another round of indexing my data before tearing into VM config and hardware tests.
Being aware that this is old question: I'm on SQL Server 2012 SE, SSMS is 2018 Beta and VM the SQL Server runs on has exclusive use of 132GB of RAM (30% total), 8 cores, and 2TB of SSD SAN.