SQL Server running slow due to thread count increase - sql

On our production server, for some reason at specific amount of time, thread count goes over and over to the certain point that though CPU Utilization is normal(30-50%), but the query starting to run slow we so lot more blocking statements.
I am not sure where to look at it, basically when our site runs normally, the thread count is around 150 threads, but during the specific time in a day(during 1:30 to 2:30) it come up to 270 threads.there are no extra sql transaction goes on, everything as normal as it was before but thread count grows and sql start behaving very very slow.
After restarting the SQL service immediately thread count comes to normal, and our site function fine for another 24 hours.
We are using SQL Server 2005, it is 24 core machine.
any idea?

Blocking statements steal workers (sys.dm_os_workers) so the server will spawn more workers to handle the incoming tasks. At 24 cores you'll have some 700 max worker threads out-of-the-box by default. So seeing 270 'threads' is not an issue, is well within the normal functioning parameters. You real problem must be the blocking, and you have to investigate it accordingly: who is blocking who and why. My bet is that you have a job running between 1:30 and 2:30 that is locking large portions of the database (a delete job perhaps?) and your queries block on locked rows. You'll have to investigate, find the root cause, and act accordingly. Reboot is not a solution, nor is blaming unrelated components (thread count). Use Activity Monitor, use Who Is Active, follow the methodical approach of Waits and Queues methodology. There are plenty of ways to identify the real problem. SQL Server will never appear slow due to thread count. It simply doesn't work like that.

You can control the degree of parallalism using the MAXDOP query hint. For more details please check this article:
http://blog.sqlauthority.com/2010/03/15/sql-server-maxdop-settings-to-limit-query-to-run-on-specific-cpu/

Thanks for your valuable feedback, yes it is true it is nothing which sql is behaving weiredly, it is something our Site which is based on Ektron CMS is responsible for, one of the Functionality of Ektron CMS (which is PageBuilder), while doing operation on this piece of Content was holding of the table badly, we have around 10 million users to our site, and probably since this was blocking the tables SQL Server goes nuts and does not respond very well.
We have finally eliminated the issue.

Related

Timeout expired SQL Server 2008

I have a SQL Server database in production and it has been live for 2 months. A while ago the web application associated with it loading takes too long time. And sometimes it says timeout occurred.
Found a quick fix by running a command 'exec sp_updatestats' will fixed the problem. But I need to be run that one consistently (for every 5 minutes).
So I created a Windows service with timer and started on server. My question is what are the root causes and possible permanent solutions? Anyone?
Here is a Most expensive query from Activity Monitor
WAITFOR(RECEIVE TOP (1) message_type_name, conversation_handle, cast(message_body AS XML) as message_body from [SqlQueryNotificationService-2eea594b-f994-43be-a5ed-d9a47837a391]), TIMEOUT #p2;
To diagnose a poorly performing queries you need to:
Identify the poorly performing query, e.g. via application logging, a SQL Profiler trace filtered to show only queries with a longer duration than a certain threshold etc...
Get an execution plan for the query
At that point you can start to try to figure out what the performance issue is.
Given that exec sp_updatestats fixes your issue it could be that statistics on certain tables are out of date (this is a pretty common cause of performance issues). If thats the case then you might be able to tweak your statistics or at least rebuild only those statistics that are causing issues.
Its worth noting that updating statistics will also cause cached execution plans to become invalid, and so its plausible that your issue is unrelated to statistics - you need to collect more information about the poorly performing queries before looking at solutions.
Your most expensive query looks like its waiting for a message, i.e. its in your list of long running queries because its designed to run for a long time, not because its expensive.
Thanks for everyone i found a solution for my issue . Its quite different I've enabled sql dependency module on my sql server by setting up enable broker on , thats the one causing timeout query so by setting it to false everything is fine working now.

TADOStoredProc/TADOQuery CommandTimeout...?‏

On a client is being raised the error "Timeout" to trigger some commands against the database.
My first test option for correction is to increase the CommandTimeout to 99999 ... but I am afraid that this treatment generates further problems.
Have experienced it ...?
I wonder if my question is relevant, and/or if there is another option more robust and elegant correction.
You are correct to assume that upping the timeout is not the correct approach. Typically, I look for log running queries that are running around the timeouts. They will typically stand out in the areas of duration and reads.
Then I'll work to reduce the query run time using this method:
https://www.simple-talk.com/sql/performance/simple-query-tuning-with-statistics-io-and-execution-plans/
If it's a report causing issues and you can't get it running faster, you may need to start thinking about setting up a reporting database.
CommandTimeout is a time, that the client is waiting for a response from server. If the query is run in the main VCL thread then the whole application is "frozen" and might be marked "not responding" by Windows. So, would you expect your users to wait at frozen app for 99999 sec?
Generally, leave the Timeout values at default and rather concentrate on tunning the queries as Sam suggests. If you happen to have long running queries (ie. some background data movement, calculations etc in Stored Procedures) set the CommandTimeout to 0 (=INFINITE) but run them in a separate thread.

SQL Server running slow

I have SQL job that runs every night which does various inserts/updates/deletes. The job contains 40 steps which mainly execute stored procedures.
It's been running fine up until a week ago when suddenly the run time went up from 2.5 hours to over 5 hours, sometimes even 8,9,10!
Could one you please give me any pointers?
First of all let me recommend you a valuable resource on Simple-Talk site. Is a detailed methodology of how to troubleshoot performance issues on SQL Server.
Does the insert you say was carried out was a huge bulk insert that could affect performance? Maybe if it was a huge load the query execution plans could be different and you need to re-tune your table structure, indexes, etc.
If the run time suddenlychanged and no changes where done in the queries or your database structure then I would ask myself several questions:
first, does the process is still taking so long or it was only one time it ran so slow? maybe now is running smoothly and the issue only arised once. Nevertheless, try to find what triggered that bad performance, it can happend again and take down your server
is the server a dedicated sql server? if not, check if some new tasks unrelated to the SQL engine had been configured, maybe a new tasks is doing some heavy I/O jobs and therefore your CRUD operations take longer
if it is a dedicated server, then check that no new job has been added and can take down your existing jobs. Check this SO link for details on jobs settled up from the SQL Agent
maybe low memory due to another process on same server?
And there is lot more to check, but before going deeper I would check that no external (non sql server related) was the reason of the delay on the process execution.

FirebirdSQL queries gets stuck at 12:00PM

I'm running Firebird 2.5 (and have also tried earlier versions) on Windows. Every day after 12:00PM running insert/update queries on one specific table hang, but complete successfully by 12:35 or so, no matter when started. It does seem that Firebird is doing some kind of maintenance on the table and it takes half an hour to complete, during which time the table cannot be written to (but the reads are fast). The table itself is really small, some 10000 rows, compared to millions of rows we have in other tables - and other tables do not get stuck.
I haven't been able to find any reason or solution. I tried dumping the table and restoring it, which didn't help, I tried switching between superserver and classic, changed versions with no success.
Has anyone experienced a problem like this?
No. Firebird doesn't have any internal maintenance procedures bind to some specified time of a day. Seems, there is some task on your server scheduled to run at 12:00 PM. Or there are network users of the server who start doing some heavy access at 12:00 PM.
The only maintenance FB does is "garbage collection" (geting rid of old record versions) and this is done on "when needed" basis (usually when records were selected, see the GCPolicy in firebird.conf) not on some predefined time.
Do you experience this hang only on during these certain hours or is it always slow to insert to that table? Have you checked the server load during the slowdown (ie in the task manager, is the CPU maxed out)? Anyway, here is some ideas to check:
What constraints / triggers do you have on the table? If they involve some extensive checks (ie against the other tables which contain millions of rows) this could be the reason inserts take so long.
Perhaps there is some other service which is triggered at that time? Ie do you have a cron job to make backup of the DB at that time? Or perhaps some other system service which runs at that time with higher priority slows down the server?
Do you have trace service active for the table? See fbtrace.conf in FireBird root directory. If it is active, extensive logging might be the cause of slowdown, if it isn't active, using it might help you to find the cause.
What are the setings for ForcedWrites / UnflushedWrites (see firebird.conf)? Does changing them make difference?
Is there something logged for this troublesome timeframe in firebird.log?
To me it looks like you have a process which starts at 12:00 and does something which locks the entire table. Use the monitoring table or the trace manager to see if there is any connection or active transaction which looks suspicious.
I also think your own transaction are started with the WAIT clause without a LOCK TIMEOUT, you might want to change this to NO WAIT or WAIT with a LOCK TIMEOUT, so that your transactions either fail immediately or after the timeout.
My suggestion is to use the TRACE API in 2.5 to track down what is happening near or around that time. That should help get you more information as to what is happening.
I use this for debugging http://upscene.com/products.misc.fbtm.php kinda buggy itself, but when it is working it is a god send.
Are some Client-Connections going DOWN at 12:00 PM? I had a similar problem on a 70.000 records sized table:
Client "A" has a permanently open DB Connection like "select * from TABLE". This is a "read only transaction" but reason enough for the server to generate Record-Versions. Why?
Client "B" made massive Updates to this Table, the Server tries to preserve the world like it was when "A" startet her "select". This is normal for Transaction able DB-Servers, and its implemented by creating Record Copies of the record-data before its updated.
So in my case for this TABLE 170.000 Record Versions existed. You can measure this by
gstat -r -t TABLE db.fdb | grep versions
If Client "B" goes down, the count of Record-Versions is NOT growing any more. Client "A" is the guilty one, freezing all this versions, forces the server to hold it. Finally if Client "A" goes down (or for example a firewall rule cuts all pending connections) Firebird is happy to start the process of getting rid of the now useless Record-Versions.
This "sweep"?! is bad programmed (even 2.5.2) cpu is 3% it do only <10.000 Versions / Minute so this TABLE has a performance of about 2%.

Intermittent SQL timeouts

Since a few days ago, the SQL server (Microsoft SQL Server 2005) backing our site has started occasionally timeouting. It is happening at seemingly random times approximately every hour or two. It usually takes about 10 minutes during which we see hundreds of timeouted requests. Under normal circumstances, most of our queries take less than 50ms. A query which takes a significant fraction of a second is an exception.
I have effectively killed a day trying to figure out at least something without any real progress. Normally, the server load is about 10-20%, and when the timeouts happen, we don’t see any increased CPU load. Also, there is nothing special happening during the timeouts; no overzealous web crawler, no heavy background tasks, no increased network traffic, no increased number of connections etc. Simply, everything looks as usual.
Not making any progress, we decided to restart it (and install the latest SP since we were in it) which seems to have fixed the problem. It has been already over six hours without any incident. Also, the CPU load has gone down under 10%.
It almost seems like if the SQL server "deteriorated" overtime. Perhaps, some internal structure (some cache or statistic) got out shape and caused the occasional problems. I don’t have any other explanation.
The only thing I noticed when I was monitoring the server (and got lucky once to be present when the timeouts were happening), I saw several long running queries waiting on CXPACKET. But I learned that this is most likely just a consequence of some other problem. I wrote a script monitoring SQL requests, and so hopefully, next time it happens, I will have more information.
Has anybody had similar experience? I’m not an SQL Server guru. Any suggestions are welcome.
since everything looked normal: CPU, nothing special happening, no overzealous web crawler, no heavy background tasks, no increased network traffic, no increased number of connections etc. I'd look into locking\blocking\race condition. Use this to see what (if anything) is locking when the time-out are happening:
How to find out what SQL queries are being blocked and what's blocking them?