We have an application that has been deployed to 50+ websites. Across these sites we have noticed a piece of strange behaviour, we have now tracked this to one specific query. Very occasionally, once or twice a day usually, one of our debugging scripts reports
2006 : MySQL server has gone away
I know there are a number of reasons this error can be thrown but the thing that is most strange is that every single time it is thrown it happens from the same SQL query being run. There is nothing strange or complex about this query, it looks like this:
SELECT `advert_only` FROM `products` WHERE `id` = '6197'
This query must run tens of thousands of times a day, for various different product IDs so it certainly doesnt fail each time. It fails randomly on seemingly random sites across our 4 servers. There is seemingly no commonality, one small thing we have noticed is that it sometimes will happen on 2 or 3 page loads in a row for 1 specific person as we also track the IP of the person it has happened to.
This is on CentOS 5 servers running MySQL 5.0.81
This is kind of in left field, but you should check your harddisk SMART for any errors. If there are issues reading from "that" sector then there may be issues. If you have a raid unit I wouldn't worry too much about this. I wouldn't give a high probability to this being the problem, but if you are really stumped then it might be worth it.
on http://bugs.mysql.com/bug.php?id=1011 the second comment says that: "the 'MySQL server has gone away' error is caused by a query longer than the max_allowed_packet."
there is some more information on fixing it here: http://bogdan.org.ua/2008/12/25/how-to-fix-mysql-server-has-gone-away-error-2006.html
That means that sql connection was idle for too long. Check if there are some slow operations performed before your sql-query.
Related
Background:
Two nights ago the old-as-hell and very poorly designed website for the company I work for got attacked by a bot that submitted about 5000+ phony orders. In the course of deleting all of those false orders from the database, SQL Management Studio crashed, and the application had to be stopped via task manager and restarted. After that I was getting optimistic concurrence control errors when trying to delete some of the fake records, and had to complete the cleanup via DELETE statement.
(yes, I KNOW it's generally bad practice to delete records from the results pane, but for people like me who aren't actually programmers but get stuck with the IT work because we're the only ones who know how to find the on switch, it makes me less paranoid that I won't delete a record I didn't mean to)
Ever since then, there is a specific page in the admin section of the site that takes a VERY long time to perform a SELECT query for a specific range. The query will complete if you sit there long enough, but here's a screenshot of the ColdFusion error box that comes up with it:
ColdFusion error message
I suspect that between the bot attack and Studio Express crashing in the middle of an DELETE query, part of the table is corrupted, which is why it exceeds the allowable time limit. I don't know if our webhost has a backup of the database (I've been in contact with them the last couple days).
What tools can I use to check for and repair errors on that table?
I have a SQL Server database in production and it has been live for 2 months. A while ago the web application associated with it loading takes too long time. And sometimes it says timeout occurred.
Found a quick fix by running a command 'exec sp_updatestats' will fixed the problem. But I need to be run that one consistently (for every 5 minutes).
So I created a Windows service with timer and started on server. My question is what are the root causes and possible permanent solutions? Anyone?
Here is a Most expensive query from Activity Monitor
WAITFOR(RECEIVE TOP (1) message_type_name, conversation_handle, cast(message_body AS XML) as message_body from [SqlQueryNotificationService-2eea594b-f994-43be-a5ed-d9a47837a391]), TIMEOUT #p2;
To diagnose a poorly performing queries you need to:
Identify the poorly performing query, e.g. via application logging, a SQL Profiler trace filtered to show only queries with a longer duration than a certain threshold etc...
Get an execution plan for the query
At that point you can start to try to figure out what the performance issue is.
Given that exec sp_updatestats fixes your issue it could be that statistics on certain tables are out of date (this is a pretty common cause of performance issues). If thats the case then you might be able to tweak your statistics or at least rebuild only those statistics that are causing issues.
Its worth noting that updating statistics will also cause cached execution plans to become invalid, and so its plausible that your issue is unrelated to statistics - you need to collect more information about the poorly performing queries before looking at solutions.
Your most expensive query looks like its waiting for a message, i.e. its in your list of long running queries because its designed to run for a long time, not because its expensive.
Thanks for everyone i found a solution for my issue . Its quite different I've enabled sql dependency module on my sql server by setting up enable broker on , thats the one causing timeout query so by setting it to false everything is fine working now.
I decided I wanted to try out Microsoft SQL Azure, as many people have talked very highly about it. It should be fast, flexible, cheap and many other things.
I got it up and running, migrated my data to Azure and hooked up the connectionstring. I tried to run some queries on the database, and was shocked about how slow even simple queries were. A "SELECT *" from a table with 700 rows took 7 seconds. My page also seems extremely slow, compared to when I used a localhost managent studio or a database on a shared hosting.
Now, when I setup my server, I couldn't pick a physical position. However, I live in Denmark, and I can see the server is the "South Central US". This might be the issue.
I don't use any stored procedures (so I guess no parameter sniffing).. I can also see my indexes is transfered succesfully.
Any ideas on what to do? Any performance things I am missing?
I ran into this very issue the last few days. Change your database tier from basic to standard and you will see a HUGE increase in performance. I am working on a query intensive dashboard at the moment, it took a 20 sec response time down to 2 sec response.
I've used Azure now for the last many years, and my original question is pretty much solved.
My main take-aways after dealing with Azure databases for a while:
It's extremely important that your application and database is placed in the same region. If not, then you will have a slow application. Recently I had an API and app running on two different regions - it took ~1 second for every response.. After moving it to same, it was instant
If your application has a high load, it's often a good idea to upgrade. This happens earlier than you might expect
Pick the nearest region - it really matters
I have RO access on a SQL View. This query below times out. How to avoid this?
select
count(distinct Status)
from
[MyTable] with (NOLOCK)
where
MemberType=6
The error message I get is:
Msg 121, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
Your query is probably fine. "The semaphore timeout period has expired" is a Network error, not a SQL Server timeout.
There is apparently some sort of network problem between you and the SQL Server.
edit: However, apparently the query runs for 15-20 min before giving the network error. That is a very long time, so perhaps the network error could be related to the long execution time. Optimization of the underlying View might help.
If [MyTable] in your example is a View, can you post the View Definition so that we can have a go at optimizing it?
Although there is clearly some kind of network instability or something interfering with your connection (15 minutes is possible that you could be crossing a NAT boundary or something in your network is dropping the session), I would think you want such a simple?) query to return well within any anticipated timeoue (like 1s).
I would talk to your DBA and get an index created on the underlying tables on MemberType, Status. If there isn't a single underlying table or these are more complex and created by the view or UDF, and you are running SQL Server 2005 or above, have him consider indexing the view (basically materializing the view in an indexed fashion).
You could put an index on MemberType.
Please check your Windows system event log for any errors specifically for the "Event Source: Dhcp". It's very likely a networking error related to DHCP. Address lease time expired or so. It shouldn't be a problem related to the SQL Server or the query itself.
Just search the internet for "The semaphore timeout period has expired" and you'll get plenty of suggestions what might be a solution for your problem. Unfortunately there doesn't seem to be the solution for this problem.
Do you have an index defined over the Status column and MemberType column?
how many records do you have? are there any indexes on the table? try this:
;with a as (
select distinct Status
from MyTable
where MemberType=6
)
select count(Status)
from a
My team were experiencing these issues intermittently with long running SSIS packages. This has been happening since Windows server patching.
Our SSIS and SQL servers are on separate VM servers.
Working with our Wintel Servers team we rebooted both servers and for the moment, the problem appears to have gone away.
The engineer has said that they're unsure if the issue is the patches or new VMTools that they updated at the same time. We'll monitor for now and if the timeout problems recur, they'll try rolling back the VMXNET3 driver, first, then if that doesn't work, take off the June Rollup patches.
So for us the issue is nothing to do with our SQL Queries (we're loading billions of new rows so it has to be long running).
This is happen because another instance of sql server is running. So you need to kill first then you can able to login to SQL Server.
For that go to Task Manager and Kill or End Task the SQL Server service then go to Services.msc and start the SQL Server service.
While I would be tempted to blame my issues - I'm getting the same error with my query, which is much, much bigger and involves a lot of loops - on the network, I think this is not the case.
Unfortunately it's not that simple. Query runs for 3+ hours before getting that error and apparently it crashes at the same time if it's just a query in SSMS and a job on SQL Server (did not look into details of that yet, so not sure if it's the same error; definitely same spot, though).
So just in case someone comes here with similar problem, this thread:
https://www.sqlservercentral.com/Forums/569962/The-semaphore-timeout-period-has-expired
suggest that it may equally well be a hardware issue or actual timeout.
My loops aren't even (they depend on sales level in given month) in terms of time required for each, so good month takes about 20 mins to calculate (query looks at 4 years).
That way it's entirely possible I need to optimise my query. I would even say it's likely, as some changes I did included new tables, which are heaps... So another round of indexing my data before tearing into VM config and hardware tests.
Being aware that this is old question: I'm on SQL Server 2012 SE, SSMS is 2018 Beta and VM the SQL Server runs on has exclusive use of 132GB of RAM (30% total), 8 cores, and 2TB of SSD SAN.
I have a problem that seems like its a result of a deadlock-situation.
Whe are now searching for the root of the problem but meantime we wanted to restart the server and get the customer going.
And now everytime we start the program it just says "SqlConnection does not support parallel transactions". We have not changed anything in the program, its compiled and on the customers server, but after the "possible deadlock"-situation it want go online again.
We have 7 clients (computers) running the program, each client is talking to a webservice on a local server, and the webservice is talking to the sql-server (same machine as webserver).
We have restarted both the sql-server and the iis-server, but not rebooted the server because of other important services running on the server so its the last thing we do.
We can se no locks or anything in the management tab.
So my question is, why does the "SqlConnection does not support parallel transactions" error comming from one time to another without changing anything in the program and it still lives between sql-restart.
It seems like it happens at the first db-request the program does when it start.
If you need more information just ask. Im puzzled...
More information:
I dont think I have "long" running transactions. The scenario is often that I have a dataset with 20-100 rows (ContractRows) in that Ill do a .Update on the tableAdapter. I also loop throug those 20-100 rows and for some of them Ill create ad-hook-sql-querys (for example if a rented product is marked as returned I create a sql-query to mark the product as returned in the database)
So I do this very simplified:
Create objTransactionObject
Create objtableadapter (objTransactionObject)
for each row in contractDS.contractrows
if row.isreturned then
strSQL &= "update product set instock=1 where prodid=" & row.productid & vbcrlf
End if
next
objtableadapter.update(contractDS)
objData.ExecuteQuery(strSQL, objTransactionObject)
if succsesfull
objtransactionobject.commit
else
objtransactionobject.rollback
end if
objTran.Dispose()
And then Im doing commit or rollback depending on if It went well or not.
Edit: None of the answers have solved the problem, but I'll thank you for the good trouble shooting pointers.
The "SqlConnection does not support parallel transactions" dissapeared suddenly and now the sql-server just "goes down" 4-5 times a day, I guess its a deadlock that does that but I have not the right knowledge to find out and are short on sql-experts who can monitor this for me at the moment. I just restart the sql-server and everything works again. 1 of 10 times I also have to restart the computer. Its really bugging me (and my customers of course).
Anyone knowing a person with good knowledge in analyzing troubles with deadlocks or other sql problems in sweden (or everywhere in the world,english speaking) are free to contact me. I know this is'nt a contact site but I take my chanse to ask the question because I have run out of options, I have spent 3 days and nights optimizing the clients to be sure we close connections and dont do too much stupid things there. Without luck.
It seems to be that you are sharing connections and creating new transactions on the same open connection (this is the parallel part of the exception you are seeing).
Your example seems to support this as you have no mention of how you acquire the connection in it.
You should do a review of your code and make sure that you are only opening a connection and then disposing of it when you are done (and by all means, use the using statement to make sure that you close the connection), as it seems like you are leaving one open somewhere.
Yours doesn't appear to be an unusual problem. Google found a lot of hits when I pasted your error string into the query box.
Reading past answers, it sounds like it has something to do with interleaving transactions improperly or isolation level.
How long are connections held open? Do you have long-running transactions?
Do you have implicit transactions turned on somewhere, so that there are some transactions where you wouldn't have expected them? Have you opened Activity Monitor to see if there are any unexpected transactions?
Have you tried doing a backup of your transaction log? That might clear it out as well if I remember a previous, similar experience correctly.