Problem:
One of our clients has SQL Server 2005 running on a Windows 2008 R2 Standard machine. Every once in a while, the server fails with the following error:
SQL Server failed with error code 0xc0000000 to spawn a thread to process a new login or connection. Check the SQL Server error log and the Windows event logs for information about possible related problems. [CLIENT: <local machine>]
The error occurs at a rate of about once per second, with the value for CLIENT: being the only thing that changes (sometimes, instead of <local machine> it shows the IP of the machine or the IP of other machines belonging to the client) and until the SQL Server is restarted, no connections can be made to it. After the restart, it works fine.
The problem happens about once or twice per month. There are no windows logs for the previous occurrence; I've since increased the max size for the Application log.
Machine configuration:
OS: Windows 2008 R2 Standard SP1 (x64)
SQL: Microsoft SQL Server 2005 - 9.00.4035.00 (Intel X86) Nov 24 2008 13:01:59 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition on Windows NT 6.1 (Build 7601: Service Pack 1)
CPU: Intel Xeon E5430 # 2.66GHz
RAM: 32 GB
Paging file: 32 GB on drive E (System managed), None on all other drives (including drive C)
More info:
The server has 2 databases that are actively used:
One database is used for replication (1 Publication with about 450 subscribers, most of which synchronize daily, usually more than once per day). The same database is also used by a web application that has about 150 subscribers that use it actively during the day.
Both of the databases also have frequent jobs running that mainly do file imports and transfers from one db to the other.
Update:
While checking the logs once again, I've noticed that the AppDomain gets marked for unload due to memory pressure, unloaded and recreated at a rate of about once every 30 minutes. During the last 2 occurences of the stated problem, the AppDomain went up to 250 and 264, respectively. Could this be a related issue?
This error could be due to a max worker threads setting that is too low. You can set this as:
EXEC sp_configure 'max worker threads',0
GO
RECONFIGURE WITH OVERRIDE
GO
to raise the limit.
It's entirely possible that you are getting the error due to having too many connections open, in other words the error is the symptom rather than the cause. You should review your application(s) for proper closing of connections.
You can inspect all open connections in SQL Server using sp_who:
Provides information about current users, sessions, and processes in an instance of the Microsoft SQL Server Database Engine. The information can be filtered to return only those processes that are not idle, that belong to a specific user, or that belong to a specific session.
More information on how to inspect open connections, read this thread on SO.
Related
I have three Microsoft Windows Server 2012 R2 Standard servers running on Dell PowerEdge machines that serve as Hyper-v hosts for my various virtual machines. Each server has a scheduled backup similar to the following:
Notes:
Each scheduled backup is configured to be a "VSS Full Backup".
Some backups show the VMs as online and some show as offline.
The issue is that the backups will run for several days successfully and then will stop running. After the backups begin to fail, I see the following when I open Windows Server Backup tool.
I also see Event 19 in the event log when the scheduled backup runs:
The backup operation attempted at
'?2018?-?06?-?04T02:00:01.583169900Z' has failed to start, error code
'0x8007000E' ('Ran out of memory'). Please review the event details
for a solution, and then rerun the backup operation once the issue is
resolved.
If I attempt to run the backup using the wbadmin command line, I see the following error:
Not enough storage is available to complete this operation.
If I sign out and then sign on, the issue is immediately resolved and I can see the backup history in the Windows Server Backup tool. The backup will run again for several days until the issue occurs again.
The cycle of the issue on the three machines is very similar. In other words, the backup will run successfully on all three machines for 2 days and then fail on all three machines and continue to fail until I sign in, sign out and then sign in on each machine.
Note: After verifying that the server backup is running with a sign in, sign out & sign in. I typically sign out again.
Notes
Each Hyper-v Server has 16 GB of free memory after the memory dedicated to the VMs is subtracted from the total memory.
Each Hyper-v Server has 25 GB of free hard drive space on the C drive and 500 GB or more of free space on the D drive where the VMs are stored.
Each backup drive has 1 TB of free space.
Any ideas?
I originally posted this question on Microsoft Technet.
Technet Question
The root cause of this issue was the Bomgar Jump Client version 17.1.4. After disabling the Jump Client, we have not experienced the issue. See the linked Technet Question for more detail.
We are currently running two instances of SQL Server. For development purposes, we run a local DB on a desktop PC in our office.
The PC has following stats:
8 GB Ram
AMD Athlon 5350 APU with Radeon(tm) R3 2.05 GZ
64 Bit Windows 8.1
Microsoft SQL Server 2014 - 12.0.2000.8 (X64) Express Edition (64-bit)
HDD Seagate ST1000DM003 1 TB
The server is located in Azure as VM Standard-Tier A3 running the pre-provided Windows Server 2012 R2 Datacenter image
Now we are facing a problem that the exact same query is running locally on the desktop 10 times faster than the on the server.
I connect to the pc with a local installed Management Studio via TCP/IP over our local network. When I connect to the server I use Remote Desktop connection and start a local instance of management studio on the server.
I have changed already the connection mode from default to TCP/IP on the server which brings me to the factor 10 times slower with default connection it will be 20 times slower. Even changing to named pipes the performance is worse.
Also rewriting the query and using different approaches, always the express version is much faster than the server. We did not do any configuration or tuning on the installation of the express version so on the server side.
Any comments a very appreciated!
Best
Simon
You should add the following at the top of the query to see where the differences are:
SET STATISTICS TIME ON
SET STATISTICS IO ON
Is your Local machine have SSD ? If it's the case, it's normal.
Try to rebuild indexes used.
Update the Database/Table statistics. The Execution Plan can be the same, but with bad stats, I've often saw very low performance. Especially if you make a lot of insert/delete.
You can see if something is wrong with SET STATISTICS IO ON. Look at the logical reads on tables, the orders of workfill tables, etc. Check if it's different from the local server.
I have a server with Windows Server 2012 R2 and SQL SERVER 2014 with in-memory tables, and when I do multiple deletes and inserts at those tables, I have the following problem: "There is insufficient system memory in resource pool 'default' to run this query.". But my server have 32GB of RAM and the CPU processing doesn't even reach 70%. This server only contains one database. The resource governor is enabled with 70% of max memory. I also have another server with Windows Server 2008 R2 and SQL SERVER 2014 with same process of inserts and deletes and I don't have any problem.
Any idea?
My company looks after a server which is running SQL Server 2014 Standard Edition. The main application database on the server is replicated out to about 20 subscribers. The server has only been in operation for about a week and at the time of setting it up we moved the existing database and replication across from the old server. At that time we also changed the recovery model from Simple to Full.
The performance of the server hasn't been particularly good so far, even though it has 24 logical processors and 32 GB of RAM. However it seems to be SQL Server that has been performing badly due to locks etc. according to the event logs.
This morning we couldn't access any tables in Management Studio and after a period of time we decided to try restarting the service. The service did not seem to stop properly and we were forced to restart the server. After restarting the server everything seemed fine until we tried to expand out the 'Databases' tab in Management Studio. It just says 'expanding...' in the UI and after several minutes brings up a timeout error message. After clicking ok on this message we can browse all the system databases as normal but the main application database is not shown.
One of the problems that has come to light is that the log file has grown extremely large (over 1 TB).
This is a major problem now as you can imagine, since we cannot connect to the database to shrink the log file and no applications can connect to the database Connecting to the database using 'sqlcmd' doesn't work either. We can't 'open' the database.
Look forward to hearing if anyone has come across this before and knows of a solution.
I have the Execute SQL Script package that contains the script to insert about 150K records.
Problem in here is when I execute the package in the Virtual machine its taking 25 min's approx and the same package in physical machine its taking 2 min's
Question 1? Why its taking that much time to load the same data in VM.
Question 2? How to solve this performance issue.
Physical machine configuration has 4GB Ram and 250GB HD + Windows server 2008 R2 + SQL server 2008 R2 Standard Edition.
Virtual machine has the same Configuration
Update: The Problem is with the SQL Server in VM.
Question 1? Why its taking that much time to Run the same script in VM.
Question 2? How to solve this performance issue.
Both the batabases schema in Physical Machine and VM are identical. Other databases are also same. There was no indexing applied for that tables in both machines. Datatypes are same. harddisk as I said has the same configuration.
No RAID is done on both the machines.
Physical machine has the 2.67GHz RAM Quad Core and in the virtual machine has the
2.00GHz RAM Quad Core
Version of SQL PM:
Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64) Apr 2 2010 15:48:46 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1)
Version of SQL PM:
Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64) Apr 2 2010 15:48:46 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1) (Hypervisor)
I executed the script Execution plan for both are the same as there is no difference in plan.
Vendor is HP ML350 Machine.
There are almost 20 VM's on the same physical server out of which 7 servers are active.
There's an article about properly setting SQL's configuration for a VM implementation here: Best Practices for SQL Server. Below is an excerpt, though the article includes other tips and a good performance testing plan:
Storage configuration problems are the number one cause of SQL performance issues. Usually these problems arise because the DBA requests a virtual disk of the VI admin, the VI admin places the VMDK on a LUN that may or may not meet the DBA's performance needs. For instance:
VMs' VMDK files placed on VMFS volumes without enough spindles.
Many VMDK files placed on a single VMFS volume which could use more spindles.
Database and log files placed on the same LUN which, you guessed it, could use more spindles.
This may be obvious to some, but this problem occurs again and again. The VI administrator should be aware of a few technical items that can help understand and avoid this problem:
Based on the IO demands of the DB files, a certain number of
spindles should be guaranteed to this file. This means that its
VMDK must be placed on a VMFS volume to accout for the SQL Server's
demands and all of the other demands on that volume.
Mixing sequential activity (such as log file update) and random activity
(such as database access) results in random behavior. This means
that the LUN configuration in the pre-virtual physical environment
may not be sufficient for the consolidated environment. This is
discussed some in Storage Performance: VMFS and Protocols.
When storage isn't meeting the SQL Server's demands, the device latency
or kernel latency (queueing time) will increase. Read up on these
counters in Storage Performance Analysis and Monitoring.
The most common cause for this problem is the lack of RAM. Having everything setup on a small 4GB RAM machine is your problem.
When you try to load those 150k rows into memory (remember, everything that happens in SSIS is in memory), a lot of those rows are being handled by your pagefile.
Pagefile on your VM is a lot slower than the one on your physical machine.
To solve this, increase the amount of RAM on your virtual machine.
I have a similar problem.
Two client machines (one physical, one virtual) execute a batch using SQLCMD. This batch calls a Stored procedure on a physical server (so it's not a memory problem since the elaboration is only on server side).
The batch executed from the physical machine takes 20 minutes. The batch executed from the virtual machine takes 1 hour and 20 minutes.
Using SQL profiler I noted that in the case of slow execution there is a wait type ASYNC_NETWORK_IO.
Probably the virtualized network layer is not optimized.
Could you run a SQL profiler and check if you see the wait type ASYNC_NETWORK_IO?