Why does wbadmin fail after a few days and report "Not enough storage is available to complete this operation"? - backup

I have three Microsoft Windows Server 2012 R2 Standard servers running on Dell PowerEdge machines that serve as Hyper-v hosts for my various virtual machines. Each server has a scheduled backup similar to the following:
Notes:
Each scheduled backup is configured to be a "VSS Full Backup".
Some backups show the VMs as online and some show as offline.
The issue is that the backups will run for several days successfully and then will stop running. After the backups begin to fail, I see the following when I open Windows Server Backup tool.
I also see Event 19 in the event log when the scheduled backup runs:
The backup operation attempted at
'?2018?-?06?-?04T02:00:01.583169900Z' has failed to start, error code
'0x8007000E' ('Ran out of memory'). Please review the event details
for a solution, and then rerun the backup operation once the issue is
resolved.
If I attempt to run the backup using the wbadmin command line, I see the following error:
Not enough storage is available to complete this operation.
If I sign out and then sign on, the issue is immediately resolved and I can see the backup history in the Windows Server Backup tool. The backup will run again for several days until the issue occurs again.
The cycle of the issue on the three machines is very similar. In other words, the backup will run successfully on all three machines for 2 days and then fail on all three machines and continue to fail until I sign in, sign out and then sign in on each machine.
Note: After verifying that the server backup is running with a sign in, sign out & sign in. I typically sign out again.
Notes
Each Hyper-v Server has 16 GB of free memory after the memory dedicated to the VMs is subtracted from the total memory.
Each Hyper-v Server has 25 GB of free hard drive space on the C drive and 500 GB or more of free space on the D drive where the VMs are stored.
Each backup drive has 1 TB of free space.
Any ideas?
I originally posted this question on Microsoft Technet.
Technet Question

The root cause of this issue was the Bomgar Jump Client version 17.1.4. After disabling the Jump Client, we have not experienced the issue. See the linked Technet Question for more detail.

Related

Recurring SQL Error 17189

Problem:
One of our clients has SQL Server 2005 running on a Windows 2008 R2 Standard machine. Every once in a while, the server fails with the following error:
SQL Server failed with error code 0xc0000000 to spawn a thread to process a new login or connection. Check the SQL Server error log and the Windows event logs for information about possible related problems. [CLIENT: <local machine>]
The error occurs at a rate of about once per second, with the value for CLIENT: being the only thing that changes (sometimes, instead of <local machine> it shows the IP of the machine or the IP of other machines belonging to the client) and until the SQL Server is restarted, no connections can be made to it. After the restart, it works fine.
The problem happens about once or twice per month. There are no windows logs for the previous occurrence; I've since increased the max size for the Application log.
Machine configuration:
OS: Windows 2008 R2 Standard SP1 (x64)
SQL: Microsoft SQL Server 2005 - 9.00.4035.00 (Intel X86) Nov 24 2008 13:01:59 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition on Windows NT 6.1 (Build 7601: Service Pack 1)
CPU: Intel Xeon E5430 # 2.66GHz
RAM: 32 GB
Paging file: 32 GB on drive E (System managed), None on all other drives (including drive C)
More info:
The server has 2 databases that are actively used:
One database is used for replication (1 Publication with about 450 subscribers, most of which synchronize daily, usually more than once per day). The same database is also used by a web application that has about 150 subscribers that use it actively during the day.
Both of the databases also have frequent jobs running that mainly do file imports and transfers from one db to the other.
Update:
While checking the logs once again, I've noticed that the AppDomain gets marked for unload due to memory pressure, unloaded and recreated at a rate of about once every 30 minutes. During the last 2 occurences of the stated problem, the AppDomain went up to 250 and 264, respectively. Could this be a related issue?
This error could be due to a max worker threads setting that is too low. You can set this as:
EXEC sp_configure 'max worker threads',0
GO
RECONFIGURE WITH OVERRIDE
GO
to raise the limit.
It's entirely possible that you are getting the error due to having too many connections open, in other words the error is the symptom rather than the cause. You should review your application(s) for proper closing of connections.
You can inspect all open connections in SQL Server using sp_who:
Provides information about current users, sessions, and processes in an instance of the Microsoft SQL Server Database Engine. The information can be filtered to return only those processes that are not idle, that belong to a specific user, or that belong to a specific session.
More information on how to inspect open connections, read this thread on SO.

Microsoft SQL Server backup physical_device_name

I've configured 2 backup tasks in Microsoft SQL Server, to have a full and incremental backup of our database. The version of Microsoft SQL Server is 2008 R2.
The problem is, I've configured a location and until a couple of weeks ago everything went fine. The problem is that it currently creates those backups in a location that is not possible to find. If I check the logs, everything went ok. But in the target directory, there's no trace of that backup file.
When I execute the following query, I get a bizarre value in physical_device_name:
SELECT
physical_device_name,
backup_start_date,
backup_finish_date,
backup_size/1024.0 AS BackupSizeKB
FROM msdb.dbo.backupset b
JOIN msdb.dbo.backupmediafamily m ON b.media_set_id = m.media_set_id
WHERE database_name = 'DB_NAME'
ORDER BY backup_finish_date DESC
These are the values that I get for physical_device_name:
{4CAE7525-44D7-4DEF-86A7-F9C7C99C013C}3
{EC6FB844-832G-4A8F-BDDE-12D073383139}3
And so on ...
Any idea why this is and how to resolve it? My initial thought was that those directories are readonly. I changed that, but I saw that one of the backups last night failed again because of the same reason.
Good day,
SQL Server supports virtualization-aware backup solutions that use volume shadow copy (VSS) also named volume snapshots. For example, SQL Server supports Hyper-V and VMware backup. For more information check this document.
When the host backups your system the SQL Server VSS Writer service is used (It should be running when SQL Server is installed on virtual machine).
You should notice that these backups have value 7 in the column "device_type" (7 means Virtual device). These rows are actually very useful, for example it help to know that the Virtual Machine backups are running full database backups on a SQL Server instance. there's no trace of these backup files since they are above the scope of the virtual machine (above the level of your machine) - these are triggered by the host (Hyper-V or VMware for example).
More information here.
Yup - If you look at device_type (in the backupmediafamily table), you will see that it is probably a virtual device (7) and is being backed up by your virtual machine software.

SQL Server 2012 , lost access to database

I have experienced a weird issue a few hours ago and i cannot seem to figure out what has caused this problem.
I have SQL server 2012 installed on a windows server 2012 virtual machine.
I have windows services, Windows applications and web sites accessing a database on this server.
all applications lost access to the database for +/- 10 minutes and it suddenly just came back up again.
during that ten minutes i managed to log onto the SQL server remotely and open management studio and access all the databases but the applications still could not connect.
The database did not go into single user mode, the CPU and memory was normal, i could ping the server from my desktop.
I looked at the event log and SQL logs but couldn't find anything related to why the database could not be accessed.
I am baffled I've been trying to figure this out for the last 2 hours and i not getting anywhere.
I would appreciate any assistance
Thanks
run the dbcc checkdb command. This will check the database and tables for any corruption/errors.

SQL Server 2014 Standard 2014 no access and bad preformance

My company looks after a server which is running SQL Server 2014 Standard Edition. The main application database on the server is replicated out to about 20 subscribers. The server has only been in operation for about a week and at the time of setting it up we moved the existing database and replication across from the old server. At that time we also changed the recovery model from Simple to Full.
The performance of the server hasn't been particularly good so far, even though it has 24 logical processors and 32 GB of RAM. However it seems to be SQL Server that has been performing badly due to locks etc. according to the event logs.
This morning we couldn't access any tables in Management Studio and after a period of time we decided to try restarting the service. The service did not seem to stop properly and we were forced to restart the server. After restarting the server everything seemed fine until we tried to expand out the 'Databases' tab in Management Studio. It just says 'expanding...' in the UI and after several minutes brings up a timeout error message. After clicking ok on this message we can browse all the system databases as normal but the main application database is not shown.
One of the problems that has come to light is that the log file has grown extremely large (over 1 TB).
This is a major problem now as you can imagine, since we cannot connect to the database to shrink the log file and no applications can connect to the database Connecting to the database using 'sqlcmd' doesn't work either. We can't 'open' the database.
Look forward to hearing if anyone has come across this before and knows of a solution.

SQL Job saying it executed but not showing up in Windows Event Viewer

I have an 2005 SSIS package that runs on a Windows 2003 Server hitting a SQL Server 2005 database on the same server. I have a package that SQL Server Job Scheduler reports executing successfully, yet on certain days the functionality in side the package does not run. Desperate for answers, I have been searching the application's audit logs and Windows Event Viewer. I noticed that days this package does not execute coincide s with days that another package stops (does not fail how you would typically expect a package to fail) due to high memory usage. The other thing I noticed by searching through the Windows Event Viewer is that even though SQL Server Job Scheduler claims to of executed the package successfully, there is no record of the event in the Event Viewer.
After all of that, my question is: are there any reports of bugs with the SQL Server / Windows Server combination regarding executing packages after high server memory usage in the same day? Regardless of the first, any suggestions on a work around?
Other related facts: due to other projects in the works, I am not authorized to modify the packages that fails when the memory usage is high.. I'm only allowed to restart the package.
Please let me know if I need to provide additional details.
Additional Details 2012.01.30
Recently an 8 GB stick of memory was removed from the server. SQL Server's max allocation was set to 30 GB. Once the stick was removed, only 24 GB remained.
2012.02.10: I was given the approval to rewrite the memory leaking package.
I determined that the execution of the previously mention rules in a foreach loop caused the memory leak on the server. I removed this looping package and replaced it with a stored procedure.
Since this change went into effect, no out of memory exceptions have occurred in the problem package and my package is running successfully.