SSISDB Database has become so huge in size and clean-up task is required but nothing seems to work
The size of my SSISDB log data is currently really huge (112 GB). When investigating why it reached such huge size, I realized that the SSIS maintenance job was not migrated during the server migration. I tried to clean up the log data by using the built-in stored procedure [internal].[cleanup_server_retention_window] and setting the retention_window to 7 days (The database has not been cleaned up for more than 3 months)
However, the stored procedure does not seem to decrease the size. Instead it took so long (many hours) to complete and made some table larger, e.g. [internal].[event_message_context]. Does it mean that while cleaning up /deleting the log entries, the stored procedure also inserted new logs into the table?
Other options (see below) that I have found on the internet does not seem to function either. It took so long to complete and the size does not seem to decrease.
http://cryptoknight.org/index.php?/archives/1-SSIS-Maintenance-Script.html
I'm expecting to find a solution that can help me drastically reduce the size of my log data and to keep the retention only for 3 days.
P.S.:I'm allowed to disable the SQL Server Agent during the clean up.
It seems there is a problem with SQL Server. For version 2017, the Microsoft has released patch via CU 17. The issue is mentioned here
Issue
You can download the latest CU using the following link.
Related
We are seeing some odd behaviour on our SSAS instances. We process our cubes as part of an overnight job on different environments, on our prod environment we process the cube on a separate server and then sync it out to a set of user facing servers. We are however seeing this behaviour even on environments where we process and query on a single instance.
The first user that hits any environment with fresh data seems to trigger a reload of the cube data from disk. Given we have 2 cubes that run to some 20Gb this takes a while. During this we are seeing low CPU utilisation, but, we can see the memory footprint of the SSAS instance spooling up, this is very visible if the instance has just been started as it seems to start using a couple of hundred Mb initially and then spool up to 22Gb at which point is becomes responsive for end users. During the spool up DAX stuiod/Excel/SSMS all seem to hang a far as the end user is concerned. Profiler isn't showing anything usfeul other than very slow responses to META data discover requests.
Is there a setting somewhere that can change this? Or do I have to run some DAX against the cube to "prewarm" it?
Is this something I've missed in the past because all my models were pretty small (sub 1Gb)
This is SQL 2016 SP2 running Tab Models at compat 1200.
Many thanks
Steve
I see that you are suffering from an acute OLAP cube cold. :)
You need to get it warmer (as you've guessed it, you need to issue a command against it, after (re)starting the service).
What you want to do, is issue a discover command - a query like this one should be enough:
SELECT * FROM $System.DBSCHEMA_CATALOGS
If you want the full story, and a detailed explanation on how to automate this warming, you can find my post here: https://fundatament.com/2018/11/07/moments-before-disaster-ssas-tabular-is-not-responding-after-a-server-restart/
Hope it helps.
Have fun. :)
We are using a SQL Server Tabular model which we use for self-service BI purposes. At monthly basis we have some 90 distinct persons who are using the model. Recently we encountered some issues/errors in the client tools(Excel and Power BI) that are connecting to the Tabular model. See screenshots. We did not make any significant changes to the model the past period.
We noticed that the errors keep showing up after our incremental load, i.e. a full process of a number of partitions we process these partitions every 15 minutes. The process is kicked of by a SSIS job which is scheduled every 15 minutes and processes 5 partitions in 3 tables.
Edit: After some research I figured out that the problem lies in the perspectives. Everytime I do a full process on any object. The error appears. This does not happen on the default model view. Still not found a solution though.
The error occurs when you make a change to the power bi report or the excel file. For example when you do a refresh, or when you click a filter. If you press refresh multiple times the connection comes back and everything works as it is supposed to. It seems like the clients lose their connection to the model. After 15 minutes the problem occurs again.
This is very aggravating for the users. Especially when they are in the middle of a presentation.
This is what we tried:
We tried searching Google for a solution
Checked that we have the latest SQL Server 2016 update (13.0.5149.0)
SSAS Builds from Visual Studio(2015 en 2017)
No full process on tables, only on
partitions.
Upgrading the server from 4 to 8 cpu cores.
I hope somebody can help us.
You shouldn't have the error that you are seeing with just a full process of a partition or even the full table. We do this every hour for a number of core tables and we do not see any issues like this (and we would)
I am starting from the hypothesis that
Your 15 minute process is doing more than just processing the partitions with a refresh command
Something else is happening on the environment (either scheduled or not). Who has permissions to change the schema? Could it be users / developers deliberately or not making changes?
The only things that should cause that kind of error would be Alter, Delete or CreateOrReplace TMSL commands
So unless that triggers your own ideas on a diagnostic process I would do the following steps
Note: I presume that your users also see this issue on your test environment when you run your 15 min processing routine on that. You should do the following on that test environment where nothing else is running to eliminate the possibility of someone else interfering with the experiment. If you don't have a representative test environment then you will have to do on live but I would do this out of hours or under some kind of change control process with your 15 minute refresh turned off and admin permissions to the cube heavily locked down to ensure that nothing can interfere with your experiment.
First prove that you can reproduce this issue with the 15 minute routine
Get your sample PowerBI report that is known to present the error (I'd prefer Power BI for a repro as it is slightly simpler than Excel)
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Run your 15 minute process
You should now see the problem reported. If you do, great, you have a reproduceable issue! If you don't then it is not quite as you thought it was and you need to find the way of reliably reproducing these errors. (perhaps something else is happening that isn't the 15 minute process)
So now you are sure how you can reproduce the issue, you need to isolate whether it is really the processing that is causing the problem
Refresh your PowerBI and explore the data to prove that the error doesn't occur
Execute (via SSMS) your XMLA that processes the entire database for one of your tables
it should look something like this
{
"refresh": {
"type": "full",
"objects": [
{
"database": "yourdbname"
}
]
}
}
Do the thing that your users do when they see the issue.
If you too see the issue, then I would raise to Microsoft Support as this shouldn't happen
If you don't see the issue then you can refine this processing to just be the partition for a single table. But as we have done a process for the entire db above if shouldn't change the result
If you still don't see the issue then it isn't the processing that is causing this issue (which I suspect) and it is something else in the 15 minute routine that is causing it. Look deeper into that process and understand what else it is doing.
Alongside this checking the logs should show if there are any other processing tasks or types of XMLA happening.
I hope these ideas get you closer to finding the actual activity that is causing this experience for your users. It would be great if you could post with how you got on and what you found.
I have the same problem here if I install the latest CU on my SQL Server 2017. My production environment is still running with CU3 (Jan/2018) due to this problem.
Knowing that I would suggest reverting your installation to a previous release. Maybe 13.0.5026.0 (SP2) or even to the 13.0.4466.4 (Jan/2018).
I am facing the same issue with SQL Server 2017 CU 11 installed.
The issue indeed occurs in case of a 'full refresh' in combination with the use of a 'perspective' in an existing connection. The workaround to use the default 'Model' in the connection does indeed 'solve' the issue.
I have a stored procedure as a source connection in Tableau 8.1. It takes a long time to fetch and display ( about 1 min) 40000 records (there is no bar chart, pie charts etc).
What the stored proc does is it selects 40000 records with some 6-7 table joins.
However the same stored procedures executes and displays the records in sql server management studio within 3 seconds.
After using SQL Server Profiler, it shows that some 45000 inserts into a tableau temp table occurs which takes a long time. Also, it shows in the log file that it takes a high percentage of time for the inserts while the execution of stored proc itself takes about 4-5 seconds only.Is this the problem ?Any suggestion how to over come this issue?
Regards
Gautam S
A few of places to start:
First check out the Tableau log file in your Tableau repository directory after trying to access your data. There will be a lot of information in there, but you should be able to see the actual SQL that Tableau sends to your database -- and that may give you some clues about what it is doing that is taking so long. If nothing else, you can cut and paste the SQL into your database tools and try to isolate the problem SQL without Tableau in the mix
If the log file doesn't give you an idea about how to restructure your system to avoid the long query, then send it along with info about your schema to Tableau support. They may be able to help.
Simplify whatever you can to reduce the problem to its core, get rid of everything in your visualization but a total, and then slowly build it back up to see what causes the behavior. For example, make a test version and remove one table at a time from your query to see what causes the problem.
Avoid using quick filters if you see performance problems (or minimize them) Nice feature, but comes with a performance cost
Try the Tableau performance monitoring (record and analysis) features
Work with a smaller data set during testing so you can more quickly experiment with different approaches
Try replacing your stored procedure with a view. That's usually better if at all possible.
Add indices to speed the joins
If there is no way around the long operation and if updates are infrequent, make a Tableau extract so that you only pay that cost periodically
If none of these things help, cut the problem down to its simplest version and post a schema and the problem SQL Otherwise, people can only give you generic advice
I am dealing with someone else's backup Maintenance Plan and have an issue with the log file, I have a database that sits on one drive with a size of 31 GB and a log file that sits on another server with a size of 20 GB, the database is in Full Recovery Model. There is a maintenance plan that runs once a day to do a complete backup and a second plan that does a backup of the log file every 15 minutes. I have checked and the drive that the log file gets backed up to and there is still plenty of room but the log file never gets smaller after the backup, is there something missing from the maintenance plan?
Thanks in advance
The situation as you describe it seems fine.
A transaction log backup does not shrink the log file. However, it does truncate the log, file, which means that space can be reused:
From Books Online (Transaction Log Truncation):
Log truncation automatically frees space in the logical log for reuse
by the transaction log.
Also, from Managing the Transaction Log:
Log truncation, which is automatic under the simple recovery model, is
essential to keep the log from filling. The truncation process reduces
the size of the logical log file by marking as inactive the virtual
log files that do not hold any part of the logical log.
This means that each time the transaction log backup occurs in your scenario, it's creating free space in the file which can be used by subsequent transactions.
Leading on from this, should you shrink the file as well? Generally speaking, the answer is no. Assuming your database does not suddenly have massive one-off spikes in usage, the transaction log will have grown to a size to accommodate the typical workload.
This means if you start shrinking the log, SQL Server will just need to grow it again... This is a resource intensive operation, affecting server performance, and no transactions can complete while the log is growing.
The current plan and file sizes all seem reasonable to me.
I don't know if this applies to your situation, but earlier versions of SQL Server 2012 have a bug that crops up when model is set to Simple recovery model. For any database created with model set to Simple, log files will continue to grow in an attempt to reach the 2,097,152 MB limit. This still applies if you alter to Full afterwards. KB article 2830400 states that altering to Full, then altering back to Simple is a workaround -- that was not my experience. Running CU 7 for SP1 was the only trick that worked for me.
The article provides links for the first updates that resolved this bug: "Cumulative Update 4 for SQL Server 2012 SP1", as well as, "Cumulative Update 7 for SQL Server 2012" (if you haven't installed SP1).
If you change the recovery to full and then back to simple, the shrink will work successfully.
Seems crazy to be doing this at this late date, but...
I am rebuilding some ETL infrastructure with a Rocket Software UniVerse source and an SQL destination. The old destination platform was SQL 2000 on Windows Server 2003, the new platform is SQL 2012 on Windows Server 2012. In both cases, an ODBC driver is used to connect to the source. Everything seems to work fine on the new platform, but the execution time for a package is exponentially slower. For example, one table with roughly 1.3 Million rows and 28 Columns takes about an hour using SQL 2000/DTS and over 3.5 hours using SQL 2012/SSIS. Both SQL servers are virtualized on Xen Server, the 2012 server has more RAM and more vCPUs, neither machine has an advantage in disk infrastructure. No metrics (Memory, disk IO, etc.) are red-lining (or really even coming close) on the 2012 server during package execution.
I have read several forum posts describing the same scenario, but none really seemed to have a solution that works for me. Since all of these posts were quite dated (most of these conversions from DTS to SSIS happened in the SQL 2005 days), I was curious if there was any fresher info out there.
The packages are very simple table copies, no transforms. I am using a "SELECT column, column,.. FROM sourcetable" for my source connection and 'Table or View - Fast Load' for my destination. The slow down APPEARS to be on the source side of the equation, though I can't be certain.
Any help appreciated.
One option to investigate is lowering the buffer size in your data flow. By default, it's set at 10k rows. If you have a slow data source, it can take quite a while to fill up the "bucket" of data just to start sending a batch of information down to the destination. While it might seem counterintuitive, lowering that number can increase performance as 5k, or 1k or 100 rows of data fill the bucket much sooner. That data then gets shuffled through the data flow and lands in the source while bucket 2, 3, etc are being filled.
If you have a SQL Server source, you can optimize your query by hinting that you'd like a fast N rows, which you'd align with your SSIS package's row size.
See Rob Farley's article for more details about that.