I am trying to build an SSAS 2014 cube using Teradata 14.10 as the relational back end. I am using the .net driver to connect to the database and have set the response buffer size to 65477
I notice that the cube builds extremely slowly and one of the reason for this is of course the sheer amount of data being returned. The query itself executes extremely fast. Its the data transfer which takes hours.
Are there any any tips to improve the build time? Has someone else done this with maybe tweaking some settings or using some other method to improve the response time for the data transfer?
Related
Currently I am building a model in Visual Studio for Azure Analysis Services, but I am experiencing very slow performance of the Power Query editor.
I am trying to do a left join on a table of about 1.600.000 rows. The table I am joining with is around 50 million rows. The merge-step works, but when I try to expand the columns it is downloading all the 50M rows for some reason. At least the status bar at the bottom indicates this.
This is quite annoying as it will do this every time I try to edit the query sequence.
Already tried setting several indexes on the SQL table
The Azure SQL server does not show usage peaks of 100%, max 80% sometimes
Any ideas how to solve this?
I noticed in SSMS that the PowerQuery editor creates so-called folded-queries and also introduces sorting statements which I don't setup in the editor.
So I fixed my performance issues by enabling legacy datasources in the Visual Studio options. With this I can write my own SQL statement which are many times faster.
Does anyone know why this is happening in the PowerQuery editor, and if using this legacy way of working has drawbacks compared to the editor?
#alejandro: I need Analysis Services mainly for the fast cache it provided. I tried to load the tables in PowerBI directly, but this became totally unresponsive.
I have a stored procedure in AZURE DW which runs very slow. I copied all the tables and the sp to a different server and there it is taking very less time to execute. I have created the tables using HASH distribution on the unique field but then also the sp is running very slow. Please advice how can I improve the performance of the sp in AZURE DW.
From your latest comment, the data sample is way too small for any reasonable tests on SQL DW. Remember SQL DW is MPP while your local on-premises SQL Server is SMP. Even with DWU100, the underlying layout of this MPP architecture is very different from your local SQL Server. For instance, every SQL DW has 60 user databases powering the DW and data is spread across them. Default storage is clustered column store which is optimized for common DW type workloads.
When a query is sent to DW, it has to build a distributed query plan that is pushed to the underlying DBs to build a local plan then executes and runs it back up the stack. This seems like a lot and it is for small data sets and simple queries. However, when you are dealing with hundreds of TBs of data with billions of rows and you need to run complex aggregations, this additional overhead is relatively tiny. The benefits you get from the MPP processing power makes that inconsequential.
There's no hard number on the actual size where you'll see real gains but at least half a TB is a good starting point and rows really should be in the tens of millions. Of course, there are always edge cases where your data set might not be huge but the workload naturally lends itself to MPP so you might still see gains but that's not common. If your data size is in the tens or low hundreds of GB range and won't grow significantly, you're likely to be much happier with Azure SQL Database.
As for resource class management, workload monitoring, etc... check out the following links:
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-develop-concurrency
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-manage-monitor
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-best-practices
I have SQL 2012 tabular and multidimensional models which is currently been processed through SQL jobs. All the models are processing with 'Process Full' Option everyday. However some of the models are taking long time for processing. Can anyone teel which is the best processing option that will not affect the performance of the SQL instance.
Without taking a look at you DB is hard to know but maybe I can give you a couple of hints:
It depends on how the data in DB is being updated. If all the fact table data is deleted and inserted every night, Process Full is probably the best way to go. But maybe you can partition data by date and proccess only the affected partitions.
On a multidimensional model you should check if aggregations are taking to much time. If so, you should consider redesign them.
On tabular models I found that some times having unnecesary big varchar fields can take huge amounts of time and memory to proccess. I found Kasper de Jonge's Server memory Analysis very helpful on identifying this kind of problems:
http://www.powerpivotblog.nl/what-is-using-all-that-memory-on-my-analysis-server-instance/bismservermemoryreport/
I am currently working on a project to improve cube processing time. The cube currently consists of 50 facts and 160 dimensions and it takes about 4 hours to process the cube. What would be the best way to benchmark cube processing performance before embarking on troubleshooting bottlenecks. The largest dimension consists of about nine million records while the largest fact table consists of about 250 million records. How would you go about finding bottlenecks and what parameters would influence the processing time the most. Any help is highly appreciated.
Having done a lot of SSAS processing optimization I would start with some monitoring. First setup some performance counters to monitor available memory, disk, CPU and network on the SSAS server and the database server. Some good perfmon counters (and recommendations about baselining processing performance) are in section 4.1.1 in the SSAS performance guide.
Second I would start a profiler trace connected to SSAS with the default events. Then when done processing Save As... Trace Table in profiler. Then look for the longest duration events in the SQL table you save it to. Then you know where to spend your time optimizing.
Feel free to write back with your specific longest duration events if you need help. Please also specify exactly how you are processing (like ProcessFull on the database or something else).
Hi all i'm completely new to maintenance tasks on SQL Server. I've set up a datawharehouse, that basically reads a load of xml files and imports this data into several tables using an SSIS. Now i've set indexes on the tables concerned and optimized my ssis. However i know that i should perform some maintenance tasks but i dont really know where to begin. We are talking about quite a bit of data, we are keeping data for up to 6 months and so far we have 3 months worth of data and the database is currently 147142.44 MB with roughly 57690230 rows in the main table. So it could easily double in size. Just wondering what your recommendations are?
While there is the usual index rebuild and statistics update which are part of normal maintenance, I would look at all of the currently long running queries and try to do some index tuning, before the data size grows. Resizing the database also forms part of a normal maintenance plan, if you can predict the growth and allocate enough space between maintenance runs then you can avoid the performance hit of space auto allocation (which will always happen at the worst possible time)