How to improve the reads from OLTP reprocessing Dimensions Attributes in SSAS? - ssas

I'm working to reduce the time that we spend to reprocess our SSAS cube.
The slowest part is processing the dimension (Process update or process add),
in specific the bottleneck is when SSAS is reading from OLTP.
When I open profiler I can see the event "Read Data" and the "Integer Data" increases in chunks of 10,000. Based on this I have the following questions
Does SSAS read and write data in chunks of 10,000?
How can I increase that number?
Any suggestions to improve the performance?
PS I already created indexes in the source tables and increased the ThreadPool\Process\MAX Threads

Related

How to debug and improve memory consumption on PowerBI Embedded service

I am current using PowerBI Embedded service from azure with an A1 unit, which is constantly reaching peak memory consumption and thus causing errors in the visualization of production reports.
1) Is there any way to identify which reports/pages/visuals are consuming the largest share of memory?
2) What would be the overall best strategy (on a high-level, general analysis) to reduce required memory? Would that be reducing the amount of data being loaded, reducing the number of pages, reducing the number of visuals, or any other possible strategy?
You can deploy the report Power BI Premium metrics app, this is for capacities, both Premium and Embedded. It will show dataset memory usage and other metrics on the capacity.
1) Is there any way to identify which reports/pages/visuals are
consuming the largest share of memory?
It will give a good overview of memory usage and whats causing it to time out/evict datasets and reports. Check the link for the full metric lists.
2) What would be the overall best strategy (on a high-level, general
analysis) to reduce required memory? Would that be reducing the amount
of data being loaded, reducing the number of pages, reducing the
number of visuals, or any other possible strategy?
Yes reduce dataset sizes, reports that suck in a number of columns but only use a few of them. Look at badly written queries and data models. For visuals, each visual on a page is a query, each query sucks up memory. I've had issues were people have had 30 visuals on a page, reducing them made it a lot quicker.
Look at the usage, are lots of reports being loaded at once, this can lead to dataset evictions, were it is dumped out of memory as other reports are taking priority. The Metric app will give you some pointers to what is happening, you'll have to take it from there and determine the root cause.
As it is an A sku you can set up an Azure Automation/Logic App to scale up and down the sku, or even pause it when needed. Also A1 & 2 are shared capacity as well not dedicated (A3 onwards) so you may have to account for any noisy neighbour issues in the background, but that will not show up on the metric app.
Hope that helps

SSAS Tabular - Other options to update tables rather than Process Full...?

We have a fairly large SSAS Tabular cube with many different tables (some which contain measures and dimensions, etc). On occasion we will run into scenarios where I have to optimize the cube partitions (break them into smaller parts) or cube structure so that not as much memory is consumed when it processes (daily). Occasionally we've had to increase the memory limits of the server just to make sure the job doesn't crash. One of our sql server consultants asked if we had considered changing the process mode on the scripted job to 'Default' rather than 'Full' (since every table in the script is set to full in the process mode). I said I hadn't considered this but my concern is that, it seems based on my research, that default won't actually update the data but will really only replenish the tables structure if it changes in some way. I need a processing mode that will just pull in any new rows (and update any rows that have changed) since the last time the partition was processed. Is there any mode which accomplishes this rather than Process full (which obviously wipes the current partition it's processing and rebuilds the entire thing = memory intensive)? Anything less memory intensive that will still pull in new rows and update outdated ones?
fyi, all the tables are based on sql queries
One option is doing a Process Data instead of a Process full on the tables in your tabular model. You may also want to consider implementing partitioning in your tables in order to take advantage of the ability for SSAS to process these in parallel. Since your tables are already based on SQL queries, you'll only need to modify the filters in the queries to make the data uniform across multiple partitions. Partitioning the tables will also allow for incremental processing using Process Add to incrementally update the partition. Looking into other ways to reduce unnecessary memory, such as removing unused columns and replacing calculated columns where possible (read about the cost calculated columns here) will also help the memory issues.

SSAS Cube processing option which has greater performance

I have SQL 2012 tabular and multidimensional models which is currently been processed through SQL jobs. All the models are processing with 'Process Full' Option everyday. However some of the models are taking long time for processing. Can anyone teel which is the best processing option that will not affect the performance of the SQL instance.
Without taking a look at you DB is hard to know but maybe I can give you a couple of hints:
It depends on how the data in DB is being updated. If all the fact table data is deleted and inserted every night, Process Full is probably the best way to go. But maybe you can partition data by date and proccess only the affected partitions.
On a multidimensional model you should check if aggregations are taking to much time. If so, you should consider redesign them.
On tabular models I found that some times having unnecesary big varchar fields can take huge amounts of time and memory to proccess. I found Kasper de Jonge's Server memory Analysis very helpful on identifying this kind of problems:
http://www.powerpivotblog.nl/what-is-using-all-that-memory-on-my-analysis-server-instance/bismservermemoryreport/

Benchmarking Cube Processing in SSAS

I am currently working on a project to improve cube processing time. The cube currently consists of 50 facts and 160 dimensions and it takes about 4 hours to process the cube. What would be the best way to benchmark cube processing performance before embarking on troubleshooting bottlenecks. The largest dimension consists of about nine million records while the largest fact table consists of about 250 million records. How would you go about finding bottlenecks and what parameters would influence the processing time the most. Any help is highly appreciated.
Having done a lot of SSAS processing optimization I would start with some monitoring. First setup some performance counters to monitor available memory, disk, CPU and network on the SSAS server and the database server. Some good perfmon counters (and recommendations about baselining processing performance) are in section 4.1.1 in the SSAS performance guide.
Second I would start a profiler trace connected to SSAS with the default events. Then when done processing Save As... Trace Table in profiler. Then look for the longest duration events in the SQL table you save it to. Then you know where to spend your time optimizing.
Feel free to write back with your specific longest duration events if you need help. Please also specify exactly how you are processing (like ProcessFull on the database or something else).

SQL Server 2005 Analysis Services data update

I'm new to Analysis Services
My first cube has been deployed and it seems to work.
Dimension tables are ok and fact tables are ok.
My question is very simple : If I add a new record in the related datasource table,
Browsing the cube, I don't see the new record until process again the cube.
In my mind I think if new records are addedd, then cube must reflect the changes.
How to solve this issue? Do I need to reprocess the cube every time a new record is added? This is impossible of course.
You understand that essentially your cube represents a bunch of aggregated measures? That means that when the cube is processed it looks at all the data that is in your fact tables and processes the Measures (according to the dimensions).
The result of this is that you're able to access the data in the cube quickly and efficiently. The downside is as you have mentioned is that when new data is added to the fact table the cube isn't updated.
Typically there will be a daily batch job that will update the cube with the latest fact data, depending on the amount of data you have and the "real-time" requirements this could be done more than once p/day. A lot of people do this out of hours.
If you look closely in BIDS you will notice on the Partitions tab that for each partition it has a Storage Mode which you can define.
I would recommend you read this this article http://sqlblog.com/blogs/jorg_klein/archive/2008/03/27/ssas-molap-rolap-and-holap-storage-types.aspx
Basically, there are a few different modes you can use:
MOLAP (Multi dimensional Online Analytical Processing)
MOLAP is the most used storage type. Its designed to offer maximum query performance to the users. Data AND aggregations are stored in optimized format in the cube. The data inside the cube will refresh only when the cube is processed, so latency is high.
ROLAP (Relational Online Analytical Processing)
ROLAP does not have the high latency disadvantage of MOLAP. With ROLAP, the data and aggregations are stored in relational format. This means that there will be zero latency between the relational source database and the cube.
Disadvantage of this mode is the performance, this type gives the poorest query performance because no objects benefit from multi dimensional storage.
HOLAP (Hybrid Online Analytical Processing)
HOLAP is a storage type between MOLAP and ROLAP. Data will be stored in relational format(ROLAP), so there will also be zero latency with this storage type.
Aggregations, on the other hand, are stored in multi dimensional format(MOLAP) in the cube to give better query performance. SSAS will listen to notifications from the source relational database, when changes are made, SSAS will get a notification and will process the aggregations again.
With this mode it’s possible to offer zero latency to the users but with medium query performance compared to MOLAP and ROLAP.
To get the real-time reporting without having to reprocess your cube you will need to try out ROLAP, but beware, the performance will suffer (depending on the size of your cube and server!).