SSAS Partition Slice Expression - ssas

I am partitioning my cube by the most recent 13 months, and then a legacy partition to hold older months.
I have successfully created dynamic partitions, but now I need to add a dynamic slice to each partition.
I thought I could use this in the Partition Slice Expression:
[Dim Date].[Month].&[" + CStr(Month(Now())) + "].lag(8)
but it's failing. Does anyone have any ideas?

I tried all day, but ultimately resolved that partition slice expressions dont like anything that is not a dimension member value.
To be clear my goal was to create dynamic partitioning using the 14 described partitions above. Best Practice advises to also use slices on the partitions per Mosha's Article but since my partitons are dynamic, then my slices needed to be dynamic.
I finally added a member to my Date Dimension that mimics the dynamic labeling of the 14 partitions I wanted to create. Next I referenced the new date dimension member values to each of the corresponding partition slices, basically moving the "dynamic" slices to the cube structure.
It works great, and give me another usefull Dimension Member. I have also partitioned the fact table in the data warehouse with the same 14 partitions using a partitioning scheme, file groups, etc. As an added bonus, since everything is dynamic my SSIS package is much less complex and does not require DDL tasks to move partitions around.

where are you doing this?
you should partition the data warehouse on your cube using T-SQL queries, not DMX queries:

Related

How to handle a large dimension in BigQuery

I have a dimension table in my current warehouse (Netezza) which has 10 million records and which is being updated on a daily basis.
Should we keep this dimension table in BigQuery as it is as we are planning to migrate to BigQuery.
How can we redesign this large dimension in BigQuery?
Because bigquery is not intended for updates, it's not that easy to implement a dimension table. The proper answer depends on your use case.
But here are some alternatives:
Have an append-only dimension table with an "UpdatedAt" field. Than, use window function to get the last version (you can even create a view that has only the last version)
Truncate the dimension table daily with the latest version of your data.
Create an external table based on GCS / Big Table / Cloud SQL, and have the dimensions updated there.
Save your dimension table in a separate database, and use Cloud Dataflow to perform the join
Save the dimension data together with the fact table (Yes, there will be a lot of duplications, but sometimes it's worth the cost)
Simply update the dimension table whenever there is a change (there is a limit to do that)
All of these approaches have drawbacks. The solution can even be a mix of more than one approach.

OLAP CUBE Aggregations and Partitions

I've been optimizing many cubes, that got a long time processing. Approximately 20 min per 10 mln rows. I've created partitions and processing became a short - about 4 min per 10 mln. Also I've create one aggregation for all partition with full processing molap and 100% aggregate (cube is not so big). Is there any reason to create aggregation for each partition? Will it work faster when user try to refresh pivot table based on olap cube?
Thanks.
Typically you have one aggregation design shared by all partitions in a measure group. On very large measure groups you might have a second lightweight aggregation design for very old rarely used partitions.
Adding lots of aggregation designs (like a separate one per partition) will likely slow down queries a tiny bit because of all the extra time it takes internally to figure out which aggregation to read from.
If you used the aggregation wizard don't bother. It knows nothing about how you query your cube and will create stupid useless aggs that waste processing time. Instead deploy your cube then go back in a few days after users have run some queries and do Usage Based Optimization instead.
Creating partitions is a good way to improve the cube processing time.
Aggregations are useful if done on the correct fields. By correct I mean , the filter selections used most frequently by the users. Usage Based Optimization is good approach to achieve it.
Also read through the below article to understand the approach used while checking the performance.
https://mytechconnect.wordpress.com/2013/08/27/ssas-performance-best-practices-and-performance-optimization/

SSAS : Update a dimension won't drop aggregation process index won't rebuild aggregation

I have a 'Employee' dimension which will be changed (modified) everyday, I made monthly partitions in cube and only process full the current month partition. Lately found that the past month's aggregation will not be dropped. Tired to 'ProcessUpdate' on this dimension and 'ProcessIndex' on partition but remained same. Also tried the setting 'ProcessAffectObjects'and 'ProcessIndex' again, still the same, tried both on lazyprocessing true and false with no luck.
So my question is how to drop the stale aggregation on past month and rebuild them explicit ?
It is a distinct count measurement and no aggregation designed via wizard
Tried drop the index by using 'ProcessClearIndexes' in XMAL command, it worked fine and use 'ProcessIndexes' did rebuild the indexes and aggregation, saw them from the SSMS query execution message .
So might it only be related to the distinct count , just because it is a non-aggregation measurement ?
"Non-additive measures create the following problems on a typical OLAP system:
Roll-ups are not possible. When pre-calculating results during cube processing, the system cannot deduce summaries from other summaries. All results must be calculated from the detail data. This situation places a heavy burden in processing time.
All results must be pre-calculated. With non-additive measures, there is no way to deduce the result for a higher-level summary query from one pre-calculated aggregation. Failure to pre-calculate the results in advance means that the results are not available. It is impossible to perform and maintain incremental updates to the system. A single transaction added to the cube usually invalidates huge portions of previously pre-calculated results. In order to recover from this, a complete recalculation is needed."
"Aggregations
As mentioned before, DISTINCT COUNTs are not additive (and this is the main reason why these measures are so problematic). Therefore, the aggregations, which are all derived from additive operators, are completely useless;"
someone answered my question on MSDN
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/7302227f-11b8-4adc-98ff-72b6c395775b/ssas-update-a-dimension-wont-drop-aggregation-process-index-wont-rebuild-aggregation?forum=sqlanalysisservices
If you use materialized reference dimensions ensure you do ProcessFull to reprocess the fact tables again fully. The reason is that the join to the intermediate dimension happens in the measure group partition processing query:
http://sqlblog.com/blogs/alberto_ferrari/archive/2009/02/25/ssas-reference-materialized-dimension-might-produce-incorrect-results.aspx

When to rebuild ssas aggregation designs?

We have an "age" dimension in our SSAS Cube. It's basically just the one attribute that's the person's whole number age at the time an event happened. We've had a requirement to further break it down into adult/child with a sub group of adult/geriatric and pediatric/neonatal.
When adding these new attributes to the dimension and a hierarchy, do I have to go into the aggregation designs and rebuild the ones that reference the dimension?
We aren't changing the key of the patient age, just adding the extra data.
Unfortunately, you aggregations won't include new level automatically, but they will help anyway: engine can use lower than your new level aggregations of the same dimension as it's faster than retrieving from data files.
Please also remember '1/3 rule': aggregations should be less than 1/3 the size of the fact table.
You can see details in amazing white paper 'Analysis Services 2008 R2 Performance Guide' http://download.microsoft.com/download/6/5/6/6567C845-FC8D-4D62-920F-C027A349C889/SSASPerfGuide2008R2.pdf (3.4 Aggregations, page 60).

How can we use same partition schema with different partition function?

I'm learning table partitioning.
When I read this page, it said that
The TransactionHistoryArchive table must have the same design schema as the TransactionHistory table. There must also be an empty partition to receive the new data. In this case, TransactionHistoryArchive is a partitioned table that consists of just two partitions.
And with the following picture, we can see that TransactionHistory has 12 partitions, but TransactionHistoryArchive just has 2 partitions.
Illustration http://i.msdn.microsoft.com/dynimg/IC38652.gif
How could it possible? Please help me to understand it.
As long as two individual partitions have identical schema and the same boundary values you can switch them. They don't need to have the same partition scheme or function.
This is because SQL Server ensures that the binary data of those partitions on disk is compatible. That's the magic of partitioning and why you can move arbitrary amounts of data as a quick metadata-only operation.