Why is Mondrian pre-aggegating before calculating Average? - mdx

I have a super-simple analyzer report, where all I'm doing is calculating the average of a measure. In the schema, it's default aggregation is AVERAGE. The only other aspect of the report is a filter on date, where I restrict it to being within a list of 3 dates.
What's odd is that it appears that Mondrian is actually calculating the average for each date BEFORE averaging those 3 numbers to get the value displayed in the report. This seems very wrong (the report only has that one average displayed - no other fields).
I don't know MDX that well, but below is what I pulled from the mdx log if that helps:
With
Set [*NATIVE_CJ_SET] as 'Filter([*BASE_MEMBERS_ActivityDate], Not IsEmpty ([Measures].[AveragePosition]))'
Set [*NATIVE_MEMBERS_ActivityDate] as 'Generate([*NATIVE_CJ_SET], {[ActivityDate].CurrentMember})'
Set [*BASE_MEMBERS_Measures] as '{[Measures].[*FORMATTED_MEASURE_0]}'
Set [*BASE_MEMBERS_ActivityDate] as '{[ActivityDate].[2012-09-01 00:00:00.0],[ActivityDate].[2012-09-02 00:00:00.0],[ActivityDate].[2012-09-03 00:00:00.0]}'
Set [*CJ_COL_AXIS] as '[*NATIVE_CJ_SET]'
Member [ActivityDate].[*SLICER_MEMBER] as 'Aggregate ([*NATIVE_MEMBERS_ActivityDate])', SOLVE_ORDER=-400
Member [Measures].[*FORMATTED_MEASURE_0] as '[Measures].[AveragePosition]', FORMAT_STRING = '#,###.00;(#,###.00)', SOLVE_ORDER=400
Select
[*BASE_MEMBERS_Measures] on columns
From [SQLTestCube1_JustResults]
Where ([ActivityDate].[*SLICER_MEMBER])

Related

Simple MDX Calculated Member

In my simple cube, I have a measure = \[Measure\].\[Salary\], I have also \[DimEmpployee\].\[EmployeeLastName\].\[Smith\]. I would like to create calculated measure, where I can display in Axis 0 two measures - \[Measure\].\[Salary\] and calculated measure \[Measure\].\[SmithsSalaries\], to compare difference between Smith's earnings vs Total Salary.
I would like to compare Measure.SmithSalaries with other measures accross all diemensions. Is it possible to create such a measure using SCOPE statement?
I was playing around SCOPE statements, but it was displaying results only if DimEmployee was selected. I am looking for something which is running in blocks to avoid performance issues.
I think you only need a simple calculated measure.
CREATE MEMBER CURRENTCUBE.[Measures].[SmithSalaries]
AS ([DimEmployee].[EmployeeLastName].[Smith], [Measures].[Salary]),
VISIBLE = 1 ;
After that you can combine that with you total salary for example to get a ratio.
CREATE MEMBER CURRENTCUBE.[Measures].[SmithSalaries Ratio]
AS DIVIDE(([DimEmployee].[EmployeeLastName].[Smith], [Measures].[Salary]),[Measures].[Salary])
VISIBLE = 1 ;
SCOPE allows you to have different behaviors when different combinations of Dimensions are into play, like returning a different calculation when the DimEmployee is selected but otherwise just return the normal calculation. Like a Very efficient IF condition to check what are in the Axis of this calculation.

Generic percent of grand total MDX expression returns wrong value with filter

I have this problem to find a generic MDX expression that returns the percent of grand total regardless of the dimension that i drag in the SSAS cube browser.
Now i'm using this expression:
([Measures].[Montant], Axis(1)(0)(Axis(1)(0).Count - 1).dimension.currentmember)
/SUM(([Measures].[Montant], Axis(1)(0)))
it works fine, but when i filter on the inner item of the axis, the expression returns a wrong value
For example :
i have in my rows axis 3 items : Year > Brand > Category
The grand total is 125 for all rows:
SUM(([Measures].[Montant], Axis(1)(0)))
If i filter on the categories , the grand total changes, lets say it is equal to 65 now for the outer items of the axis. But when i drill down to see its value for the categories, i find it still equal to 125. and as a result the value of percent is wrong as well.
Can someone please help me figure out what's wrong with my MDX expression coz i've been stuck at it for too long and i don't seem to find a solution.
screenshot of cube browser
The calculated measure is "test SOB", MDX expression :
([Measures].[Montant], Axis(1)(0)(Axis(1)(0).Count - 1).dimension.currentmember)
/SUM(([Measures].[Montant], Axis(1)(0)))
the grand total is "denominateur", MDX expression:
SUM(([Measures].[Montant], Axis(1)(0)))
as you can see, the value after filtering with Onglet = "DIGITAL" is 182.50 but when I drill down the brand "Beauty" to see "denominateur" per category, i find the value 338.05 which is the value of "denominateur" before applying the filter.
I'm wondering if the use of EXISTING will enforce the filter context in your denominteur calculation?
SUM(
[Measures].[Montant],
EXISTING Axis(1).ITEM(0).ITEM(0).HIERARCHY.MEMBERS
)

Last Available value MDX

I have a requirement where in i am to extract data from a cube, within the SSRS dataset using the query builder ,with the time dimension in the result set, across a range of dates. The conditions are
The measures are to be displayed for each day of the date range.
The sub total row should have the last available measures value for that time range.
There is a time filter (currently a single date filter with a multi select option).
my MDX is as below.
The measure has a 'Sum' as the aggregation type.
I have a calculated measure with the scope defined as below.
SCOPE([MEASURES].[Measure1]);
SCOPE([Date].[Date].MEMBERS);
THIS = TAIL(EXISTING ([Date].[Date].MEMBERS),1).ITEM(0) ;
END SCOPE;
END SCOPE;
This above scope statement works perfectly. however, when i select in more that one date member this query slows WAYYYYYYY down. Performance numbers are
Across 1 date - 4 seconds
Across 2 dates - 22 minutes
Across 3 dates - unknown (in Hours)
This drastic degradation in performance goes away if i remove the scope statement, which makes me thing that there should be a better way to do the same. the final report query is as below.
SELECT
NON EMPTY
{[Measures].[Measure1]} ON COLUMNS
,NON EMPTY
{ [Dimension1].[Dimension1].[Dimension1].ALLMEMBERS*
[Dimension2].[Dimension2].[Dimension2].ALLMEMBERS*
[Dimension3].[Dimension3].[Dimension3].ALLMEMBERS*
[Date].[Date].[Date].ALLMEMBERS
} ON ROWS
FROM (
SELECT {[Date].[Date].&[2014-06-13T00:00:00]
,[Date].[Date].&[2014-06-16T00:00:00] } ON COLUMNS
FROM [Cube]
)
So the question again is, Is there a way to do the last available value part of the scope statement so as to have a better performance? Also, if there is another way to write the final mdx that would help the performance?.
Please let me know if there are anythings unclear regarding the question.
Thanks
Srikanth
The first optimization step would be to change your query to
SELECT
NON EMPTY
{[Measures].[Measure1]} ON COLUMNS
,NON EMPTY
{ [Dimension1].[Dimension1].[Dimension1].ALLMEMBERS*
[Dimension2].[Dimension2].[Dimension2].ALLMEMBERS*
[Dimension3].[Dimension3].[Dimension3].ALLMEMBERS*
{[Date].[Date].&[2014-06-13T00:00:00], [Date].[Date].&[2014-06-16T00:00:00] }
} ON ROWS
FROM [Cube]
Furthermore, I am not sure why you added the SCOPE([Date].[Date].MEMEBER); (probably Date].[Date].MEMBERS, actually). Maybe it helps to omit it and the corresponding END SCOPE.

MDX Query percentile 25th, 50th and 75th

I have a question and I haven't been able to find the answer (neither in this forum nor other) I am looking for:
I need to calculate the 25th Percentile, the median (the 50th percentile) and the 75th percentile.
Putting in another words: I need to write in the MDX query in SSRS for it to tell me which data is the 25th, the median and the 75th
All I was able to find so far was not the exact values of each one of them
thanks
I've been working on the same issue for my own data. The trouble I was having is in figuring out the Median() function. Here's how I interpret the parameters of the function:
Microsoft's definition:
MEDIAN(Set_Expression [, Numeric_Expression])
My interpretation:
Set_Expression is the set of values that define the grain to which the measure is summed before the median is evaluated
Numeric_Expression is the measure that is summed, which set of sums is then sorted and evaluated to find the median
In my case for finding the straight median across the entire data set, I didn't want to sum the values at all. To prevent any sums from being calculated, I used the key attribute for a dimension that had a 1-1 cardinality with the records in the fact table that contains the measure that I'm using. The only flaw I've seen so far is that sometimes the median returns a whole number when there are an even number of records and the mean of the two middle records should result in a number ending in .5. For example, the values of the two middle records are 16 and 17 and the function is returning 17 instead of 16.5. Since this is a minor flaw, I'm willing to overlook it for now.
This is what my calculation with the median function looks like:
WITH MEMBER Measures.[Set Median] AS MEDIAN(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Non-summable Measure]
)
I used a combination of Median and TopCount to get the 75th percentile. I use TopCount to limit the set for the median to the second half of the data since TopCount sorts the data in descending order. I'll explain how I understand TopCount:
Microsoft's definition:
TopCount(Set_Expression, Count [, Numeric_Expression])
My interpretation:
Set_Expression is the set of values from which the desired number of tuples will be returned
Count is the number of tuples to return from the set
Numeric_Expression is the value that will be used to sort the set in descending order
I want the Median function to use the last half of the records in the fact table that are returned in the query, so I again use the key for the dimension table that has a 1-1 cardinality with the fact table and I sort it by the measure from which I want to find the median value.
Here is how I coded the member:
MEMBER Measures.[75th Percentile] AS MEDIAN(
TOPCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
So far, this combination of functions has returned a true 75th percentile from my data set. To get the 25th percentile, I tried replacing TOPCOUNT in my code with BOTTOMCOUNT, which is supposed to do the same thing, only sorting the data in ascending order to use the first half of the records instead of the second half. Unfortunately, I haven't been able to get anything but NULL from this combination of functions, so I'm open to suggestions on how to get the 25th percentile.
This is how my final query looks:
SELECT
{
Measures.[Set Median]
,Measures.[25th Percentile]
,Measures.[75th Percentile]
} ON 0
,[Dimensional row members here] ON 1
FROM [Cube]
WHERE
[Non-axis dimensional filter members here]

Ignore associated NULL values in SSAS calculated member

I'm creating Analysis Services cubes in Visual Studio BIDS, and have a question about summing in calculated members.
The data has to do with commercial real estate transactions. I want to sum square feet of building space involved in sales transactions for each region. I'm going to use that result in a weighted average calculation. However, I only want to sum the square feet of transactions which have non-null values for the corresponding building capitalization rate (cap rate) member.
Here is a drill-down to Athens in the cube browser:
Note that Athens has 15 values for square feet, but only 5 values for cap rate, reflecting my relational data source as shown here:
So, I only want to sum the five square feet values that have associated cap rate values. Doing the math with the relational query result above you can see that this should result in a sum just over 900K, not the 2 million+ sum shown in the BIDS screenshot.
My attempt at this calculation:
sum(
descendants(
[Property].[Property by Region].CurrentMember,
[Property].[Property by Region].[Metro Area]
),
iif([Measures].[Cap Rate] is null or [Measures].[Sq Ft] is null, 0,
[Measures].[Sq Ft])
)
ends up including the square feet values that have no corresponding cap rates, so I still end up with a value in the 2 millions.
Why is my iff() clause not working as one would expect?
I was finally able to create the weighted average calculation using a combination of Named Calculations in the Data Source View (DSV) and a calculated member (in the cube script). First, I went to the DSV and added a named calculation called xWeightedCapRt with a formula as follows:
CASE WHEN CapRate IS Null THEN Null Else CapRate * SqFt END
In the cube, I then added xWeightedCapRt as a New Measure. I set its aggregation function to Sum and left its Visible property set to True temporarily.
I created an additional Named Calculation called "xSqFt", defined as:
CASE WHEN CapRate IS Null THEN Null Else SqFt END
and again created a corresponding measure.
On the Calculation tab (of the cube designer) I created a new calculated member, [WAvg Cap Rate by Sq Ft], with the following formula:
[Measures].[x Weighted Cap Rt] / [Measures].[x Sq Ft]
After deploying and processing the cube, I was able to verify that the weighted average calculation matched my spreadsheet numbers. At that point, I set the Visible property of the two intermediate measures to False and redeployed.
What I've learned is that calculations at the "row-level" are best performed through the DSV. You can then use those to build up more complex calculations within the cube.
(NOTE: One thing that needs to be added to the steps above is logic to handle division by zeros.)
Couldnt you have done a nonempty around the descendants on the cap rate measure?