Calculating percentile values in SSAS - ssas

I am trying to calculate percentile (for example 90th percentile point of my measure) in a cube and I think I am almost there. The problem I am facing is, I am able to return the row number of the 90th percentile, but do not know how to get my measure.
With
Member [Measures].[cnt] as
Count(NonEmpty(
-- dimensions to find percentile on (the same should be repeated again
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members
,
-- add the measure to group
[Measures].[Profit]))
-- define percentile
Member [Measures].[Percentile] as 90
Member [Measures].[PercentileInt] as Int((([Measures].[cnt]) * [Measures].[Percentile]) / 100)
**-- this part finds the tuple from the set based on the index of the percentile point and I am using the item(index) to get the necessary info from tuple and I am unable to get the measure part
Member [Measures].[PercentileLo] as
(
Order(
NonEmpty(
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members,
[Measures].[Profit]),
[Measures].[Profit].Value, BDESC)).Item([Measures].[PercentileInt]).Item(3)**
select
{
[Measures].[cnt],
[Measures].[Percentile],[Measures].[PercentileInt],
[Measures].[PercentileLo],
[Measures].[Profit]
}
on 0
from
[TestData]
I think there must a way to get measure of a tuple found through index of a set. Please help, let me know if you need any more information. Thanks!

You should extract the tuple at position [Measures].[PercentileInt] from your set and add the measure to it to build a tuple of four elements. Then you want to return its value as the measure PercentileLo, i. e. define
Member [Measures].[PercentileLo] as
(
[Measures].[Profit],
Order(
NonEmpty(
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members,
[Measures].[Profit]),
[Measures].[Profit], BDESC)).Item([Measures].[PercentileInt])
)
The way you implemented it, you tried to extract the fourth (as Item() starts counting from zero) item from a tuple containing only three elements. Your ordered set only has three hierarchies.
Just another unrelated remark: I think you should avoid using complete hierarchies for [Calendar].[Hierarchy].members, [Region Dim].[Region].members, and [Product Dim].[Product].members. Your code looks like you are including all levels (including the all member) in the calculation. But I do not know the structure and names of your cube, hence I may be wrong with this.

An alternate method could be to find the median of the last 20% of the records in the table. I've used this combination of functions to find the 75th percentile. By dividing the record count by 5, you can use the TopCount function to return a set of tuples that make up 20% of the whole table sorted in descending order by your target measure. The median function should then land you at the correct 90th percentile value without having to find the record's coordinates. In my own use, I use the same measure for the last parameter in both the Median and TopCount functions.
Here's my code:
WITH MEMBER Measures.[90th Percentile] AS MEDIAN(
TOPCOUNT(
[set definition]
,Measures.[Fact Table Record Count] / 5
,Measures.[Value by which to sort the set so the first 20% of records are chosen]
)
,Measures.[Value from which the median should be determined]
)
Based on what you've supplied in your problem definition, I would expect your code to look something like this:
WITH MEMBER Measures.[90th Percentile] AS MEDIAN(
TOPCOUNT(
{
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members
}
,Measures.[Fact Table Record Count] / 5
,[Measures].[Profit]
)
,[Measures].[Profit]
)

Related

Filtering dimensions in MDX inside a SUM

I am new to MDX expressions and I am trying to create one that sums the value of a given measure filtered by dimensions.
In my database I have several different dimensions that have the same name: "Answer". To sum them up, I have created the query below:
WITH MEMBER Measures.Total as SUM ({[Activity].[Activity].&[14], [Activity][Activity].&[22]},
[Measures].[Activity time])
SELECT NON EMPTY [Measures].[Total] on COLUMNS from [My Analytics]
This query works, however I had to use the "&[14]" and "&[22]" statments that correspond to two different "Answer" dimensions.
Since I have more than two dimensions with the same name, is there a way to rewrite the query above in a way that I would select all these dimensions without having to add their unique ID? For example, I would re-write the query as something like this:
WITH MEMBER Measures.Total as SUM ({[Activity].[Activity].&["Answer"]},
[Measures].[Activity time])
SELECT NON EMPTY [Measures].[Total] on COLUMNS from [My Analytics]
Is this possible?
Thanks!
You can use the Filter function as following:
with
set [my-answers] as
Filter( [Activity].[Activity].members,
[Activity].[Activity].currentMember.name = 'Answer'
)
member [Measures].[Total] as Sum( [my-answers] )
...

MDX Query percentile 25th, 50th and 75th

I have a question and I haven't been able to find the answer (neither in this forum nor other) I am looking for:
I need to calculate the 25th Percentile, the median (the 50th percentile) and the 75th percentile.
Putting in another words: I need to write in the MDX query in SSRS for it to tell me which data is the 25th, the median and the 75th
All I was able to find so far was not the exact values of each one of them
thanks
I've been working on the same issue for my own data. The trouble I was having is in figuring out the Median() function. Here's how I interpret the parameters of the function:
Microsoft's definition:
MEDIAN(Set_Expression [, Numeric_Expression])
My interpretation:
Set_Expression is the set of values that define the grain to which the measure is summed before the median is evaluated
Numeric_Expression is the measure that is summed, which set of sums is then sorted and evaluated to find the median
In my case for finding the straight median across the entire data set, I didn't want to sum the values at all. To prevent any sums from being calculated, I used the key attribute for a dimension that had a 1-1 cardinality with the records in the fact table that contains the measure that I'm using. The only flaw I've seen so far is that sometimes the median returns a whole number when there are an even number of records and the mean of the two middle records should result in a number ending in .5. For example, the values of the two middle records are 16 and 17 and the function is returning 17 instead of 16.5. Since this is a minor flaw, I'm willing to overlook it for now.
This is what my calculation with the median function looks like:
WITH MEMBER Measures.[Set Median] AS MEDIAN(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Non-summable Measure]
)
I used a combination of Median and TopCount to get the 75th percentile. I use TopCount to limit the set for the median to the second half of the data since TopCount sorts the data in descending order. I'll explain how I understand TopCount:
Microsoft's definition:
TopCount(Set_Expression, Count [, Numeric_Expression])
My interpretation:
Set_Expression is the set of values from which the desired number of tuples will be returned
Count is the number of tuples to return from the set
Numeric_Expression is the value that will be used to sort the set in descending order
I want the Median function to use the last half of the records in the fact table that are returned in the query, so I again use the key for the dimension table that has a 1-1 cardinality with the fact table and I sort it by the measure from which I want to find the median value.
Here is how I coded the member:
MEMBER Measures.[75th Percentile] AS MEDIAN(
TOPCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
So far, this combination of functions has returned a true 75th percentile from my data set. To get the 25th percentile, I tried replacing TOPCOUNT in my code with BOTTOMCOUNT, which is supposed to do the same thing, only sorting the data in ascending order to use the first half of the records instead of the second half. Unfortunately, I haven't been able to get anything but NULL from this combination of functions, so I'm open to suggestions on how to get the 25th percentile.
This is how my final query looks:
SELECT
{
Measures.[Set Median]
,Measures.[25th Percentile]
,Measures.[75th Percentile]
} ON 0
,[Dimensional row members here] ON 1
FROM [Cube]
WHERE
[Non-axis dimensional filter members here]

SSAS 2012 Calculated Member for Percentage

Being an SSAS newbie, I was wondering if it's possible to create a calculated member that references an individual row's value as well as the aggregated value in order to create a percentage?
For example, if I have a fact table with ValueA, I'd like to create a calculate member that essentially performed:
[Measures].[ValueA] (for each row I've sliced the data by) / [Measures].[ValueA] (the total)
Also I'd like to keep the total as the sum of whatever's been filtered in the cube browser. I feel certain this must be possible but I'm clearly missing something.
You can use the Axis function. Her is an example:
WITH MEMBER [Measures].[Percentage] AS
[Measures].[ValueA] / (Axis(1).CurrenMember.Parent, [Measures].[ValueA])
SELECT {[Measures].[ValueA], [Measures].[Percentage]} ON 0,
'what you want' ON 1
FROM your cube
(You may need to add check in the calculated member expression)

Aggregating MDX query results on existing facts only

I am very new to MDX, so probably I'm missing something very simple.
In my cube, I have a dimension [Asset] and a measure [Visits], calculating (in this case) how many visits an asset has been consumed by. An important thing to note is that not every visit is associated with an asset.
What I need to find out is how many visits there are that consumed at least one asset. I wrote the following query:
SELECT
[Asset].[All] ON COLUMNS,
[Measures].[Visits] ON ROWS
FROM
[Analytics]
But this query just returns the total number of visits in the cube. I tried applying the NON EMPTY modifier to both axes, but that doesn't help.
This query should give you what you expect:
WITH MEMBER [Asset].[Asset Name].[All Assets] AS
AGGREGATE( EXCEPT( [Asset].[Asset Name].MEMBERS, { [Asset].[All] } ) )
SELECT
{ [Asset].[Asset Name].[All Assets] } ON COLUMNS,
[Measures].[Visits] ON ROWS
FROM
[Analytics]
You may need to put {[Asset].[Asset Name].[All]} as second argument of Except if the All member was not excluded.
In the query I create a calculated member [Asset].[Asset Name].[all assets] that should represent all your existing assets. I supposed that your existing assets are all the members of the level [Asset].[Asset Name] but the All member.
You can find more information about the Aggregate function here.
This works as well:
SELECT
[Measures].[Visits] ON 0
FROM
[Analytics]
WHERE
DRILLDOWNLEVEL([Asset].[All])
Update: as well as this:
SELECT
[Measures].[Visits] ON 0
FROM
[Analytics]
WHERE
[Asset].[All].CHILDREN

SSAS -> MDX -> Creating a column percentage within my query based on counts

SELECT
NON EMPTY {[Measures].[Fact Order Count]}
ON COLUMNS,
{ ([Front Manager].[Front Manager Id].[Front Manager Id].ALLMEMBERS * [Order Type].[Order Type].[Order Type].ALLMEMBERS ) }
ON ROWS
FROM
[TEST_DW] CELL PROPERTIES VALUE
So, I have three columns in the output:
Front Manager, Order Type, Order Count
The above query shows me the counts for each manager and order type combination. I need a fourth column which would be a percentage of the types of orders for each front manager.
So, if there are four types of orders (A, B, C, D), and a manager had 25 of each order totaling 100. The fourth column would read 25%.....
I have scoured the web on how to do this, but have really come up short on this one. Any direction on this would be greatly appreciated, I am definitely new to MDX. Thanks.
What you're looking for are MDX Calculated members.
Let's assume the member for order A is called : [Order Type].[Order Type].[Order A] and we want to calculate the percentage from the total.
WITH
MEMBER [Order A] AS ([Order Type].[Order Type].[Order A],[Measures].[Fact Order Count]) / ([Measures].[Fact Order Count]) , FORMAT_STRING = 'Percent'
SELECT
{[Measures].[Fact Order Count],[Measures].[Order A]} on 0
...
What is important in the calculated members is that you can evaluate any MDX tuple (e.g ([Order Type].[Order Type].[Order A],[Measures].[Fact Order Count]) ). This changing if needed the values coming from the pivot axis (defined in on 0 and on 1..). Note you can add calculated members for the measures as well as the other dimensions.