How do I use percentile (let's say 99% percentile) as a measure in SSAS cube? I'd like to be able to report on it using many different dimensions and filters so pre-calculating in sql is not an option.
If you want to evaluate percentiles over a dimension, then you can do it without having to pre-calculate anything in SQL. This can be done using a combination of the TopPercent and Tail MDX functions.
For example, say you want to find the 25th percentile over a [Date].[Calendar Month] dimension attribute by the [Measures].[Avg Daily Census] (I work in health care) measure. You could use the following query to do this.
WITH
MEMBER [Measures].[25th Percentile] AS
(
Tail(
TopPercent(
[Date].[Calendar Month].[Calendar Month],
100 - 25,
[Measures].[Avg Daily Census]
)
).Item(0),
[Measures].[Avg Daily Census]
)
SELECT
[Measures].[25th Percentile] ON 0
FROM [Census]
This query uses TopPercent to find the top 75 percent of values along the [Calendar Month] dimension attribute, finds the member with the lowest value along that set, and then evaluates the measure at that member.
Now, if you want to generate the entire set of percentile values, you can use a sequence utility dimension (just a dimension containing numbers), or you could create a custom percentile dimension containing 0-99. The following query would generate the values of the measure over all percentages.
WITH
MEMBER [Measures].[Percentile] AS
(
Tail(
TopPercent(
[Date].[Calendar Month].[Calendar Month],
100 - [Sequence].[Ones].CurrentMember.MemberValue,
[Measures].[Avg Daily Census]
)
).Item(0),
[Measures].[Avg Daily Census]
)
SELECT
[Measures].[Percentile] ON 0,
[Sequence].[Ones].&[0] : [Sequence].[Ones].&[99] ON 1
FROM [Census]
However, if you are trying to evaluate percentiles at the grain of the fact table, then I believe you will either need to pre-calculate the percentile values for each fact row or add a degenerate dimension and use the method above.
Related
I'm trying to add a calculated member to my cube, which will return the first fiscal year where there is any data at all in a particular measure.
The purpose is to suppress (i.e. NULLify) various year-on-year calculated measures when the year is this first year: in that year, comparison with the previous year is meaningless.
I've got this so far:
WITH MEMBER Measures.DataStartYear_Sales
AS
HEAD(
NONEMPTY([Calendar].[Fiscal Periods].[Fiscal Year].Members,[Measures].[QuantityOrdered])
,1).Item(0).Properties("NAME")
At the moment:
a. It's a query-scoped measure, as that's easier to experiment with.
b. It returns the first year's Name, as that's easier to see. Eventually I'll just return the member itself, and do an IS comparison against the year hierarchy .CurrentMember in the other calculated member calculations.
The problem I expected, which has happened, is that I only want this measure to be calculated once, over the whole cube. But when I used it in a query, it obviously reacts to the context. For example, if I stick the Products dimension on ROWS, the value of this measure will be different for each row, because each product's earliest order date is different.
That is of course useful, but it's not what I want. Is there some way to force this measure to ignore the query context, and always return the same value?
I looked into SCOPE_ISOLATION and SOLVE_ORDER, but they don't do what I'm trying to do here.
I suppose I could specify a tuple of Dimension1.All, Dimension2.All.... DimensionN.All, covering all dimensions in the cube, but that seems messy and fragile.
I think you might be able to accomplish this with static sets. Here is an example using Adventure Works that produces the same first year regardless of context:
WITH STATIC SET FirstYear AS
HEAD
(
NONEMPTY([Date].[Calendar Year].[Calendar Year].MEMBERS, [Measures].[Internet Sales Amount])
, 1
)
MEMBER FirstYearName AS
FirstYear.ITEM(0).NAME
SELECT
[Measures].[FirstYearName] ON COLUMNS
, [Date].[Calendar Year].[Calendar Year].MEMBERS
//Add as many dimensions as you like here...for example
* [Product].[Product].[Product].MEMBERS
ON ROWS
FROM
[Adventure Works]
;
Example output:
That should hopefully put you on the right track.
I've been tasked with a rather odd Time intelligence function by my finance group that I'm trying to puzzle out.
I've been asked with creating a measure within our SSAS Cube to allow for seeing previous quarter to date based on how far we are in the current quarter. But instead of seeing a standard idea of days elapsed currently versus days elapsed previously, they would like to see days remaining versus previous days remaining.
What I mean by that is, take 1/22/2015 for example. We have 48 days remaining in our current quarter, which I have by means of a calculated measure. I need to find the corresponding working day from the previous quarter where it is also at 48 days remaining.
At that point I could create a date range with some aggregate functions off of the first date in the previous quarter to the corresponding date found in the above and come up with what they are looking for.
The best idea I've had so far is to possibly do this in the database section itself, by creating a new column that is essentially the calculated number of days remaining but stored. But at that point I'm not sure how to take a calculated measure in SSAS and filter a previous quarter date member to use that property as it were.
Do you have an utility dimensions in your cube? We have one called TimeCalculations. In there we have things such as CurrentValue, MTDValue, PrevEquivMTD, Past7Days .... I think your new logic would fit in with such a dimension.
Here is an example of PrevEquivQTD against AdvWrks that I just had a play with. Guessing this doesn't really help your scenario but I had fun writing it:
WITH
SET [NonEmptyDates] AS
NonEmpty
(
[Date].[Calendar].[Date].MEMBERS
,[Measures].[Internet Sales Amount]
)
SET [LastNonEmptyDate] AS
Tail([NonEmptyDates])
SET [CurrQ] AS
Exists
(
[Date].[Calendar].[Calendar Quarter]
,[LastNonEmptyDate].Item(0)
)
MEMBER [Measures].[pos] AS
Rank
(
[LastNonEmptyDate].Item(0)
,Descendants
(
[CurrQ]
,[Date].[Calendar].[Date]
)
)
MEMBER [Measures].[PrevEquivalentQTD] AS
Sum
(
Head
(
Descendants
(
[CurrQ].ITEM(0).PrevMember
,[Date].[Calendar].[Date]
)
,[Measures].[pos]
)
,[Measures].[Internet Sales Amount]
)
SELECT
{[Measures].[pos],[Measures].[PrevEquivalentQTD]} ON 0
,[LastNonEmptyDate] ON 1
FROM
(
SELECT
[Date].[Calendar].[Date].&[20050111]
:
[Date].[Calendar].[Date].&[20080611] ON 0
FROM [Adventure Works]
);
Your Date is 1/22/2015. You want the Same Date in Previous Quarter which would be 8/22/2015.
If this is what you want, you will have to use MDX function ParallelPeriod as shown in sample below. Please replace it with your own Dimensions and Cube.
Select
ParallelPeriod
(
[Date].[Calendar Date].[Calendar Quarter], -- Level Expression
1, -- Index
[Date].[Calendar Date].[Date].&[20150122] -- Member Expression
) On 0
From [Adventure Works]
If you want the same date in the following quarter, then replace index 1 with -1.
Cheers
I'm trying to find the Median, 25th percentile, and 75th percentile as a calculation in my cube. The values I'm evaluating are non-summable because they represent ages of people, so I'm using the following function to find the median:
WITH MEMBER Measures.[Set Median] AS MEDIAN(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Non-summable Measure]
)
The dimension key and the fact table key have a 1-1 relationship, so the key members as a set allow me to find the median across all the returned records without any summing. I've successfully found the 75th percentile using the following function combination:
MEMBER Measures.[75th Percentile] AS MEDIAN(
TOPCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
Since TopCount sorts the set in descending order, I'm able to find the 75th Percentile by finding the median of the top half of the records. Based on this logic, I'm trying to find the 25th Percentile by using the BottomCount function the same way since it sorts the set in ascending order. However, I'm only getting NULL back in my query for the 25th Percentile calculation. Here is the function combination and my end query:
MEMBER Measures.[75th Percentile] AS MEDIAN(
BOTTOMCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
SELECT
{
Measures.[Set Median]
,Measures.[25th Percentile]
,Measures.[75th Percentile]
} ON 0
,[Date Dimension].[Calendar Hierarchy].Year.&[2011]:[Date Dimension].[Calendar Hierarchy].Year.&[2014] ON 1
FROM [Cube]
WHERE
[Age Dimension].[Age in Years Hierarchy].[Age Year].&[0]:[Age Dimension].[Age in Years Hierarchy].[Age Year].&[5]
I don't understand why I'm getting NULLs back for the 25th Percentile using the Median and BottomCount functions when I'm not having trouble with the opposite situation for the 75th Percentile using the Median and TopCount functions. I've checked my data set in the SQL database and none of my measure values are NULL. If anyone has a better understanding of the BottomCount function, I appreciate any clear explanation or an alternate way to help me find the correct 25th Percentile in MDX. Thanks!
Instead of [Dimension].[Key Attribute].MEMBERS
it seems like this should work - NONEMPTY(LEAVES([Dimension]))
but I tried it and it just hangs, never returns results, at least I didn't have the patience to wait more than 10 minutes
So I used this instead, and it worked fine
FILTER({LEAVES([Dimension])}, Measures.[Non-summable Measure]> 0)
Here is my full query which returns the correct 25th percentile
WITH
MEMBER [Measures].[P25] AS
MEDIAN( BOTTOMCOUNT(
FILTER({LEAVES([Dimension])}, Measures.CalculatedRate > 0)
,[Measures].[Dimension Member Distinct Count] /2
,Measures.[CalculatedRate]
)
,[CalculatedRate]
)
SELECT
{Region.MEMBERS} ON ROWS,
{[Measures].[P25]} ON COLUMNS
FROM
[Cube]
where
( <where clause> )
Hope it helps...
Could you use the Measures.[Set Median] you created in the definitions of the 25th and 75th percentile by putting it into a FILTER clause such that the definition for 25th was something like:
MEDIAN(
FILTER(
[Dimension].[Key Attribute].MEMBERS,
Measures.[Non Summable Measure] < Measures.[Set Median]
),
Measures.[Non Summable Measure]
)
The definition for the 75th would be similar but using the greater than sign. There are some boundary issues here, so you might want <= or >=.
Warning: This query has been nowhere near an MDX parser!
Beware that there is no standard definition of percentile nor for quartiles (Q1 and Q3 which correspond to P25 and P75). This query implements one definition of percentile, modify it to match the definition you want to use.
Lets take a set and order it according to a measure..
With
set CUSTOMERS as Order( [Customers].Children), [Measures].[Sales], ASC )
We calculate the Rank of each set item and the total count of elements in the set.
member [Measures].[Rank] as Rank( [Customers].CurrentMember, CUSTOMERS)
member [Measures].[Count] as Count( CUSTOMERS )
Dividing the first by the second we get (one definition of) the percentile.
member [Measures].[Percentile] as [Measures].[Rank] / [Measures].[Count] * 100
To get the 25th percentile, get the first item that has a percentile value of at least 25
select
Head( Filter( CUSTOMERS, [Measures].[Percentile] > 25) ,1) on Rows,
{ [Measures].[Sales], [Measures].[Rank], [Measures].[Count], [Measures].[Percentile] } on columns
from [MyCube]
The [Measures].[Sales] value of this item is the percentile.
I want to create a member based on this problem
I have a Product A being sold
I want to find the largest range of consecutive days without sale
example:
days 1,2,3 the product not sale, after that,it sold for 15 consecutive days, at 19th day it didnt sell for 2 days and after that it sold every day until the end of the month
so my maximum days without sale was 3
The following query delivers in the Microsoft sample cube Adventure Works what you want:
WITH Member Measures.[days without sales] AS
IIf( [Measures].[Internet Sales Amount] > 0
, 0
,(Measures.[days without sales], [Date].[Calendar].CurrentMember.PrevMember) + 1
)
Member Measures.[Max days without sales] AS
Max( [Date].[Calendar].[Date].Members
,Measures.[days without sales]
)
SELECT { [Measures].[Max days without sales] }
ON COLUMNS
FROM [Adventure Works]
WHERE [Product].[Product].&[486]
The measure days without sales is defined recursively, and returns how many days up to and including the current member of the [Date].[Calendar] hierarchy there was no sales. You may need to adapt the criteria for "without sale", bearing in mind that in MDX, numerical comparisons treat NULL as 0 - which is different from SQL.
This measure only works correctly if there is a member in this hierarchy for each day, i. e. there are no gaps in this hierarchy. And actually, the definition is more general than just working for days: If you use months for the [Date].[Calendar].CurrentMember, it would give you the number of months without sales, etc. It works with each level of the hierarchy.
The measure Max days without sales does not contain the product in its definition, it delivers the maximum days for whatever is in context (in this case the product in the WHERE clause).
Please note that - as actually there is a loop over all days in the [Date].[Calendar] hierarchy when calculating Measures.[Max days without sales], and within that the recursion again iterates along the previous days, and all this for each cell in the result set - this may be slow for large reports.
I want to create a barchart with a bar for each month and some measure.
But i also want to filter on a range of day which might not completly overlap some of the month.
When that happen I would like the aggregate count for those month to only aggregat over the days that fall in my date range not get the aggregate for the whole month.
Is that possible with MDX and if it is how should the request look like?
Create a second time dimension, using a virtual dimension of the original dimension. Use one dimension in the WHERE and another in the SELECT.
This often happens anyway if some people want 'Business Time' of quarters and periods, and others prefer months. Or if you have a financial year which runs April-April.
You can use subselect. You can find more information on this page and this page:
When a member is specified in the axis clause then that member with
its ascendants and descendants are included in the sub cube space for
the subselect; all non mentioned sibling members, in the axis or
slicer clause, and their descendants are filtered from the subspace.
This way, the space of the outer select has been limited to the
existing members in the axis clause or slicer clause, with their
ascendants and descendants as mentioned before.
Here is an example:
SELECT { [Customer].[Customer Geography].[Country].&[Australia]
, [Customer].[Customer Geography].[Country].&[United States]
} ON 1
, {[Measures].[Internet Sales Amount], [Measures].[Reseller Sales Amount]} ON 0
FROM ( SELECT {[Customer].[Customer Geography].[Country].&[Australia]
, [Customer].[State-Province].&[WA]&[US]} ON 0
FROM [Adventure Works]
)
The result will contain one row for Autralia and another one for the United States. With the subselect, I restricted the value of United Stated to the Washington state.
One way I found to do it with Mondrian is as follow
WITH MEMBER [Measures].[Units Shipped2] AS
SUM
(
{
exists([Store].[Store Country].currentmember.children,{[Store].[USA].[WA],[Store].[USA].[OR]})
},[Measures].[Units Shipped]
)
MEMBER [Measures].[Warehouse Sales2] AS
SUM
(
{
exists([Store].[Store Country].currentmember.children,{[Store].[USA].[WA],[Store].[USA].[OR]})
},[Measures].[Warehouse Sales]
)
SELECT
{[Measures].[Units Shipped2],[Measures].[Warehouse Sales2]} ON 0,
NON EMPTY [Store].[Store Country].Members on 1
FROM [Warehouse]
I am not sure if the filtering will be done in SQL like below and give good performance or be run locally
select Country, sum(unit_shipped)
where state in ('WA','OR' )
group by Country