using BottomCount() to find 25th percentile median in SSAS - sql

I'm trying to find the Median, 25th percentile, and 75th percentile as a calculation in my cube. The values I'm evaluating are non-summable because they represent ages of people, so I'm using the following function to find the median:
WITH MEMBER Measures.[Set Median] AS MEDIAN(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Non-summable Measure]
)
The dimension key and the fact table key have a 1-1 relationship, so the key members as a set allow me to find the median across all the returned records without any summing. I've successfully found the 75th percentile using the following function combination:
MEMBER Measures.[75th Percentile] AS MEDIAN(
TOPCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
Since TopCount sorts the set in descending order, I'm able to find the 75th Percentile by finding the median of the top half of the records. Based on this logic, I'm trying to find the 25th Percentile by using the BottomCount function the same way since it sorts the set in ascending order. However, I'm only getting NULL back in my query for the 25th Percentile calculation. Here is the function combination and my end query:
MEMBER Measures.[75th Percentile] AS MEDIAN(
BOTTOMCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
SELECT
{
Measures.[Set Median]
,Measures.[25th Percentile]
,Measures.[75th Percentile]
} ON 0
,[Date Dimension].[Calendar Hierarchy].Year.&[2011]:[Date Dimension].[Calendar Hierarchy].Year.&[2014] ON 1
FROM [Cube]
WHERE
[Age Dimension].[Age in Years Hierarchy].[Age Year].&[0]:[Age Dimension].[Age in Years Hierarchy].[Age Year].&[5]
I don't understand why I'm getting NULLs back for the 25th Percentile using the Median and BottomCount functions when I'm not having trouble with the opposite situation for the 75th Percentile using the Median and TopCount functions. I've checked my data set in the SQL database and none of my measure values are NULL. If anyone has a better understanding of the BottomCount function, I appreciate any clear explanation or an alternate way to help me find the correct 25th Percentile in MDX. Thanks!

Instead of [Dimension].[Key Attribute].MEMBERS
it seems like this should work - NONEMPTY(LEAVES([Dimension]))
but I tried it and it just hangs, never returns results, at least I didn't have the patience to wait more than 10 minutes
So I used this instead, and it worked fine
FILTER({LEAVES([Dimension])}, Measures.[Non-summable Measure]> 0)
Here is my full query which returns the correct 25th percentile
WITH
MEMBER [Measures].[P25] AS
MEDIAN( BOTTOMCOUNT(
FILTER({LEAVES([Dimension])}, Measures.CalculatedRate > 0)
,[Measures].[Dimension Member Distinct Count] /2
,Measures.[CalculatedRate]
)
,[CalculatedRate]
)
SELECT
{Region.MEMBERS} ON ROWS,
{[Measures].[P25]} ON COLUMNS
FROM
[Cube]
where
( <where clause> )
Hope it helps...

Could you use the Measures.[Set Median] you created in the definitions of the 25th and 75th percentile by putting it into a FILTER clause such that the definition for 25th was something like:
MEDIAN(
FILTER(
[Dimension].[Key Attribute].MEMBERS,
Measures.[Non Summable Measure] < Measures.[Set Median]
),
Measures.[Non Summable Measure]
)
The definition for the 75th would be similar but using the greater than sign. There are some boundary issues here, so you might want <= or >=.
Warning: This query has been nowhere near an MDX parser!

Beware that there is no standard definition of percentile nor for quartiles (Q1 and Q3 which correspond to P25 and P75). This query implements one definition of percentile, modify it to match the definition you want to use.
Lets take a set and order it according to a measure..
With
set CUSTOMERS as Order( [Customers].Children), [Measures].[Sales], ASC )
We calculate the Rank of each set item and the total count of elements in the set.
member [Measures].[Rank] as Rank( [Customers].CurrentMember, CUSTOMERS)
member [Measures].[Count] as Count( CUSTOMERS )
Dividing the first by the second we get (one definition of) the percentile.
member [Measures].[Percentile] as [Measures].[Rank] / [Measures].[Count] * 100
To get the 25th percentile, get the first item that has a percentile value of at least 25
select
Head( Filter( CUSTOMERS, [Measures].[Percentile] > 25) ,1) on Rows,
{ [Measures].[Sales], [Measures].[Rank], [Measures].[Count], [Measures].[Percentile] } on columns
from [MyCube]
The [Measures].[Sales] value of this item is the percentile.

Related

Calculated measure to find the datediff using timedimension

I need to find out the number of days in Month based on Time dimension. Time dimension is Month when Jan is selected it has to return 31 as value.
If you have Time dimension and Time hierarchy, this should work:
WITH MEMBER measures.NumOfDays AS
Count
(
Descendants
(
[Time].[Time].CurrentMember,
,LEAVES
)
)
SELECT Measures.NumOfDays ON 0,
[Time].[Time].Month on 1
FROM [MyCube]
The below sample shows how to get the count.
Please note the below query only show the idea how to do this. Your cube will note have these attributes you you need to replace them
with member
measures.t as Count(([Date].[Month of Year].&[1],[Date].[Day of Month].[Day of Month].members))
select {measures.t}
on columns
from [Adventure Works]

MDX - How skip 0 values in MIN agregation and how exclude some percent of results?

In my cube I have
Earnings as measure with MIN aggregation
Dimension: [Localization].[Type].&[center]
Dimension: {[Date].[Year].&[2017], [Date].[Year].&[2018]}
My query is:
What are the minimum earnings of the person who decides to buy
apartments in the city center, excluding 5% of the lowest, within
last 2 years?
Now my mdx query looks like that:
SELECT
[Measures].[MinEarnings] ON COLUMNS
FROM [cube]
WHERE
(
BottomCount ([Localization].[Type].&[center], 95, [Measures].[MinEarnings]),
{[Date].[Year].&[2017], [Date].[Year].&[2018]}
)
I have two problems:
Some earnings are 0 - how can I skip them in calculations?
If my query correctly excludes 5% of the lowest earnings?
First of all you should use toppercent not bottomcount. you want the min salary of a person who is not in last 5% not last 5. Toppercent will give you the top 95%.
Secondly to filter 0 you can use the following syntax
toppercent (
filter([Localization].[Type].&[center], [Measures].[MinEarnings]>0)
, 95, [Measures].[MinEarnings])
Even now placing the code in the where clause might not work, however try it. I would suggest that you move the toppercent to rows , then order it, then take the top1
topcount(
order(
toppercent (
filter([Localization].[Type].&[center], [Measures].[MinEarnings]>0)
,95, [Measures].[MinEarnings])
,[Measures].[MinEarnings],asc)
,1)
I have an example which gives the minum sales amount of cities, notice i have replaced nulls with 0 to make it as close as possible to your case
with member [Measures].[Internet Sales Amount2]
as
case when ([Measures].[Internet Sales Amount])=null then 0 else [Measures].[Internet Sales Amount] end
select [Measures].[Internet Sales Amount2]
on columns ,
topcount(order(toppercent(filter([Customer].[City].[City],[Measures].[Internet Sales Amount2]>0),95,[Measures].[Internet Sales Amount2]),[Measures].[Internet Sales Amount2],asc),1)
on rows
from [Adventure Works]
where [Customer].[Country].&[Canada]
in the picture below is the result before topcount 1

MDX Prior Quarter Day Range

I've been tasked with a rather odd Time intelligence function by my finance group that I'm trying to puzzle out.
I've been asked with creating a measure within our SSAS Cube to allow for seeing previous quarter to date based on how far we are in the current quarter. But instead of seeing a standard idea of days elapsed currently versus days elapsed previously, they would like to see days remaining versus previous days remaining.
What I mean by that is, take 1/22/2015 for example. We have 48 days remaining in our current quarter, which I have by means of a calculated measure. I need to find the corresponding working day from the previous quarter where it is also at 48 days remaining.
At that point I could create a date range with some aggregate functions off of the first date in the previous quarter to the corresponding date found in the above and come up with what they are looking for.
The best idea I've had so far is to possibly do this in the database section itself, by creating a new column that is essentially the calculated number of days remaining but stored. But at that point I'm not sure how to take a calculated measure in SSAS and filter a previous quarter date member to use that property as it were.
Do you have an utility dimensions in your cube? We have one called TimeCalculations. In there we have things such as CurrentValue, MTDValue, PrevEquivMTD, Past7Days .... I think your new logic would fit in with such a dimension.
Here is an example of PrevEquivQTD against AdvWrks that I just had a play with. Guessing this doesn't really help your scenario but I had fun writing it:
WITH
SET [NonEmptyDates] AS
NonEmpty
(
[Date].[Calendar].[Date].MEMBERS
,[Measures].[Internet Sales Amount]
)
SET [LastNonEmptyDate] AS
Tail([NonEmptyDates])
SET [CurrQ] AS
Exists
(
[Date].[Calendar].[Calendar Quarter]
,[LastNonEmptyDate].Item(0)
)
MEMBER [Measures].[pos] AS
Rank
(
[LastNonEmptyDate].Item(0)
,Descendants
(
[CurrQ]
,[Date].[Calendar].[Date]
)
)
MEMBER [Measures].[PrevEquivalentQTD] AS
Sum
(
Head
(
Descendants
(
[CurrQ].ITEM(0).PrevMember
,[Date].[Calendar].[Date]
)
,[Measures].[pos]
)
,[Measures].[Internet Sales Amount]
)
SELECT
{[Measures].[pos],[Measures].[PrevEquivalentQTD]} ON 0
,[LastNonEmptyDate] ON 1
FROM
(
SELECT
[Date].[Calendar].[Date].&[20050111]
:
[Date].[Calendar].[Date].&[20080611] ON 0
FROM [Adventure Works]
);
Your Date is 1/22/2015. You want the Same Date in Previous Quarter which would be 8/22/2015.
If this is what you want, you will have to use MDX function ParallelPeriod as shown in sample below. Please replace it with your own Dimensions and Cube.
Select
ParallelPeriod
(
[Date].[Calendar Date].[Calendar Quarter], -- Level Expression
1, -- Index
[Date].[Calendar Date].[Date].&[20150122] -- Member Expression
) On 0
From [Adventure Works]
If you want the same date in the following quarter, then replace index 1 with -1.
Cheers

Using percentile as measure in SSAS

How do I use percentile (let's say 99% percentile) as a measure in SSAS cube? I'd like to be able to report on it using many different dimensions and filters so pre-calculating in sql is not an option.
If you want to evaluate percentiles over a dimension, then you can do it without having to pre-calculate anything in SQL. This can be done using a combination of the TopPercent and Tail MDX functions.
For example, say you want to find the 25th percentile over a [Date].[Calendar Month] dimension attribute by the [Measures].[Avg Daily Census] (I work in health care) measure. You could use the following query to do this.
WITH
MEMBER [Measures].[25th Percentile] AS
(
Tail(
TopPercent(
[Date].[Calendar Month].[Calendar Month],
100 - 25,
[Measures].[Avg Daily Census]
)
).Item(0),
[Measures].[Avg Daily Census]
)
SELECT
[Measures].[25th Percentile] ON 0
FROM [Census]
This query uses TopPercent to find the top 75 percent of values along the [Calendar Month] dimension attribute, finds the member with the lowest value along that set, and then evaluates the measure at that member.
Now, if you want to generate the entire set of percentile values, you can use a sequence utility dimension (just a dimension containing numbers), or you could create a custom percentile dimension containing 0-99. The following query would generate the values of the measure over all percentages.
WITH
MEMBER [Measures].[Percentile] AS
(
Tail(
TopPercent(
[Date].[Calendar Month].[Calendar Month],
100 - [Sequence].[Ones].CurrentMember.MemberValue,
[Measures].[Avg Daily Census]
)
).Item(0),
[Measures].[Avg Daily Census]
)
SELECT
[Measures].[Percentile] ON 0,
[Sequence].[Ones].&[0] : [Sequence].[Ones].&[99] ON 1
FROM [Census]
However, if you are trying to evaluate percentiles at the grain of the fact table, then I believe you will either need to pre-calculate the percentile values for each fact row or add a degenerate dimension and use the method above.

How to groupby and filter on the same dimension in MDX

I want to create a barchart with a bar for each month and some measure.
But i also want to filter on a range of day which might not completly overlap some of the month.
When that happen I would like the aggregate count for those month to only aggregat over the days that fall in my date range not get the aggregate for the whole month.
Is that possible with MDX and if it is how should the request look like?
Create a second time dimension, using a virtual dimension of the original dimension. Use one dimension in the WHERE and another in the SELECT.
This often happens anyway if some people want 'Business Time' of quarters and periods, and others prefer months. Or if you have a financial year which runs April-April.
You can use subselect. You can find more information on this page and this page:
When a member is specified in the axis clause then that member with
its ascendants and descendants are included in the sub cube space for
the subselect; all non mentioned sibling members, in the axis or
slicer clause, and their descendants are filtered from the subspace.
This way, the space of the outer select has been limited to the
existing members in the axis clause or slicer clause, with their
ascendants and descendants as mentioned before.
Here is an example:
SELECT { [Customer].[Customer Geography].[Country].&[Australia]
, [Customer].[Customer Geography].[Country].&[United States]
} ON 1
, {[Measures].[Internet Sales Amount], [Measures].[Reseller Sales Amount]} ON 0
FROM ( SELECT {[Customer].[Customer Geography].[Country].&[Australia]
, [Customer].[State-Province].&[WA]&[US]} ON 0
FROM [Adventure Works]
)
The result will contain one row for Autralia and another one for the United States. With the subselect, I restricted the value of United Stated to the Washington state.
One way I found to do it with Mondrian is as follow
WITH MEMBER [Measures].[Units Shipped2] AS
SUM
(
{
exists([Store].[Store Country].currentmember.children,{[Store].[USA].[WA],[Store].[USA].[OR]})
},[Measures].[Units Shipped]
)
MEMBER [Measures].[Warehouse Sales2] AS
SUM
(
{
exists([Store].[Store Country].currentmember.children,{[Store].[USA].[WA],[Store].[USA].[OR]})
},[Measures].[Warehouse Sales]
)
SELECT
{[Measures].[Units Shipped2],[Measures].[Warehouse Sales2]} ON 0,
NON EMPTY [Store].[Store Country].Members on 1
FROM [Warehouse]
I am not sure if the filtering will be done in SQL like below and give good performance or be run locally
select Country, sum(unit_shipped)
where state in ('WA','OR' )
group by Country