I have a question related to creating a (more efficient) custom Distinct Count Measure using MDX.
Background
My cube has several long many to many relationship chains between Facts and Dimensions and it is important for me to be able to track which members in certain Dimensions do and do not relate to other Dimensions. As such, I have created a "Not Related" record in each of my dimension tables and set those records' ID values to -1. Then in my intermediate mapping fact tables I use the -1 ID to connect to these "Not Related" records.
The issue arises when I try to run a normal out-of-the-box distinct count on any field where the -1 members are present. In the case that a -1 member exists, the distinct count measure will return a result of 1 more than the true answer.
To solve this issue I have written the following MDX:
CREATE MEMBER CURRENTCUBE.[Measures].[Provider DCount]
AS
//Oddly enough MDX seems to require that the PID (Provider ID) field be different from both the linking field and the user sliceable field.
SUM( [Providers].[PID Used For MDX].Children ,
//Don't count the 'No Related Record' item.
IIF( NOT([Providers].[PID Used For MDX].CURRENTMEMBER IS [Providers].[PID Used For MDX].&[-1])
//For some reason this seems to be necessary to calculate the Unknown Member correctly.
//The "Regular Provider DCount Measure" below is the out-of-the-box, non-MDX measure built off the same field, and is not shown in the final output.
AND [Measures].[Regular Provider DCount Measure] > 0 , 1 , NULL )
),
VISIBLE = 1 , DISPLAY_FOLDER = 'Distinct Count Measures' ;
The Issue
This MDX works and always shows the correct answer (yeah!), but it is EXTREMELY slow when users start pulling Pivot Tables with more than a few hundred cells that use this measure. For less than 100 cells, the results are nearly instantaneously. For a few thousand cells (which is not uncommon at all), the results could take up to an hour to resolve (uggghhh!).
Can anyone help show me how to write a more efficient MDX formula to accomplish this task? Your help would be GREATLY appreciated!!
Jon Oakdale
jonoakdale#hotmail.com
Jon
You can use predefined scope to nullify all unnecessary (-1) members and than create your measure.
SCOPE ([Providers].[PID Used For MDX].&[-1]
,[Measures].[Regular Provider DCount Measure]);
THIS = NULL;
END SCOPE;
CREATE MEMBER CURRENTCUBE.[Measures].[Provider DCount]
AS
SUM([Providers].[PID Used For MDX].Children
,[Measures].[Regular Provider DCount Measure]),
VISIBLE = 1;
By the way, I used in my tests [Providers].[PID Used For MDX].[All].Children construction since don't know, what is dimension / hierarchy / ALL-level in your case. It seems like [PID Used For MDX] is ALL-level and [Providers] is name of dimension and hierarchy, and HierarchyUniqueName is set to Hide.
Related
I am a beginner in MDX queries. Can any one tell me how to get the record count that is a result of a MDX query?
The query is following:
select {[Measures].[Employee Department History Count],[Measures].[Rate]} on columns, Non Empty{{Filter([Shift].[Shift ID].[Shift ID].Members, ([Shift].[Shift ID].CurrentMember.name <> "1"))}*{[Employee].[Business Entity ID].[Business Entity ID].Members}} on rows from [Adventure Works2012].
I have tried various methods and I haven't really got a solution for that.
I assume you mean row count when you talk of "record count", as MDX does not know a concept of records, but the result shown from an MDX query is the space built by the tuples on the axes.
I see two possibilities to get the row count:
Just count the rows returned from your above query in the tool from which you call the MDX query.
If you want to count in MDX, then let's state what you want to have:
You want to know the number of members of the set of non empty combinations of [Shift ID]s and [Business Entity ID]s where the Shift ID is not "1" and at least one of the measures [Employee Department History Count] and [Rate] is not null.
To state that different: Let's call the tuples like above for which the first measure is not null "SET1", and the tuples like above for which teh second measure is not null "SET2". Then you you want to know the count of the the tuples which are contained in one of these sets (or in both).
To achieve this, we define these two sets and then a calculated menber (a new measure in our case) containing this calculation in its definition, and then use this calculated member in the select clause to show it:
WITH
SET SET1 AS
NonEmpty({{Filter([Shift].[Shift ID].[Shift ID].Members,
([Shift].[Shift ID].CurrentMember.name <> "1"))}
* {[Employee].[Business Entity ID].[Business Entity ID].Members}},
{[Measures].[Employee Department History Count])
SET SET2 AS
NonEmpty({{Filter([Shift].[Shift ID].[Shift ID].Members,
([Shift].[Shift ID].CurrentMember.name <> "1"))}
* {[Employee].[Business Entity ID].[Business Entity ID].Members}},
{[Measures].[Rate])
MEMBER [Measures].[MyCalculation] AS
COUNT(SET1 + SET 2)
SELECT [Measures].[MyCalculation] ON COLUMNS
FROM [Adventure Works2012]
Please note:
The sets SET1 and SET2 are not absolutely necessary, you could also put the whole calculation in one long and complicated definition of the MyCalculation measure, but splitting it up makes is easier to read. However, the definition of a new member is necessary, as in MDX you can only put members on axes (rows, columns, ...). These members can either already been defined in the cube, or you have to define them in the WITH clause of your query. There is no such thing as putting expressions/calculations on axes in MDX, only members.
The + for sets is a union which removes duplicates, hence this operation gives us the tuples which have an non empty value for at least one of the measures. Alternatively, you could have used the Union function equivalently to the +.
The Nonempty() I used in the definitions of the sets is the NonEmpty function, which is slightly different from the NON EMPTY keyword that you can use on the axes. We use one of the measures as second argument to this function in both set definitions.
I have currently no working SSAS installation available to test my statement, hence there might be a minor error or typo in my above statement, but the idea should work.
I am trying to replicate the following sql statement into MDX so that I can create a calculated member in the cube using the base loaded members instead of having to calculate it outside the cube in the table and then loading it
SUM(CASE WHEN ((A.SALES_TYPE_CD = 1) AND (A.REG_SALES=0))
THEN A.WIN_SALES
ELSE 0
END) AS Z_SALES
I am currently loading SALES_TYPE_CD as a dimension and REG_SALES and WIN_SALES as measures.
I also have a few other dimensions in the cube but for simplicity, lets just say I have 2 other dimensions, LOCATION and ITEM
The dimension has LOCATION has 3 levels, "Region"->"District"->"Store", ordered from top to bottom level.
The dimension has ITEM has 3 levels, "CLASS"->"SUBCLASS"->"SKU", ordered from top to bottom level.
The dimension has SALES TYPE has 2 levels, "SALES_TYPE_GROUP"->"SALES_TYPE_CD", ordered from top to bottom level.
I know that I cannot create a simple calculated member in the cube which crossjoins the "SALES_TYPE" dimension with another dimension to get the answer I want.
I would think that it would be a more complicated MDX statement something like :
CREATE MEMBER CURRENTCUBE.[Measures].[Z_Sales]
AS 'sum(filter(crossjoin(leaves(), [Sales Type].[Sales Type].
[Sales_Type_CD].&[1]), [Measures].[REG_SALES]=0),[Measures].
[WIN_SALES])',
FORMAT_STRING = '#,#',
VISIBLE = 1 ;
But this does not seem to return the desired result.
What would be the proper MDX code to generate the desired result?
I did a bunch of testing with the data and I now know that there is no way I can get the right answer by using MDX alone in this scenario. Like "Greg" and "Tab" suggested, the only way would be to have reg sales as a dimension. Since this is a measure, that is out of the question because of the large number of possibilities for the value which has a data type of decimal (18,2)
Thanks for taking the time to answer the question.
I have a measure [Measures].[myMeasure] that I would like to create several derivatives of based on the related attribute values.
e.g. if the related [Location].[City].[City].Value = "Austin" then I want the new calculated measure to return the value of [Measures].[myMeasure], otherwise, I want the new calculated measure to return 0.
Also, I need the measure to aggregate correctly meaning sum all of the leaf level values to create a total.
The below works at the leaf level or as long as the current member is set to Austin...
Create Member CurrentCube.[Measures].[NewMeasure] as
iif(
[Location].[City].currentmember = [Location].[City].&[Austin],
[Measures].[myMeasure],
0
);
This has 2 problems.
1 - I don't always have [Location].[City] in context.
2. When multiple cities are selected this return 0.
I'm looking for a solution that would work regardless of whether the related dimension is in context and will roll up by summing the atomic values based on a formula similar to above.
To add more context consider a transaction table with an amount field. I want to convert that amount into measures such as payments, deposits, return, etc... based on the related account.
I don't know the answer but just a couple of general helpers:
1 You should use IS rather than = when comparing to a member
2 You should use null rather than 0 - 0/NULL are effecitvely the same but using 0 will slow things up a lot as the calculation will be fired many more times. (this might help with the second section of your question)
Create Member CurrentCube.[Measures].[NewMeasure] as
iif(
[Location].[City].currentmember IS [Location].[City].&[Austin],
[Measures].[myMeasure],
NULL
);
I'm building a cube in MS BIDS. I need to create a calculated measure that returns the weighted-average of the rank value weighted by the number of searches. I want this value to be calculated at any level, no matter what dimensions have been applied to break-down the data.
I am trying to do something like the following:
I have one measure called [Rank Search Product] which I want to apply at the lowest level possible and then sum all values of it
IIf([Measures].[Searches] IS NOT NULL, [Measures].[Rank] * [Measures].[Searches], NULL)
And then my weighted average measure uses this:
IIf([Measures].[Rank Search Product] IS NOT NULL AND SUM([Measures].[Searches]) <> 0,
SUM([Measures].[Rank Search Product]) / SUM([Measures].[Searches]),
NULL)
I'm totally new to writing MDX queries and so this is all very confusing to me. The calculation should be
([Rank][0]*[Searches][0] + [Rank][1]*[Searches][1] + [Rank][2]*[Searches][2] ...)
/ SUM([searches])
I've also tried to follow what is explained in this link http://sqlblog.com/blogs/mosha/archive/2005/02/13/performance-of-aggregating-data-from-lower-levels-in-mdx.aspx
Currently loading my data into a pivot table in Excel is return #VALUE! for all calculations of my custom measures.
Please halp!
First of all, you would need an intermediate measure, lets say Rank times Searches, in the cube. The most efficient way to implement this would be to calculate it when processing the measure group. You would extend your fact table by a column e. g. in a view or add a named calculation in the data source view. The SQL expression for this column would be something like Searches * Rank. In the cube definition, you would set the aggregation function of this measure to Sum and make it invisible. Then just define your weighted average as
[Measures].[Rank times Searches] / [Measures].[Searches]
or, to avoid irritating results for zero/null values of searches:
IIf([Measures].[Searches] <> 0, [Measures].[Rank times Searches] / [Measures].[Searches], NULL)
Since Analysis Services 2012 SP1, you can abbreviate the latter to
Divide([Measures].[Rank times Searches], [Measures].[Searches], NULL)
Then the MDX engine will apply everything automatically across all dimensions for you.
In the second expression, the <> 0 test includes a <> null test, as in numerical contexts, NULL is evaluated as zero by MDX - in contrast to SQL.
Finally, as I interpret the link you have in your question, you could leave your measure Rank times Searches on SQL/Data Source View level to be anything, maybe just 0 or null, and would then add the following to your calculation script:
({[Measures].[Rank times Searches]}, Leaves()) = [Measures].[Rank] * [Measures].[Searches];
From my point of view, this solution is not as clear as to directly calculate the value as described above. I would also think it could be slower, at least if you use aggregations for some partitions in your cube.
I need an MDX query for Mondrian filtered by date, where one or both of the boundry dates may not exist. I'm using the query below that works as long as both 2013-01-01 and 2013-01-08 dimensions exist. If one of the two dates does not exist then it returns no results, even though the dimensions in between do exist. How would I get this query to work even in the case of a missing boundry date dimension?
SELECT
NON EMPTY {Hierarchize({[Measures].[Number of Something]})} ON COLUMNS,
NON EMPTY {[Date].[2013-01-01]:[Date].[2013-01-08]} ON ROWS
FROM [Users]
MDX is built with the assumption that every member that you refer to exists; it is best then to make sure all conceivable date dimension members do exist by having a separate table with these values precomputed.
You could get tricky and implement that table as a stored procedure but date dimensions don't take up a lot of space in the grand scheme of things so you'd hardly ever do this.
I don't know of any other way to solve your problem.
try to eliminate the NON EMPTY
Even I haven't yet understand the reason of implementing this logic, you can hide this by adding . If you add custom member in Mondrian try it.
/* Exclude Missing Member */
Create Set CurrentCube.[MissingMemberSet] As
iif(IsError(StrToMember("[Dimension].[Hierarchy].&[MEMBER]")),
{}, {[Dimension].[Hierarchy].&[MEMBER]});
Create Member CurrentCube.Measures.[Calculation on Missing Member]
AS
IIF ([MissingMemberSet].Count > 0,
([Dimension].[Hierarchy].&[MEMBER],Measures.[X Measure]),
0
)
,
FORMAT_STRING = "Currency",
LANGUAGE = 1033,
NON_EMPTY_BEHAVIOR = { [X Measure] },
VISIBLE = 1 , DISPLAY_FOLDER = 'Display Folder' ;
Also you can implement in using IIF(IsError or IIF(Exists MDX functions.