MDX Calculation . Filetring a measure based on the value in another column - ssas

I see the problems I have are of nature 'filter one fact table measure based on the value of a column in the same fact table'.
I have a cube with measure called Report. ‘Calls’ and ‘Failure’ are columns of this measure. There is a dimension called ‘trial’. I have to write few new calculations in SSAS Cube.
Sum([measure].[calls] ) only if failure = 1 and value of [Trial categories].[Trail].&[1].
 I din’t get the desired result using filter. So I created a new column in the fact table ‘calls_if_failure’ = ‘calls’ or 0 depending on the value of failure column.
 And then in the calculated column I used sum([calls_if_failure], [Trial categories].[Trail].&[1]) . Is this the only way to do this?
Now I have many more requirements of the nature ->
Sum([measure].[calls] ) only if [measure].[visit] = 1
Should I take the same approach as before to arrive at the solution? If yes, then this would mean many more columns in fact table.
Appreciate any help. Thank You.

Struggling to understand the question a little but maybe the following will at least help to give you some ideas:
SUM (
IIF (
[Trial categories].CURRENTMEMBER IS [Trial categories].[Trail].&[1]
 , [measure].[calls]
 , NULL
)
)

Related

MDX OLAP Cube Query Optimization

Problem: I'm trying to write a MDX query that will show the first date a member has measure values.
Data obstacles:
1. I don't have access to the data warehouse/source data
2. I can't request any physical calcs or CUBE changes
Looking for: I know this goes against what a CUBE should be doing, but is there any way to achieve this result. I'm running into locking conflicts and general run time issues.
Background: After some trial and error. I have a working query but sadly it's only is practical when filtered for <10 employees. I've tried some looping but there are ~60k employee ids in the cube with each one having 10-20 emp keys (one for each change in their employee info).
//must have values for measure 1 or 2
WITH
set NE_measures as
{
[Measures].[measure1] ,
[Measures].[measure2]
}
//first date with measure values for each unique emp key
MEMBER [Measures].[changedate] AS
Head
(
NonEmpty([Dim Date].[Date].[Date].allMEMBERS, NE_measures)
).Item(0).Member_Name
SELECT non empty {[Measures].[changedate]} ON COLUMNS,
non empty [Dim Employee].[Emp Key].[Emp Key].allmembers ON ROWS
FROM [Cube]
Try this:
MEMBER [Measures].[changedate] AS
Min(
[Dim Date].[Date].[Date].allMEMBERS,
IIF(
NOT(ISEMPTY([Measures].[measure1]))
OR NOT(ISEMPTY([Measures].[measure2])),
[Dim Date].[Date].CurrentMember.MemberValue,
NULL
)
);
I’m assuming the KeyColumn or ValueColumn is more likely to sort properly than the name. So if MemberValue doesn’t work then try Member_Key.
The most efficient way of accomplishing this would be to add a date column in the fact table with measure 1 and measure 2 then create a AggregateFunction=Min measure on it. But you said you couldn’t change the cube so I didn’t propose that superior option.

Optimizing Summarize in DAX

I have this DAX formula that gives me a count of id that appear on the fact table in a month, averaged over the year. I can put this measure is a table ad it's unpacked by row with no issues (by adding variables from dimensions)
Measure:= AVERAGEX(
SUMMARIZE(
CALCULATETABLE(fact_table;FILTER('Time_Dimension';'Time_Dimension'[Last_month] <> "LAST"));
Time_Dimension[Month Name];
"Count";DISTINCTCOUNT(fact_table[ID])
);
[Count]
)
But it's terrible slow (I have 3 measures like this on a single table) and the fact table is big (like 300Million rows big)
I was reading that SUMMARIZE perform really bad with aggregations and It should be replaced with SUMMARIZECOLUMNS.
I wrote this formula
Measure_v2:= AVERAGEX(
SUMMARIZECOLUMNS(
Time_Dimension[Month Name];
FILTER(Time_Dimension;
Time_Dimension[Month Name]<>"LAST"
);
"Count";DISTINCTCOUNT(fact_table[ID])
)
[Count]
)
And it works when I visualize the measure as it is, but when I try to put it in a context (like the table above) it gives me the error "Can't use SUMMARIZECOLUMN and ADDMISSINGITEMS() in this context" How can I make a sustainable optimization from the original SUMMARIZE function?
Before optimizing SUMMARIZE, I would re-visit the overall approach. If your goal is to calculate average fact count per year-month, there is a simpler (and faster) way.
[ID Count]:=CALCULATE(COUNT('fact_table'[ID]),'Time_Dimension'[Last_month] <> "LAST")
[Average ID Count]:=AVERAGEX( VALUES('Time_Dimension'[Year_Month]), [ID Count`])
assuming that:
you have year-month attribute in your time dimension;
IDs in your fact table are unique (and therefore, simple count is
enough)
If this solution does not solve your problem, then please post your data model - it's hard to optimize without knowing the data structure.
On a side note, I would remove ID field from the fact table. It adds no value to the model, and consumes huge amounts of memory. Your objective can be achieved by simply counting rows:
[Fact Count]:=CALCULATE(COUNTROWS('fact_table'),'Time_Dimension'[Last_month] <> "LAST")

Slow MDX Custom Distinct Count Formula

I have a question related to creating a (more efficient) custom Distinct Count Measure using MDX.
Background
My cube has several long many to many relationship chains between Facts and Dimensions and it is important for me to be able to track which members in certain Dimensions do and do not relate to other Dimensions. As such, I have created a "Not Related" record in each of my dimension tables and set those records' ID values to -1. Then in my intermediate mapping fact tables I use the -1 ID to connect to these "Not Related" records.
The issue arises when I try to run a normal out-of-the-box distinct count on any field where the -1 members are present. In the case that a -1 member exists, the distinct count measure will return a result of 1 more than the true answer.
To solve this issue I have written the following MDX:
CREATE MEMBER CURRENTCUBE.[Measures].[Provider DCount]
AS
//Oddly enough MDX seems to require that the PID (Provider ID) field be different from both the linking field and the user sliceable field.
SUM( [Providers].[PID Used For MDX].Children ,
//Don't count the 'No Related Record' item.
IIF( NOT([Providers].[PID Used For MDX].CURRENTMEMBER IS [Providers].[PID Used For MDX].&[-1])
//For some reason this seems to be necessary to calculate the Unknown Member correctly.
//The "Regular Provider DCount Measure" below is the out-of-the-box, non-MDX measure built off the same field, and is not shown in the final output.
AND [Measures].[Regular Provider DCount Measure] > 0 , 1 , NULL )
),
VISIBLE = 1 , DISPLAY_FOLDER = 'Distinct Count Measures' ;
The Issue
This MDX works and always shows the correct answer (yeah!), but it is EXTREMELY slow when users start pulling Pivot Tables with more than a few hundred cells that use this measure. For less than 100 cells, the results are nearly instantaneously. For a few thousand cells (which is not uncommon at all), the results could take up to an hour to resolve (uggghhh!).
Can anyone help show me how to write a more efficient MDX formula to accomplish this task? Your help would be GREATLY appreciated!!
Jon Oakdale
jonoakdale#hotmail.com
Jon
You can use predefined scope to nullify all unnecessary (-1) members and than create your measure.
SCOPE ([Providers].[PID Used For MDX].&[-1]
,[Measures].[Regular Provider DCount Measure]);
THIS = NULL;
END SCOPE;
CREATE MEMBER CURRENTCUBE.[Measures].[Provider DCount]
AS
SUM([Providers].[PID Used For MDX].Children
,[Measures].[Regular Provider DCount Measure]),
VISIBLE = 1;
By the way, I used in my tests [Providers].[PID Used For MDX].[All].Children construction since don't know, what is dimension / hierarchy / ALL-level in your case. It seems like [PID Used For MDX] is ALL-level and [Providers] is name of dimension and hierarchy, and HierarchyUniqueName is set to Hide.

MDX Query SUM PROD to do Weighted Average

I'm building a cube in MS BIDS. I need to create a calculated measure that returns the weighted-average of the rank value weighted by the number of searches. I want this value to be calculated at any level, no matter what dimensions have been applied to break-down the data.
I am trying to do something like the following:
I have one measure called [Rank Search Product] which I want to apply at the lowest level possible and then sum all values of it
IIf([Measures].[Searches] IS NOT NULL, [Measures].[Rank] * [Measures].[Searches], NULL)
And then my weighted average measure uses this:
IIf([Measures].[Rank Search Product] IS NOT NULL AND SUM([Measures].[Searches]) <> 0,
SUM([Measures].[Rank Search Product]) / SUM([Measures].[Searches]),
NULL)
I'm totally new to writing MDX queries and so this is all very confusing to me. The calculation should be
([Rank][0]*[Searches][0] + [Rank][1]*[Searches][1] + [Rank][2]*[Searches][2] ...)
/ SUM([searches])
I've also tried to follow what is explained in this link http://sqlblog.com/blogs/mosha/archive/2005/02/13/performance-of-aggregating-data-from-lower-levels-in-mdx.aspx
Currently loading my data into a pivot table in Excel is return #VALUE! for all calculations of my custom measures.
Please halp!
First of all, you would need an intermediate measure, lets say Rank times Searches, in the cube. The most efficient way to implement this would be to calculate it when processing the measure group. You would extend your fact table by a column e. g. in a view or add a named calculation in the data source view. The SQL expression for this column would be something like Searches * Rank. In the cube definition, you would set the aggregation function of this measure to Sum and make it invisible. Then just define your weighted average as
[Measures].[Rank times Searches] / [Measures].[Searches]
or, to avoid irritating results for zero/null values of searches:
IIf([Measures].[Searches] <> 0, [Measures].[Rank times Searches] / [Measures].[Searches], NULL)
Since Analysis Services 2012 SP1, you can abbreviate the latter to
Divide([Measures].[Rank times Searches], [Measures].[Searches], NULL)
Then the MDX engine will apply everything automatically across all dimensions for you.
In the second expression, the <> 0 test includes a <> null test, as in numerical contexts, NULL is evaluated as zero by MDX - in contrast to SQL.
Finally, as I interpret the link you have in your question, you could leave your measure Rank times Searches on SQL/Data Source View level to be anything, maybe just 0 or null, and would then add the following to your calculation script:
({[Measures].[Rank times Searches]}, Leaves()) = [Measures].[Rank] * [Measures].[Searches];
From my point of view, this solution is not as clear as to directly calculate the value as described above. I would also think it could be slower, at least if you use aggregations for some partitions in your cube.

Filtering a Measure (or Removing Outliers)

Say I have a measure, foo, in a cube, and I have a reporting requirement that users want to see the following measures in a report:
total foo
total foo excluding instances where foo > 10
total foo excluding instances where foo > 30
What is the best way to handle this?
In the past, I have added Named Calculations which return NULL if foo > 10 or just foo otherwise.
I feel like there has to be a way to accomplish this in MDX (something like Filter([Measures].[foo], [Measures].[foo] > 10)), but I can't for the life of me figure anything out.
Any ideas?
The trick is that you need to apply the filter on your set, not on your measure.
For example, using the usual Microsoft 'warehouse and sales' demo cube, the following MDX will display the sales for all the stores where sales were greater than $2000.
SELECT Filter([Store].[Stores].[Store].members, [Unit Sales] > 2000) ON COLUMNS,
[Unit Sales] ON ROWS
FROM [Warehouse and Sales]
I met similar problem when use saiku (backend with Mondrain), as I haven't found any clear solution of "add filter on measure", I added it here, and that may be useful for other guy.
In Saiku3.8, you could add filter on UI: "column"->"filter"->"custom", then you may see a Filter MDX Expression.
Let's suppose we want clicks in Ad greater than 1000, then add the following line there:
[Measures].[clicks] > 1000
Save and close, then that filter will be valid for find elem with clicks greater than 1000.
The MDX likes below (suppose dt as dimension and clicks as measure, we want to find dt with clicks more than 1000)
WITH
SET [~ROWS] AS
Filter({[Dt].[dt].[dt].Members}, ([Measures].[clicks] > 1000))
SELECT
NON EMPTY {[Measures].[clicks]} ON COLUMNS,
NON EMPTY [~ROWS] ON ROWS
FROM [OfflineData]
i think you have two choices:
1- Add column to your fact(or view on data source view that is based on fact table)like:
case when unit_Price>2000 then 1
else 0
end as Unit_Price_Uper_Or_Under_10
and add a fictitious Dimension based on this columns value.
and add named query for New Dimension(say Range_Dimension in datasourceview :
select 1 as range
union all
select 0 as range
and after taht you cant used this filter like other dimension and attribute.
SELECT [Store].[Stores].[Store].members ON COLUMNS,
[Unit Sales] ON ROWS
FROM [Warehouse and Sales]
WHERE [Test_Dimension].[Range].&[1]
the problem is for every range you must add When condition and only if the range is static this solution is a good solution.
and for dynamic range it's better to formulate the range (based on disceretizing method )
2- add dimension with granularity near fact table based on fact table
for example if we have fact table with primary key Sale_id.we can add
dimension based on fact table with only one column sale_Id and in dimension Usage tab
we can relate this new dimension and measure group with relation type Fact and
after that in mdx we can use something like :
filter([dim Sale].[Sale Id].[Sale Id].members,[Measures].[Unit Price]>2000)