MDX Performance Issues on Count & Sum of Filtered Members/Measures - ssas

I Have the following MDX query I need to run on a cube (which I don't have access to change).
This particular query is taking around 1.5 minutes to run, which is just far too long. I've been searching for a way to speed it up, but I'm not having a heap of luck.
Can anyone see a way to improve this query? I've been tearing my hair out for the last couple of days, so any help would be greatly appreciated!
`WITH
MEMBER [Measures].[1-99_Count] AS
COUNT(FILTER ([Scam].[Scam Ref].AllMembers,
([Measures].[Amount Lost]>=1 AND [Measures].[Amount Lost]<=99)))
MEMBER [Measures].[1-99_Amount] AS
SUM(FILTER ([Scam].[Scam Ref].AllMembers,
([Measures].[Amount Lost]>=1 AND [Measures].[Amount Lost]<=99)),
Iif(IsEmpty([Measures].[Amount Lost]),0,[Measures].[Amount Lost]))
SELECT {[Measures].[1-99_Count],
[Measures].[1-99_Amount]} ON COLUMNS,
[First Resolved On Date].[Month].[Month] ON ROWS
FROM [Infocentre]
WHERE ([First Resolved On Date].[Date].[Date].&[20140101]:[First Resolved On Date].[Date].[Date].&[20150623],
[Scam].[Scam Category Level1].&[{d9d6bc38-e73e-e411-9a82-0a713f2121f7}])`

In the end, logic prevailed and I was able to get the cube owners to add a new dimension to the cube, which means I no longer need to try and get this dog's breakfast working.

(just for fun!)
Your first measure can definitely be improved via the following.
Count(Filter is a pattern you can get rid of most of the time.
Mosha blogged about this pattern, which can be improved, here:
http://sqlblog.com/blogs/mosha/archive/2007/11/22/optimizing-count-filter-expressions-in-mdx.aspx
I've also tried to improve the second calculation as well. If this condition is true [Measures].[Amount Lost] >= 1 AND [Measures].[Amount Lost] <= 99 then this must mean that this is false IsEmpty([Measures].[Amount Lost]) so maybe both conditions can be covered by 1 condition. Also replaced your 0 with null - SSAS is a lot happier (& faster) with null:
WITH
MEMBER [Measures].[1-99_Count] AS
Sum
(
[Scam].[Scam Ref].ALLMEMBERS
,IIF
(
[Measures].[Amount Lost] >= 1 AND [Measures].[Amount Lost] <= 99
,1
,null
)
)
MEMBER [Measures].[1-99_Amount] AS
Sum
(
[Scam].[Scam Ref].ALLMEMBERS
,IIF
(
[Measures].[Amount Lost] >= 1 AND [Measures].[Amount Lost] <= 99
,[Measures].[Amount Lost]
,null
)
)
SELECT
{
[Measures].[1-99_Count]
,[Measures].[1-99_Amount]
} ON COLUMNS
,[First Resolved On Date].[Month].[Month] ON ROWS
FROM [Infocentre]
WHERE
(
[First Resolved On Date].[Date].[Date].&[20140101]
:
[First Resolved On Date].[Date].[Date].&[20150623]
,[Scam].[Scam Category Level1].&[{d9d6bc38-e73e-e411-9a82-0a713f2121f7}]
);

Related

custom count measure runs forever MDX

So this question goes off the one here
I've been trying to do a similar count measure and I did the suggested solution but it's still running.... and it's been more than 30 minutes with no results, while without that it runs in under a minute. So am I missing something? Any guidance would help. Here is my query:
WITH
MEMBER [Measures].[IteractionCount] AS
NONEMPTY
(
FILTER
(
([DimInteraction].[InteractionId].[ALL].Children,
[Measures].[Impression Count]),
[DimInteraction].[Interaction State].&[Enabled]
)
).count
SELECT
(
{[Measures].[IteractionCount],
[Measures].[Impression Count]}
)
ON COLUMNS,
(
([DimCampaign].[CampaignId].[CampaignId].MEMBERS,
[DimCampaign].[Campaign Name].[Campaign Name].MEMBERS,
[DimCampaign].[Owner].[Owner].MEMBERS)
,[DimDate].[date].[date].MEMBERS
)
ON ROWS
FROM
(
SELECT
(
{[DimDate].[date].&[2020-05-06T00:00:00] : [DimDate].[date].&[2020-05-27T00:00:00]}
)
ON COLUMNS
FROM [Model]
)
WHERE
(
{[DimCampaign].[EndDate].&[2020-05-27T00:00:00]:NULL},
[DimCampaign].[Campaign State].&[Active],
{[DimInteraction].[End Date].&[2020-05-27T00:00:00]:NULL}//,
//[DimInteraction].[Interaction State].&[Enabled]
)
I don't know if FILTER is affecting it in any way but I tried it with and without and it still runs forever. I do need it specifically filtered to [DimInteraction].[Interaction State].&[Enabled]. I have also tried to instead filter to that option in the WHERE clause but no luck
Any suggestions to optimize this would be greatly appreciated! thanks!
UPDATE:
I end up using this query to load data into a python dataframe. Here is my code for that. I used this script for connecting and loading the data. I had to make some edits to it though to use windows authentication.
ssas_api._load_assemblies() #this uses Windows Authentication
conn = ssas_api.set_conn_string(server='server name',db_name='db name')
df = ssas_api.get_DAX(connection_string=conn, dax_string=query))
The dax_string parameter is what accepts the dax or mdx query to pull from the cube.
Please try this optimization:
WITH
MEMBER [Measures].[IteractionCount] AS
SUM
(
[DimInteraction].[InteractionId].[InteractionId].Members
* [DimInteraction].[Interaction State].&[Enabled],
IIF(
IsEmpty([Measures].[Impression Count]),
Null,
1
)
)
SELECT
(
{[Measures].[IteractionCount],
[Measures].[Impression Count]}
)
ON COLUMNS,
(
([DimCampaign].[CampaignId].[CampaignId].MEMBERS,
[DimCampaign].[Campaign Name].[Campaign Name].MEMBERS,
[DimCampaign].[Owner].[Owner].MEMBERS)
,[DimDate].[date].[date].MEMBERS
)
PROPERTIES MEMBER_CAPTION
ON ROWS
FROM
(
SELECT
(
{[DimDate].[date].&[2020-05-06T00:00:00] : [DimDate].[date].&[2020-05-27T00:00:00]}
)
ON COLUMNS
FROM [Model]
)
WHERE
(
{[DimCampaign].[EndDate].&[2020-05-27T00:00:00]:NULL},
[DimCampaign].[Campaign State].&[Active],
{[DimInteraction].[End Date].&[2020-05-27T00:00:00]:NULL}//,
//[DimInteraction].[Interaction State].&[Enabled]
)
CELL PROPERTIES VALUE
If that doesn’t perform well the please describe the number of rows returned by your query when you comment out IteractionCount (sic) from the columns axis. And please describe how many unique InteractionId values you have.

How To Get All Items Created or Still Open For A Given Time

I am working with a system were items are created (postDate dimension) and closed (endDate dimension). The endDate column is always populated with the last time the item was seen. An item is considered closed in a certain time if its last seen date is before the date you are querying. Each row in the fact table has the item, postDate, endDate, locationID, and some other dimensions used for aggregations. What I am trying to accomplish is getting all items still active for a given time frame. For example I want to know all items posted in November 2008 or before November 2008 that has not yet closed. In SQL it would look something like:
SELECT C.geoCountyArea,TM.CalendarYear,COUNT(DISTINCT a.itemid)
FROM [dbo].[factTable] a
JOIN dbo.dimDate AS TM
ON TM.DateKey BETWEEN postDate AND endDate
JOIN [dbo].[dim_geography] C
ON A.geographyID=C.geographyID
WHERE C.geoCountyArea = '1204000057'
AND TM.CalendarYear = 2008 AND TM.MonthNumberOfYear = 11
GROUP BY C.geoCountyArea,TM.CalendarYear
ORDER BY C.geoCountyArea,TM.CalendarYear
This returns 27,715 which is expected. Now, in MDX this looks like:
WITH MEMBER Measures.[itemCount] AS
AGGREGATE(
{NULL:[PostDate].[Month Name].&[2008]&[11]} * {[EndDate].[Month Name].&[2008]&[11]:NULL},
[Measures].[Fact_itemCount]
)
SELECT NON EMPTY (
Measures.[itemCount]
) ON 0,
NON EMPTY (
{[PostDate].[Month Name].&[2008]&[11]},
{[Geography].[Geo County Area].&[1204000057]}
)ON 1
FROM [Cube];
This returns 27,717 - which is 2 more than the SQL version that could be due to items with no end Date posted. Now, the complication comes when I want to get more than one explicit time - for example item count for all months in 2008 or item count for all years. I looked up methods to link a given param to another one via roll playing dimensions and came across this link. I altered my script so it looks like:
WITH MEMBER Measures.[itemCount] AS
AGGREGATE(
{NULL:LINKMEMBER([DATE].[Calendar].CURRENTMEMBER
,[PostDate].[Calendar])}
* {LINKMEMBER([DATE].[Calendar].CURRENTMEMBER
, [EndDate].[Calendar]):NULL}
, [Measures].[Fact_itemCount]
)
SELECT {Measures.[jobCount]} ON 0,
NON EMPTY (
{[DATE].[Month Name].&[2008]&[11]},
{[Geography].[Geo County Area].&[1204000057]}
)ON 1
FROM [Cube];
This, however, returns only the items created in November 2008 - value of 14,884. If I add in other months I do get individual counts for each month but, again, these are just the items created in those months.
How do I get the "active" item count for a given month/year/quarter without having do explicitly declare the time values in the AGGREGATE?
Can you use NonEmpty?
WITH MEMBER Measures.[itemCount] AS
AGGREGATE(
{NULL:
NONEMPTY(
[PostDate].[Month Name].MEMBERS //<<AMEND TO EXACT STRUCTURE USED IN YOUR CUBE
,[DATE].[Calendar].CURRENTMEMBER
).ITEM(0).ITEM(0)}
* {NONEMPTY(
[EndDate].[Month Name].MEMBERS //<<AMEND TO EXACT STRUCTURE USED IN YOUR CUBE
,[DATE].[Calendar].CURRENTMEMBER
).ITEM(0).ITEM(0): NULL}
, [Measures].[Fact_itemCount]
)
...
This ended up being the solution that provided valid results (tested against SQL calls against the warehouse tables):
WITH MEMBER Measures.[itemCount] AS
AGGREGATE(
{NULL:LINKMEMBER([Post Date].[Calendar],
[Post Date].[Calendar])}
* {LINKMEMBER([Post Date].[Calendar],
[End Date].[Calendar]):NULL},
[Measures].[Fact_itemCount]
)
SELECT {Measures.[itemCount]} ON 0,
NON EMPTY (
{[Post Date].[Month Name].Children},
{[Geography].[Geo County Area].&[1204000057]}
)
FROM [Cube]
Not that I am doing LINKMEMBER against the post and end dates - not against the global Date measure.

Calculated SSAS Member based on multiple dimension attributes

I'm attempting to create a new Calculated Measure that is based on 2 different attributes. I can query the data directly to see that the values are there, but when I create the Calculated Member, it always returns null.
Here is what I have so far:
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS sum
(
Filter([Expense].MEMBERS, [Expense].[Amount Category] = "OS"
AND ([Expense].[Account Number] >= 51000
AND [Expense].[Account Number] < 52000))
,
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
Ultimately, I need to repeat this same pattern many times. A particular accounting "type" (Absorption, Selling & Marketing, Adminstrative, R&D, etc.) is based on a combination of the Category and a range of Account Numbers.
I've tried several combinations of Sum, Aggregate, Filter, IIF, etc. with no luck, the value is always null.
However, if I don't use Filter and just create a Tuple with 2 values, it does give me the data I'd expect, like this:
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS sum
(
{( [Expense].[Amount Category].&[OS], [Expense].[Account Number].&[51400] )}
,
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
But, I need to specify multiple account numbers, not just one.
In general, you should only use the FILTER function when you need to filter your fact table based on the value of some measure (for instance, all Sales Orders where Sales Amount > 10.000). It is not intended to filter members based on dimension properties (although it could probably work, but the performance would likely suffer).
If you want to filter by members of one or more dimension attributes, use tuples and sets to express the filtering:
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS
Sum(
{[Expense].[Account Number].&[51000]:[Expense].[Account Number].&[52000].lag(1)} *
[Expense].[Amount Category].&[OS],
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
Here, I've used the range operator : to construct a set consisting of all [Account Number] members greater than or equal to 51000 and less than 52000. I then cross-join * this set with the relevant [Amount Category] attribute, to get the relevant set of members that I want to sum my measure over.
Note that this only works if you actually have a member with the account number 51000 and 52000 in your Expense dimension (see comments).
An entirely different approach, would be to perform this logic in your ETL process. For example you could have a table of account-number ranges that map to a particular accounting type (Absorption, Selling & Marketing, etc.). You could then add a new attribute to your Expense-dimension, holding the accounting type for each account, and populate it using dynamic SQL and the aforementioned mapping table.
I don't go near cube scripts but do you not need to create some context via the currentmember function and also return some values for correct evaluation against the inequality operators (e.g.>) via the use of say the membervalue function ?
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS sum
(
[Expense].[Amount Category].&[OS]
*
Filter(
[Expense].[Account Number].MEMBERS,
[Expense].[Account Number].currentmember.membervalue >= 51000
AND
[Expense].[Account Number].currentmember.membervalue < 52000
)
,
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
EDIT
Dan has used the range operator :. Please make sure your hierarchy is ordered correctly and that the members you use with this operator actually exist. If they do not exist then they will be evaluated as null:
Against the AdvWks cube:
SELECT
{} ON 0
,{
[Date].[Calendar].[Month].&[2008]&[4]
:
[Date].[Calendar].[Month].&[2009]&[2]
} ON 1
FROM [Adventure Works];
Returns the following:
If the left hand member does not exist in the cube then it is evaluated as null and therefore open ended on that side:
SELECT
{} ON 0
,{
[Date].[Calendar].[Month].&[2008]&[4]
:
[Date].[Calendar].[Month].&[1066]&[2] //<<year 1066 obviously not in our cube
} ON 1
FROM [Adventure Works];
Returns:

How to filter a measure based on another measure

I have a dimension table called DimUser that has 1 row per user. I have a measure based on this dimension called "USER COUNT" which is count of rows of the DimUser table. I also have a calculated measure called "ACTIVE DAYS" which returns no of days a user is active.
Now, I want to create another calculated measure based on this 2 measures, where I count only users who are active more than 5 days. I have total 5 users in my user table and 2 users in my fact table who are active more than 5 days. My MDX expression should return 2.
This is what I wrote
FILTER([Measures].[USER COUNT], [Measures].[ACTIVE DAYS] > 5)
But this gives me 5 as the answer instead of 2. What am I doing wrong?
Next I tried this, but this fails compilation saying two measures in the expression are not allowed
([Measures].[USER COUNT], [Measures].[ACTIVE DAYS] > 5)
Next, I tried creating a new calculated member called [IsActive] that returns true or false if active days > 5 and then use that as filter on the [Measures].[USER COUNT].
How do I go about doing this?
First of all, I don't think it is even possible to filter one measure on another measure, because measures are numeric values, not a set. The concept of filtering applies to a set. So filtering a measure based on another measure is absurd. What can be done though, in terms of MDX, is to create a set of users with the first filter applied(i.e. [Measures].[ACTIVE DAYS] > 5) and then getting a COUNT out of it.
Something like this(NOT TESTED):
WITH SET [ActiveUsersMoreThan5Days] AS
FILTER([User].[UserId].CHILDREN, [Measures].[ACTIVE DAYS] > 5)
//Assuming the User dimension and hierarchy here..
//Fill it in with actual..
MEMBER [Measures].CountActiveUsersMoreThan5Days AS
COUNT(NONEMPTY([ActiveUsersMoreThan5Days], [Measures].[<<Some other measure>>]))
SELECT [Measures].CountActiveUsersMoreThan5Days on 0
//,[Dimension].Hierarchy.CHILDREN ON 1
FROM [Cube]
To avoid using Filter which is iterative and not the best performing function. Also to create a measure that is context aware then the following might help:
WITH
MEMBER [Measures].[CntUsersActive5DaysOrMore] AS
Sum
(
[UserId].[UserId] //<<< or [UserId].[UserId].[UserId]
,IIF
(
[Measures].[ACTIVE DAYS] > 5
,1
,0
)
)
SELECT
[Measures].[CntUsersActive5DaysOrMore] ON 0
FROM [YourCube];

MDX - Count of Filtered CROSSJOIN - Performance Issues

BACKGROUND: I've been using MDX for a bit but I am by no means an expert at it - looking for some performance help. I'm working on a set of "Number of Stores Authorized / In-Stock / Selling / Etc" calculated measures (MDX) in a SQL Server Analysis Services 2012 Cube. I had these calculations performing well originally, but discovered that they weren't aggregating across my product hierarchy the way I needed them to. The two hierarchies predominantly used in this report are Business -> Item and Division -> Store.
For example, in the original MDX calcs the Stores In-Stock measure would perform correctly at the "Item" level but wouldn't roll up a proper sum to the "Business" level above it. At the business level, we want to see the total number of store/product combinations in-stock, not a distinct or MAX value as it appeared to do originally.
ORIGINAL QUERY RESULTS: Here's an example of it NOT working correctly (imagine this is an Excel Pivot Table):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 2,416 2,392 99.01%
[-] Business Two 2,377 2,108 93.39%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 84.06%
-Item 3 2,377 2,108 88.68%
-Item N ... ... ...
FIXED QUERY RESULTS: After much trial and error, I switched to using a filtered count of a CROSSJOIN() of the two hierarchies using the DESCENDANTS() function, which yielded the correct numbers (below):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 215,644 149,301 93.90%
[-] Business Two 86,898 55,532 83.02%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 99.31%
-Item 3 2,377 2,108 99.11%
-Item N ... ... ...
QUERY THAT NEEDS HELP: Here is the "new" query that yields the results above:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
FILTER(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
)
),
[Measures].[Inventory Qty] > 0
)
),
FORMAT_STRING = "#,#",
NON_EMPTY_BEHAVIOR = { [Inventory Qty] },
This query syntax is used in a bunch of other "Number of Stores Selling / Out of Stock / Etc."-type calculated measures in the cube, with only a variation to the [Inventory Qty] condition at the bottom or by chaining additional conditions.
In its current condition, this query can take 2-3 minutes to run which is way too long for the audience of this reporting. Can anyone think of a way to reduce the query load or help me rewrite this to be more efficient?
Thank you!
UPDATE 2/24/2014: We solved this issue by bypassing a lot of the MDX involved and adding flag values to our named query in the DSV.
For example, instead of doing a filter command in the MDX code for "number of stores selling" - we simply added this to the fact table named query...
CASE WHEN [Sales Qty] > 0
THEN 1
ELSE NULL
END AS [Flag_Selling]
...then we simply aggregated these measures as LastNonEmpty in the cube. They roll up much faster than the full-on MDX queries.
It should be much faster to model your conditions into the cube, avoiding the slow Filter function:
If there are just a handful of conditions, add an attribute for each of them with two values, one for condition fulfilled, say "cond: yes", and one for condition not fulfilled, say "cond: no". You can define this in a view on the physical fact table, or in the DSV, or you can model it physically. These attributes can be added to the fact table directly, defining a dimension on the same table, or more cleanly as a separate dimension table referenced from the fact table. Then define your measure as
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
),
{ [Flag dim].[cond].[cond: yes] }
)
)
Possibly, you even could define the measure as a standard count measure of the fact table.
In case there are many conditions, it might make sense to add just a single attribute with one value for each condition as a many-to-many relationship. This will be slightly slower, but still faster than the Filter call.
I believe you can avoid the cross join as well as filter completely. Try using this:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS
CASE WHEN [Product].[Item Name].CURRENTMEMBER IS [Product].[Item Name].[All]
THEN
SUM(EXISTS([Product].[Item Name].[Item Name].MEMBERS,[Business].[Business Name].CURRENTMEMBER),
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
))
ELSE
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
)
END
I tried it using a dimension in my cube and using Area-Subsidiary hierarchy.
The case statement handles the situation of viewing data at Business level. Basically, the SUM() across all members of Item Names used in CASE statement calculates values for individual Item Names and then sums up all the values. I believe this is what you needed.