MDX - Count of Filtered CROSSJOIN - Performance Issues

MDX - Count of Filtered CROSSJOIN - Performance Issues - sql

BACKGROUND: I've been using MDX for a bit but I am by no means an expert at it - looking for some performance help. I'm working on a set of "Number of Stores Authorized / In-Stock / Selling / Etc" calculated measures (MDX) in a SQL Server Analysis Services 2012 Cube. I had these calculations performing well originally, but discovered that they weren't aggregating across my product hierarchy the way I needed them to. The two hierarchies predominantly used in this report are Business -> Item and Division -> Store.
For example, in the original MDX calcs the Stores In-Stock measure would perform correctly at the "Item" level but wouldn't roll up a proper sum to the "Business" level above it. At the business level, we want to see the total number of store/product combinations in-stock, not a distinct or MAX value as it appeared to do originally.
ORIGINAL QUERY RESULTS: Here's an example of it NOT working correctly (imagine this is an Excel Pivot Table):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 2,416 2,392 99.01%
[-] Business Two 2,377 2,108 93.39%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 84.06%
-Item 3 2,377 2,108 88.68%
-Item N ... ... ...
FIXED QUERY RESULTS: After much trial and error, I switched to using a filtered count of a CROSSJOIN() of the two hierarchies using the DESCENDANTS() function, which yielded the correct numbers (below):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 215,644 149,301 93.90%
[-] Business Two 86,898 55,532 83.02%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 99.31%
-Item 3 2,377 2,108 99.11%
-Item N ... ... ...
QUERY THAT NEEDS HELP: Here is the "new" query that yields the results above:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
FILTER(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
)
),
[Measures].[Inventory Qty] > 0
)
),
FORMAT_STRING = "#,#",
NON_EMPTY_BEHAVIOR = { [Inventory Qty] },
This query syntax is used in a bunch of other "Number of Stores Selling / Out of Stock / Etc."-type calculated measures in the cube, with only a variation to the [Inventory Qty] condition at the bottom or by chaining additional conditions.
In its current condition, this query can take 2-3 minutes to run which is way too long for the audience of this reporting. Can anyone think of a way to reduce the query load or help me rewrite this to be more efficient?
Thank you!
UPDATE 2/24/2014: We solved this issue by bypassing a lot of the MDX involved and adding flag values to our named query in the DSV.
For example, instead of doing a filter command in the MDX code for "number of stores selling" - we simply added this to the fact table named query...
CASE WHEN [Sales Qty] > 0
THEN 1
ELSE NULL
END AS [Flag_Selling]
...then we simply aggregated these measures as LastNonEmpty in the cube. They roll up much faster than the full-on MDX queries.

It should be much faster to model your conditions into the cube, avoiding the slow Filter function:
If there are just a handful of conditions, add an attribute for each of them with two values, one for condition fulfilled, say "cond: yes", and one for condition not fulfilled, say "cond: no". You can define this in a view on the physical fact table, or in the DSV, or you can model it physically. These attributes can be added to the fact table directly, defining a dimension on the same table, or more cleanly as a separate dimension table referenced from the fact table. Then define your measure as
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
),
{ [Flag dim].[cond].[cond: yes] }
)
)
Possibly, you even could define the measure as a standard count measure of the fact table.
In case there are many conditions, it might make sense to add just a single attribute with one value for each condition as a many-to-many relationship. This will be slightly slower, but still faster than the Filter call.

I believe you can avoid the cross join as well as filter completely. Try using this:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS
CASE WHEN [Product].[Item Name].CURRENTMEMBER IS [Product].[Item Name].[All]
THEN
SUM(EXISTS([Product].[Item Name].[Item Name].MEMBERS,[Business].[Business Name].CURRENTMEMBER),
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
))
ELSE
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
)
END
I tried it using a dimension in my cube and using Area-Subsidiary hierarchy.
The case statement handles the situation of viewing data at Business level. Basically, the SUM() across all members of Item Names used in CASE statement calculates values for individual Item Names and then sums up all the values. I believe this is what you needed.

Related

MDX where clause in subquery does not slice cube - how to understand?

This query gives me sales of one store:
select
[measures].[sales] on 0
from [MyCube]
where [store].[store].[042]
However, if I move the slicer to inside of the subquery, it gives me sales of all stores.
select
[measures].[sales] on 0
from (select
from [MyCube]
where [store].[store].[042]
)
How to understand the mechanisms behind this difference?
This is also noted in this article, but without much explanation.
----EDIT----:
I tried various things and read around for a while. I'd like to add a question: is there a scenario in which the where clause in sub-select does filter the result?
This query gives me sales of all stores in state MI (store [042] belongs to MI):
select
[measures].[sales] on 0
from (select
[store].[state].[MI] on 0
from [myCube]
where [store].[store].[042]
)
Thinking of 'inner query only filters if the filtered dimension is returned on an axis', the theory is proved wrong if I do this:
select
[measures].[sales] on 0
from (select
[store].[state].members on 0
from [myCube]
where [store].[store].[042]
)
The sub-select still returns one state MI, but the outer query returns sales of all stores (of all states).
----EDIT 4/13----:
Re-phrasing the question in AdventureWorks cube with screenshot.
Query 1: sales of one store
Query 2: it returns sales of all stores if where clause is in the sub-select.
Query 3: the two answers I got suggested that we select the dimension in an axis - here is the result - we get all cities.

select
[measures].[sales] on 0
from (select
from [MyCube]
where [store].[store].[042]
)
The above query reduces the scope of stores just to the member [042]. Make note that sub-select is executed before the actual select. So, when it comes to the select, the engine just sees a cube which has all the members in all the dimensions; but only the member [store].[store].[042] in the store dimension. It's as if the cube has been kept intact every where else but sliced off on the Store dimension.
If you go a step ahead and add the store on to one of the axes, like
select
[measures].[sales] on 0,
[store].[store].members on 1
from (select
from [MyCube]
where [store].[store].[042]
)
you would see that although the member [All] appears in the output, it actually is just comprised of only one store.
In essence, the [All] is a special member which is calculated with respect to scope of the cube. It reflects the combined effect of all the members in the cube.
In SQL terms, it is similar to:
select sales, store as [All] from
(select sales, store from tbl where store = '042') tbl
Even though you see Sales----All, it is but a reflection of sales for store [042]

Here are some other good references concerning sub-select and slicer debate:
http://bisherryli.com/2013/02/08/mdx-25-slicer-or-sub-cube/
https://cwebbbi.wordpress.com/2014/04/07/free-video-on-subselects-in-mdx/
Chris Webb's video being located here:
https://projectbotticelli.com/knowledge/what-is-a-subselect-mdx-video-tutorial?pk_campaign=tt2014cwb
This should still leave an All member:
SELECT
[measures].[sales] ON 0
FROM
(
SELECT
FROM [MyCube]
WHERE
[store].[store].[042]
);
...but the member [All] of the Store hierarchy will only now be made up of [store].[store].[042].
You can see this by adding the Store hierarchy onto ROWS:
SELECT
[measures].[sales] ON 0,
[store].MEMBERS ON 1
FROM
(
SELECT
FROM [MyCube]
WHERE
[store].[store].[042]
);
This is the AdvWorks version similar to the reference in your question:
SELECT
{[Measures].[Order Count]} ON 0
,[Subcategory].MEMBERS ON 1
FROM
(
SELECT
{
[Subcategory].[Subcategory].&[22]
} ON 0
FROM [Adventure Works]
);
It returns the member from the sub-select and the All member adjusted to take account of the subselect:
In the references article why is the [All] less than the sum of the other two - this is not down to the subselect but is in connection with the measure that he has chosen [Measures].[Order Count] which is a distinct count. If you take away the subselect you see exactly the same behaviour of the All member being less than the sum of the other subcategory members (I've marked the point at which the total of the parts becomes higher than the All member):
SELECT
{[Measures].[Order Count]} ON 0
,Order
(
[Subcategory].MEMBERS
,[Measures].[Order Count]
,bdesc
) ON 1
FROM [Adventure Works];
Order Count: on 1 order there might be several Product Subcategories - hence this behaviour.
Edit
This query of yours:
select
[measures].[sales] on 0
from (select
[store].[state].members on 0
from TestCube //<< added this!
where [store].[store].[042]
)
This inner script is not valid? Using the same dimension on an axes and the WHERE clause is not valid:
select
[store].[state].members on 0
from TestCube
where [store].[store].[042]
Edit2
An mdx script returns a cube, which may be sliced or not sliced, but nevertheless it returns a cube. The WHERE clause is used to slice the cube that is returned. If we were using a third party tool then the dimension added to the WHERE clause would go into a combobox - with say Cliffside selected. BUT the user could effectively select Ballard from that combobox - it is just a slicer. The WHERE clause is not changing the cube that is returned by the mdx script, it is just affecting what is displayed in the cellset.
WHERE is valid within a subselect. It is part of the definition:
https://msdn.microsoft.com/en-us/library/ff487138.aspx
I've never found a use case for a subselect's WHERE clause.
Edit3
This link will explain things:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/ccb66ac3-0f9a-4261-8ccc-b6ecc51b6f07/is-where-clause-pointless-inside-a-subselect?forum=sqlanalysisservices
As Darren gosbell says in the answer to this question:
https://msdn.microsoft.com/en-us/library/ff487138.aspx it says that:
The WHERE clause does not filter the subspace.

Calculated SSAS Member based on multiple dimension attributes

I'm attempting to create a new Calculated Measure that is based on 2 different attributes. I can query the data directly to see that the values are there, but when I create the Calculated Member, it always returns null.
Here is what I have so far:
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS sum
(
Filter([Expense].MEMBERS, [Expense].[Amount Category] = "OS"
AND ([Expense].[Account Number] >= 51000
AND [Expense].[Account Number] < 52000))
,
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
Ultimately, I need to repeat this same pattern many times. A particular accounting "type" (Absorption, Selling & Marketing, Adminstrative, R&D, etc.) is based on a combination of the Category and a range of Account Numbers.
I've tried several combinations of Sum, Aggregate, Filter, IIF, etc. with no luck, the value is always null.
However, if I don't use Filter and just create a Tuple with 2 values, it does give me the data I'd expect, like this:
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS sum
(
{( [Expense].[Amount Category].&[OS], [Expense].[Account Number].&[51400] )}
,
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
But, I need to specify multiple account numbers, not just one.

In general, you should only use the FILTER function when you need to filter your fact table based on the value of some measure (for instance, all Sales Orders where Sales Amount > 10.000). It is not intended to filter members based on dimension properties (although it could probably work, but the performance would likely suffer).
If you want to filter by members of one or more dimension attributes, use tuples and sets to express the filtering:
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS
Sum(
{[Expense].[Account Number].&[51000]:[Expense].[Account Number].&[52000].lag(1)} *
[Expense].[Amount Category].&[OS],
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
Here, I've used the range operator : to construct a set consisting of all [Account Number] members greater than or equal to 51000 and less than 52000. I then cross-join * this set with the relevant [Amount Category] attribute, to get the relevant set of members that I want to sum my measure over.
Note that this only works if you actually have a member with the account number 51000 and 52000 in your Expense dimension (see comments).
An entirely different approach, would be to perform this logic in your ETL process. For example you could have a table of account-number ranges that map to a particular accounting type (Absorption, Selling & Marketing, etc.). You could then add a new attribute to your Expense-dimension, holding the accounting type for each account, and populate it using dynamic SQL and the aforementioned mapping table.

I don't go near cube scripts but do you not need to create some context via the currentmember function and also return some values for correct evaluation against the inequality operators (e.g.>) via the use of say the membervalue function ?
CREATE MEMBER CURRENTCUBE.[Measures].[Absorption]
AS sum
(
[Expense].[Amount Category].&[OS]
*
Filter(
[Expense].[Account Number].MEMBERS,
[Expense].[Account Number].currentmember.membervalue >= 51000
AND
[Expense].[Account Number].currentmember.membervalue < 52000
)
,
[Measures].[Amount - Expense]
),
VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = 'Expense';
EDIT
Dan has used the range operator :. Please make sure your hierarchy is ordered correctly and that the members you use with this operator actually exist. If they do not exist then they will be evaluated as null:
Against the AdvWks cube:
SELECT
{} ON 0
,{
[Date].[Calendar].[Month].&[2008]&[4]
:
[Date].[Calendar].[Month].&[2009]&[2]
} ON 1
FROM [Adventure Works];
Returns the following:
If the left hand member does not exist in the cube then it is evaluated as null and therefore open ended on that side:
SELECT
{} ON 0
,{
[Date].[Calendar].[Month].&[2008]&[4]
:
[Date].[Calendar].[Month].&[1066]&[2] //<<year 1066 obviously not in our cube
} ON 1
FROM [Adventure Works];
Returns:

SSAS & OLAP cube: twice same measure

I'm not very experienced in OLAP Cube + MDX, and I'm having a hard time trying to use twice the same measure in a cube.
Let's say that we have 3 Dimensions: D_DATE, D_USER, D_TYPE_OF_SALE_TARGET and 3 tables of Fact: F_SALE, F_MEETING, F_SALE_TARGET
F_SALE is linked to D_USER (who make the sale) and D_DATE (when)
F_SALE_TARGET is linked to D_USER, D_DATE, D_TYPE_OF_SALE_TARGET (meaning: user has to reach various goals/targets for a given month).
I can browse my cube:
Rows = Date * User
Cols = Number of sale, Total amount of sale + the value of 1 target (in the WHERE clause, I filter on [Dim TYPE SALE TARGET].[Code].&[code.numberOfSales])
How can I add other columns for other targets? As all the targets are in the same table, I don't see how to add a second measure from [Measures].[Value - F_SALE_TARGET] linked to a different code, ie. [Dim TYPE SALE TARGET].[Code].&[code.amountOfSale].

your question is not clear to me but it seems like one way to accomplish that is by creating Calculated Members. Basically, select you cube in BIDS, go to the Calculations tab and create Calculated Members. You would be able to insert your MDX query there. For each target type you can create a different calculation such as: ([Measures].[Value - F_SALE_TARGET], [Dim TYPE SALE TARGET].[Code].&[code.amountOfSale])

Many to many dimension - MDX help needed

I’m pretty new to the many-to-many dimensions but I have a scenario to solve, which raised a couple of questions that I can’t solve myself… So your help would be highly appreciated!
The scenario is:
There is a parent-child Categories dimension which has a recursive Categories hierarchy with NonLeafDataVisible set
There is a regular Products dimension, that slices the fact table
There is a bridge many-to-many ProductCategory table which defines the relation between the two. Important to note is that a product can belong to any level of the categories hierarchy – i.e. a particular category can have both – directly assigned products and sub-categories.
There is a fact Transactions table that holds a FK to the Product that has been sold, as well as a FK to its category. The FK is needed, because
I have all this modeled in BIDS, the dimension usage is set between each of the dimensions and the facts, the many-to-many relation between the Categories and the Transactions table is in place is in place. In other words everything seems kind of OK..
I now need to write an MDX which I would use to create a report that shows something like that:
Lev1 Lev2 Lev3 Prod Count
-A
-AA 6
-AA 2
P6 1
P5 1
-AAA 2
P1 1
P2 1
-AAB 2
P3 1
P4 1
+BB
The following MDX almost returns what I need:
SELECT
[Measures].[SALES Count] ON COLUMNS,
NONEMPTYCROSSJOIN(
DESCENDANTS([Category].[PARENTCATEGORY].[Level 01].MEMBERS),
[Product].[Prod KEY].[Prod KEY].MEMBERS,
[Measures].[Measures].[Bridge Distinct Count],
[Measures].[SALES Count],
2) ON ROWS
FROM [Sales]
The problem that I have is that for each of the non-leaf categories, the cross join returns a valid intersection with each of the products that’s been sold for it + all subcategories. Hence the result set contains way too much redundant data and besides I can’t find a way to filter out the redundancies in the SSRS report itself.
Any idea on how to rewrite the MDX so that it only returns the result set above?
Another problem is that if I create a role-playing Category dimension which I set to slice directly the transactions data, then the numbers that I get when browsing the cube are completely off… It seems as SSAS is doing something during processing (but it’s not the SQL statements it shoots to the OLTP, as those remain exactly the same) that causes the problem, but I’ve no idea what. Any ideas?
Cheers,
Alex

I think I found a solution to the problem, using the following query:
WITH
MEMBER [Measures].[Visible] AS
IsLeaf([DIM Eco Res Category].[PARENTCATEGORY].CurrentMember)
MEMBER [Measures].[CurrentProd] AS
IIF
(
[Measures].[Visible]
,[DIM Eco Res Product].[Prod KEY].CurrentMember.Name
,""
)
SELECT
{
[Measures].[Visible]
,[Measures].[CurrentProd]
,[Measures].[FACT PRODSALES Count]
} ON COLUMNS
,NonEmptyCrossJoin
(
Descendants
(
[DIM Eco Res Product].[Prod KEY].[(All)],
,Leaves
)
,Descendants([DIM Eco Res Category].[PARENTCATEGORY].[(All)])
,[Measures].[FACT PRODSALES Count]
,2
)
DIMENSION PROPERTIES
MEMBER_CAPTION
,MEMBER_UNIQUE_NAME
,PARENT_UNIQUE_NAME
,LEVEL_NUMBER
ON ROWS
FROM [Sales];
In the report then I use the [Measures].[CurrentProd] as a source for the product column and that seems to work fine so far.

Filtering a Measure (or Removing Outliers)

Say I have a measure, foo, in a cube, and I have a reporting requirement that users want to see the following measures in a report:
total foo
total foo excluding instances where foo > 10
total foo excluding instances where foo > 30
What is the best way to handle this?
In the past, I have added Named Calculations which return NULL if foo > 10 or just foo otherwise.
I feel like there has to be a way to accomplish this in MDX (something like Filter([Measures].[foo], [Measures].[foo] > 10)), but I can't for the life of me figure anything out.
Any ideas?

The trick is that you need to apply the filter on your set, not on your measure.
For example, using the usual Microsoft 'warehouse and sales' demo cube, the following MDX will display the sales for all the stores where sales were greater than $2000.
SELECT Filter([Store].[Stores].[Store].members, [Unit Sales] > 2000) ON COLUMNS,
[Unit Sales] ON ROWS
FROM [Warehouse and Sales]

I met similar problem when use saiku (backend with Mondrain), as I haven't found any clear solution of "add filter on measure", I added it here, and that may be useful for other guy.
In Saiku3.8, you could add filter on UI: "column"->"filter"->"custom", then you may see a Filter MDX Expression.
Let's suppose we want clicks in Ad greater than 1000, then add the following line there:
[Measures].[clicks] > 1000
Save and close, then that filter will be valid for find elem with clicks greater than 1000.
The MDX likes below (suppose dt as dimension and clicks as measure, we want to find dt with clicks more than 1000)
WITH
SET [~ROWS] AS
Filter({[Dt].[dt].[dt].Members}, ([Measures].[clicks] > 1000))
SELECT
NON EMPTY {[Measures].[clicks]} ON COLUMNS,
NON EMPTY [~ROWS] ON ROWS
FROM [OfflineData]

i think you have two choices:
1- Add column to your fact(or view on data source view that is based on fact table)like:
case when unit_Price>2000 then 1
else 0
end as Unit_Price_Uper_Or_Under_10
and add a fictitious Dimension based on this columns value.
and add named query for New Dimension(say Range_Dimension in datasourceview :
select 1 as range
union all
select 0 as range
and after taht you cant used this filter like other dimension and attribute.
SELECT [Store].[Stores].[Store].members ON COLUMNS,
[Unit Sales] ON ROWS
FROM [Warehouse and Sales]
WHERE [Test_Dimension].[Range].&[1]
the problem is for every range you must add When condition and only if the range is static this solution is a good solution.
and for dynamic range it's better to formulate the range (based on disceretizing method )
2- add dimension with granularity near fact table based on fact table
for example if we have fact table with primary key Sale_id.we can add
dimension based on fact table with only one column sale_Id and in dimension Usage tab
we can relate this new dimension and measure group with relation type Fact and
after that in mdx we can use something like :
filter([dim Sale].[Sale Id].[Sale Id].members,[Measures].[Unit Price]>2000)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MDX - Count of Filtered CROSSJOIN - Performance Issues - sql

Related

MDX where clause in subquery does not slice cube - how to understand?

Calculated SSAS Member based on multiple dimension attributes

SSAS & OLAP cube: twice same measure

Many to many dimension - MDX help needed

Filtering a Measure (or Removing Outliers)

Categories

Resources