MDX where clause in subquery does not slice cube - how to understand? - ssas

This query gives me sales of one store:
select
[measures].[sales] on 0
from [MyCube]
where [store].[store].[042]
However, if I move the slicer to inside of the subquery, it gives me sales of all stores.
select
[measures].[sales] on 0
from (select
from [MyCube]
where [store].[store].[042]
)
How to understand the mechanisms behind this difference?
This is also noted in this article, but without much explanation.
----EDIT----:
I tried various things and read around for a while. I'd like to add a question: is there a scenario in which the where clause in sub-select does filter the result?
This query gives me sales of all stores in state MI (store [042] belongs to MI):
select
[measures].[sales] on 0
from (select
[store].[state].[MI] on 0
from [myCube]
where [store].[store].[042]
)
Thinking of 'inner query only filters if the filtered dimension is returned on an axis', the theory is proved wrong if I do this:
select
[measures].[sales] on 0
from (select
[store].[state].members on 0
from [myCube]
where [store].[store].[042]
)
The sub-select still returns one state MI, but the outer query returns sales of all stores (of all states).
----EDIT 4/13----:
Re-phrasing the question in AdventureWorks cube with screenshot.
Query 1: sales of one store
Query 2: it returns sales of all stores if where clause is in the sub-select.
Query 3: the two answers I got suggested that we select the dimension in an axis - here is the result - we get all cities.

select
[measures].[sales] on 0
from (select
from [MyCube]
where [store].[store].[042]
)
The above query reduces the scope of stores just to the member [042]. Make note that sub-select is executed before the actual select. So, when it comes to the select, the engine just sees a cube which has all the members in all the dimensions; but only the member [store].[store].[042] in the store dimension. It's as if the cube has been kept intact every where else but sliced off on the Store dimension.
If you go a step ahead and add the store on to one of the axes, like
select
[measures].[sales] on 0,
[store].[store].members on 1
from (select
from [MyCube]
where [store].[store].[042]
)
you would see that although the member [All] appears in the output, it actually is just comprised of only one store.
In essence, the [All] is a special member which is calculated with respect to scope of the cube. It reflects the combined effect of all the members in the cube.
In SQL terms, it is similar to:
select sales, store as [All] from
(select sales, store from tbl where store = '042') tbl
Even though you see Sales----All, it is but a reflection of sales for store [042]

Here are some other good references concerning sub-select and slicer debate:
http://bisherryli.com/2013/02/08/mdx-25-slicer-or-sub-cube/
https://cwebbbi.wordpress.com/2014/04/07/free-video-on-subselects-in-mdx/
Chris Webb's video being located here:
https://projectbotticelli.com/knowledge/what-is-a-subselect-mdx-video-tutorial?pk_campaign=tt2014cwb
This should still leave an All member:
SELECT
[measures].[sales] ON 0
FROM
(
SELECT
FROM [MyCube]
WHERE
[store].[store].[042]
);
...but the member [All] of the Store hierarchy will only now be made up of [store].[store].[042].
You can see this by adding the Store hierarchy onto ROWS:
SELECT
[measures].[sales] ON 0,
[store].MEMBERS ON 1
FROM
(
SELECT
FROM [MyCube]
WHERE
[store].[store].[042]
);
This is the AdvWorks version similar to the reference in your question:
SELECT
{[Measures].[Order Count]} ON 0
,[Subcategory].MEMBERS ON 1
FROM
(
SELECT
{
[Subcategory].[Subcategory].&[22]
} ON 0
FROM [Adventure Works]
);
It returns the member from the sub-select and the All member adjusted to take account of the subselect:
In the references article why is the [All] less than the sum of the other two - this is not down to the subselect but is in connection with the measure that he has chosen [Measures].[Order Count] which is a distinct count. If you take away the subselect you see exactly the same behaviour of the All member being less than the sum of the other subcategory members (I've marked the point at which the total of the parts becomes higher than the All member):
SELECT
{[Measures].[Order Count]} ON 0
,Order
(
[Subcategory].MEMBERS
,[Measures].[Order Count]
,bdesc
) ON 1
FROM [Adventure Works];
Order Count: on 1 order there might be several Product Subcategories - hence this behaviour.
Edit
This query of yours:
select
[measures].[sales] on 0
from (select
[store].[state].members on 0
from TestCube //<< added this!
where [store].[store].[042]
)
This inner script is not valid? Using the same dimension on an axes and the WHERE clause is not valid:
select
[store].[state].members on 0
from TestCube
where [store].[store].[042]
Edit2
An mdx script returns a cube, which may be sliced or not sliced, but nevertheless it returns a cube. The WHERE clause is used to slice the cube that is returned. If we were using a third party tool then the dimension added to the WHERE clause would go into a combobox - with say Cliffside selected. BUT the user could effectively select Ballard from that combobox - it is just a slicer. The WHERE clause is not changing the cube that is returned by the mdx script, it is just affecting what is displayed in the cellset.
WHERE is valid within a subselect. It is part of the definition:
https://msdn.microsoft.com/en-us/library/ff487138.aspx
I've never found a use case for a subselect's WHERE clause.
Edit3
This link will explain things:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/ccb66ac3-0f9a-4261-8ccc-b6ecc51b6f07/is-where-clause-pointless-inside-a-subselect?forum=sqlanalysisservices
As Darren gosbell says in the answer to this question:
https://msdn.microsoft.com/en-us/library/ff487138.aspx it says that:
The WHERE clause does not filter the subspace.

Related

How filter only the children members in MDX?

When I run this mdx query, works fine (get the children members from a hierarchy level):
select {} on columns,
[Dimension].[hierarchy].[level].children on rows
from [Cube]
But, when I add some tuple on rows, doesn't filter filter the children members (shows all the members) :S
select {} on columns,
[Dimension].[hierarchy].[level].children
* [Dimension2].[hierarchy2].[level2].allmembers on rows
from [Cube]
* is a cross join - you will get the Cartesian product of [Dimension].[hierarchy].[level].children and [Dimension2].[hierarchy2].[level2].allmembers because they are different dimensions.
If they were two hierarchies from the same dimension then auto exist behaviour would limit the results e.g. Year2014 crossed with month should just show the months in 2014.
Try using DESCENDANTS function + you might not require NULLs so try the NON EMPTY
SELECT
{} ON COLUMNS,
NON EMPTY
DESCENDANTS(
[Dimension].[hierarchy].[level].[PickAHigherUpMember],
[Dimension].[hierarchy].[PickTheLevelYouWantToDrillTo]
)
*
[Dimension2].[hierarchy2].[level2].allmembers ON ROWS
FROM [Cube]
if you look at the mdx language reference for children, you will also find another example of how to use the function with a hierarchy in stead of a member_expression.
http://msdn.microsoft.com/en-us/library/ms146018.aspx
but it won't work with a hierarchy level.
Maybe the row expression was initialy a hierarchy that you've have changed into a level expression.
in the following a similar working mdx with a hierarchy on rows:
select {} on 0,
[Product].[Model Name].children
*
[Geography].[Country].[All Geographies]
on 1
FROM [Adventure Works
Philip,
I guess you want only those rows where the children have a value on the default measure. In that case you could try the following:
select {} on columns,
Nonempty([Dimension].[hierarchy].[level].children
* [Dimension2].[hierarchy2].[level2].allmembers) on rows
from [Cube]
Now if, for the children, you'd need all the members from Dimension2 then you could try:
select {} on columns,
Nonempty([Dimension].[hierarchy].[level].children, [Dimension2].[hierarchy2].[level2].allmembers)
* [Dimension2].[hierarchy2].[level2].allmembers) on rows
from [Cube]
In the second case the Nonempty function takes a second parameter and the cross join is done with the result of the Nonempty function. For the documentation on Nonempty including the usage of the second parameter see https://learn.microsoft.com/en-us/sql/mdx/nonempty-mdx

MDX - Count of Filtered CROSSJOIN - Performance Issues

BACKGROUND: I've been using MDX for a bit but I am by no means an expert at it - looking for some performance help. I'm working on a set of "Number of Stores Authorized / In-Stock / Selling / Etc" calculated measures (MDX) in a SQL Server Analysis Services 2012 Cube. I had these calculations performing well originally, but discovered that they weren't aggregating across my product hierarchy the way I needed them to. The two hierarchies predominantly used in this report are Business -> Item and Division -> Store.
For example, in the original MDX calcs the Stores In-Stock measure would perform correctly at the "Item" level but wouldn't roll up a proper sum to the "Business" level above it. At the business level, we want to see the total number of store/product combinations in-stock, not a distinct or MAX value as it appeared to do originally.
ORIGINAL QUERY RESULTS: Here's an example of it NOT working correctly (imagine this is an Excel Pivot Table):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 2,416 2,392 99.01%
[-] Business Two 2,377 2,108 93.39%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 84.06%
-Item 3 2,377 2,108 88.68%
-Item N ... ... ...
FIXED QUERY RESULTS: After much trial and error, I switched to using a filtered count of a CROSSJOIN() of the two hierarchies using the DESCENDANTS() function, which yielded the correct numbers (below):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 215,644 149,301 93.90%
[-] Business Two 86,898 55,532 83.02%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 99.31%
-Item 3 2,377 2,108 99.11%
-Item N ... ... ...
QUERY THAT NEEDS HELP: Here is the "new" query that yields the results above:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
FILTER(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
)
),
[Measures].[Inventory Qty] > 0
)
),
FORMAT_STRING = "#,#",
NON_EMPTY_BEHAVIOR = { [Inventory Qty] },
This query syntax is used in a bunch of other "Number of Stores Selling / Out of Stock / Etc."-type calculated measures in the cube, with only a variation to the [Inventory Qty] condition at the bottom or by chaining additional conditions.
In its current condition, this query can take 2-3 minutes to run which is way too long for the audience of this reporting. Can anyone think of a way to reduce the query load or help me rewrite this to be more efficient?
Thank you!
UPDATE 2/24/2014: We solved this issue by bypassing a lot of the MDX involved and adding flag values to our named query in the DSV.
For example, instead of doing a filter command in the MDX code for "number of stores selling" - we simply added this to the fact table named query...
CASE WHEN [Sales Qty] > 0
THEN 1
ELSE NULL
END AS [Flag_Selling]
...then we simply aggregated these measures as LastNonEmpty in the cube. They roll up much faster than the full-on MDX queries.
It should be much faster to model your conditions into the cube, avoiding the slow Filter function:
If there are just a handful of conditions, add an attribute for each of them with two values, one for condition fulfilled, say "cond: yes", and one for condition not fulfilled, say "cond: no". You can define this in a view on the physical fact table, or in the DSV, or you can model it physically. These attributes can be added to the fact table directly, defining a dimension on the same table, or more cleanly as a separate dimension table referenced from the fact table. Then define your measure as
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
),
{ [Flag dim].[cond].[cond: yes] }
)
)
Possibly, you even could define the measure as a standard count measure of the fact table.
In case there are many conditions, it might make sense to add just a single attribute with one value for each condition as a many-to-many relationship. This will be slightly slower, but still faster than the Filter call.
I believe you can avoid the cross join as well as filter completely. Try using this:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS
CASE WHEN [Product].[Item Name].CURRENTMEMBER IS [Product].[Item Name].[All]
THEN
SUM(EXISTS([Product].[Item Name].[Item Name].MEMBERS,[Business].[Business Name].CURRENTMEMBER),
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
))
ELSE
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
)
END
I tried it using a dimension in my cube and using Area-Subsidiary hierarchy.
The case statement handles the situation of viewing data at Business level. Basically, the SUM() across all members of Item Names used in CASE statement calculates values for individual Item Names and then sums up all the values. I believe this is what you needed.

Aggregating MDX query results on existing facts only

I am very new to MDX, so probably I'm missing something very simple.
In my cube, I have a dimension [Asset] and a measure [Visits], calculating (in this case) how many visits an asset has been consumed by. An important thing to note is that not every visit is associated with an asset.
What I need to find out is how many visits there are that consumed at least one asset. I wrote the following query:
SELECT
[Asset].[All] ON COLUMNS,
[Measures].[Visits] ON ROWS
FROM
[Analytics]
But this query just returns the total number of visits in the cube. I tried applying the NON EMPTY modifier to both axes, but that doesn't help.
This query should give you what you expect:
WITH MEMBER [Asset].[Asset Name].[All Assets] AS
AGGREGATE( EXCEPT( [Asset].[Asset Name].MEMBERS, { [Asset].[All] } ) )
SELECT
{ [Asset].[Asset Name].[All Assets] } ON COLUMNS,
[Measures].[Visits] ON ROWS
FROM
[Analytics]
You may need to put {[Asset].[Asset Name].[All]} as second argument of Except if the All member was not excluded.
In the query I create a calculated member [Asset].[Asset Name].[all assets] that should represent all your existing assets. I supposed that your existing assets are all the members of the level [Asset].[Asset Name] but the All member.
You can find more information about the Aggregate function here.
This works as well:
SELECT
[Measures].[Visits] ON 0
FROM
[Analytics]
WHERE
DRILLDOWNLEVEL([Asset].[All])
Update: as well as this:
SELECT
[Measures].[Visits] ON 0
FROM
[Analytics]
WHERE
[Asset].[All].CHILDREN

Why does Order ignore subselect?

I've run into some confusing behavior in Analysis Services 2005 (and 2008 R2) and would appreciate it if someone could explain why this is happening. For the sake of this question, I've reproduced the behavior against the Adventure Works cube.
Given this MDX:
SELECT [Customer].[Education].[(All)].ALLMEMBERS ON COLUMNS,
Order(DrillDownLevel({[Customer].[Customer Geography].[All Customers]}),
([Measures].[Internet Order Count]),
ASC) ON ROWS
FROM (SELECT {[Customer].[Education].&[Partial High School]} ON COLUMNS FROM [Adventure Works])
WHERE [Measures].[Internet Order Count];
The query evaluates with the ordered set on rows:
All Customers: 2, 136
Germany: 269
France: 298
Canada: 304
United Kingdom: 311
United States: 457
Australia: 497
However, if I include the All Member (or defaultmember) for Education in the tuple used in the order statement:
SELECT [Customer].[Education].[(All)].ALLMEMBERS ON COLUMNS,
Order(DrillDownLevel({[Customer].[Customer Geography].[All Customers]}),
([Measures].[Internet Order Count], [Customer].[Education].[All Customers]),
ASC) ON ROWS
FROM (SELECT {[Customer].[Education].&[Partial High School]} ON COLUMNS FROM [Adventure Works])
WHERE [Measures].[Internet Order Count];
Then the the set comes back in a significantly different order:
All Customers: 2, 136
France: 298
Germany: 269
United Kingdom: 311
Canada: 304
Australia: 497
United States: 459
Note that France and Germany are out of order relative to each other. Same with Canada / UK and with USA / Australia. From what I can tell, it's ordering based on the aggregation before the sub-cube is evaluated.
Why does including this member (which should implicitly be in the tuple in the first example?) cause the evaluation of the order statement to look outside of the subcube's visual totals? Filter and TopCount etc functions seem to have the same behavior.
Cheers.
My guess is that since the 2 attributes are related only by key (no direct relationship exists) the resulting crossjoin of the attributes results on the key members being aggregated and this is where the visualtotals is not applied (it's one of the ways how to calculate tho nonvisualtotal with subselects).
I've constructed a query to demonstrate what is happening as a result of that crossjoin:
WITH
MEMBER sortExpr as
AGGREGATE(Descendants([Customer].[Customer Geography].CurrentMember), [Measures].[Internet Order Count])
SELECT
[Customer].[Education].[(All)].ALLMEMBERS
*
{[Measures].[Internet Order Count], sortExpr} ON COLUMNS,
Order
(
DrillDownLevel({[Customer].[Customer Geography].[All Customers]})
,sortExpr
,ASC
) ON ROWS
FROM
(
SELECT
{[Customer].[Education].&[Partial High School]} ON COLUMNS
FROM [Adventure Works]
)
;
EDIT:
Here is another query that shows that the expression works correctly with attribute overwrites once you solve the subselect problem using a query scoped set (well-known solution):
WITH
set sub as
[Customer].[Customer Geography].[Customer]
MEMBER sortExpr2 as
Aggregate(existing sub, [Measures].[Internet Order Count])
SELECT
[Customer].[Education].[(All)].ALLMEMBERS
*
{[Measures].[Internet Order Count], sortExpr2} ON COLUMNS,
Order
(
DrillDownLevel({[Customer].[Customer Geography].[All Customers]})
,
(
-- [Customer].[Customer Geography].CurrentMember,
[Customer].[Education].[All Customers],
sortExpr2
)
,ASC
) ON ROWS
FROM
(
SELECT
{[Customer].[Education].&[Partial High School]} ON COLUMNS
FROM [Adventure Works]
)
;
Thanks to Boyan for tweeting this question.
Regards, Hrvoje
Note: Have a look at http://www.bp-msbi.com/2011/07/mdx-subselects-some-insight/ where I explained the behaviour in more detail.
A great article about subselects is Mosha's 2008 MDX: subselects and CREATE SUBCUBE in non-visual mode
Note what happens when you use a subselect. Implicit Exists and visual totals are applied. The key bit of information here in regards to what you are experiencing is:
2. Applies visual totals to the values of physical measures even within expressions if there are no coordinate overwrites.
In your first query SSAS applies visual totals by default. You can change your query to not do this like this:
SELECT [Customer].[Education].[(All)].ALLMEMBERS ON COLUMNS,
Order(DrillDownLevel({[Customer].[Customer Geography].[All Customers]}),
([Measures].[Internet Order Count]),
ASC) ON ROWS
FROM NON VISUAL (SELECT {[Customer].[Education].&[Partial High School]} ON COLUMNS FROM [Adventure Works])
WHERE [Measures].[Internet Order Count];
The NON VISUAL keyword in a SELECT statement tells SSAS to only apply the implicit Exists, but not the visual totals part. The results of the query will be the same as in the second case, but you will also see the real numbers it is ordering by.
Because you are explicitly overwriting the All member in the second query, SSAS does not apply the visual totals to this expression and orders by the total amounts for all years. However, it still displays the visual totals for the selected on ROWS measure after it evaluates the order on non-visual totals.
I'm not a SSAS specialist but this is related to attribute overwrite.
You're overwriting your sub-select in the Order function evaluation. Note, this behavior is context dependent (axes eval, calculated members and pivot). In you example the context is a function during the evaluation of the axis. The behavior is different from the evaluation of the pivot (once you're axes are known). It's complicated but the way it is.
Note that in icCube and after discussing with some MDX specialist we decided to simplify and not follow this behavior: subselect filter is always applied.

Can I reformulate this MDX query to use sets instead of an "And"?

with member [Measures].[BoughtDispenser] as
Sum(Descendants([Customer].[Customer].CurrentMember, [Customer].[Customer]),
Iif(
(IsEmpty(([Item].[ItemNumber].&[011074], [Measures].[Sale Amount]))
And IsEmpty(([Item].[ItemNumber].&[011069], [Measures].[Sale Amount]))
)
Or IsEmpty([Measures].[Sale Amount]),
0 , 1
)
)
select
{[Measures].[Sale Amount]} on columns,
non empty filter([Customer].[Customer].children, [Measures].[BoughtDispenser])
* {[Item].[ItemNumber].members}
on rows
from [Sales]
where [EnteredDate].[Quarter].&[2010-01-01T00:00:00]
;
The object is to show all the items purchased by customers who also bought either of the two dispensers (011069 and 011074).
I based the calculated member on a query I found to do basket analysis. I feel like there should be a way to write it with the set {[Item].[ItemNumber].&[011074], [Item].[ItemNumber].&[011069]} instead of the two IsEmpty tests. Everything I've tried ended up having every Customer in the result.
My environment is SQL Server Analysis Services 2005.
Yes I can! It just required a slightly different approach to the calculated member:
with member [Measures].[BoughtDispenser] as
Sum(Descendants([Customer].[Customer].CurrentMember, [Customer].[Customer])
* {[Item].[ItemNumber].&[011069], [Item].[ItemNumber].&[011074]},
[Measures].[Quantity Shipped]
)
select
{[Measures].[Sale Amount]} on columns,
non empty filter([Customer].[Customer].children, [Measures].[BoughtDispenser])
* {[Item].[ItemNumber].members}
on rows
from [Sales]
where [EnteredDate].[Quarter].&[2010-01-01T00:00:00]
;