MDX: avg advanced use

MDX: avg advanced use - mdx

I am reaching the limit of my basic MDX knowledge on a problem, if anyone has an idea, every help will be welcome
Situation
This is the hierarchy I'd like to deal with. In my fact_table I have a [Measures].[Sales] measure.
[All Management].[TemplateMgt].[CityMgt].[DistricMgt].[StoreMgt]
[All Management].[TMP-00.002].[London].[DistricMgt].[Shoe001]
[All Management].[TMP-00.002].[London].[DistricMgt].[Hat001]
[All Management].[TMP-00.002].[London].[DistricMgt].[Electronic001]
[All Management].[TMP-00.002].[Paris].[DistricMgt].[Shoe001]
[All Management].[TMP-00.002].[Paris].[DistricMgt].[Hat001]
[All Management].[TMP-00.002].[Paris].[DistricMgt].[Electronic001]
[All Management].[TMP-00.002].[Madrid].[DistricMgt].[Shoe001]
[All Management].[TMP-00.002].[Madrid].[DistricMgt].[Hat001]
[All Management].[TMP-00.002].[Madrid].[DistricMgt].[Electronic001]
Problem
For a given CityMgt, I would like to have three values
[Measures].[Cur]: StoreMgt's sales of the given CityMgt (So for Madrid, get the value [Shoe001], [Hat001], [Electronic001])
[Measures].[Avg]: the average sales of StoreMgt group by StoreMgt having the same TemplateMgt AVG([London].[Shoe001] + [Paris].[Shoe001] + [Madrid].[Shoe001])
[Measures].[Max]: the max sales values of StroreMgt having the same TemplateMgt MAX([London].[Shoe001], [Paris].[Shoe001], [Madrid].[Shoe001])
In other word, I'd like to have an output that will have this structure
Shoe001 | Hat001 | Electronic001
----------------------------------------------------
CUR|AVG|MAX | CUR|AVG|MAX | CUR|AVG|MAX
----------------------------------------------------
What I got so far
WITH MEMBER [Measures].[Cur] AS (...)
MEMBER [Measures].[Avg] AS (...)
MEMBER [Measures].[Max] AS (...)
SELECT {[Measures].[Cur], [Measures].[Avg], [Measures].[Max]} ON COLUMNS,
{FILTER({DESCENDANTS([All Management].CurrentMember, [StoreMgt])}, [All Management].Parent.Parent = "Madrid" } ON ROWS
from [MyCube]
My problem is that I don't know what to put in the Member attributes Cur/Avg/Max so my datas can be treated per StoreMgt (a kind of groupby)
If anyone can enligthenme, I will appreciate.
Cordially,

To get the average you can define new hierarchies (attributes if you're on SSAS). One for the country and another for the product type. Once you get them the statistical calculations are a question of playing with the currentmember and the [All].
You can go for another version -> SUM( FILTER(..members, condition), value)... this can be slow, really slow.
In general, for this kind of calculation you can use what we call statistical or utility dimensions (see).

I am not completely sure that following query will work, hope it conveys the idea,
WITH MEMBER [All Management].[Sales_AVG] AS AVG({[All Management].Members},
[Measures].currentMember)
MEMBER [All Management].[Sales_MAX] AS MAX({[All Management].Members},
[Measure].currentMember)
SELECT {[Measures].[Sales]} ON COLUMNS,
{[All Management].Members, [All Management].[Sales_AVG],
[All Management].[sales_Max]} ON ROWS FROM [MYCUBE] WHERE
{DESCENDANTS([All Management].CurrentMember, [StoreMgt])}

Related

NONEMPTY and CROSSJOIN performance and order in MDX

I was wondering which of the following two queries is more performant?
Query 1:
SELECT NONEMPTY(CROSSJOIN({[Product].[Category].children},
{[Scenario].[Scenario].members}
)
) ON COLUMNS
FROM [Analysis Services Tutorial]
Query 2:
SELECT CROSSJOIN(NONEMPTY({[Product].[Category].children}),
NONEMPTY({[Scenario].[Scenario].members})
) ON COLUMNS
FROM [Analysis Services Tutorial]
I would say query 2 is more performant/optimized because first you take out all the unnecessary members and then crossjoin them. The first query you crossjoin everything and then take out the nulls. That would be my guess but I want somebody who can clear me up.
Edit 1 In response of comments of an answer
Lets say I add a measure as a second parameter, so it does not go to the "default measure". How could second query return values with null? I am specifying to crossjoin between nonempty members. And I just really dont see how the can return different results no matter the dimensions involved. To me they seemed pretty equivalent. What am I not seeing?
Query 1:
SELECT NONEMPTY(CROSSJOIN({[Product].[Category].children},
{[Scenario].[Scenario].members}
), [Total Internet Sales]
) ON COLUMNS
FROM [Analysis Services Tutorial]
Query 2:
SELECT CROSSJOIN(NONEMPTY({[Product].[Category].children},[Total Internet Sales]),
NONEMPTY({[Scenario].[Scenario].members},[Total Internet Sales])
) ON COLUMNS
FROM [Analysis Services Tutorial]
Edit 2
As the answer said the queries are not the same. I realized when #GregGalloway presented other scenario.
I did an excel with sample data so maybe someone can find it useful.

They aren't equivalent since both queries we will return different results. For example, against the real Adventure Works (not some tutorial version) these two queries return different results. Notice that the Clothing/Kentucky column shows null on the second query:
SELECT NONEMPTY(CROSSJOIN({[Product].[Category].children},
{[Customer].[State-Province].[State-Province].Members}
), [Measures].[Internet Sales Amount]
) ON COLUMNS
FROM [Adventure Works]
where [Measures].[Internet Sales Amount]
SELECT CROSSJOIN(NONEMPTY({[Product].[Category].children},[Measures].[Internet Sales Amount]),
NONEMPTY({[Customer].[State-Province].[State-Province].Members},[Measures].[Internet Sales Amount])
) ON COLUMNS
FROM [Adventure Works]
where [Measures].[Internet Sales Amount]
Note that the Scenario dimension doesn't relate to the Internet Sales measure group, I don't think. So that may not be a good example. I chose the Product dimension and the Customer dimension for my example.
As discussed (and as you updated in your question) NonEmpty() should always have a second parameter so it is clear what measure you are doing NonEmpty against. Your query should also mention a measure on one axis or the WHERE clause so that you're not returning some vague "default measure". I've included a WHERE clause with a measure in my examples.
Anyway, to answer your question... assuming the measure is a physical measure or a well optimized calculated measure that runs in block mode I wouldn't be surprised if Query 1 is faster. But it depends on the measure and the size of dimensions and the sparsity of the cube. This question is very theoretical and the two queries don't return equivalent results.

Combining Aggregate Queries

I have Two Tables of related data.
Table 1 - All Trading history by broker
Table 2 - All Trade Breaks (trades which had errors / differences / issues)
I created a Query to Total the number of trades by Broker from Table 1
I created a Query to Total the number of "Breaks" by Broker from Table 2
I then created a Query to combine the two previous Queries and produce some statistics
Example:
Broker Total Trades Total Breaks Break %
Goldman 10 4 40%
Morgan 10 2 20%
Rather than create 3 queries - is there a way to create 1 query which achieves the same result? I want to perform more detailed analysis / reports without inundating the database with tons of individual queries. SQL Code Below
First Query:
SELECT DISTINCTROW [All Breaks].Broker, Sum([All Breaks].TradeCount) AS
[Sum Of TradeCount]
FROM [All Breaks]
GROUP BY [All Breaks].Broker;
Second Query:
SELECT DISTINCTROW [All Trades].Broker, Sum([All Trades].TradeCount) AS
SumOfTradeCount
FROM [All Trades]
GROUP BY [All Trades].Broker;
end Result: Combining
SELECT [Broker List].Broker, [All Breaks Query].[Sum Of TradeCount], [All
Trades Query].SumOfTradeCount, [Sum Of TradeCount]/[SumOfTradeCount] AS
Percentage
FROM ([Broker List] INNER JOIN [All Breaks Query] ON [Broker List].Broker
= [All Breaks Query].Broker) INNER JOIN [All Trades Query] ON [Broker
List].Broker = [All Trades Query].Broker;
Thanks Very Much!

In query design, there is a way to switch to SQL View, where you can freely write sql.
select one, two
from
(select one, joinfield from table1) as first
Inner join (select two, joinfield from table2) as second
ON first.joinfield = second.joinfield.
Also, see Combining two MS Access queries

MDX - Count of Filtered CROSSJOIN - Performance Issues

BACKGROUND: I've been using MDX for a bit but I am by no means an expert at it - looking for some performance help. I'm working on a set of "Number of Stores Authorized / In-Stock / Selling / Etc" calculated measures (MDX) in a SQL Server Analysis Services 2012 Cube. I had these calculations performing well originally, but discovered that they weren't aggregating across my product hierarchy the way I needed them to. The two hierarchies predominantly used in this report are Business -> Item and Division -> Store.
For example, in the original MDX calcs the Stores In-Stock measure would perform correctly at the "Item" level but wouldn't roll up a proper sum to the "Business" level above it. At the business level, we want to see the total number of store/product combinations in-stock, not a distinct or MAX value as it appeared to do originally.
ORIGINAL QUERY RESULTS: Here's an example of it NOT working correctly (imagine this is an Excel Pivot Table):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 2,416 2,392 99.01%
[-] Business Two 2,377 2,108 93.39%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 84.06%
-Item 3 2,377 2,108 88.68%
-Item N ... ... ...
FIXED QUERY RESULTS: After much trial and error, I switched to using a filtered count of a CROSSJOIN() of the two hierarchies using the DESCENDANTS() function, which yielded the correct numbers (below):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 215,644 149,301 93.90%
[-] Business Two 86,898 55,532 83.02%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 99.31%
-Item 3 2,377 2,108 99.11%
-Item N ... ... ...
QUERY THAT NEEDS HELP: Here is the "new" query that yields the results above:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
FILTER(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
)
),
[Measures].[Inventory Qty] > 0
)
),
FORMAT_STRING = "#,#",
NON_EMPTY_BEHAVIOR = { [Inventory Qty] },
This query syntax is used in a bunch of other "Number of Stores Selling / Out of Stock / Etc."-type calculated measures in the cube, with only a variation to the [Inventory Qty] condition at the bottom or by chaining additional conditions.
In its current condition, this query can take 2-3 minutes to run which is way too long for the audience of this reporting. Can anyone think of a way to reduce the query load or help me rewrite this to be more efficient?
Thank you!
UPDATE 2/24/2014: We solved this issue by bypassing a lot of the MDX involved and adding flag values to our named query in the DSV.
For example, instead of doing a filter command in the MDX code for "number of stores selling" - we simply added this to the fact table named query...
CASE WHEN [Sales Qty] > 0
THEN 1
ELSE NULL
END AS [Flag_Selling]
...then we simply aggregated these measures as LastNonEmpty in the cube. They roll up much faster than the full-on MDX queries.

It should be much faster to model your conditions into the cube, avoiding the slow Filter function:
If there are just a handful of conditions, add an attribute for each of them with two values, one for condition fulfilled, say "cond: yes", and one for condition not fulfilled, say "cond: no". You can define this in a view on the physical fact table, or in the DSV, or you can model it physically. These attributes can be added to the fact table directly, defining a dimension on the same table, or more cleanly as a separate dimension table referenced from the fact table. Then define your measure as
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
),
{ [Flag dim].[cond].[cond: yes] }
)
)
Possibly, you even could define the measure as a standard count measure of the fact table.
In case there are many conditions, it might make sense to add just a single attribute with one value for each condition as a many-to-many relationship. This will be slightly slower, but still faster than the Filter call.

I believe you can avoid the cross join as well as filter completely. Try using this:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS
CASE WHEN [Product].[Item Name].CURRENTMEMBER IS [Product].[Item Name].[All]
THEN
SUM(EXISTS([Product].[Item Name].[Item Name].MEMBERS,[Business].[Business Name].CURRENTMEMBER),
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
))
ELSE
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
)
END
I tried it using a dimension in my cube and using Area-Subsidiary hierarchy.
The case statement handles the situation of viewing data at Business level. Basically, the SUM() across all members of Item Names used in CASE statement calculates values for individual Item Names and then sums up all the values. I believe this is what you needed.

SSAS MDX query, Filter rows by sales people

I am learning how to query Cubes using MDX (SQL Server 2012) queries. I have been presented with a challenge. We have a hierarchy of sales people, a stored procedure returns a table with all sales people working under a manager. I have a classic sales cube where FactSales PK is invoice number and invoice line and a dimension for all our Sales people.
How can I filter the invoices where the sales person is in a table ?
Something like this but translated to MDX:
select * from sales where SalesPerson in (select SalesPerson from #salespeople)
The only way I see this could work is by writing the query dynamically and adding each salesperson in a filter, but that is not optimal in my opinion, we can have 200 or 400 people that we want to return sales from.
thanks!

If the dimension containing the sales people contains the hierarchy (who works for whom), you can resolve the challenge without using the stored procedure. Let's say your manager is named "John Doe" and your sales person hierarchy is named [Sales Person].[Sales Person]. Then just use
[Sales Person].[Sales Person].[John Doe].Children
in your query if you want to see sales for the people working directly for John, and you are done. In case you want to see John himself and everybody working for him directly or indirectly, you would use the Descendants function as follows:
Descendants([Sales Person].[Sales Person].[John Doe], 0, SELF_AND_AFTER)
This function has many variants, documented here.
In the Microsoft sample Adventure Works cube, where a similar hierarchy is called [Employee].[Employees], you could run the following query:
SELECT {
[Measures].[Reseller Sales Amount]
}
*
[Date].[Calendar].[Calendar Year].Members
ON COLUMNS,
Descendants([Employee].[Employees].[Jean E. Trenary], 0, SELF_AND_AFTER)
ON ROWS
FROM [Adventure Works]
to see the sales of employees working directly or indirectly for "Jean E. Trenary".

Many to many dimension - MDX help needed

I’m pretty new to the many-to-many dimensions but I have a scenario to solve, which raised a couple of questions that I can’t solve myself… So your help would be highly appreciated!
The scenario is:
There is a parent-child Categories dimension which has a recursive Categories hierarchy with NonLeafDataVisible set
There is a regular Products dimension, that slices the fact table
There is a bridge many-to-many ProductCategory table which defines the relation between the two. Important to note is that a product can belong to any level of the categories hierarchy – i.e. a particular category can have both – directly assigned products and sub-categories.
There is a fact Transactions table that holds a FK to the Product that has been sold, as well as a FK to its category. The FK is needed, because
I have all this modeled in BIDS, the dimension usage is set between each of the dimensions and the facts, the many-to-many relation between the Categories and the Transactions table is in place is in place. In other words everything seems kind of OK..
I now need to write an MDX which I would use to create a report that shows something like that:
Lev1 Lev2 Lev3 Prod Count
-A
-AA 6
-AA 2
P6 1
P5 1
-AAA 2
P1 1
P2 1
-AAB 2
P3 1
P4 1
+BB
The following MDX almost returns what I need:
SELECT
[Measures].[SALES Count] ON COLUMNS,
NONEMPTYCROSSJOIN(
DESCENDANTS([Category].[PARENTCATEGORY].[Level 01].MEMBERS),
[Product].[Prod KEY].[Prod KEY].MEMBERS,
[Measures].[Measures].[Bridge Distinct Count],
[Measures].[SALES Count],
2) ON ROWS
FROM [Sales]
The problem that I have is that for each of the non-leaf categories, the cross join returns a valid intersection with each of the products that’s been sold for it + all subcategories. Hence the result set contains way too much redundant data and besides I can’t find a way to filter out the redundancies in the SSRS report itself.
Any idea on how to rewrite the MDX so that it only returns the result set above?
Another problem is that if I create a role-playing Category dimension which I set to slice directly the transactions data, then the numbers that I get when browsing the cube are completely off… It seems as SSAS is doing something during processing (but it’s not the SQL statements it shoots to the OLTP, as those remain exactly the same) that causes the problem, but I’ve no idea what. Any ideas?
Cheers,
Alex

I think I found a solution to the problem, using the following query:
WITH
MEMBER [Measures].[Visible] AS
IsLeaf([DIM Eco Res Category].[PARENTCATEGORY].CurrentMember)
MEMBER [Measures].[CurrentProd] AS
IIF
(
[Measures].[Visible]
,[DIM Eco Res Product].[Prod KEY].CurrentMember.Name
,""
)
SELECT
{
[Measures].[Visible]
,[Measures].[CurrentProd]
,[Measures].[FACT PRODSALES Count]
} ON COLUMNS
,NonEmptyCrossJoin
(
Descendants
(
[DIM Eco Res Product].[Prod KEY].[(All)],
,Leaves
)
,Descendants([DIM Eco Res Category].[PARENTCATEGORY].[(All)])
,[Measures].[FACT PRODSALES Count]
,2
)
DIMENSION PROPERTIES
MEMBER_CAPTION
,MEMBER_UNIQUE_NAME
,PARENT_UNIQUE_NAME
,LEVEL_NUMBER
ON ROWS
FROM [Sales];
In the report then I use the [Measures].[CurrentProd] as a source for the product column and that seems to work fine so far.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MDX: avg advanced use - mdx

Related

NONEMPTY and CROSSJOIN performance and order in MDX

Combining Aggregate Queries

MDX - Count of Filtered CROSSJOIN - Performance Issues

SSAS MDX query, Filter rows by sales people

Many to many dimension - MDX help needed

Categories

Resources