How to do an aggregation in MDX similar to a 'sum over partition' statement in sql - mdx

I have a situation where I have a product and a time dimension, with a fact table of sales volume. Over time, various details about the product changes, with the except of the business key for the product. In my flat reporting from the cube, I want to include some aggregration at the 'business key' level, regardless of what other parts of the product dimension are shown.
In sql this would be trivial as something like:
select sum(volume) over (partition by productKey,year) as Total
Regardless of whatever else I had selected, the Total column would be aggregated only on those two fields.
In MDX I have managed to achieve the same result, but it seems like there must be a simpler way.
WITH MEMBER Measures.ProductKeyTotal AS
'SUM(([Product].[ProductKey],[Time].[Year]
,[Product].[Product Name].[Product Name].ALLMEMBERS
,[Volume Type].[Volume Type Id].[Volume Type Id].ALLMEMBERS)
,[Measures].[Volume])'
SELECT {[Measures].[Volume],[Measures].[ProductKeyTotal]} ON COLUMNS,
NONEMPTYCROSSJOIN ([Product].[ProductKey].[ProductKey].ALLMEMBERS
,[Time].[Time].[Year].ALLMEMBERS
,[Product].[Product Name].[Product Name].ALLMEMBERS
,[Volume Type].[Volume Type Id].[Volume Type Id].ALLMEMBERS) ON ROWS
FROM [My Cube]
WHERE ([Product].[Include In Report].&[True])
1) If I don't include the allmembers for the rows I don't want in the calculated member the total is not correct, is there a shortcut to force it to ignore all the dimensions other that what you specify?
Part of the reason I ask is that I need to add a bunch of other calculated members, some of which will be using parameters and if I use the method from the example above I am going to need to duplicate the same stuff in multiple places, and the code will get weighty.

Well, first of all, don't use NonEmptyCrossJoin--it's been deprecated. Use non empty and then the cross join operator (*).
It's important to understand how tuples and tuple sets work to answer your question. Essentially, any dimension not explicitly stated will always get the CurrentMember of a given dimension. Typically, this is DefaultMember, but if you have it set to something else in your query, that will change this up. The reason you have to specify ALLMEMBERS for those dimensions is because it will use CurrentMember, otherwise. You could just use the [All] member in lieu of trying to sum up ALLMEMBERS (especially if they're not flat!), which will give you a bit better performance.
The most performant way to do this is to add another Measure Group to your cube, and then remove the keys that don't apply to the measure from that Measure Group. This way, you get a native calculation for these rather than a run-time calculation (which tend to be slow, especially when you're adding up everything in your cube). Moreover, you can even set up some aggregation design on that Measure Group, and it will be very performant.

Related

Multiple subtotals - Rollup order of fields

I am trying to run a query that aggregates data, groups the results by several different fields, and extract all relevant "SubTotal" permutations. (similar to CUBE() in MSSQL)
When Using Group By Rollup(), I get only permutations according to the order of the Group By fields in the Rollup function.
For example the query below (runs on a public dataset), it returns subtotal by year, or by year and month, or by year, month and medallion... but it doesn't subtotal by medallion.
SELECT
trip_year,
trip_month,
medallion,
SUM(trip_count) AS Sum_trip_count
FROM
[nyc-tlc:yellow.Trips_ByMonth_ByMedallion]
WHERE
medallion IN ("2R76", "8J82", "3B85", "4L79", "5D59", "6H75", "7P60", "8V48", "1H12", "2C69", "2F38", "5Y86", "5j90", "8A75", "8V41", "9J24", "9J55", "1E13", "1J82")
GROUP BY
ROLLUP(trip_year,
trip_month,
medallion)
My question is:
What should I do in order to get all different permutations of "Sub Totals" in a single query results.
Already tried: Union with similar query but with different order, it works, but not elegant (it would require too many unions).
Thanks
You are correct on both counts. In BigQuery, ROLLUP respects the hierarchy treating the listed fields as a strictly ordered list. Their order will not be changed during aggregation.
The CUBE aggregate commonly found in other SQL environments is unordered and in fact aggregates every possible order/subset of its listed fields. At this time, CUBE has not been implemented in BigQuery. The workaround you suggest is also what I would suggest. UNION all result sets from ROLLUP using each permutation of its contained fields. Albeit not ideal, you should get the same results.
In short, UNIONs of several queries with different permutations of ROLLUP fields is the only way to achieve this at the moment. The downsides are as you state that this may be difficult to maintain and can be more expensive in queries.
If you would like to see CUBE implemented in BigQuery, I strongly encourage you to file a feature request on the Big Query public issue tracker. Be sure to include a thorough use case in this request.
UPDATE: To support the feature request filed by the OP, please star it and you'll receive notifications with updates.

SSAS: Two sets specified in the function have different dimensionality

I'm trying to run the following MDX query (I'm newbie in the matter):
WITH MEMBER [Measures].[Not Null SIGNEDDATA] AS IIF( IsEmpty( [Measures].[SIGNEDDATA] ), 0, [Measures].[SIGNEDDATA] )
SELECT
{[Measures].[Not Null SIGNEDDATA]} ON COLUMNS,
{[Cuenta].[818000_001],[Cuenta].[818000_G02]} ON ROWS
FROM [Notas_SIC]
WHERE ([Auditoria].[AUD_NA],[Concepto].[CONCEPTO_NA],[Entidad].[CCB],
[Indicador].[INDICADOR_NA],[Interco].[I_NONE],[Moneda].[COP],
[Tiempo].[2010.01],[Version].[VERSION_NA])
Where 818000_001 is a base member of my 'Cuenta' dimension, and 818000_G02 is a node or aggregation. I receive the following message:
"Two sets specified in the function have different dimensionality"
What am I doing wrong? If I put the query with only base members (many) or only differents aggregations, the result is ok as expected.
Thanks in advance!
By using commas in your WHERE clause, you are trying to create a set out of members from different dimensions. You should use asterisks to make a CROSSJOIN instead.
So replace this:
WHERE ([Auditoria].[AUD_NA],[Concepto].[CONCEPTO_NA],[Entidad].[CCB],
[Indicador].[INDICADOR_NA],[Interco].[I_NONE],[Moneda].[COP],
[Tiempo].[2010.01],[Version].[VERSION_NA])
With this:
WHERE ({[Auditoria].[AUD_NA],[Concepto].[CONCEPTO_NA],[Entidad].[CCB]} *
{[Indicador].[INDICADOR_NA],[Interco].[I_NONE],[Moneda].[COP]} *
{[Tiempo].[2010.01],[Version].[VERSION_NA]})
I added curly braces though I'm not sure if they're absolutely required.
This is maybe your problem:
{[Cuenta].[818000_001],[Cuenta].[818000_G02]} ON ROWS
You have put them as a set but you can only make a set out of members with the same dimensionality. From your description it sounds like [Cuenta].[818000_G02] is being classed as a different hierarchy which is unexpected.
Is [Cuenta].[818000_G02] created using the Aggregate function? Can you swap to using the Sum function? Does the script then work?
(not a great answer - just more questions?)
{[Cuenta].[818000_001],[Cuenta].[818000_G02]}
Your problem probably lies in the above statement. As is clear, they could be from different hierarchies(which is not clear from your example and I come to that below).
Try replacing the above with
{[Cuenta].[Hierarchy1].[818000_001]} * {[Cuenta].[Hierarchy2].[818000_G02]}
where Hierarchy1 and Hierarchy2 are the hierarchies to which these members belong. (If you are not sure, just try {[Cuenta].[818000_001]} * {[Cuenta].[818000_G02]}). I am assuming that the second aggregate members may be built on a different hierarchy and thus the engine is throwing an error.
It is generally a good habit to always use the fully qualified member name because then the SSAS engine has to spend less time in searching for the member.

MDX newbie: How do I filter/except based on an intermediate level which is not unique

I have a dimension ItemSales with a hierarchy of levels like Site->ItemType->Item
ItemType is not unique. Sites can sell the same ItemType.
I want to exclude a certain type from the totals (such as Unknown)
This must be easy but I'm stuck. It seems like Except would work but as far I can work out, except requires me to enumerate every site
Except([ItemSales].[Sites].[ItemType].Members],{[ItemSales].[site1].[unknown],[ItemSales].[site2].[Unknown]})
this also doesn't help if I just want to aggregate at Sites level.
The examples I see for filter focus on numerical filters for measures. Can you filter on the name of a member or whatever we call the key value it gets from the column?
Sorry for asking such an easy question but I'm not getting any less confused the more I read.
Not sure this is the best way to address your problem, but you can do something with the Filter() function using the .name of the members to keep all [ItemType] that are not 'Unknown' :
Filter(
[ItemSales].[Sites].[ItemType].Members,
[ItemSales].[Sites].currentMember.name <> 'Unknown'
)

Is there a way to specify that every cell should be brought back using MDX?

I'm using a query of the form:
Select {
([Measures].[M1], [Time].[2000]),
([Measures].[M2], [Time].[2000]),
([Measures].[M1], [Time].[2001]),
([Measures].[M2], [Time].[2001])
} on 0
From [Cube]
Where
([Some].[OtherStuff])
using Analysis Services 2008. Even though I'm not specifying NON EMPTY or similar, I'm still only getting three of the cells back (one of them is null).
How do I make sure that all the cells are brought back - even null ones?
Additional thoughts:
The query above isn't actually the one I'm running (surprisingly enough:) ). The real one has a couple of hierarchies from the same dimension specified as part of the select, and also as part of the where clause. I'm wondering if it's something to do with that, but I can't think what exactly.
Additional additional thoughts:*
This seems to be an AS2005/8 feature called Auto-Exists. Have a look at the relevant section on this MSDN article.
You're pulling back different dimensions, which doesn't fly too well. Try this instead:
with
member [Time].[Calendar].[CY2000] as
([Time].[Calendar].[2000], [Time].[Fiscal].[All Time])
member [Time].[Calendar].[FY2001] as
([Time].[Calendar].[All Time], [Time].[Fiscal].[2001])
select
{[Time].[Calendar].[CY2000], [Time].[Calendar].[FY2001]}*
{[Measures].[M1], [Measures].[M2]}
on columns
from [Cube]
where
{[Some].[OtherOtherStuff]}
Not an answer but this should help narrow down the problem, a query that replicates the issue in AdventureWorks
SELECT
{ ( [Date].[Calendar].[Date].&[1], [Date].[Fiscal].[Date].&[1129] ) } on 0
FROM [Adventure Works]
I'd expect this to bring back the tuple specified and a Null, but instead no tuples are retrieved.
I think this is because the tuple located on two hierarchies of the same dimension where there is no commonality.

MDX WHERE NOT IN() equivalent in many-to-many dimension

I have a many-to-many dimension in my cube (next to other regular dimensions). When I want to exclude fact rows in my row count measure, I usually do something like the following in MDX
SELECT [Measures].[Row Count] on 0
FROM cube
WHERE ([dimension].[attribute].Children - [dimension].[attribute].&[value])
This might seem more complicated than needed in this simple example, but in this case the WHERE can grow sometimes, also including UNIONs.
So this works for regular dimensions, but now I have a many-to-many dimension. If I perform the trick above it does not produce the desired result, namely I want to exclude all rows that have that specific attribute in the many-to-many dimension.
Actually it does exactly what the MDX asks, namely count all rows, but ignore the specified attribute. Since a row in the fact table can have multiple attributes in a many-to-many dimension, the row will still be counted.
That's not what I need, I need it to explicitly exclude rows that have that dimension attribute value. Also, I might exclude multiple values. So what I need is something similar to T-SQL's WHERE .. NOT IN (...)
I realize that I can just subtract the resulting values from [attribute].all and [attribute].&[value], but that won't work any more when UNIONing multiple WHERE statements.
Anybody got a good idea on how to solve this?
Thanks in advance,
Delta
I have not tested this, but I think you could do this if you had an attribute that was at the same level of granularity as the rows (so probably implemented as a fact relationship).
So if you wanted to count the number of orders that did NOT have a product category of bikes (assuming a M2M relationship between OrderID and Category) then something like the following should work. (you can find more info on the EXISTS function in Books Online)
[Orders].[Order ID].[Order ID].Members
- EXISTS([Orders].[Order ID].[Order ID].Members
, [Product].[Category].&[Bikes]
, "Order Facts")
Although it could be quite slow as this sort of query is forcing the SSAS engine to add up a lot of facts from a low level.
Have you tried the EXCEPT command? It's syntax is like the following:
EXCEPT({the set i want}, {a set of members i dont want})
You could use the Filter function:
SELECT [Measures].[Row Count] on 0
FROM [cube]
WHERE Filter([dimension].[attribute].Children, [dimension].CurrentMember.MemberValue <> value)