Abitrary Shape Of Sets Error when defining Partition Slicers - ssas

I'm trying to adapt the concept of designing Slicers within my Cube for associated Partitions. This is a practice that I typically have avoided because of familiarity with the Auto-Slice concept where creating Slicers are treated more so as options as opposed to proper design.
However, this one error is becoming a total annoyance and I am considering going back to avoiding their use if no one can provide a reasonable solution to the infamous, "An arbitrary shape of the sets is not allowed in the current context" error.
I am receiving this error when attempting to process my cube with slicers that use my Calendar Date Hierarchy. Here is an example of one of the Partition Slicers:
{[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-01T00:00:00]
,[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-02T00:00:00]
,[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-03T00:00:00]
,[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-04T00:00:00]
,[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-05T00:00:00]
,[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-06T00:00:00]
,[Calendar Dates].[Calendar Dates].[Calendar Year].&[2007].&[QUARTER NUMBER 2].&[APRIL].&[2007-04-07T00:00:00]}
My first greivance is that I have to manually specify every Member of the set because the use of the range (:) operator is prohibited. The size of the Cube I am maintaining is enormous and just creating the number of Partitions required are an extreme task in itself so not having the use of the range operator is just simply a poor restriction IMHO. I saw that there was a request on MSConnect to correct this design issue but the last response I noted was that it was too late for SQL2008 R2. No mention of intention to address it later releases though.
Please see: https://connect.microsoft.com/SQLServer/feedback/details/339861/automatically-resolve-arbitrary-shaped-sets-to-subcubes
Getting past my gripe session, I cannot see where or why my defined set creates an arbitrary shape. Furthermore, looking at examples as to what constitutes an Arbitrary Set Of Shapes, I cannot see any correlation to suggest that my set falls into that category.
What do I need to do to circumvent the problem and avoid the annoying error?
Any advice or suggestions are GREATLY welcomed.

Figured this out.
In all examples I reviewed as to what can potentially cause this error there was an attempt to define a Slice on a Partition that required a crossJoin across different dimensions which caused the error, "An arbitrary shape of the sets is not allowed in the current context".
Although my defined set doesn't use an explicit cross join across different dimensions, the elements of the set are a product of a User-Defined Hierarchy and so there is an implicit Cross Join required to produce the Slice.
To alleviate the Implicit Cross Join, I re-created the members of my set to be at the base attribute level and voila!
So the revised set is now:
{[Calendar Dates].[Dates].&[2007-04-01T00:00:00]
,[Calendar Dates].[Dates].&[2007-04-02T00:00:00]
,[Calendar Dates].[Dates].&[2007-04-03T00:00:00]
,[Calendar Dates].[Dates].&[2007-04-04T00:00:00]
,[Calendar Dates].[Dates].&[2007-04-05T00:00:00]
,[Calendar Dates].[Dates].&[2007-04-06T00:00:00]
,[Calendar Dates].[Dates].&[2007-04-07T00:00:00]}
And the error is resolved.

Related

PowerPivot DAX MAXX Peformance Issue

I am building a data model with PowerPivot for Excel 2013 and need to be able to identify the max number of emails sent per person. The DAX formula below gives me the result that I looking for but performance is incredibly slow. Is there an alternative that will compute a maximum by group without the performance hit?
Maximum Emails per Constituent:
=MAXX(SUMMARIZE('Email Data','Email Data'[person_id],"MAX Value",
([Emails Sent]/[Unique Count People])),[MAX Value])
So, without the measure definitions for [Emails Sent] or [Unique Count People], it is not possible to give definitive advice on performance. I'm going to assume they are trivial measures, though, based on their names - note that this is an assumption and its truth will affect the rest of my post. That being said, there is an obvious optimization to make to start with.
Maximum Emails per Consultant:=
MAXX(
ADDCOLUMNS(
VALUES('Email Data'[person_id])
,"MAX Value"
,[Emails Sent] / [Unique Count People]
)
,[MAX Value]
)
I used the ADDCOLUMNS() rather than a SUMMARIZE() to calculate new columns. See this post for an explanation of the performance implications.
Additionally, since you're not grouping by multiple columns, there's no need to use SUMMARIZE(). The performance impact of using VALUES() instead should be minimal.
The other question that comes to mind is whether this needs to be a measure. Are you going to be slicing by other dimensions? If not, this becomes a static attribute of a [person_id] which could be calculated during ETL, or in a calculated column.
A final note - I've also been assuming that your model is optimal as well. Again, we'd need to see it to make comment on whether you could see performance issues from something you're doing there.

Errors in the OLAP storage engine: The attribute key cannot be found when processing

I know this is mainly a design problem. I 've read that there is a workaround for this issue by customising errors at processing time but I am not glad to have to ignore errors, also the cube process is scheduled so ignore errors is not a choice at least a good one.
This is part of my cube where the error is thrown.
DimTime
PK (int)
MyMonth (int, Example = 201501, 201502, 201503, etc.)
Another Columns
FactBudget
PK (int)
Month (int, Example = 201501, 201502, 201503, etc.)
Another columns...
The relation in DSV is set as follows.
DimTiempo = DimTime, FactPresupuesto=FactBudget, periodo = MyMonth, PeriodoPresupFK = Month
Just translated for understanding.
The Relationship in cube is as follows:
The cube was built without problem, when processing the errror: The attribute key cannot be found when processing was thrown.
It was thrown due to FactBudget has some Month values (201510, 201511, 201512 in example) which DimTime don't, so the integrity is broken.
As mentioned in the answer here this can be solved at ETL process. I think I can do nothing to get the relationship if one fact table has foreign keys that has not been inserted in dimensions.
Note MyMonth can be values 201501, 201502, 201503 etc. This is set for year and month concatenated, DimTime is incremental inserted and every day is calculated that column so in this moment DimTime don't have values for 201507 onwards.
Is there a workaround or pattern to handle this kind of relationships?
Thanks for considering my question.
I believe that the process you are following is incorrect: you should setup any time related dimensions via a degenerate/fact dimension methodology. That is, the time dimension would not really be a true dimension - rather, it is populated through the fact table itself which would contain the time. If you look up degenerate dimensions you'll see what I mean.
Is there a reason why you're incrementally populating DimTime? It certainly isn't the standard way to do it. You need the values you're using in your fact to already exist in the dimensions. I would simply script up a full set of data for DimTime and stop the incremental updates of it.
I ran into this issue while trying to process a cube in Halo BI. It seems that some "datetime" styles "are" supported by SQL Server but not Halo BI. This statement does not cause an error:
CAST(CONVERT(char(11),table.[col name],113) AS datetime) AS [col name]
however this does not process without an error:
CAST(CONVERT(char(16),table.[col name],120) AS datetime) AS [col name]
However both of these work in SQL Server Management Studio 2012.
Another cause of this error is due to the cube measures being improperly mapped to the fact table.

OLTP variable required as Dimension and Measure in OLAP?

Scenario
Designing a Star Diagram for an OLAP environment for the process Incident Management. Management requests to be able to both filter on SLA status (breached, achieved or in progress) and being able to calculate the percentage of sla achieved vs breached. Reporting will be done through in Excel/SSRS through SSAS (tabular).
Question
I’m reasonable inexperienced in designing for an OLAP environment. I know my idea will work but I’m concerned this is not the best approach.
My idea:
SLA needs to be both a measure and a dimension.
DimSLA
…
(Nullable bool) Sla Achieved -> Yes=True, No=False, and InProgress=NULL
…
FactIncident
…
(Nullable Integer) Sla Achieved Yes=1,No=0 and In Progress=NULL
…
Then in SSAS, publish a calculated percentage field which averages FactIncident-SlaAchieved.
Is this the right/advisable way to do it?
As you describe it, "SLA achieved" should be an attribute, as you want to classify by it, not sum it. The only thing you want to sum or aggregate would be other measures (maybe an incident count) under the condition that the "SLA achieved" attribute has certain values like "achieved" or "not achieved". This is the main rule in dimensional design: Things you use for classifying or break down are attributes, and things that you calculate are measures. There are a few cases where you need a column for both, but not many.
Do not just use a boolean value. Use a string value easily understand by users like the texts "SLA achieved", "SLA not achieved", "in progress". This makes it much more easy for non technical users to use the cube. In case you use this in a dimension table , there would be just three records with the strings, and the fact table would reference them with maybe a byte foreign key, hence more meaningful texts do not use up millions of bytes.

What is the path of a member belonging to multiple hierarchies?

I have a dimension (d_orga) with the following structure : http://dongorath.free.fr/d_orga.png.
As you can see, there is a hierarchy for each parallel branch.
My problem is determining the key path of a member at the l_site level, knowing that each member has a parent in every branch. An exemple member is : [d_orga].[l_site].&[grp]&[p3]&[e3]&[c3]&[eu]&[DE]&[ber]. This tells me it wants all levels in the order l_grp - l_pol - l_ent - l_com - l_reg - l_cou - l_site for my specific case, but those specific hierarchies can be different depending on the client (this example is our "demo" environment whereas a client could have different levels, or only 2 hierarchies, etc.). How can I determine the order of the wanted levels without having to hardcode it each time ? Does it depends on the creation order of the hierarchies ? An alphabetical order I failed to see ? Another arcane inner working of SSAS ?
It has, in fact, nothing to do with the structure of the dimension. The key path of a member is "simply" the key columns (property KeyColumns) defined on the attribute. They are ordered when defined and this is the order that must be used.
In the example of the question, I defined the key columns of the l_site attribute to be, in order, grp_code - pol_code - ent_code - com_code - reg_code - cou_code - site_code, thus, it is the order to be used.
Concerning the problem of specific hierarchies in client applications, the definition of the key columns being computed by the application, it can be safely re-computed by this very application.

LastChild simulation in MDX - Multiple hierarchies

I have a SSAS cube in which there is a measure that needs to be allocated via a percentage located in another measure. I have all this set up as a Measure Expression in my "Equity Amount" measure and it works great.
My problem is that this "Equity Amount" measure is actually a snapshot so I would need it to aggregate using the LastChild function. It turns out that you cannot have a measure expression in a semi additive measure so i'm trying to fake the LastChild function in MDX.
I've seen a lot of examples everywhere on the web and all but none of them talk about having multiple hierarchies in the date dimension. I have both "calendar Year" and "Fiscal Year" hierarchies.
My MDX works for one hierarchy but as soon as I scope for the second hierarchy, the first one gets overwritten. I'm guessing I need to treat both hierarchies in a single statement but am having a real tough time getting it to work.
Here is my MDX for one hierarchy. Can anyone help modify it for multiple hierarchies or is there any other way to solve my problem ?
Scope([Measures].[Equity Value]);
This = iif(isleaf([Calendar].[By Calendar Year].CurrentMember),
[Measures].[Equity Value],
([Calendar].[By Calendar Year].CurrentMember.LastChild,[Measures].[Equity Value]));
End Scope;
David,
1) You're using scope as a calculated member. You can get rid of your iif by declaring the scope working only in a sub-cube :
scope ([Measures].[Equity Value],[Calendar].[By Calendar Year].levels(0).... )
This = (the expression)
2) Not sure to understand your problem but a tuple with two members of two hierarchies of the same dimension can be null by construction: As an example your first day of the calendar (1/1/2010) and the first day of your fiscal calendar (e.g. 1/6/2020) are not the same day so actually null. Two hierarchies of a same dimension are only ways of representing the same coordindates (here days), you're doing an intersection by declaring a tuple.
Not sure I'm helping you...
Thanks for trying! I understand what you mean (I think). MDX is one tough language!
I ended up doing the allocation via the view that is the source for my measure and keeping the LastChild aggregation function for the measure. In the end, it is much easier and also is better for query performance.
Thanks anyway :)