I have a rigid relationship defined between two attributes in a dimension. This is a business case, and we expect that the "parent" attribute will never change. However, we are seeing an intermittent circumstance where, during a ProcessUpdate of the dimension, the query for the parent attribute is executed first, and before query for the the child attribute is executed, a record is inserted into the underlying database. Thus, when the child attribute query is executed, it reads data that wasn't present when the first attribute was processed, and thus the parent attribute is presumably assigned the unknown member. During the next ProcessUpdate of that dimension, since the parent attribute now picks up the new data and sees that the parent attribute is no longer the "unknown" member but another valid member, an error is thrown proclaiming that the rigid relationship was violated.
What actions can be taken here?
Remove the rigid relationship -- but if rigid relationships are supposed to be defined by business cases, and we have a valid business case, is this just a design flaw in SSAS?
Arrange the order in which the attributes are processed -- If the child attribute was processed before the parent attribute, then we wouldn't be encountering this issue. Is arranging processing of attributes even possible in SSAS?
Do a full process on the dimension -- We have other dimensions with rigid relationships, should we set them all to ProcessFull? If so, then, to keep rigid relationships, why even have the other processing options?
Are there other options to consider, like maybe changing the error configuration, or something?
Please let me know what you think would be the best approach.
Thanks,
Greg
I would recommend going with option #2 (arrange the order of process) using the 'sequential (transaction mode) processing option - here's more detail. You might also want to run the 'impact analysis' to verify object dependencies - steps to implement here.
Related
Imagine I have a Parent.hasMany(Child) relationship, if I have an API to query a Parent but also need to surface how many children this parent has, I have 2 immediate options:
Run a query on COUNT(child.id) (I feel this must be very hard to scale as we add more and more children in for a given Parent
Maybe have a n_count attribute defined on the Parent and do a SQL transaction to modify the count on the parent every time a Child is created/deleted
Which is the better option here, or is there a third and best way?
Storing n_count in the parent is generally considered undesirable because it is redundant information (information which can be obtained more reliably by counting the child records). Having said that, if the updates to the parent and child rows (n_count) is controlled to guarantee correct updates (by database triggers, for example), then this can be called a type of 'controlled de-normalisation', and used for performance improvement (only improves read-queries, updates/inserts will be slower, of course).
I need to be able to model a single entity that can be in different states. The issue is that in each state the entity has more fields attributed to it.
In my case I have a Match entity which has the states "pending", "ready", "running" and "finished". A pending match hold the date and time the match is to commence and what server it will be hosted on, a ready match holds the teams who will be competing in the match, a running match holds the live scores of the match as it's running and the finished match will hold the time the match finished and the winner.
Summarising, the match entity goes through a sort of pipeline where at different stages, new fields are added to it.
One approach is to have one big Match table with all the fields for all states that get filled in as the match advances a state. This would mean the table would have many null values, but they would all eventually get filled by the time the match is in its final state.
Another is to have a table for each of the matches stage (PendingMatch, ReadyMatch, RunningMatch... etc.) with only the fields relevant to each stage. Since i'm using hibernate, each state would have its own class so when a match is loaded you can only access the fields relevant to that state. The issue with this method however is that I would be repeating columns across tables and in order to advance the matches state, I would have to pop it from its current table and insert it into the next table with the new data for that state.
Neither solution seems ideal, but I cant see any other way of modelling this.
I'm sure this is a common problem, but since the exact description of the is such a mouthful its very difficult to find anything on it!
If anyone could explain which approach is best or perhaps a totally new approach that would be appreciated.
Thanks in advance!
Yepadee
p.s. I'm using hibernate ORM tool to model this
Using Core Data, I have two entities that have many-to-many relationships. So:
Class A <<---->> Class B
Both relationships are set up as 'ordered' so I can track they're order in a UITableView. That works fine, no problem.
I am about to try and implement iCloud with this Core Data model, and find out that iCloud doesn't support ordered relationships, so I need to reimplement the ordering somehow.
I've done this with another entity that has a one-to-many relationship with no problem, I add an 'order' attribute to the entity and store it's order information there. But with a many-to-many relationship I need an unknown number of order attributes.
I can think of two solutions, neither of which seem ideal to me so maybe I'm missing something;
Option 1. I add an intermediary entity. This entity has a one-to-many relationship with both entities like so:
Class A <<--> Class C <-->> Class B
That means I can have the single order attribute in this helper entity.
Option 2. Instead of an order attribute that stores a single order number, I store a dictionary that I can store as many order numbers as I need, probably with the corresponding object (ID?) as the key and the order number as the value.
I'm not necessarily looking for any code so any thoughts or suggestions would be appreciated.
I think your option 1, employing a "join table" with an order attribute is the most feasible solution for this problem. Indeed, this has been done many times in the past. This is exactly the case for which you would use a join table in Core Data although the framework already gives you many-to-many relationships: if you want to store information about the relationship itself, which is precisely your case. Often these are timestamps, in your case it is a sequence number.
You state: "...solutions, neither of which seem ideal to me". To me, the above seems indeed "ideal". I have used this scheme repeatedly with great performance and maintainability.
The only problem (though it is the same as with a to-one relationship) is that when inserting an item out of sequence you have to update many entities to get the order right. That seems cumbersome and could potentially harm performance. In practice, however, it is quite manageable and performs rather well.
NB: As for arrays or dictionaries to be stored with the entity to keep track of ordering information: this is possible via so-called "transformable" attributes, but the overhead is daunting. These attributes have to be serialized and deserialized, and in order to retrieve one sequence number you have to get all of them. Hardly an attractive design choice.
Before we had ordered relationships for more than 10 years, everyone used a "helper" entity. So that is the thing that you should do.
Additional note 1: This is no "helper" entity. It is a entity that models a fact in your model. In my books I always had the same example:
You have a group entity with members. Every member can belong to many groups. The "helper" entity is nothing else than membership.
Additional note 2: It is hard to synchronize such an ordered relationship. This is why it is not done automatically. However, you have to do it. Since CD and synchronizing is no fun, CD and synchronizing a model with ordered relationship is less than no fun.
My query is regarding the setting of the KeyColumn property of a dimension attribute in analysis services (2008). Specifically it boils down to: I have a dimension, there are three attributes which I am currently concerned with: SudoKey, Code and Description.
SudoKey is the most granular, but Code and Description are at the same level, that is to say for every Code member, there is one Description member, and vice versa.
My users want to have access to both individually (some users find codes more efficient, whereas others prefer to work with the descriptions).
I am currently thinking that for efficiency rather than define SudoKey > Code and SudoKey > Description relationships, I should be defining a SudoKey > Code relationship and using Code as the KeyColumn value for Description (with Description for the NameColumn value)... Only I am not confident about what I am doing and success is critical!
Any input would be much appreciated! :)
Edit: What I mean to say is, I don't know if this will work/if it will have the intended effect of reducing the work which Analysis Services has to do.
What your are explaining is a typical dimension and the both of the relationships should be to the key column. It would not be more work for SSAS. All attributes in the dimension are potentially viewable and usable by end-users so I don't see why are you are trying to change the relationships to the key.
your dimension key will be the unique attribute, the one its directly referenced on the fact tables, so if on the fact you have sudoKey, use it.
About browsing, if you configure the dimension relationship correctly your users will be able to browse the cube by any if the attributes.
You configure the dimension relationship (and this is very important, probably one of the most important configurations you have on the cube) on the second tab of the dimension configuration. In this case you would have your key attribute as the main and the other two directly related to it
My application is CoreData based but they may be a common theory for all relational databases:
I have a Output-Input to-many relationship in my model. There are potentially an unlimited number of links under this relationship for each entity. What is the best way to identify a specific input or output?
The only way I have achieved this so far is to place an intermediate entity in the relationship that can hold an output and input name. Then an entity can cycle through its inputs/outputs to find the right relationship when required. Is there a better way?
Effectively I am trying to provide a generic entity that can have any number of relationships with other generic entity.
Apologies if my description isn't the clearest.
Edit in response to the answer below:
Firstly thank you for your response. I certainly have a two-way too-many relationship in mind. But if a widget has 2 other widgets linked to its Inputs relationship what is the best way of determining which input is supplying, say, 'Age' or 'Years Service' when both may have this property but I'm only interested in a specific value from each?
I'm as confused as Joshua - which tells me that it may be that you haven't got a clear picture of what you're trying to achieve or that it is somewhat complex (both?).
My best guess is that you have something like:
Entity Widget
Attributes:
identifier
Relationships
outputWidgets <<->> Widget
inputWidgets <<->> Widget
(where as per standard a ->> is a to-many relationship and <<->> is a to-many relationship with a to-many reverse relationship).
So each widget will be storing the set of widgets that it has as outputs and the set of widgets it has as inputs.
Thus a specific widget maintains a set of inputWidgets and outputWidgets. Each of these relationships is also reversed so you can - for any of the widgets in the input or output - see that your widget is in their list of inputs or outputs.
This is bloody ugly though.
I think your question is how to achieve the above while labelling a relationship. You mention you want to have a string identifier (unique?) for each relationship.
You could do this via:
Where you create a new widgetNamedRelationship for each double sided relationship. Note that I'm assuming that every relationship is double sided.
Then for each widget you have a set of named inputs and named outputs. This also allows for widgets to be attached to themselves but only of there are separate input and output busses.
So then for your example "age" in your implementation class for Widget instance called aWidget you'd have something like:
NSPredicate *agePredicate = [NSPredicate predicateWithFormat:#"name='age'"];
NSSet *ageInputs = [aWidget.inputs filteredSetUsingPredicate:agePredicate];
Have I understood the question?
There really is no better way if you want to be able to take full advantage of the conveniences of fast and efficient in-store querying. It's unclear what you're asking in your additional comments, which I suppose is why you haven't gotten any answers yet.
Keep in mind Core Data supports many-to-many relationships without a "join table."
If Widget has many Inputs or Outputs (which I suspect could be the same entity), then a many-to-many, two-way relationship (a relationship with an inverse, in Core Data parlance) between Widget and Input is all you need. Then all you need to do is see if your Input instance is in the Widget instance's -inputs or if a Widget instance is in the Input instance's -widgets.
Is that what you were looking for? If not, please try to clarify your question (by editing it, not by appending comments :-)).