In my ssas project, I created two cubes(A and B). There are 1,000 data in cube A, and 20,000,000 in B.
When I process cube A, it will not be completed to process sometimes.
In my idea, only 1,000 data will never cause SSAS crashed,
so maybe cube B caused.
Does anyone know the affection between cubes when processing cube?
Thanks,
There is no connection between different cubes except for possible common dimensions.
Dimensions can be used by several cubes. If Cube A has one or more dimensions in common with Cube B, Cube B might be set as unprocessed when Cube A and its dimenions have been reprocessed.
Related
I have Account and Customer dimensions in the cube that are connected to the same measure groups (there are about 15 - 20 measure groups in the cube).
When I run XMLA command to process update these two dimensions, like this:
<Batch>
<Parallel>
<Process>
<Object>
<DatabaseID>My Database</DatabaseID>
<DimensionID>Dim Customer</DimensionID>
</Object>
<Type>ProcessUpdate</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
</Parallel>
</Batch>
in the case of Account dimension it finishes in a couple of minutes because it doesn't trigger processing of all partitions of all measure groups. But in the case of Customer dimension it triggers processing of all partitions of all measure groups, so process update of this dimension lasts longer then full processing of entire cube.
I am not sure what can be the reason from which the dimension will trigger all this processing in the case of one dimension and not in the case of the other. For both dimensions Process affected objects is set to Do not process. Where should I look, what to check, can I somehow prevent this reprocessing happen?
Thanks!
The documentation of ProcessUpdate states that
Forces a re-read of data and an update of dimension attributes. Flexible aggregations and indexes on related partitions will be dropped.
It can cause the aggregations to be dropped.
Specifically, there is an MSDN blog about the different processing options, which has details of when the aggregations could be dropped
Depending on the nature of the changes in the dimension table, ProcessUpdate can affect dependent partitions. If only new members were added, then the partitions are not affected. But if members were deleted or if member relationships changed (e.g., a Customer moved from Redmond to Seattle), then some of the aggregation data and bitmap indexes on the partitions are dropped.
Chris Webb one of the key persons in the BI world has blogged about this as well, specifically, he has the following to say:
The act of clearing the indexes/aggregations also shows up as "partition processing operations" in Profiler
Can anyone please explain a database dimension and cube dimension in SSAS in simple terms? I have done my research and I am not sure I completely understand the differences between the two.
I assume you are developing an Analysis Services Multidimensional model in Visual Studio. That project represents a database. In that database are top level objects like data sources, data source views, cubes and dimensions. So the dimensions you put in the Dimensions folder in Solution Explorer are "database dimensions".
Next, when you construct a cube it contains measure groups and cube dimensions. The bottom left of the first tab of the cube designer lists the cube dimensions. Also the rows on the Dimension Usage tab list cube dimensions.
You can add a database dimension to a cube one or more times. For example, if you have a Date dimension defined as a database dimension then you can add it multiple times to the cube to make Order Date and Ship Date cube dimensions. These multiple copies of the dimension are called "role-playing dimensions".
The database dimension takes up storage space. The cube dimensions don't take up duplicate storage space for the dimension data. But measure groups are indexed by cube dimensions so each additional cube dimension will increase the index size in your cube.
The cube dimension is the role the data plays in your cube. The most frequently used is Date. You'll have multiple dates in your cube, Order Date, Hire Date, Termination Date, etc. Each of these 'role playing' dimensions acts the same and can contain the same attributes, but they differ in how they describe the relationship to the event you are looking at.
The database dimension is the storage for the cube dimension. There is only one Date database dimension and all of your cube Date dimensions will use the exact same copy of the data. All of the attributes used by your cube dimension are contained in the database dimension, even if the various cube role-playing dimensions do not each make use of all of the attributes. Changing data in the database dimension will change the data in all of the cube dimensions when the cube is next processed.
I have a SSAS cube in which one of my dimension has 5 million recrods. When I try to view data for the dimension, report or excel pivot becomes lengthy and also the performance is poor. I cant categorize that particular dimension data. Only way I can think of to restrict data is select top 10K rows from the dimension which has metric values. Apart from restricting it to top 10K dimension records can anyone please suggest other possibilities?
Have you set up aggregations? I would venture to guess that the majority of the time being spent getting your data to a viewing point has to do with your measures. If I was you I would try adding in aggregations or upping the aggregation percent in order to alleviate some of the pressure at querytime by passing this workload to the processing time of the dimension/cube.
Generally, people set their aggregation levels at about 30% to start.
If you have done this already, I would think about upgrading your hardware on the server that your cube sits on. (depending on what you already have)
These are just suggestions as it could also be an issue in your cube design that is causing a lengthy runtime.
I would suggest you to create a hierarchy for showing 5 million records. Group by substring in Level 1,( if required some characters in Level 2), then the data falling under that group. For example :
Level 1 Value
A Apple
A Ant
This would mean that you wont be showing all 5 million records at once and it is very effective now to use aggregations too.
I have a date dimension that has identical attributes in several cubes.
How should I set this up
Have the dimension repeated in each cube
Make it a linked dimension from one cube to all the other cubes
Make a stand alone cube with just the date dimension and then have all the other cubes link to that one instance
Something else.
If the cubes are in the same database, you should just be able to add the dimension to each one. Do you have a single database holding all the cubes, or do they live in different databases?
I have a shared dimension in SQL Server Analysis Services 2000. The dimension is used in two cubes. The dimension is marked as non-changing dimension.
Names of some memebers in the database have changed. However no structal changes have been made. I process the dimension incrementaly, but the names of the memebrs stored in dimension don't change.
Is the only option for me to fully process the dimension? I don't want to do it, because it would require reporocessing two very large cubes (30 GB cubes, and 300 GB of raw fact data).
Any suggestions?
EDIT: I have a similar solution working on SSAS 2005, and the Process Update option is working, that means name of memebers are changing.