I'm testing a Multidimensional model by using Excel. Simplifying, I've two dimension tables, products and categories, and a sales fact table. There is a relationship between products and categories and between sales and products.
When I analyze this model in Excel and I put as row label products and categories, without putting the sales amount as a measure, it seems it occurs a cartesian product between product values and category values without the corresponding relationship has any effects.
This is for me an undesirable behaviour respect to the user's point of view. An user could want to navigate the structure of the model without initially to select any measures. So if a category is linked to one or more products, the selection of these two dimension tables must show the rigth data combination and not a cartesian products.
Now, how can I solve this issue, please? Thanks
You should consider uniting Products and Categories into one dimension. As you said, Categories are related to Products itself, and facts are linked to Products. By uniting, you can view sales divided by Category-Product hierarchy, and have a tree-like view on Excel filter.
On your [Dim Product] add Category table and draw a relation between tables. Add fields from Category table and create hierarchy Category - Product. Here is a sample of similar design.
Two different dimensions should be used if objects are really independent which is not your case.
Related
I’ve been asked create our analysis cube and have a design question.
We sell ‘widgets’ and ‘parts’ to go with those widgets. Each order has many widgets and sometimes a few parts.
What I’m stuck on is – to me, an order is a fact in a measure. But, what are the widgets? Are they a dimension and each fact in the measure will be an entry for every part and widget for the order.
So, if order 123 had widget 1 and widget 2 and part 5, then there will be 3 facts in the measure for the same order? Is that correct?
At its basic level you can consider most facts to be transactions or transaction line items. So, for example, you may have a 'sales' fact table in which each record represents one line item from that sale. Each fact record would have numeric columns representing metrics and other columns joining to dimension tables. The combination of those dimensions would describe that line item. So, in your case, you likely have something like:
1) A 'date' dimension detailing the date of the transaction
2) A 'widget' dimension detailing the widget sold on that transaction
3) A 'customer' dimension detailing the customer who bought that item (almost certainly the same customer would appear on every line item for this transaction)
4) ... determined by what information you have and what business problem you're trying to solve.
Now, the dimension tables contain further details. For example, your widget dimension table likely contains things like the name of the widget, the color, the manufacturer, etc. Every time your company sells one of these widgets, the record in the fact table links to that same dimension record for that name, color, manufacturer, etc. combination (i.e. you don't create a new dimension record every time you sell the same item - this is a one-to-many relationship - each dimension record may have many related fact records).
You other dimension tables would similarly describe their dimensions. For example, the customer dimension might give the customer's name, their address, ...
So, the short answer to your question is that widget likely is a dimension, items and widgets may (or may not) actually be the same dimension (in a school class I suspect that they are), and that you would have 3 fact records for that one transaction.
This is probably along the same lines as the prior answer but....
If you try and model "many widgets per order" you'll have issues because you end up with a many (order fact) to many (widgets) relationship. In a cube / star schema design, many to many relationship usually need to be moddeled out to be many to one in some way.
So what you do is try and identify what special thing identifies an "order" (as opposed to a bunch of widgets in an order). Usually that is simply stuff like order date, customer, order number, tax
An example way to model this is:
If you have a single order with five widgets, you model that as a fact table with five records that happens to have a repeating widget, customer, date etc. in it
Then you have to work out how you spread an order header tax amount over five records. The two obvious solutions are:
Create a widget that represents tax and add that as another record
Spread the tax over five records, either evenly or weighted by something
Modelling "parts" just takes these concepts further.
It is important to understand what the end user wants to see, why they want to see parts. What do they want to measure by parts, how do you assign higher level values (like tax) down to lower levels like parts.
I have a DimPerson table and a DimPersonDecileOutrigger Table which stores decile data. The way the outrigger is structured is that a customer is given a decile for current year and previous year (if they have bought in the period)- which means a customer might have TY and NOT LY and vice versa. Some customers are both.
In ssis when I picked the columns in dimension structure- I initially only picked columns from DimPerson and not the outrigger. That way in the browser it showed all the id's starting from 1. But when I dragged some columns from outrigger- then in the browser it doesnt show all personID's. I want to see all customers regardless of them having a decile or not.
Pic attached to show what it looks like in dimension structure tab. Also the relationship is between OutriggerID as primary and OutriggerID in person as foreign.
If you just want to solve the problem, you can create a View in your underlying relational database that uses LEFT OUTER JOIN to link the two tables, so that the view will return all rows from DimPerson, even if they don't have a Decile.
Then use the view as the source for your dimension instead of the tables.
I have a problem in my SSAS cube:
There are two fact tables: OrderFact and PaymentFact, when I filter a date, I want to see payments related to filtered date orders. I designed a cube as follows, but I don’t get the desired result, can anyone help me out of this:
You will need to setup a many-to-many Date dimension. Basically you will have two measure groups in the cube. Then on the PaymentFact measure group you will go to the Dimension Usage tab of the cube designer and setup DateDim as a many-to-many relationship type using OrderFact as the intermediate measure group.
For more background about many-to-many dimensions in SSAS, I would highly recommend this whitepaper:
http://www.sqlbi.com/articles/many2many/
The other option is to copy DateKey to PaymentFact in your ETL then make it a regular relationship. If a Payment only relates to one order, then that's feasible. If a payment relates to multiple orders, then use the many-to-many relationship.
I was hoping someone could explain the appropriate use of the 'FACT Relationship Type' under the Dimension Usage tab. Is it simply to create a dimension out of your fact table to access attribute on the fact table itself?
Thanks in advance!
Yes, if your fact table has attributes that you would like to slice by (create a dimension from), you would use this relationship type.
Functionally, to the users it behaves no differently than a regular relationship.
After you create your dimensions and cubes you need to define how each dimension is related to each measure group. A measure group is a set of measures exposed by a single fact table.
Each cube can contain multiple fact tables and multiple dimensions. However, not every dimension will be related to every fact table.
To define relationships right click the cube in BIDS and choose open; then navigate to the Dimension Usage tab. If you click the ellipsis button next to each dimension you will see a screen that allows you to change dimension usage for a particular measure group. You can choose from the following options:
Regular default option; the dimension is joined directly to the fact table
No relationship the dimension is not related to the current measure group
Fact the dimension and fact are derived from a single table. If this is the case your dimensional warehouse has poor design and isn't likely to perform well. Consider separating fact and dimension tables.
Referenced the dimension is joined to an intermediate table prior to being joined to the fact table. Referenced relationship resembles a snowflake dimension, but is slightly different. Suppose you have a customer dimension and a sales fact; you'd like to examine total sales by customer, but you also want to examine line item sales by customer. Instead of duplicating the customer key in the line item fact table you can treat the sales fact as an intermediate table to join customer to line item.
Many-to-many this option involves two fact tables and two dimension tables. Dimension A is joined to an intermediate fact A, which in turn joins to dimension B to which the fact B is joined. Much like with fact option if you need to use many-to-many option your design could probably use some improvement. This type of relationship is sometimes necessary if you are building cubes on top of a relational database that is in 3rd normal form. It is strongly advisable to use a dimensional model with star schema for all cubes. For example you could have two fact tables: vehicles and options; each vehicle can come with a number of options. You're likely to examine vehicle sales by customer, and options by the items that are included in each option. Therefore you would have a customer dimension and item dimension. You could also want to examine vehicles sales by included item. If so the vehicle fact would be joined to the options fact and customer dimension; the options fact would also join to items' dimension.
Data mining target dimension is based on a mining model which is built from a source dimension. Both source dimension and target dimension must be included in the cube.
I have one data warehouse table that contain one row for each item sold.
Each row contains the item's type.
What MDX request could show the number of items sold for each item type?
What (dimensions,levels,etc) would it suppose to create?
In case it is relevant, I am using Pentaho/Mondrian/Spoon/Schema Workbench.
When you build a cube from the data warehouse you would typically aggregate rows for each product sale into totals for groups of one hour, or one day, per product. Few big cubes would support drilling down to individual product sales.
After creating a [Product] hierarchy/dimension you would create a virtual dimension based on that, using the item types, to give another way of breaking the information down.