My goal is to generate a report that has by day: order count, quantity sum, receive sum (we ship our product out and get it back). I have two cubes now: one for the orders and one for the receives. I've created a new cube that includes both dimensions but they are not in sync by date. It's as if there is not a join between the two fact tables by the date. Any ideas?
Thanks
In one cube, include both your orders and receives Measure Groups, and one instance of your Date dimension. In the cube's Dimensions Relationships tab, relate both Measure Groups to the Date dimension using their relevant date keys.
Related
I’ve been asked create our analysis cube and have a design question.
We sell ‘widgets’ and ‘parts’ to go with those widgets. Each order has many widgets and sometimes a few parts.
What I’m stuck on is – to me, an order is a fact in a measure. But, what are the widgets? Are they a dimension and each fact in the measure will be an entry for every part and widget for the order.
So, if order 123 had widget 1 and widget 2 and part 5, then there will be 3 facts in the measure for the same order? Is that correct?
At its basic level you can consider most facts to be transactions or transaction line items. So, for example, you may have a 'sales' fact table in which each record represents one line item from that sale. Each fact record would have numeric columns representing metrics and other columns joining to dimension tables. The combination of those dimensions would describe that line item. So, in your case, you likely have something like:
1) A 'date' dimension detailing the date of the transaction
2) A 'widget' dimension detailing the widget sold on that transaction
3) A 'customer' dimension detailing the customer who bought that item (almost certainly the same customer would appear on every line item for this transaction)
4) ... determined by what information you have and what business problem you're trying to solve.
Now, the dimension tables contain further details. For example, your widget dimension table likely contains things like the name of the widget, the color, the manufacturer, etc. Every time your company sells one of these widgets, the record in the fact table links to that same dimension record for that name, color, manufacturer, etc. combination (i.e. you don't create a new dimension record every time you sell the same item - this is a one-to-many relationship - each dimension record may have many related fact records).
You other dimension tables would similarly describe their dimensions. For example, the customer dimension might give the customer's name, their address, ...
So, the short answer to your question is that widget likely is a dimension, items and widgets may (or may not) actually be the same dimension (in a school class I suspect that they are), and that you would have 3 fact records for that one transaction.
This is probably along the same lines as the prior answer but....
If you try and model "many widgets per order" you'll have issues because you end up with a many (order fact) to many (widgets) relationship. In a cube / star schema design, many to many relationship usually need to be moddeled out to be many to one in some way.
So what you do is try and identify what special thing identifies an "order" (as opposed to a bunch of widgets in an order). Usually that is simply stuff like order date, customer, order number, tax
An example way to model this is:
If you have a single order with five widgets, you model that as a fact table with five records that happens to have a repeating widget, customer, date etc. in it
Then you have to work out how you spread an order header tax amount over five records. The two obvious solutions are:
Create a widget that represents tax and add that as another record
Spread the tax over five records, either evenly or weighted by something
Modelling "parts" just takes these concepts further.
It is important to understand what the end user wants to see, why they want to see parts. What do they want to measure by parts, how do you assign higher level values (like tax) down to lower levels like parts.
I have a problem in my SSAS cube:
There are two fact tables: OrderFact and PaymentFact, when I filter a date, I want to see payments related to filtered date orders. I designed a cube as follows, but I don’t get the desired result, can anyone help me out of this:
You will need to setup a many-to-many Date dimension. Basically you will have two measure groups in the cube. Then on the PaymentFact measure group you will go to the Dimension Usage tab of the cube designer and setup DateDim as a many-to-many relationship type using OrderFact as the intermediate measure group.
For more background about many-to-many dimensions in SSAS, I would highly recommend this whitepaper:
http://www.sqlbi.com/articles/many2many/
The other option is to copy DateKey to PaymentFact in your ETL then make it a regular relationship. If a Payment only relates to one order, then that's feasible. If a payment relates to multiple orders, then use the many-to-many relationship.
I have a fact table called "FactActivity" and a few dimension tables like users, clients, actions, date and tenants.
I create measure groups corresponding to each of them as follows
FactActivity => Sum of ActivityCount colums
DimUser => Count of rows
DimTenant => Count of rows
DimDate => Count of rows and distinct count of weekofyear column
Each user can do multiple actions using multiple clients. A tenant is logical grouping of users. So a tenant contain multiple users but a user can't belong to more than 1 tenant. All the dimension tables and fact tables are connected to DimDate via regular relationship.
The cube structure is as follows.
Now I want to defined the dimension relationships to each of the measure group. Some of them are Many-Many relationsip (to enable distinct count calculation). The designer is showing me multiple options to choose from for many of the intersections. I'm confused as to which one to select as intermediate measure group. Should I always pick the measure group whose total # rows is the least ex: DimDate? Or what is the right logic to determine the intermediate measure group.
This is what I got. IS this right? If no, what is wrong?
For more information to hep choose the right answer.
FactActivity = 1 billion rows
DimUser = 35 million rows
DimTenant = 1 million rows
DimDate = 1000 rows
The correct way to choose the intermediate measure group depends on how you want to evaluate your measures with respect to the dimension related:
Let's start with Activity measure group to Tenant dimension: The question is: How should Analysis Services determine the activity count (or any other measure in the Activity measure group) of a tenant? The only reasonable way to determine this would be to go from the activity fact table through the user table to the tenant table. And actually, the last relationship is not a many-to-many relationship, but a many-to one relationship. I. e. you could optimize away the tenant dimension by integrating it into the user dimension. However, using a many-to-many relationship will work as well, just be a little less efficient. You might also consider using a reference relationship from user to tenant instead of a many-to-many relationship. And there may be other considerations why you may have chosen to have them two separate dimensions, thus I do not discuss this any further.
Now let us continue with the next one: Tenant measure group to User dimension: The way you have configured it (using the date measure group) means that for each date that a tenant and a user have in common, the tenant count of a user adds one to the count. This is probably not what you want. I would assume you want to relate tenant measures to user dimension by the user measure group. However, I am not sure what the purpose of the DateKey in the user and tenant dimension tables is at all. Thus, your relationship may be correct.
Let's continue with the relationships from the Date measure group to the Tenant and User dimensions. I would assume there should be no relationship at all, as the week of the year and the date count do not depend on tenants or users. Please note that it is absolutely ok to have no relationship between some measure groups and some dimensions. If you look at the Microsoft sample cube "Adventure Works", it has more gray cells (i. e. measure group and dimension being unrelated) in the Dimension Usage than white ones (i. e. there is some kind of relationship between measure group and dimension, of whichever type). In the default setting of IgnoreUnrelatedDimensions = true of a measure group, this means that the measure value will be the same for all members of the dimension. This should be the case for date count and week of year. However, again, as I do not know the purpose of the DateKey in the user and tenant dimension tables, I am not sure if this assumption is correct for your data.
And after these examples, I would hope you can continue with the rest of relationships yourself.
I have a Sales fact table, an Orders fact table (both line level detail), and two date roleplaying dimensions (from the Date dimension) for Order Date and Transaction Date.
I'm trying to get to a point where you can view sales measures by order date and order measures by transaction date.
The Sales table has the key for the related Order line if the sale was from an order and null if it was a non-order sale. The Order table doesn't have any links to the related transaction.
I've been trying to wrap my head around how to model a relationship based on the link between the two fact tables and the only method I can get to work would be to create a dimension based on the Orders table which contains only the key, then use many-to-many relationships... which somehow seems completely wrong, but I'm not sure what would be the "right" approach to this situation.
If at all possible I'd like the non-order sales to show as "unknown" order dates when viewing Sales Measures by Order date, so you can see the complete picture rather than just sales from orders. Using the above approach this isn't happening.
Any suggestions about what needs to be changed to get this to work?
You were on the right track. I would create a view in the relational database or a named query in the DSV containing as the single column the distinct non-null order IDs, maybe call it "DimOrderId". Then build a dimension from it, setting the "Null processing" property (you have to click the "plus" two times for the "Key Columns" property of the attribute in BIDS to access this property) to "UnknownMember".
And then use this dimension for the many-to-many relationship.
You should use the Order ID to lookup the Order Date and put an Order Date dimension key in the Sales Transaction fact table. Since there may be multiple transactions per order, the other way around probably just doesn't make sense. If it is 1:1 you could do the reverse, but it would mean updating order facts once the transaction occurs which could be a load-time complexity and performance hit. Make sure you really NEED order by Transaction Date.
I have one data warehouse table that contain one row for each item sold.
Each row contains the item's type.
What MDX request could show the number of items sold for each item type?
What (dimensions,levels,etc) would it suppose to create?
In case it is relevant, I am using Pentaho/Mondrian/Spoon/Schema Workbench.
When you build a cube from the data warehouse you would typically aggregate rows for each product sale into totals for groups of one hour, or one day, per product. Few big cubes would support drilling down to individual product sales.
After creating a [Product] hierarchy/dimension you would create a virtual dimension based on that, using the item types, to give another way of breaking the information down.