I am having a tough time figuring out how I am supposed to implement parent-child relationships between my measure groups in SSAS. Essentially I am just trying recreate a standard SQL join so that I can lookup measures in my Parent group while using keys in my Child group.
For example, let's say I have the following dimensions:
Parent
Child (FK: ParentID)
Time
And I have the following measure groups
ParentFact (Keys: ParentID, TimeID - Measures Related to the parent over time)
ChildFact (Keys: ChildID, TimeID - Measures related to the child over time)
I have not created any specific dimensions for any of my fact tables.
In the end I just want to run a query like this to list out measures in my Child table joined to measures in my Parent table:
SELECT
{
[ChildMeasure]
, [ParentMeasure]
} ON COLUMNS
, [Child].Children ON ROWS
FROM
[MyCube]
WHERE
[Time].[100]
When this query runs, the Child rows are correctly listed, alongside the appropriate measure values for Time ID 100. Unfortunately, ParentMeasure is all the same, and appears to be an aggregate for this value over all Parents at Time ID 100. I would expect this column to show the value from each child's associated parent at Time ID 100.
What am I doing wrong here? Do I need to create FactDimensions for each FactTable, and somehow relate those? Should I crate an association between Parent and Child in my Datasource view? Would that make it a Snowflake schema, which I think I am supposed to avoid?
As a side note, my ChildFact table actually contains ParentID as one of the measures, because it is on the relational table in the datasource (probably due to some previous denormalization effort by the DB developer). Should I remove any measures that are actually FKs in my fact table, or is that somehow required for what I am trying to do?
I don't think you should attempt to create a single Fact table due to the obvious granularity issue.
I think you should add a Dimension Relationship in your Cube definition, between your Child Fact and Parent Dimension, using the existing FK column.
Probably the only valid use of ParentID as a Measure would be to get a Distinct Count.
There is no need to create two facts here.
Create a single Fact table (ParentID, ChildID, TimeID)
Create 3 dimension DimParent, DimChild, DimTime
FK of Fact table will refer to respective dimension table
Also, check the Attribute relationship how Facts are sliced with dimension
Related
Background...
As part of existing data conversion we need to convert to populate Adhoc Hierarchies with a limited information.
Currently we have a finger countable members from CXO House treated to be users of this Adhoc Hierarchies.
They have their choice of Employee Hierarchical combinations for a purpose.
These hierarchies have only Parent - Child relation and can be of at any level to any level of Org Hierarchy.
In the other words... child never become a parent unless he has subordinates for him.
We have an employee table of ORG EmpHierarchy (OH) that has Organizational hierarchies.
But, which can't be directly used but we can take help of these columns to form our logic. This table is no way related to current model.
We have a few other tables HeadofDepartment(HOD), HierarchyDetails(HD) and a Stage Table same as HD.
OrgEmpHierarchy (OH) Has:
OH_ID - Organizational HierarchyID (DB Sequence)
OH_PID - (Parent ID) one of the values from previous column.
OH_EmpID - Organizational EmpID.
HeadofDepartment (HOD) Has:
HOD_ID - Head of Dept. ID (DB Sequence)
HOD_EmpID - Organizational EmpID.
HierarchyDetails (HD) Has:
HD_ID - Hierarchy Details ID (DB Sequence)
HD_PID - (ParentID) of of the values from previous column.
HD_HOD_ID - (Foreign Key) HOD_ID from HOD.
HD_EmpID - Organizational EmpID.
We need to populate the Hierarchy for each HOD_ID from Head of Department (HOD) in Hierarchy Details (HD) table.
We are able to populate values in HD for HD_ID, HD_HOD_ID, HD_Emp_ID. HD_PID is populated with NULL.
Now with the help of OH, HOD I need to populate Hierarchies in HD_PID in HD table.
Can some one give me a Oracle SQL/ PLSQL query which udpates the HD_PID?
Since this is an assignment I am not going to "give you the SQL".
You will need to use a hierarchical query which has the basic format:
SELECT some_columns
FROM a_table
START WITH some_condition
CONNECT BY PRIOR some_column = some_other_column
So, which table to select from? Well, looking at your tables OrgEmpHierarchy has contains a relationship between OH_PID (the Parent ID) and OH_EmpID so you should be using this table and connecting on those columns.
So, what should you start with? Without any data, it is difficult to say but presumably the top of the hierarchy is either an Owner/Director or the Head of Departments - if it is that latter case then you can look at starting with those employees that are IN the HeadofDepartment table.
What to select?
You're using the OH_PID and OH_EmpID columns to connect the query so you can select these as they define the hierarchical relationship.
You can select a sequence using ROWNUM or use an actual sequence YOUR_SEQUENCE.NEXTVAL.
You need to know the Head of Department - assuming you start the hierarchy at with those head of departments then there is a very simple operator that lets you "connect by the root" of the hierarchy - I'm sure a simple web search will inform you of its syntax an usage.
That just about covers it all except you'll want to insert it into a table. So, use:
INSERT INTO HierarchyDetails ( columns )
SELECT ...
I have a report I am trying to make that displays parent information and all children in one household on ONE row.
There is no "parent" table that stores the information on parents and there is no ID that links parents to child and no ID that links sibling to sibling. The only way to tell if they are siblings is if they have the same address (logic being that if they have the same address, they live together, and are part of the same household). All the information is pulled from a "student" table or a custom field in the student table that stores the parent information, address they live at, etc.
Instead of displaying parent info twice I want to display
the information like this:
Parent_name, address, phone,child1_name, child1_schoolname, child1_age, child2_name, child2_schoolname, child2_age, etc(for every child in that household)
The problem is that not every household will have the same amount of children and I can only link siblings by their address.
How can I display all information for each household on ONE row? Is this possible and how? I've tried pivot table but with no avail.
This is a classic 'you shouldn't be doing reports in the database' question. A database is for data retrieval, not data formatting. But let's assume you know this and need to do it anyway for some reason.
The algorithm I'd use for this would be
Create some windowed queries across the data; group by address (the joinable value) and sort by age desc.
Create a query that utilize this window and returns the first item in each group.
Create additional queries that return the second, the third, the fourth, in each group. etc.
Outer join these together.
This is going to be far easier if you define some maximum number of siblings (five?) as opposed to dynamically building these siblings.
If the parents are in the same table, how do you know which items are parents and which are children?
In case you have two tables one for Parent(first table) and one for Children(second table) as below:
You can do something like that in your data model:
select Parent.NAME as parent_name,
Parent.ADDRESS as parent_address,
Parent.PHONE AS phone,
(
select listagg(Child.NAME,',')
within group(order by Child.NAME)
from CHILD Child
where Child.ADDRESS=Parent.ADDRESS
)as children_names,
(
select
listagg(Child.AGE,',')
within group(order by Child.NAME)
from CHILD Child
where Child.ADDRESS=Parent.ADDRESS
)as children_ages
from PARENT Parent .
And you will have the output query result:
Listagg is your solution which operates as you want bringing muliple rows in one.
However,listagg is compatible for database 11g and newest versions,
so in case you have older version,this is not going to work.
Hope this help.
I have 2 fact tables with a measure group each, Production and Production Orders. Production has production information at a lower granularity (at the component level) productionorders has information at a higher level (order level with header quantities etc.).
I have created a surrogate key link between the two tables on productionorderid. As soon as I add Prod ID (from productiondetailsdim) to the pivot table it blats out the actual qty (from prod order measure group) and I cannot combine the qty's from the two measure groups.
How can I design the correct relationship between the two? Please see my dim usage diagram. Production Details is the dim that links the two fact tables, at the moment DimProductionDetails is in a fact relationship with Production. I'm not sure what the relationship should be with Production Order (it is currently many to many).
Please see example data between the two tables:
I have to be able to duplicate this behaviour:
Do you want the full actual qty from prod order measure group to repeat next to each product? If so a many-to-many relationship is right. I suspect once I explain how that many-to-many works you will spot the problem.
When you slice full actual qty from prod order measure group by product from the Production Details dimension it does a runtime join between the two measure groups on the common dimensions. So for example, if for if order 245295 has a date of 1/1/2015 while the production details for order 245295 have dates of 1/8/2015 then the runtime join will lose rows for that order and actual qty will show as null. So compare all the dimensions used on both measure groups and ensure all rows for the same order have the same dimension keys for those common dimensions. If for example dates differ then create a named query in the DSV that selects just the dimension columns from the production fact table which match the order fact table. Then create a new measure group off that named query and use the new measure group as the intermediate measure group in your many to many dimension. (The current many to many cell in the dimension usage tab should say the name of the new measure group not the existing Production measure group.)
Edit: if you want the actual qty measure to only show when you are at the order level and be null at the product level then try the following. Change the many-to-many relationship to a regular relationship and in the dialog where you choose how the fact table joins to the dimension change the dimension attribute to ProductionOrder_SK (which is not the key of the dimension) and choose the corresponding column in the fact table. Then left click on the Production Order measure group and go to the Properties window and set IgnoreUnrelatedRelationships to false. That way slicing actual qty by work center or by an attribute that is below grain in the Production Details dimension will show as null.
I have a fact table that has 4 date columns CreatedDate, LoginDate, ActiveDate and EngagedDate. I have a dimension table called DimDate whose primary key can be used as foreign key for all the 4 date columns in fact table. So the model looks like this.
But the problem is, when I want to do sub-filtering for the measures based on the date column. For ex: Count all users who were created in the last month and are engaged in this month. This is not possible to do with this design, coz when I filter the measure with create date , I can’t further filter for a different time window for engaged date. Since all the connected to same dimension, they are not working independently.
However, If I create a separate date dimension table for each of the columns, and join them like this then it works.
But this looks very cumbersome when I have 20 different date columns in fact table in real world scenario, where I have to create 20 different dimensions and connect them one by one. Is there any other way I can achieve my scenario w/o creating multiple duplicated date dimensions?
This concept is called a role-playing dimension. You don't have to add the table to the DSV or the actual dimensions one time for each date. Instead add the date once, then go to the dimension usage tab. Click Add Cube Dimension, and then choose the date dim. Right-click and rename it. Then update the relationship to use the correct fields.
There's a good article on MSSQLTips.com that covers this topic.
I wish to change the default aggregation from SUM to SUM on Distinct ID Values.
This is the current behaviour
ID Amount
1 $10
1 $10
2 $20
3 $30
3 $30
Sum Total = $90
By default, I am getting a sum of $90. I wish to do the sum on distinct ids and get a value of $60. How would I modify the default Aggregation Behavior to achieve this result?
Design your data as a many-to-many relationship: create one table/view having one record per ID and the amount column from the data shown in your question (the main fact table), and one table/view having one record per record of your data as shown in your question, presumably having another column, as otherwise it would not make any sense to have the data as shown in your question). This will be the m2m dimension table. Then, create a bridge table/view having the id of the m2m dimension table and your ID column.
Then create the following AS objects: A measure group from the main fact table, a dimension on column ID of the same table (in case there is no other column making a dimension table meaningful, in that case, you would better have a separate dimension table having ID as the primary key). Create a dimension from the m2m dimension table, and a measure group having only the invisible measure "count" from the bridge table. Finally, on the "Dimension Usage" tab of Cube Designer, set the relationship between the m2m dimension and the main measure group to be many to many via the bridge measure group.
See http://technet.microsoft.com/en-us/library/ms170463.aspx for a tutorial on many-to-many relationships.