Problems creating a Parent Child reference in SSAS - ssas

In my data warehouse, I have the following dimension that I want to create a Parent Child hierarchy. My problem is this. The Primary key is OfficerPeopleID, which is NOT either the parent or child. The Parent is MgrPeopleID, and the child is PeopleID.
If I change the default key when creating a dimension to PeopleID, it appears as if it will work, but then I receive errors while processing because of it seeing multiple copies of PeopleID. The reason there are multiples is because it is a SCD type 2 and the Primary Key, (OfficerPeopleID) is the surrogate key for the table. I know I am not the only one that has tried creating a parent child reference on fields other than the Primary key?
Thank you!

I don't think you would want to do that. If I understand you correctly, PeopleID is your natural key or your source system key and OfficerPeopleID is your DW surrogate key. In this case you would need to have a column which stores the Parent surrogate key not the parent natural key. In other words, you should be able to create a foreign key for the table to itself. Based on what you have right now, you could have more than one record for the manager which would make it ambiguous as to which record is the correct one. Also, for the parent child to work you, the child has to be the key for the table.
If you want to do it properly you should populate the MgrOfficerPoepleID (new column) in your ETL process. If you are going to do that make sure you update the manager key value when you have a new row because of SCD2. However, if you still wish to do it as a named query in SSAS DSV, you can do something like this
SELECT
OffcerPeopleID,
-- ... insert other columns here
PeopleID,
MgrPeopleID,
(SELECT OfficerPeopleID
FROM dbo.Employee
WHERE(e.MgrPeopleID = PeopleID) AND (IsCurrent = 1)) AS MgrOfficerPoepleID
FROM dbo.OfficerPeopleDim AS e
WHERE IsCurrent = 1 -- this is your SCD2 flag. you could also use two date range columns

you cant do that if the PeopleID contains duplicate records, either you make it unique or you create the relationship using both fields.
I also advise you to create two separate entries on the DSV, one for Managers and another for Employees, with queries like this:
Manager:
select PeopleID as ManagerID, name as Name from OfficerPeopleDim
Employee:
select PeopleID as EmployeeID, name, MgrPeopleId as Manager
from OfficerPeopleDim
where MgrPeopleId is not null
So it will look like this(left) and produce the result on the right:

Related

How to make 2 relations to the same table(using diferent columns) on dataset.relations?

I have a table clients and a table country, clients has 2 columns referencing country id (id of the country the client is from, and id from the counry the client wants its packages sent to), while dataset.relations is useful to fuse tables using 1 column(or more as long as they are diferent) i cant figure how to display a table that contains the info of the client, and the NAME of both countires corresponding to the ids.
For just the country of the client i go ass follows
billOrderDataset.Relations.Add("clientCountryRelation",countriesTableCopy.Columns("id"), clientTableCopy.Columns("countryId"))
clientTableCopy.Columns.Add("countryName",GetType(String), "Parent.countryName")
but after that i dont know how would i also add the name of the country to the column "deliveryCountryId" since there is already a relation using both tables and the id column from country, so bascially i would need something like
billOrderDataset.Relations.Add("clientCountryRelation2",countriesTableCopy.Columns("id"), clientTableCopy.Columns("DeliveryCountryId"))
clientTableCopy.Columns.Add("DeliveryCountryName",GetType(String), "Parent.countryName")
The answer is hidden in a paragraph inside the Remarks section in the documentation of the DataColumn.Expression property.
Parent/Child Relation Referencing
....
When a child has more than one
parent row, use Parent(RelationName).ColumnName. For example, the
Parent(RelationName).Price references the parent table’s column named
Price via the relation.
So, in other words, when you have more than one relation on the child table you need to explicitly give the name of the relation in the expression syntax
clientTableCopy.Columns.Add("countryName",GetType(String), "Parent(clientCountryRelation).countryName")
clientTableCopy.Columns.Add("DeliveryCountryName",GetType(String), "Parent(clientCountryRelation2).countryName")

Extending table with another table ... sort of

I have a DB about renting cars.
I created a CarModels table (ModelID as PK).
I want to create a second table with the same primary key as CarModels have.
This table only contains the number of times this Model was searched on my website.
So lets say you visit my website, you can check a list that contains common cars rented.
"Most popular Cars" table.
It's not about One-to-One relationship, that's for sure.
Is there any SQL code to connect two Primary keys together ?
select m.ModelID, m.Field1, m.Field2,
t.TimesSearched
from CarModels m
left outer join Table2 t on m.ModelID = t.ModelID
but why not simply add the field TimesSearched to table CarModels ?
Then you dont need another table
Easiest is to just use a new primary key on the new table with a foreign key to the CarModels table, like [CarModelID] INT NOT NULL. You can put an index and a unique constraint on the FK.
If you reeeealy want them to be the same, you can jump through a bunch of hoops that will make your life Hell, like creating the table from the CarModels table, then setting that field as the primary key, then whenever you add a new CarModel you'll have to create a trigger that will SET IDENTITY_INSERT ON so you can add the new one, and remember to SET IDENTITY_INSERT OFF when you're done.
Personally, I'd create a CarsSearched table that holds ThisUser selected ThisCarModel on ThisDate: then you can start doing some fun data analysis like [are some cars more popular in certain zip codes or certain times of year?], or [this company rents three cars every year in March, so I'll send them a coupon in January].
You are not extending anything (modifying the actual model of the table). You simply need to make INNER JOIN of the table linking with the primary keys being equal.
It could be outer join as it has been suggested but if it's 1:1 like you said ( the second table with have exact same keys - I assume all of them), inner will be enough as both tables would have the same set of same prim keys.
As a bonus, it will also produce fewer rows if you didn't match all keys as a nice reminder if you fail to match all PKs.
That being said, do you have a strong reason why not to keep the said number in the same table? You are basically modeling 1:1 relationship for 1 extra column (and small one too, by data type)
You could extend (now this is extending tables model) with the additional attribute of integer that keeps that number for you.
Later is preferred for simplicity and lower query times.

Inserting Data into Tables and Integrity

This is my first post, so please excuse me for any obvious or simple questions as I am very new to programming and all my projects are a first to me.
I am currently working on my first database project. A relational database using Oracle sql. I'm new on my course, so I am not sure on all the concepts yet, but working at it.
I have used some modelling software to help me construct a 13 table database. I have setup all my columns and assigned primary and foreign keys to all 13 tables. What I am looking to do now is insert 10 rows of test data into each table. I have done the parent tables but am confused about the child tables. When I assign ID numbers to all the parent tables primary keys, will the child tables foreign keys be populated at the same time?
I have not used sequences yet as I'm not 100% how to make them work, but instead inputted my own values like 100, 101, 102 etc. I know those values need to be in the foreign key, but wouldn't manually inserting them into many tables get confusing?
Is there an easier approach to this or am I over complicating the process?
I will need to use some queries later but I just want to be happy that the data is sound.
Thanks for your help
Rob
No, the child table data won't be populated automatically-- if there is a child table, that implies that there is a 0 or 1 to m relationship between the two. One row in the parent table may have 0 rows in the child table or it may have dozens so nothing could possibly be populated automatically.
If you are manually assigning primary key values, you'd need to hard code those same values as the foreign key values when you insert data into the child tables. In the real world, you wouldn't manually insert data into many tables at once, you'd have an application that did so and that knew what keys to use based on parameters passed in or by getting the currval of the sequence used to populate the primary key after inserting into the parent table.
Its necessary that data for foreign key should be present in parent table, but not the other way around.
If you want to create test data, i suggest you use something like below query.
insert into child_table(fk_column,column1,column2....)
select pk_column,'#dummy_value1#','#dummy_value2#',..
from parent_table
if you have 10 rows in parent, this will add 10 rows in child.
If you want more rows, e.g. 100 for each parent value you need to duplicate the parent data. for that use below query.
insert into child_table(fk_column,column1,column2....)
select pk_column,'#dummy_value1#','#dummy_value2#',..
from parent_table
join (select level from dual connect by level<10)
this will add 100 child values for 10 parent values..

SQL Server foreign key to multiple tables

I have the following database schema:
members_company1(id, name, ...);
members_company2(id, name, ...);
profiles(memberid, membertypeid, ...);
membertypes(id, name, ...)
[
{ id : 1, name : 'company1', ... },
{ id : 2, name : 'company2', ... }
];
So each profile belongs to a certain member either from company1 or company2 depending on membertypeid value
members_company1 ————————— members_company2
———————————————— ————————————————
id ——————————> memberid <——————————— id
name membertypeid name
/|\
|
|
profiles |
—————————— |
memberid ————————+
membertypeid
I am wondering if it's possible to create a foreign key in profiles table for referential integrity based on memberid and membertypeid pair to reference either members_company1 or members_company2 table records?
A foreign key can only reference one table, as stated in the documentation (emphasis mine):
A foreign key (FK) is a column or combination of columns that is used
to establish and enforce a link between the data in two tables.
But if you want to start cleaning things up you could create a members table as #KevinCrowell suggested, populate it from the two members_company tables and replace them with views. You can use INSTEAD OF triggers on the views to 'redirect' updates to the new table. This is still some work, but it would be one way to fix your data model without breaking existing applications (if it's feasible in your situation, of course)
Operating under the fact that you can't change the table structure:
Option 1
How important is referential integrity to you? Are you only doing inner joins between these tables? If you don't have to worry too much about it, then don't worry about it.
Option 2
Ok, you probably have to do something about this. Maybe you do have inner joins only, but you have to deal with data in profiles that doesn't relate to anything in the members tables. Could you create a job that runs once per day or week to clean it out?
Option 3
Yeah, that one may not work either. You could create a trigger on the profiles table that checks the reference to the members tables. This is far from ideal, but it does guarantee instantaneous checks.
My Opinion
I would go with option 2. You're obviously dealing with a less-than-ideal schema. Why make this worse than it has to be. Let the bad data sit for a week; clean the table every weekend.
No. A foreign key can reference one and only one primary key and there is no way to spread primary keys across tables. The kind of logic you hope to achieve will require use of a trigger or restructuring your database so that all members are based off a core record in a single table.
Come on you can create a table but you cannot modify members_company1 nor members_company2?
Your idea of a create a members table will require more actions when new records are inserted into members_company tables.
So you can create triggers on members_company1 and members_company2 - that is not modify?
What are the constraints to what you can do?
If you just need compatibility on selects to members_company1 and members_company2 then create a real members table and create views for members_company1 and members_company2.
A basic select does not know it is a view or a table on the other end.
CREATE VIEW dbo.members_company1
AS
SELECT id, name
FROM members
where companyID = 1
You could possible even handle insert, updates, and deletes with instead-of
INSTEAD OF INSERT Triggers
A foreign key cannot reference two tables. Assuming you don't want to correct your design by merging members_company1 and members_company2 tables, the best approach would be to:
Add two columns called member_company1_id and member_company2_id to your profiles table and create two foreign keys to the two tables and allow nulls. Then you could add a constraint to ensure 1 of the columns is null and the other is not, at all times.

How to Delete all data from a table which contain self referencing foreign key

I have a table which has employee relationship defined within itself.
i.e.
EmpID Name SeniorId
-----------------------
1 A NULL
2 B 1
3 C 1
4 D 3
and so on...
Where Senior ID is a foreign key whose primary key table is same with refrence column EmpId
I want to clear all rows from this table without removing any constraint. How can i do this?
Deletion need to be performed like this
4, 3 , 2 , 1
How can I do this
EDIT:
Jhonny's Answer is working for me but which of the answers are more efficient.
I don't know if I am missing something, but maybe you can try this.
UPDATE employee SET SeniorID = NULL
DELETE FROM employee
If the table is very large (cardinality of millions), and there is no need to log the DELETE transactions, dropping the constraint and TRUNCATEing and recreating constraints is by far the most efficient way. Also, if there are foreign keys in other tables (and in this particular table design it would seem to be so), those rows will all have to be deleted first in all cases, as well.
Normalization says nothing about recursive/hierarchical/tree relationships, so I believe that is a red herring in your reply to DVK's suggestion to split this into its own table - it certainly is viable to make a vertical partition of this table already and also to consider whether you can take advantage of that to get any of the other benefits I list below. As DVK alludes to, in this particular design, I have often seen a separate link table to record self-relationships and other kinds of relationships. This has numerous benefits:
have many to many up AND down instead of many-to-one (uncommon, but potentially useful)
track different types of direct relationships - manager, mentor, assistant, payroll approver, expense approver, technical report-to - with rows in the relationship and relationship type tables instead of new columns in the employee table
track changing hierarchies in a temporally consistent way (including terminated employee hierarchy history) by including active indicators and effective dates on the relationship rows - this is only fully possible when normalizing the relationship into its own table
no NULLs in the SeniorID (actually on either ID) - this is a distinct advantage in avoiding bad logic, but NULLs will usually appear in views when you have to left join to the relationship table anyway
a better dedicated indexing strategy - as opposed to adding SeniorID to selected indexes you already have on Employee (especially as the number of relationship types grows)
And of course, the more information you relate to this relationship, the more strongly is indicated that the relationship itself merits a table (i.e. it is a "relation" in the true sense of the word as used in relational databases - related data is stored in a relation or table - related to a primary key), and thus a normal form for relationships might strongly indicate that the relationship table be created instead of a simple foreign key relationship in the employee table.
Benefits also include its straightforward delete scenario:
DELETE FROM EmployeeRelationships;
DELETE FROM Employee;
You'll note a striking equivalence to the accepted answer here on SO, since, in your case, employees with no senior relationship have a NULL - so in that answer the poster set all to NULL first to eliminate relationships and then remove the employees.
There is a possibly appropriate usage of TRUNCATE depending upon constraints (EmpployeeRelationships is typically able to be TRUNCATEd since its primary key is usually a composite and not a foreign key in any other table).
Try this
DELETE FROM employee;
Inside a loop, run a command that deletes all rows with an unreferenced EmpID until there are zero rows left. There are a variety of ways to write that inner DELETE command:
DELETE FROM employee WHERE EmpID NOT IN (SELECT SeniorID FROM employee)
DELETE FROM employee e1 WHERE NOT EXISTS
(SELECT * FROM employee e2 WHERE e2.SeniorID = e.EmpID
and probably a third one using a JOIN, but I'm not familiar with the SQL Server syntax for that.
One solution is to normalize this by splitting out "senior" relationship into a separate table. For the sake of generality, make that second table "empID1|empID2|relationship_type".
Barring that, you need to do this in a loop. One way is to do it:
declare #count int
select #count=count(1) from table
while (#count > 0)
BEGIN
delete employee WHERE NOT EXISTS
(select 1 from employee 'e_senior'
where employee.EmpID=e_senior.SeniorID)
select #count=count(1) from table
END