Developing Slowly Changing Dimension in SSAS - ssas

I searched the internet a lot but nothing good came of it. I have 3 table and want to develop SCD type 2 in SSAS Cube.
1- DimCompanies
2- DimDate
3- FactTable
FactTable:
Val
CompanyId
DateId
DimCompanies has this information :
CompanyId
CompanyName
I used many methods on the internet, such as adding the Surrogate key and bussiness Key, but to no avail.
My question is that a company has a name for a year, for example 2000 to 2005, and after 2006 it has a new name. Therefore, new information must be displayed when the cube is loaded, but whatever I do, this mode can not be implemented. I add YearId and IsCurrent In DimCompanies But I dont know how to use it. I also do not know how to connect the DimDate to the DimCompanies.

your DimDate and DimCompanies should have an unique key (primary Key) which will link to your Fact Table columns DateId and CompaniesId (Foreign Keys) - This is how you will connect the two as they are already connected in the fact table.

Related

SQL Table Relationships

I am attempting to create some table relationships for a SQL database that handles service activity on a work order ticket. I have one-to-may relationships for these tables and have a question.
I have WOT (Work Order Ticket) table for the creation of a work order ticket.
The WOT can have multiple service activities on different dates associated with it.
Each service activity can have multiple parts used (for a repair).
This is just the basic idea, but I have created the following tables and relationships:
WOT Table:
wot>wotnum – PK
Service Activity Table:
service_activity>serviceid – PK
service_activity>wotnum – FK (link to PK in WOT table)
Part Used Table:
partused>partusedid – PK
partused>serviceid – FK (link to PK in Service Activity table)
Each of the tables above has other columns as well (not shown), but they are unique to the table, such as date fields, part numbers, etc.
The service_activity>serviceid (PK) field is an autoincrement field and so is the partused>partusedid (PK) field as well.
My question is, during data entry, how do I insure the partused>serviceid field (FK) synced with the service_activity>serviceid field (PK) without actually having to manually enter the partused>serviceid field (FK)?
Although I have a decent understanding of tables and relationships (critical to get this correct now), I am a bit of a neophyte as to the process of thinking through how the tables will interact during actual data input. I think the answer to this may be simple, but I am just not grasping it yet. If my current solution does not seem adequate, I would welcome a suggestion. I need some help to get going in the right direction.
If you have added the FK correctly, when you manually enter the partused.serviceid column data during data entry, it will automatically refer the service_activity.serviceid column.
For example: If you have only one service_activity with serviceid = 456.
When you attempt insert data in the table partused with partused.serviceid = 500, it will throw an error and the insertion will not be allowed.
When you attempt to insert data in the table partused with partused.serviceid = 456, it will be successful as the referenced row in the other table indeed exists.

SQL Database Recursive Relationship

So I’m attempting to teach myself databases and SQL, and I’m trying to play around with making a database in management studio, and I have a question regarding recursive relationships in a table. Say I have a table called ‘Customers’ and in that table I have an int called Customer_ID as the primary key that is also an identity incrementing by 1, an nchar(125) called ‘Customer_Name’, and another int called Customer_Parent_ID (I don’t know whether I should make this an identity or not). How do I go about forming the relationships that there are customers, and I want to track that some of those customers may be parents of other customers (Think of companies, for example say both Microsoft and LinkedIn are customers, but Microsoft is also LinkedIn’s parent company and I want to show that relationship). I attached a picture of what I THINK it should look like… but again, total newbie here and any recommendations would be much appreciated.
Thank you so much!
EDIT: Added SQL Code and removed accidental mysql tag.
I think your question is somehow similar to this one, and the simple answer is you set a foreign key on Customer_Parent_ID column and refers to the Customer_ID column, so any id number that appears in the Customer_Parent_ID must also present in the Customer_ID column.

Data Warehouse - duplicate dimension members for multiple divisions

I am fairly new to data warehousing and SSIS, but I have been tasked with populating a data warehouse with sales transaction records from 2 different divisions of the parent company. My issue...I am modifying the SSIS package that populates the Product (SKUs) dimension to accommodate for the Products that pertain to the two divisions and I have ended up with a few Product names that exist in both divisions. I need a solution to accommodate the Product list for each division in the SAME dimension table. Is this possible??
To illustrate:
https://www.dropbox.com/s/hkda4n1bfs5o178/Capture.JPG?dl=0
Where 'widget_3' and 'widget_4' are named the same in both divisions, but they are NOT the same product. Just happened to be named the same. I imagine this is a common problem, but i am reluctant to make any changes to the dimension table schema before consulting with someone first.
I am working with a Product dimension table that has [MemberID] as the primary key and [Product] as a unique non clustered constraint with IGNORE_DUP_KEY = OFF. My first instinct was to modify the table schema to change the IGNORE_DUP_KEY to ON and rely on having a [Division] attribute to help populate the data in the fact table; use [Product] and [Division] to identify the [MemberID] on update.
Something like this??:
https://www.dropbox.com/s/fjzvsh80mtp3ozs/Capture2.JPG?dl=0
Am I going down the wrong path?
Notes:
- Using SQL 2008
This is at the end of the day a business problem. If there is a name conflict in two department this conflict should be resolved before to present the data togheter, else a department will find that they see some sales on their product which does not belong to them.
Once understood how to treat this at the global level (for example you will have a small department prefix in case of a clash, but this has to be agreed) the problem will be automatically solved.
When the departments could not be reached or do not agree on a solution, you could have two product name column, each for every of the two department and use them togheter as PK (I will not include a division, or at least I will not show it, because it is confusing for the end users). But I do recommend to find a business solution, not a technical one.

Database Design - Foreign Key and Primary Key relationships across 3 tables

As a personal project to myself, I am trying to redesign one of our existing Access database tools at work into VB.net. This is including a database redesign from scratch as the current one is an absolute mess.
Here is the database as it stands at the moment with my current re-design in SQL Server:
Now to make the relationships clear:
On the Client table, Client_ID is the primary key. This has a relationship to Contracts.Client_ID as its foreign key. This also has a relationship to Sites.Client_ID as its foreign key.
On the Sites table, Site_ID is the primary key. This has a relationship to Contracts.Site_ID as its foreign key.
Each primary key in every table auto-increments by one on each record creation.
The idea here is a simple Client/Site/Contract structure. For example, Client: Microsoft, Site: Reading Head Office, Contracts (any type of contract that could apply to either the company as whole or an individual site).
You can't have a site without a client. You should be able to link a contract to a client or a site. At present I have allowed Nulls for both Site_ID and Client_ID in contracts to facilitate this as I can't find any way to ensure at least one is filled in.
Does this design look reasonable and follow best practice? I've tried to follow best practice as per a number of different suggestions found across the web, namingly separating tables for different types of data. Any input would be gratefully received
I recommend you check out creating and altering CHECK constraints. a simple condition of (Client_ID IS NOT NULL OR Site_ID IS NOT NULL)
As a side note, the structure should work for the business rules. Can a Client have a contract with without a site to work at or on? Does this make sense? If so, then go with what you have, if not, I'd suggest requiring site information for contracts (where will you be sending the invoice?), and therefore you can probably eliminate client_id from contracts.
Given that simple is the case here, I won't dive into other considerations like, should a phone number be attached to sites, or contacts? I would say contacts as the number for the renewal person is likely very different than the one for whom the contract was written.
I would do something like
Clients
ClientID PK, Name
only the attributes specific to a client
Client_Contracts
ClientID FK [Clients(ClientID)], ContractID FK (Contrats[ContractID])
Contratcs
ContractID(PK), StartDate, EndDate , SiteID FK (Sites [SiteID])
Only the attributes that a contract must have.
Sites
SiteID PK, all the columns for a site.

Designing an SQL Table and getting it right the first time

I currently working on an issue tracker for my company to help them keep track of problems that arise with the network. I am using C# and SQL.
Each issue has about twenty things we need to keep track of(status, work loss, who created it, who's working on it, etc). I need to attach a list of teams affected by the issue to each entry in my main issue table. The list of teams affected ideally contains some sort of link to a unique table instance, just for that issue, that shows the list of teams affected and what percentage of each teams labs are affected.
So my question is what is the best way to impliment this "link" between an entry into the issue table and a unique table for that issue? Or am I thinking about this problem wrong.
What you are describing is called a "many-to-many" relationship. A team can be affected by many issues, and likewise an issue can affect many teams.
In SQL database design, this sort of relationship requires a third table, one that contains a reference to each of the other two tables. For example:
CREATE TABLE teams (
team_id INTEGER PRIMARY KEY
-- other attributes
);
CREATE TABLE issues (
issue_id INTEGER PRIMARY KEY
-- other attributes
);
CREATE TABLE team_issue (
issue_id INTEGER NOT NULL,
team_id INTEGER NOT NULL,
FOREIGN KEY (issue_id) REFERENCES issues(issue_id),
FOREIGN KEY (team_id) REFERENCES teams(team_id),
PRIMARY KEY (issue_id, team_id)
);
This sounds like a classic many-to-many relationship...
You probably want three tables,
One for issues, with one record (row) per each individual unique issue created...
One for the teams, with one record for each team in your company...
And one table called say, "IssueTeams" or "TeamIssueAssociations" `or "IssueAffectedTeams" to hold the association between the two...
This last table will have one record (row) for each team an issue affects... This table will have a 2-column composite primary key, on the columns IssueId, AND TeamId... Every row will have to have a unique combination of these two values... Each of which is individually a Foreign Key (FK) to the Issue table, and the Team Table, respectively.
For each team, there may be zero to many records in this table, for each issue the team is affected by,
and for each Issue, there may be zero to many records each of which represents a team the issue affects.
If I understand the question correctly I would create....
ISSUE table containing the 20 so so items
TEAM table containing a list of teams.
TEAM_ISSUES table containing the link beteen the two
The TEAM_ISSUES table needs to contain a foriegn key to the ISSUE and TEAM tables (ie it should contain an ISSUE_ID and a TEAM_ID... it therefore acts as an intersection between the two "master" tables. It sounds like this is also the place to put the percentage.
Does that make sense?
There are so many good free open source issue trackers available that you should have pretty good reasons for implementing your own. You could use your time much better in customizing an existing tracker.
We are using Bugtracker.NET in the team I work for. It's been customized quite a bit, but there was no point in developing a system from the beginning. The reason we chose that product was that it runs on .NET and works great with SQL Server, but there are many other alternatives.
We can see those entities in your domain:
The "Issue"
"Teams" affected by that issue, in a certain percentage
So, having identified those two items, you can represent that with two tables, and the relationship between them is another table, that could track the percentage impact too.
Hope this helps.
I wouldn't create a unique table for each issue. I would do something like this
Table: Issue
IssueId primary key
status
workLoss
createdby
etc
Table: Team
TeamID primary key
TeamName
etc
Table: IssueTeam
IssueID (foreign key to issue table)
TeamID (foreign key to team table)
PercentLabsAffected
Unless I'm understanding wrong what you're trying to do, you should not have a unique table for each instance of an issue.
Your database should have three tables: an Issues table, a Teams table, and an IssueTeams joining table. The IssueTeams table would include foreign keys (i.e. TeamID and IssueID) that reference the respective team in Teams and issue in Issues. So Issue Teams might have records like (Issue1, Team1), (Issue1, Team3). You could keep the affected percentage of each teams' labs in the joining table.
Well, just to be all modern and agile-y, 'getting it right the first time' is less trendy than 'refactorable.' But to work through your model:
You have Issues (heh heh). You have Teams.
An Issue affects many Teams. A Team is affected by many Issues. So just for the basic problem, you seem to have a classic Many:Many relationship. A join table containing two columns, one to Issue PK and one to Team PK takes care of that.
Then you have the question of what % of teams. There's a dynamic aspect to that, of course, so to do it right, you'll need to specify a trigger. But the obvious place to put it is a column in Issue ("Affected_Team_Percentage").
If I understand you correctly, you want to create a new table of teams affected for each issue. Creating tables as part of normal operations rings my relational database design alarm bell. Don't do it!
Instead, use one affected_teams table with a foreign key to the issues table and a foreign key to the teams table. That will do the trick.