Analysis Services Cube - ssas

I am trying to understand the concept of an Analysis Services Cube. Please see the DLL below (Dim is short for Dimension):
CREATE TABLE DIMCustomer (ID INT identity, Name varchar(100), primary key (ID))
CREATE TABLE DIMSupplier (ID INT identity, Name varchar(100), primary key (ID))
CREATE TABLE DIMSalesman (ID INT identity, Name varchar(100), primary key (ID))
CREATE TABLE DIMDeliveryDriver (ID INT identity, Name varchar(100), primary key (ID))
CREATE TABLE DIMDate (ID INT identity, month varchar(100), day varchar(100), year varchar(100), primary key (ID))
CREATE TABLE FactTable (CustomerID int, SupplierID INT, SalesmanID INT, DeliveryDriverID int, DateID INT)
Is this an example of scenario that supports a five dimensional cube? (because the fact table contains five foreign keys (CustomerID, SupplierID, SalesmanID and DeliveryDriverID).

An OLAP cube is a technology that stores data in an optimized way to provide a quick response to various types of complex queries by using dimensions and measures.
OLAP cubes can be considered as the final piece of the puzzle for a data warehousing solution. The useful feature of an OLAP cube is that the data in the cube can be contained in an aggregated form.
In your case if you can include sales amount in facts table, it is possible to create sales summary handled by customer, supplier, salesman, delivery driver etc.
Suggest to read this for beginner level understanding with a sample:
https://www.codeproject.com/Articles/658912/Create-First-OLAP-Cube-in-SQL-Server-Analysis-Serv

You'd get a count measure for free in ssas for the measure group which may suffice for prototyping but,agreed,where are the other facts?

With this relational database schema, you'll be able to create a basic cube with 5 dimensions and 1 fact table.
The potential problem is that your fact table has no fact, I'd expect something such as salesAmount or another numeric value in your fact. At the moment your fact is a factless fact. Factless fact are a great way to model use relations between dimensions but it's usually not what you try to do in the first hours of cube design.

Related

Multiple columns from the same table in another table as foreign keys

I'm trying to make a game called "Odds On". You may have played this before.
Two people decide on a range of numbers and a challenge/dare
They both say their chosen number within the range at the same time.
If the choices are the same, then whoever is 'it' has to complete the challenge/dare
I'm trying to learn and create an Ubuntu server with WinForm clients to make this game online.
With my current plan, I've run into a problem. I would like the game table to hold two user.username(s) as foreign keys as well as two round.roundID(s).
I've tried making the database in access and access said it could not enforce referential integrity
Can this be implemented in mySQL?? Perhaps using linking tables?
UML of what I'm trying to create
I would avoid MS-Access since it has several limitations that could hamper your learning of relational databases.
You can have many relationships between each pair of tables. Your example -- in PostgreSQL -- can look like:
create table users (
username varchar(20) primary key not null,
password varchar(20),
currently_logged_in int
);
create table round (
round_id int primary key not null,
user1_result int,
user2_result int
);
create table game (
game_id int primary key not null,
user1 varchar(20) references users (username),
user2 varchar(20) references users (username),
round1_id int references round (round_id),
round2_id int references round (round_id),
lower_bound int,
upper_bound int
);
See this example at db<>fiddle.

Best practice for verifying correctness of data in MS SQL

We have multiple tables with different data (for example masses, heights, widths, ...) that needs to be verified by employees. To keep track of already verified data, we are thinking about designing a following table:
TableName varchar
ColumnName varchar
ItemID varchar
VerifiedBy varchar
VerificationDate date
This table links the different product id's, tables and columns that will be verified, for example:
Table dbo.Chairs
Column dbo.Chairs.Mass
ItemId 203
VerifiedBy xy
VerificationDate 10.09.2020
While creating foreign keys, we were able to link the ItemID to the central ProductsID-Table. We wanted to create two more foreign keys for database tables and columns. We were unable to do this, since "sys.tables" and "INFORMATION_SCHEMA.COLUMNS" are views.
How can I create the foreign keys to the availible database tables/columns?
Is there better way how to do such a data verification?
Thanks.
You can add a CHECK constraint to verify that the correctness of the data which is inserted/updated in the columns TableName and ColumnName, like this:
CREATE TABLE Products (
ItemID VARCHAR(10) PRIMARY KEY,
ItemName NVARCHAR(50) UNIQUE
)
CREATE TABLE Chairs (
ItemID VARCHAR(10) PRIMARY KEY,
FOREIGN KEY (ItemID) REFERENCES dbo.Products,
Legs TINYINT NOT NULL
)
CREATE TABLE Sofas (
ItemID VARCHAR(10) PRIMARY KEY,
FOREIGN KEY (ItemID) REFERENCES dbo.Products,
Extendable BIT NOT NULL
)
CREATE TABLE Verifications (
TableName sysname NOT NULL,
ColumnName sysname NOT NULL,
ItemID VARCHAR(10) REFERENCES dbo.Products,
VerifiedBy varchar(30) NOT NULL,
VerificationDate date NOT NULL,
CHECK (COLUMNPROPERTY(OBJECT_ID(TableName),ColumnName,'ColumnId') IS NOT NULL)
)
You need to grant VIEW DEFINITION on the tables to the users which have rights to insert/update the data.
This will not entirely prevent wrong data, because the check constraints will not be verified when you drop a table or a column.
However, I don't think this is necessarily a good idea. A better (and more conventional) way would be to add the VerifiedBy and VerificationDate to the Products table (if you can force the user to verify all the properties at once) or create separate columns regarding each verified column (for example LegsVerifiedBy and LegsVerificationDate in the Chairs table, ExtendableVerifiedBy and ExtendableVerificationDate in the Sofas table, etc), if the verification really needs to be done separately for each column.

Combine multiple outrigger tables into one?

I have a dimensional table and several outriggers.
create table dimFoo (
FooKey int primary key,
......
)
create table triggerA (
FooKey int references dimFoo (FooKey),
Value varchar(255),
primary key nonclustered (FooKey, Value)
)
create table triggerB (
FooKey int references dimFoo (FooKey),
Value varchar(255)
primary key nonclustered (FooKey, Value)
)
create table triggerC (
FooKey int references dimFoo (FooKey),
Value varchar(255)
primary key nonclustered (FooKey, Value)
)
Should these outrigger tables be merged into one table?
create table Triggers (
FooKey int references dimFoo (FooKey),
TriggerType varchar(20), // triggerA, triggerB, triggerC, etc....
Value varchar(255),
primary key nonclustered (FooKey, TriggerType, Value)
)
In order to meet this kind of scenario, such as with dimCustomer with customers potentially having multiple hobbies, the typical Kimball approach is to use a Bridge table between dimensions (dimCustomer and dimHobby).
This link provides a summary of how bridge tables could solve this problem and also alternatives which may work better for you.
Without knowing more about your specific scenario, including what the business requirements are, how many of these value types you have, how 'uniform' the various value types and values are, and the BI technology you'll be using for accessing the data, its hard to give a definitive answer to whether you should combine the bridges into one uber-bridge that caters for the various many-to-manys. All the above influence the answer to some extent.
Typically the 'generic tables' approach is more useful behind the scenes for administration than it is for presenting for analytics. My default approach would be to have specific bridge tables until/unless this became unmanageable from an ETL perspective or perceived as much more complex from a user query perspective. I wouldn't look to 'optimise' to a combined table from the get-go.
If your situation is outside the usual norms (do you have three as per your example, or ten?), combining could well be a good idea. This would make it more like a factless fact, with dimensions of dimCustomer, dimValueType and dimValue, and would be a perfectly reasonable solution.

Storing updates from different updaters in the History table

I have a table "AvailableProducts" with following fields:
StoreID int,
ProductID int,
ProductPrice decimal,
IsAvailable bit
The ProductPrice can be changed either by sales person in the store or it can be updated by a price update from the brand.
Now to store the history of price changes, I've created a history table as follows:
Table ProductPriceHistory
UpdateID int,
StoreID int,
ProductID int,
ProductPrice decimal,
IsAvailable bit,
UpdatedBy int,
UpdatedAt datetime
The problem I am facing is that keeping BrandID or SalesPersonID (That made the changes to price) in the UpdatedBy field is wrong design.
I can modify it to something like this:
Table ProductPriceHistory
UpdateID int,
StoreID int,
ProductID int,
ProductPrice decimal,
IsAvailable bit,
BrandId int,
SalesPersonID int,
UpdatedAt datetime
This would allow me to reference the updating entity by a foreign key in the Brand and SalesPerson Tables using the Id fields. But it would also lead to many empty or null column values since only one entity i.e. either brand or SalesPerson can update the price at given time.
I could also create two different history tables to save updates made by SalesPerson and Brands separately but this solution doesn't look appealing.
Any suggestions for improvement in this design as I would like the history for this table to be maintained in a single table. Thanks :)
You could create an ObjectType table with 2 items:
CREATE TABLE [dbo].[ObjectType](
[ObjectTypeId] [int] NOT NULL,
[ObjectTypeName] [nvarchar](100) NULL)
GO
INSERT INTO dbo.ObjectType VALUES (1, 'Brand')
INSERT INTO dbo.ObjectType VALUES (2, 'SalesPerson')
Then add a new column ObjectTypeId to table ProductPriceHistory
ALTER TABLE ProductPriceHistory
ADD ObjectTypeId int
You could write log for many kinds of item not only for SalesPerson and Brands
This is a common question - it's often asked in relation to the object orientation concept of polymorphism.
There are 3 standard solutions - you've identified two of them; the final one is to model the common fields in a single table, and have separate tables for the variant data. That would have tables "sales_update" and "brand_update", with foreign keys on updateID back to the update history table.
There is no elegant solution - the relational model simply doesn't support this use case particularly nicely. You need to look at the rest of your system, and pick the solution that's easiest in your case. Usually, that's the "one table stores everything model" - but your situation may be different.

SQL One-to-Many Table vs. multiple one-to-one relationships

I'm working on a project with the following objective: A User can create a Challenge and select an optional Rival to take part of this challenge. The Challenge generates Daily entries and will track stats on these.
The basic User and Entry entities look like this:
CREATE TABLE users (
id (INT),
PRIMARY KEY (id)
);
CREATE TABLE entries (
challengeId INT,
userId INT,
entryDate DATE,
entryData VARCHAR,
PRIMARY KEY (challengeId, userId, entryDate)
)
The piece I'm having trouble with is the Challenge piece with the Rival concept. I can see two approaches.
// Hard code the concept of a Challenge Owner and Rival:
CREATE TABLE challenges (
id INT,
name VARCHAR,
ownerId INT,
rivalId INT NULL,
PRIMARY KEY (id),
UNIQUE KEY (ownerId, name)
);
// Create Many-to-one relationship.
CREATE TABLE challenges (
id INT,
name VARCHAR,
PRIMARY KEY (id),
UNIQUE KEY (name)
)
CREATE TABLE participant (
challengeId INT,
userId INT,
isOwner BIT,
PRIMARY KEY (challengeId, userId)
)
The problem with the first approach is that referential integrity is tough since now there are two columns where userIds reside (ownerId and rivalId). I'd have to create two tables for everything (owner_entries, rival_entries, owner_stats, etc.) in order to set up foreign keys.
The second approach solves this and has some advantages like allowing multiple rivals in the future. However, one thing I can't do anymore with that approach is enforce Challenge name uniqueness across a single user instead of the whole Challenge table. Additionally, tasks like finding a Challenge's owner is now trickier.
What's the right approach to the Challenges table? Is there anyway to set up these tables in a developer friendly manner or should I just jump all the way to Class Table Inheritance and manage the concept of Owner/Rivals there?
I think the way I would set this up is as follows (using the second approach):
CREATE TABLE challenges (id INT,
name VARCHAR,
owner_id INT,
PRIMARY KEY (id),
UNIQUE KEY (name, owner_id))
CREATE TABLE participant (challengeId INT,
userId INT,
PRIMARY KEY (challengeId, userId))
This allows easy tracking of who owns the challenge, yet extracts out the individual participants.
This would also allow you to unique the challenge name by the owner safely, and foreign keys on the userId in participant are easy. 'Rivals' are then all participants that are not the challenge owner.
I treat the first approach the right one.
You could have one table for users and one for challenges.
Are you aware that you can reference one table twice like below?
SELECT * FROM CHALLENGES
INNER JOIN USERS AS OWNERS ON OWNERS.ID = CHALLENGES.OWNERID
INNER JOIN USERS AS RIVALS ON RIVALS.ID = CHALLENGES.RIVALID
In this case you can reference both rivals and owners without creating new tables.