My data warehouse has snowflake schema and I am trying to create a cube for this schema. But I don't know how to add sub dimension. I am new to this and I am really confused with hierarchies and levels. This is my data warehouse:
ratings-fact_table ----> books_dm ----> authors
rating_id book_id author_id
book_id author_id
user_id publisher_id ----> publishers
publisher_id
|
users_dm ----> cities ---------> countries
user_id city_id country_id
city_id country_id
Please help!
I guess your data warehouse is build on the top of some relational database (MySQL, etc.). You can solve this problem by converting the snowflake schema to star schema manually ~ by creating SQL views in your database. Then in your OLAP schema you would use those tables:
Fact table (cube): ratings-fact_table
Dimension table: books_dm_view
Dimension table: users_dm_view
Where:
books_dm_view is SQL view:
CREATE VIEW books_dm_view AS
SELECT * FROM books_dm b
LEFT JOIN authors a ON b.author_id = a.author_id
LEFT JOIN publishers p ON p.publisher_id = b.publisher_id
users_dm_view is SQL view:
CREATE VIEW users_dm_view AS
SELECT * FROM users_dm u
LEFT JOIN cities c ON c.city_id = u.city_id
LEFT JOIN countries n on n.country_id = c.country_id`
This way your dimensions have no sub-dimensions and you don't need to use extra joins in you OLAP schema.
Related
This is what I wrote but I am not getting the correct answer:
select top 1 customer.first_name
from customer, claims
where customer.id = claims.id
order by claims.amount_of_claim desc
GO
You are joining claim.id and customer.id. Those will never match except by accident. The number on a claims document will never match the number on your insurance card or policy except by accident.
When you write a query, the table rows are matched based only on the query expressions, not any constraints between tables. This means that the following query will try to match unrelated things:
select top 1 customer.first_name
from customer inner join claims
on customer.id = claims.id
order by claims.amount_of_claim desc
The diagram shows that a Claim is related to a Policy which in turn is related to a customer:
Claims(customer_policy_id) -> Customer_Policy(customer_id) -> Customer
You'll have to join these 3 tables in your query
select top 1 cust.first_name
from customer cust
inner join customer_policy pol on cust.id=pol.customer_id
inner join claims cl on pol.id=cl.customer_policy_id
order by cl.amount_of_claim desc
In a relational database relations are represented by tables, not foreign key constraints. A Foreign Key Constraint is used to ensure that the FK values stored in a table are valid.
On top of that, an ER diagram is not a database diagram. The entities and relations shown in an ER diagram are don't map directly to tables. For example, a many-to-many relation between eg Customer and Address in an ER diagram would have to be translated to a bridge table, CustomerAddresses(CustomerId,AddressId). After all, a table is a relation in relational theory, and CustomerAddresses defines the relation between Customer and Address.
I've just started learning SQL and need help with an assignment question. I am asked to look through a dataset about Kickstarter campaigns. I'm asked to find the top 3 categories by amount of backers.
Here is the ER diagram:
ER diagram
In the 'Campaign' Table, there's the 'backers' column, but the 'Category' Table is only related with the Campaign through the 'Sub-Category' Table.
So far, I have been able to Join sub_category.category_id with the sub-category.category_name, but i'm not sure how to take this new Table and join it with Campaign
SELECT C.name AS category_name, SC.category_id, SC.id AS SC_id
FROM Category AS C
JOIN sub_category AS SC ON C.id = SC.category_id
Screenshot
I am hoping to have a table where there is a column for 'Category Name' and 'Backers' and then simply sort it by the number of backers
How should I go about this? Am I on the right track?
SELECT C.name AS category_name, CA.backers
FROM campaign AS CA
JOIN sub_category AS SC
ON CA.sub_category_id =SC.Id
JOIN Category AS C
ON C.id = SC.category_id
order by CA.backers
You can have multiple joins all together in one query.
Secondly there is a connection between Campaign and Sub_Category table which will help to join these two tables.
Later we can then join Category table as these two table has a connection between them based on Category_Id which is a foreign key in sub category table.
At last you can just order by based on Backers.
Let me know if you have any issue or doubt in comments.
And just to take Magnus's answer and rewrite visually, you can better see the hierarchy of the query. See how it closely resembles that of your table relationships
SELECT
C.name category_name,
CA.backers
FROM
campaign CA
JOIN sub_category SC
ON CA.sub_category_id = SC.Id
JOIN Category C
ON SC.category_id = C.id
order by
CA.backers
Notice the indentation to the table its ID is based upon from that prior to it. This way you know which column FROM connecting TO. I have found that if you list the tables in the FROM clause first to show all the HOW tables are related and ON what foreign : primary key relationships, that is the hardest part. Then its just pulling the columns you want after that.
I was tasked with creating a complex query that incudes all of the data from all of the tables minus the Keys. I am having an issue with the dead end tables and how to circle back around to include the data of the connecting table. I need to select columns DivisionName, ProgramName, ProgramChairFName, ProgramChairLName, CourseID, OutcomeDescription from the listed tables.
SQL Diagram
The 'dead-ends' aren't really dead-ends. When you join all the tables by the appropriate keys, you'll get an assembly of the information you want.
Consider a really simple example:
table person
id name
1 Alice
table pet
id person_id animal
1 1 cat
table hobby
id person_id activity
1 1 dancing
Here, the two tables pet and hobby link to the person table via the person_id key.
In your thinking, "pet" could be considered a "dead-end" because it doesn't link to hobby. But it doesn't need to. The query:
SELECT name, animal, activity
FROM person
JOIN pet ON person.id = pet.person_id
JOIN hobby ON person.id = hobby.person_id;
creates the correct joins back to the person table. It's not a linear path (person -> pet -> hobby). The nature of the joins are specified by the "ON" part of the query. You can see this simple example works here: http://sqlfiddle.com/#!9/02c94b/1
So, in your case, you can have a series of JOINs:
SELECT [all the columns you want]
FROM Division d JOIN Program p
ON d.DivisionKey = p.DivisionKey
JOIN ProgramChairMap pcm
ON p.ProgramKey = pcm.ProgramKey
JOIN ProgramChair pc
ON pcm.ProgramChairKey = pc.ProgramChairKey
JOIN Course c
ON p.ProgramKey = c.ProgramKey
JOIN CourseOutcome co
ON c.CourseKey = co.CourseKey
JOIN Outcome o
ON co.OutsomeKey = o.OutcomeKey
I'm using MariaDB ColumnStore and in ColumnStore, Circular joins are not supported.
In my database I have data regarding measurements sent from different countries and customers.
So for different roles I need to be able to filter the data inside the view.
So this is the structure I have right now
TABLE Measurements:
Country Customer Measurement
a 1 150
a 2 200
b 3 250
I have a table which maps users to the roles
TABLE UsersToRoles:
Users Roles
x role1
y role2
I have a table which maps Roles to the data it is allowed to see
TABLE RolesToData
Roles VariableType VariableValue
role1 Country a
role1 Customer 1
role1 Customer 2
role2 Country b
role2 Customer 3
I created the following RoleView
CREATE VIEW RoleView AS (
SELECT UsersToRoles.User, Country, Customer FROM UsersToRoles.users AS User,
Country.VariableValue AS 'Country' FROM
((UsersToRoles JOIN RolesToData Country ON (UsersToRoles.Roles =
Country.Roles AND Country.VariableType = 'Country'))
JOIN RolesToData Customer ON (UsersToRoles.Roles = Customer.Roles AND
Customer.VariableType = 'Customer')))
Which returns the following VIEW
User Country Customer
x a 1
x a 2
y b 3
I would then like to join the Measurements table with the RoleView on both Country and Customer such as
CREATE VIEW FinalView AS (SELECT measurement.* FROM measurement JOIN RoleView ON
(measurement.country = RoleView.country AND measurement.customer = RoleView.customer))
The problem is that MariaDB ColumnStore does not support circular joins.
Is there a work around to achieve a ciruclar joins without doing a circular join?
Perhaps through creating several views, and doing left or right joins on each view?
Would be really grateful if I found a solution to this.
EDIT: UPDATE
I did manage to find a quick fix, I am not sure about if this will have any severe impact or not.
If you use CONCAT on one of the join conditions it works.
CREATE VIEW FinalView AS (SELECT measurement.* FROM measurement JOIN
RoleView ON
(measurement.country = CONCAT(RoleView.country,"") AND measurement.customer =
RoleView.customer))
This is based on a comment by Andrew in this thread
https://jira.mariadb.org/browse/MCOL-1205
As you correctly stated the simplies workaround is to replace a circular join condition with a functional condition that effectively does nothing to the arguments of this function.
I need help in creating a stored procedure that pulls out multiple data from different tables.
My current stored procedure is as follows:
'#partnername nvarchar(120)
as
select ProjectDetails.Project, ProjectDetails.Id
from ProjectDetails
join ProjectPartners on ProjectPartners.ProjectDetailsId = ProjectDetails.Id
join Partners on Partners.Id = ProjectPartners.PartnersId
where Partners.PartnerName= #partnerName'
This Stored procedure allows a user to insert a partner name, this then displays the projects they are linked within.
But now I'm wishing to display more data within the stored procedure from other tables such as the following:
Table (ProjectFinance) columns ID, ProjectValue, FundingAgency and AgencyValue
Table (Partnership) Columns ID, PartnershipLevel, PartnershipType.
The Tables are linked using foreign keys within the project finance table & Partnership table these Foreign Keys are known as ProjectDetailsID
Any help will be greatly appreciated!
You need to add the tables to your joins and add their columns to your select list:
select ProjectDetails.Project, ProjectDetails.Id, pf.*, p.*
from ProjectDetails
join ProjectPartners on ProjectPartners.ProjectDetailsId = ProjectDetails.Id
join Partners on Partners.Id = ProjectPartners.PartnersId
join ProjectFinanct pf on pf.ProjectDetailsId = ProjectDetails.ID
join Partnership p on p.ProjectDetails.ID = ProjectDetails.ID
where Partners.PartnerName= #partnerName'