Cognos Analytics join multiple tables - sql

I am working in Cognos Analytics 11.1.7.
I have a data module with three tables. Table 1 and 2 contain all the transactions we do, where table 1 contains the remitter's part of the transaction and table 2 contains the beneficiary's part of the transaction. I.e. one transaction is divided into two tables. Table 3 holds account numbers.
I want to create a report that shows the remitter's customer ID and account number. However, in some cases, the the remitter's account is missing. These customer ID's have unique customer ID's (Y instead of X). In those cases, I want the beneficiary's customer ID and account number. Consider the following three tables
Table 1: REMITTER
CUSTOMER_ID
ORDER_NR
1X
123
2Y
456
1X
789
Table 2: BENEFICIARY
CUSTOMER_ID
ORDER_NR
4X
123
6X
456
6X
789
Table 3: ACCOUNTS
CUSTOMER_ID
ACCOUNT_NR
1X
1111
2Y
3X
3333
4X
4444
5X
5555
6X
6666
What I want is basically the following report:
REPORT OF ALL TRANSACTIONS TODAY
CUSTOMER_ID
ACCOUNT_NR
ORDER_NR
1
1111
123
6
6666
456
1
1111
789
I have solved the CUSTOMER_ID column with a switch case:
CASE
WHEN REMITTER.CUSTOMER_ID CONTAINS 'Y'
THEN BENEFICIARY.CUSTOMER_ID
ELSE REMITTER.CUSTOMER_ID
END
Now here's the problem, I can't create a join (relationship) between the column created above and the ACCOUNTS table since my own column lies directly "under" the data module (on the same level as the tables in the index list to the left). However, if I create a column "under" REMITTER table, I can't use the case calculation from above. Cognos gives med the following error:
The expression is not valid.
XQE-MSR-0008 In module "STACKOVERFLOW", the following query
subjects are not joined: "REMITTER", "BENEFICIARY".
I have tried to circumvent the error by creating all kinds of joins between REMITTER and BENEFICIARY on ORDER_NR but Cognos keeps giving me this error.
I have also tried to make a "triangle" of joins, where REMITTER and BENEFICIARY are joined on ORDER_NR, REMITTER and ACCOUNTS are joined on CUSTOMER_ID and BENEFICIARY and ACCOUNTS are joined on CUSTOMER_ID. This doesn't work. However, when I delete either the REMITTER/ACCOUNT or BENEFICIARY/ACCOUNTS join, it works with the table I keep joined.
I am slowly losing my sanity here. Thanks!

What is the nature of the relationships between these entities?
That is a question which you should ask for everything in your model.
The pattern of that relationship drives the relationship between the objects in the model, which in turn drives what decisions you need to make in your modelling.
For example, is this a Bridge table situation? If so, you need to be aware of it so you can model appropriately.
In the end it falls back on Kimball:
Identify the facts
Identify the dimensions
I am assuming that the cardinality is beneficiary to remitter to account or
remitter to beneficiary to account.
Put beneficiary and remitter into a view in the module, create a relationship between it and account, and delete the relationship between the middle table and account (so that the SQL will use the relationship which you created ).
I think putting the calculation into the table which is in the middle would also do the trick.
I can not say that I can map between your described 'triangle' of joined tables and a business purpose so I could not use that information to understand the entity relationship. Such a pattern of relationship is specifically identified as one to be identified and, as part of the Cognos proven practices, corrected. Because I can not identify if there truly is a business purpose to have such a triangle or not, I can not, and will not, describe the appropriate modelling actions as they are dependent on the business purpose of the relationships between the entities, which takes us back to St. Ralph.

Use queries.
In a report, create a query to resolve the remitter/beneficiary problem, then join that to another query that gets the accounts data.
If this functionality must be canned -- because many unknown report developers will use it to produce many unknown reports -- you can still do the same thing, but in a data set.
As C'est Moi alludes to, you can also do this in the data module by joining Remitter and Beneficiary into a table that computes the Customer_Id.

Related

Avoiding nullable FK - Database Design

I am trying to figure out how to best represent my data in my database with preferably no null FK(sql server).
I have a site where a user has to buy a membership(monthly subscription), can buy ads, buy credits to do more stuff on the site.
So I was thinking something like having these tables
Company
-normal company columns
Plan
- Limitations (json, that contains all the limitations of what membership allows, ie can do x amount of seraches)
- Name (ie Membership lv 1)
- Price
- Qty
- Unit (Monthly, Credit, etc)
PlanTypes
- Type (membership, addon, ad)
CompanyPlans (junction table)
- PlanId
- CompanyId
- Limitations (Json) - this would store like when their membership expires or how many credits they have left. If they would extend membership or buy credits the rows would be updated, so there would be business rules to basically make sure 1 row per plan per company, though this would not work really for ads as they can buy more than 1 ad.
Ads
- normal add columns
- Start
- End
So my problem I run into is, that the Plan table is the table that keeps track of if their membership subscription, credits which I think is fine.
However when it gets to ads it gets weird, as right now I was planning to put the relationship with Company. So now all of sudden for ads the checking of if they are still active or not is done in the ads table where everything else is in the CompanyPlan table.
To make it worse does the start and end date of the ad still get duplicated in the CompanyPlan table for consistency?
I really don't want to try to break up the Plan table into something like Subscription Table, Addon table and Ad Table as I am planning to an order history & order history line table that links to the product bought and I don't want to have 3 different relationships to the Order history line table and have always 2 of FK relationships null, as I think that is bad as well.
Another option I was thinking but again I am not sure if it is bad practice is to put the Ad relationship on the CompanyPlan Junction Table and keep the start/end date in the Company Table and other info in the Ad table.
Example Data
Company
Id Name
1 My Compnay
PlanTypes
Id Type
1 Membership
2 Addon
3 Ad
Plans
Id PlanTypeId Limitations Name Price Value Unit
1 1 {Searches: 100} Plan 1 $30 1 Month
2 2 {} Extra Searches $10 100 Credits
3 3 {} Ad1 $100 1 Month
CompanyPlan
Id PlanId Limitations CompnyId
1 1 {Start: "2018-01-01", End: {2018-02-01}, 1
2 2 {ExtraSearches: 100}, 1
3 3 {AdStart: "2018-01-01", AdEnd: {2018-02-01}, 1
Ads
Id CompanyId Start End
1 1 2018-01-01 2018-02-01
Order History
Id CompanyId
1 1
Order Lines
Id OrderHistoryId PlanId
1 1 1
2 1 2
3 1 3
For the start like The Impaler already mentioned in his comments, nullable FK aren't a bad thing if used correctly. I would design your tables as follow:
Company
-Id
-Name
Plan
-Id
-Name
-Details
Addon
-Id
-Value
DefaultPlanAddon
-Id
-Plan_Id
-Addon_Id
Subscription
-Id
-Company_Id
-Plan_Id
-Start
-End
SubscriptionAddons
-Id
-Subscription_Id
-Addon_Id
Advertisement
-Id
-Subscription_Id
-Content
-Start
-End
Most of the tables are pretty straight forward I guess.
Company
All data to indentify the company/customer.
Plan
Basic details about a plan.
Addon
Here it gets a bit more interesting. In this table are all addons saved. In my understanding addons are useable for all different plans. If this is not the case, you could add something like a PlanAddons table that holds the information which plans can have which addons. Value contains what this addon is about. Maybe about 100 extra adds or something other. As for what I know you can use a simple string as the type since a addon is basically an Id and a description of what this addon is.
Subscription
I would seperate make a subscription table. You can also save the start and end date for each subscription. With this you also have a pruchase history and dont need an OrderHistory table. If the start date does not equals the purchase date you can easily add a PurchaseDate column.
SubscriptionAddon
This table includes all addons for a specific subscription since one subcription can have many addons.
DefaultPlanAddon
This table stores the default addons for each plan. This informations are only used if a company buys a subscription. Your logic looks up what default addons a plan has and put int, together with other wanted addons, into the SubscriptionAddon table.
Advertisement
Here are all adds stored. Also with start and end. Like with a subscription you can use this table as well for the OrderHistrory.
After reading your case, this is what I think:
I think the advertisement table (ads) should store the begin/end date of the ad.
The relationship between company and ads is 1:n. So ads has a foreign key pointing to company.
The company table should not store the begin/end date of the ad, since the same company could buy more ads in the future. When it does, the new ad gets a begin/end date, without affecting the old ad, that may be obsolete at this point.
This should keep you model clean, without (dangerous) redundancy.

Data warehouse FactTable

I am trying to use this to learn how about data warehousing and having trouble understanding the concept of the fact table.
http://www.codeproject.com/Articles/652108/Create-First-Data-WareHouse
What would be some queries that I could run to find information from the faceable, and what questions do they answer.
A fact table is used in the dimensional model in data warehouse design. A fact table is found at the center of a star schema or snowflake schema surrounded by dimension tables.
A fact table consists of facts of a particular business process e.g., sales revenue by month by product. Facts are also known as measurements or metrics. A fact table record captures a measurement or a metric.
Example of fact table -
In the schema below, we have a fact table FACT_SALES that has a grain which gives us a number of units sold by date, by store and by product.
All other tables such as DIM_DATE, DIM_STORE and DIM_PRODUCT are dimensions tables. This schema is known as the star schema.
Let's translate this a bit.
Firstly, in a fact table we usually enter numeric values ( rarely Strings , char's ,or other data types).
The purpose of a fact table is to connect with the KEYS of dimensional tables ,other fact tables (fact tables more rarely, and also not a good practice) AND measurements ( and by measurements I mean numbers that change frequently like Prices , Quantities , etc.).
Let's take an example:
Think about a row from a fact table as an product from a supermarket when you pass it by the check out and it get's scanned. What will be displayed in the check out row in your database fact table? Possibly:
Product_ID | ProductName | CustomerID | CustomerName | InventoryID | StoreID | StaffID | Price | Quantity ... etc.
So all those Keys and measurements are bringed together in one fact table, having some big performance and understandability advantage.
Fact Table Contain all Primary key of Dimension table and Measure like "Sales Amount"
A Fact table is a table that stores your measurements of a business process. Here you would record numeric values that apply to an event like a sale in a store. It is surrounded by dimension tables which give the measurement context (which store? which product? which date?).
Using the dimensions you can ask lots of questions of your facts, like how many of a particular product have been sold each month in a region.
Some further info
All dimension keys in a fact should be a FK to the dimension
if a key is unknown it should point to a zero key in the dimension detaialing this.
All joins from a fact to a dim are 1 to 1. bridge tables are is a technique to cater for many to many's, but this is more advanced
All measurements in a fact are numeric, but can contain NULLS if unknown (never put 0 to represent unknown)
When joining facts to dimensions there is no need to do an outer join, due to FKS applied above.
if you have 999 rows in a fact, no matter what dimensions you join to, you should always have 999 rows returned.

Turn two database tables into one?

I am having a bit of trouble when modelling a relational database to an inventory managament system. For now, it only has 3 simple tables:
Product
ID | Name | Price
Receivings
ID | Date | Quantity | Product_ID (FK)
Sales
ID | Date | Quantity | Product_ID (FK)
As Receivings and Sales are identical, I was considering a different approach:
Product
ID | Name | Price
Receivings_Sales (the name doesn't matter)
ID | Date | Quantity | Type | Product_ID (FK)
The column type would identify if it was receiving or sale.
Can anyone help me choose the best option, pointing out the advantages and disadvantages of either approach?
The first one seems reasonable because I am thinking in a ORM way.
Thanks!
Personally I prefer the first option, that is, separate tables for Sales and Receiving.
The two biggest disadvantage in option number 2 or merging two tables into one are:
1) Inflexibility
2) Unnecessary filtering when use
First on inflexibility. If your requirements expanded (or you just simply overlooked it) then you will have to break up your schema or you will end up with unnormalized tables. For example let's say your sales would now include the Sales Clerk/Person that did the sales transaction so obviously it has nothing to do with 'Receiving'. And what if you do Retail or Wholesale sales how would you accommodate that in your merged tables? How about discounts or promos? Now, I am identifying the obvious here. Now, let's go to Receiving. What if we want to tie up our receiving to our Purchase Order? Obviously, purchase order details like P.O. Number, P.O. Date, Supplier Name etc would not be under Sales but obviously related more to Receiving.
Secondly, on unnecessary filtering when use. If you have merged tables and you want only to use the Sales (or Receving) portion of the table then you have to filter out the Receiving portion either by your back-end or your front-end program. Whereas if it a separate table you have just to deal with one table at a time.
Additionally, you mentioned ORM, the first option would best fit to that endeavour because obviously an object or entity for that matter should be distinct from other entity/object.
If the tables really are and always will be identical (and I have my doubts), then name the unified table something more generic, like "InventoryTransaction", and then use negative numbers for one of the transaction types: probably sales, since that would correctly mark your inventory in terms of keeping track of stock on hand.
The fact that headings are the same is irrelevant. Seeking to use a single table because headings are the same is misconceived.
-- person [source] loves person [target]
LOVES(source,target)
-- person [source] hates person [target]
HATES(source,target)
Every base table has a corresponding predicate aka fill-in-the-[named-]blanks statement describing the application situation. A base table holds the rows that make a true statement.
Every query expression combines base table names via JOIN, UNION, SELECT, EXCEPT, WHERE condition, etc and has a corresponding predicate that combines base table predicates via (respectively) AND, OR, EXISTS, AND NOT, AND condition, etc. A query result holds the rows that make a true statement.
Such a set of predicate-satisfying rows is a relation. There is no other reason to put rows in a table.
(The other answers here address, as they must, proposals for and consequences of the predicate that your one table could have. But if you didn't propose the table because of its predicate, why did you propose it at all? The answer is, since not for the predicate, for no good reason.)

Fact Table Recommendation

I have a data mart which only needs to capture a serial number of a product, the date of the activity, and where the activity took place (which account).
There are five possible activities. The issue I have is this. Two of the activities take place at a warehouse level. The remaining three take place at the account-level (WH does not apply). Ultimately however every warehouse rolls up to a master account.
So if I had one fact table, I would essentially need two FK and you would have to traverse the fact table to build the WH > Account hierarchy which seems hard to maintain. I'd like one dimension table.
Or is it then recommended I split this into two fact tables, even though the only different characteristic of either table is whether the activity took place at the warehouse or not.
The goal of the reporting will be at the account level, but having the WH information may be useful at some point. And I need to check for duplicates, etc which is why I was leaning towards the first, but don't know how to appropriately handle the hierarchies.
Single Fact Table Design
Item: 1
Account: 14
Warehouse:2
ActivityType:3
Date: 20130204
SerialNumber:123456
Count:1
Dual Fact Table Design
Table 1
Item: 1
Warehouse:2
ActivityType:3
Date: 20130204
SerialNumber:123456
Count:1
Table 2
Item: 1
Account:2
ActivityType:3
Date: 20130204
SerialNumber:123456
Count:1
Ive interpreted you situation as:
ALL activities require an account
Some activities involve a
warehouse.
The selection of warehouse implies an account. the
accounts mentioned in the two point above are of the same type (there
is only 1 account dimension table)
In which case you should be OK with the single FACT table design:
[ACTIVITY_FACT]
SK (Optional, i find unique surrogate PKs useful)
ITEM_SK (Link to your ITEM_DIM table)
ACCOUNT_SK (Link to your ACCOUNT_DIM table)
WAREHOUSE_SK (Link to your WAREHOUSE_DIM table, -1 for no warehouse activities)
ACTIVITY_TYPE_SK (Link to your ACTIVITY_TYPE_DIM table)
ACTIVITY_DATE_SK (Link to your DATE_DIM table)
ITEM_SERIAL_NUMBER
ITEM_COUNT
Have a record in your WAREHOUSE dimension for NONE or NOT APPLICABLE and allocate it a nice obvious special condition SK value of -1 or -9 or whatever your shop is using for such things.
For activity records that reference a warehouse, put the appropriate warehouse sk AND the account sk that belong to that warehouse.
For activities that do not involve a warehouse, populate the warehouse sk with the NONE / NOT APPLICABLE warehouse dimension record and the appropriate Account SK.
Now your fact table can be joined to your Account and Warehouse dimension tables without having to worry about outer join or null condition handling. This should allow you and your users to play about with warehouse dimension data as required and your not having to faff about with managing two tables that contain essentially the same date.
A possibility is to define the hierarchy in a single dimension table. Guessing at what you’re dealing with, I came up with the following.
Outline of dimension table:
TABLE: Account
Account_ID <surrogate key>
Account <Account name, identifier>
Warehouse (Warehouse name, identifier)
Sample data:
Account_ID Account Warehouse
1 A n/a
2 B n/a
3 C n/a
4 W wh1
5 W wh2
6 Z wh3
7 Z n/a
Account_ID is just a surrogate key, having no intrinsic meaning or value
Account lists the accounts. Here, I shows five, A, B, C, W and Z. Select distinct to get the list of accounts; join to a fact table by Account_ID where Account = “W” gets all data for that account (for however many warehouses, if applicable).
Warehouse lists all warehouses and the account they are associated with; here, “W” is the account for two separate warehouses (wh1, wh2); Z is associated with warehouse wh3, but could also be used by a fact table with “no” warehouse. Join to a fact table by Account_ID where Warehouse = “wh1” gets all data for that warehouse.
Using this, with Account_ID in a fact table you could drill down for all entries for any given Account or for a specific warehouse (or for no warehouse, if there is value in that).
There are lots of variations and permutations possible with this kind of approach.

Entity relation (normal forms)

Let's assume we have table USERS like:
..ID..username
user1
user2
user3
Users can have Bills (user -> have_many -> bill relation).
Table BILLS like:
..ID..user_id
1
2
2
Also we have Products so every Product can be associated to ONLY ONE bill (product -> has_one -> bill relation).
Table PRODUCTS like:
..ID..bill_id
2
3
1
So, as you can see, our user can have lot of products (through bills).
My question:
Would it be correct DUE TO Database normalization to add second foreign key to PRODUCTS table named user_id to quickly select all user's Products from PRODUCTS table, or it's not correct and I should use JOIN statement to select all User's Products?
P.S. Sorry for dirty tables drawing )
I would rather go with the normalized view (where you DO NOT have a user_id in the products table).
The only time I would ever consider this, as a last option, is if the performance REALY requires it.
It is usually better to normalize data so as not to duplicate information and retain data consistency.
However there are exceptions required in real life systems, often for performance reasons when dealing with huge volumes of data.