obiee Cannot find logical table source coverage for logical columns - repository

I try to make some changes in the repository of an OBI 11g installation.
I have a fact table who counts the positions in the departments.
I have a department table and a departmentDH table.
The department and departmentDH are linked to the fact table with the deptId (both).
In the logical layer I have a logical table and two LTS (logical table sources). One for the data of department and one for the data of departmentDH.
When I make an analysis with data of departments, data of departementsDH and the fact, everything goes fine. When I delete the fact in my analysis I get the error: "Cannot find logical table source coverage for logical columns".
Did I do something wrong?

Counter question: How do you expect the engine to navigate between those two dimensions? I do not know how your business model is built in detail but probably your dimensions have nothing to do with each other and the query goes astray since a non-applicable fact table is used.
You can never to pure dimension-only queries across multiple dimensions!
If that specific fact table is supposed to be used in all cases where there is a doubt then you must set an implicit fact column for your Subject Area:
https://gerardnico.com/wiki/dat/obiee/obis/implicit_fact_column

Related

Understanding a table's structure/schema in SQL

I wanted to reach out to ask if there is a practical way of finding out a given table's structure/schema e.g.,the column names and example row data inserted into the table(like the head function in python) if you only have the table name. I have access to several tables in my current role, however, a person who developed the tables left the team I am on. I was interested in examining the tables closer via SQL Assistant in Teradata (these tables often contain often hundreds of thousands of rows hence there are issues of hitting CPU exception criteria errors).
I have tried the following select statement, but there is an issue of hitting internal CPU exception criteria limits.
SELECT top10 * FROM dbc.table1
Thank you in advance for any tips/advice!
You can use one of these commands to get table's structure details in teradata
SHOW TABLE Database_Name.Table_Name;
or
HELP TABLE Database_Name.Table_Name;
It shows the table structure details

Relational Database join between two tables having unknown number of intermediate tables

I have a large database that I want to set up a generalized query method for subtables (or a join between subtables). However, the tables I'm interested in are sub-tables of a parent table that is an unknown number of tables deep of relationships from that parent table, depending on the table I'm querying.
Is there a means by which your can get SQL to automatically join all of the interim tables between the two tables of interest? Or narrow a query to only a subset of parent table?
For example this set of relationships:
Folder_Table->System_Table->Items_Table->Items_Class->Items_attributes->Items_Methods->Method_Data->Method_History
I want to be able to generically do searches or joins of any of the sub-tables, where the results are for only a single folder of Folder_table, without having to do a series of explicit joins to X table levels deep... which would significantly increase the complexity of building generic queries interfaces at runtime.
No, there is not.
What you're asking for is the famous "figure out what I want done and do it" function, which would be the golden panacea of programming languages or databases.
SQL is explicit. You need to specify the path by explicitly listing the tables to join and how to join them.
Now, could you make such a function for your specific case? Sure. You would build into it the knowledge of either your specific table structures, or the way to obtain the information needed to automatically find the path between table A and table B. However, there is no such built-in function that already exists, just waiting for you to use it. So if you want such a function, you're going to have to write it yourself.
Bonus questions:
What if there's multiple paths between A and B?

Building a MySQL database that can take an infinite number of fields

I am building a MySQL-driven website that will analyze customer surveys distributed by a variety of clients. Generally, these surveys are structured fairly consistently, and most of our clients' data can be reduced to the same normalized database structure.
However, every client inevitably ends up including highly specific demographic questions for their customers that are irrelevant to every other one of our clients. For instance, although all of our clients will ask about customer satisfaction, only our auto clients will ask whether the customers know how to drive manual transmissions.
Up to now, I have been adding columns to a respondents table for all general demographic information, with a lot of default null's mixed in. However, as we add more clients, it's clear that this will end up with a massive number of columns which are almost always null.
Is there a way to do this consistently? I would rather keep as much of the standardized data as possible in the respondents table since our import script is already written for that table. One thought of mine is to build a respondent_supplemental_demographic_info table that has the columns response_id, demographic_field, demographic_value (so the manual transmissions example might become: 'ID999', 'can_drive_manual_indicator', true). This could hold an infinite number of demographic_fields, but would be incredible painful to work with from both a processing and programming perspective. Any ideas?
Your solution to this problem is called entity-attribute-value (EAV). This "unpivots" columns so they are rows in a table and then you tie them together into a single view.
EAV structures are a bit tricky to learn how to deal with. They require many more joins or aggregations to get a single view out. Also, the types of the values becomes challenging. Generally there is one value column, so everything is stored as a string. You can, of course, have a type column with different types.
They also take up more space, because the entity id is repeated on each row (I think that is the response_id in your case).
Although not idea in all situations, they are appropriate in a situation such as you describe. You are adding attributes indefinitely. You will quickly run over the maximum number of columns allowed in a single table (typically between 1,000 and 4,000 depending on the database). You can also keep track of each value in each column separately -- if they are added at different times, for instance, you can keep a time stamp on when they go in.
Another alternative is to maintain a separate table for each client, and then use some other process to combine the data into a common data structure.
Do not fall for a table with key-value pairs (field id, field value) as that is inefficient.
In your case I would create a table per customer. And metadata tables (in a separate DB) describing these tables. With these metadata you can generate SQL etcetera. That is definitely superior too having many null columns. Or copied, adapted scripts. It requires a bit of programming, where an application uses the metadata to generate SQL, collect the data (without customer specific semantic knowledge) and generate reports.

When to split a Large Database Table?

I'm working on an 'Employee' database and the fields are beginning to add up (20 say). The database would be populated from different UI say:
Personal Information UI: populates fields of the 'Employee' table such as birthday, surname, gender etc
Employment Details UI: populates fields of the 'Employee' table such as employee number, date employed, grade level etc
Having all the fields populated from a single UI (as you would imagine) is messy and results in one very long form that you'd need to scroll.
I'm thinking of splitting this table into several smaller tables, such that each smaller table captures a related information of an employee (i.e. splitting the table logically according to the UI).
The tables will then be joined by the employee id. I understand that splitting tables with one-to-one relationship is generally not a good idea (multiple-database-tables), but could splitting the table logically help, such that the employee information is captured in several INSERT statements?
Thanks.
Your data model should not abide to any rules imposed by the UI, just for convenience. One way to reduce the column-set for a given UI component is to use views (in most databases, you can also INSERT / UPDATE / DELETE using simple views). Another is to avoid SELECT *. You can always select subsets of your table's columns
"could splitting the table logically help, such that the employee information is captured in several INSERT statements?"
No.
How could it help?
20 fields is a fairly small number of fields for a relational database. There was a question a while ago on SO where a developer expected to have around 3,000 fields for a single table (which was actually beyond the capability of the RDBMS in question) - under such circumstances, it does make sense to split up the table.
It could also make sense to split up the table if a subset of columns were only ever going to be populated for a small proportion of rows (for example, if there were attributes that were specific to company directors).
However, from the information supplied so far, there is no apparent reason to split the table.
Briefly put, you want to normalize your data model. Normalization is the systematic restructuring of data into tables that formed the theoretical foundation of the relational data model developed by EF Codd forty years ago. There are levels of normalization - non-normalized and then first, second, third etc normal forms.
Normalization is now barely an afterthought in many database shops, ostensibly because it is erroneously believed to slow database performance.
I found a terrific summary on an IBM site covering normalization which may help. http://publib.boulder.ibm.com/infocenter/idshelp/v10/topic/com.ibm.ddi.doc/ddi56.htm
The original Codd book later revised by CJ Date is not very accessible, unfortunately. I can recommend "Database Systems: Design, Implementation & Management; Authors: Rob, P. & Coronel, C.M". I TA'ed a class on database design that used this textbook and I've kept using it for reference.

Database table with just 1 or 2 optional fields... split into multiple tables?

In a DB I'm designing, there's one fairly central table representing something that's been sold or is for sale. It distinguishes between personal sales (like eBay) and sales from a proper company. This means there is literally 1 or two fields which are not equally appropiate to both cases... for instance one field is only used in one case, another field is optional in one case but mandatory in the other.
If there were more specialty it would be sensible to have a core table and then two tables with the fields relevant to the specific cases. But here, creating two tables just to contain like one field plus the reference to the core table seems both aesthetically bad, and painful to the query designer and DB software.
What do you think? Is it ok to bend the rules slightly by having a single table with weakened constraints - meaning the DB cannot 100% prevent inconsistent data being added (in a very limited way) - or do I suck it up and create dumb-looking 1-field tables?
What you're describing with one table for common columns and dependent tables for subtype-specific columns is called Class Table Inheritance. It's a perfectly good thing to do.
What #Scott Ferguson seems to be describing (two distinct tables for the two types of sales) is called Concrete Table Inheritance. It can also be a good solution depending on your needs, but more often it just makes it harder to write query across both subtypes.
If all you need is one or two columns that apply only to a given subtype, I agree it seems like overkill to create dependent tables. Remember that most brands of SQL database support CHECK constraints or triggers, so you can design data integrity rules into the metadata.
CREATE TABLE Sales (
sale_id SERIAL,
is_business INT NOT NULL, -- 1 for corporate, 0 for personal
sku VARCHAR(20), -- only for corporate
paypal_id VARCHAR(20), -- mandatory but only for personal
CONSTRAINT CHECK (is_business = 0 AND paypal_id IS NOT NULL)
);
I think the choice of having these fields is not going to hurt you today and would be the choice I would go for. just remember that as your database evolves you may need to make the decision to refactor to 2 separate tables, (if you need more fields)
There are some who insist that inapplicable fields should never be allowed, but I think this is one of those rules that someone wrote in a book and now we're all supposed to follow it without questioning why. In the case you're describing, a single table sounds like the simple, intelligent solution.
I would certainly not create two tables. Then all the common fields would be duplicated, and all your queries would have to join or union two tables. So the real question is, One table or three. But you seem to realize that.
You didn't clarify what the additional fields are. If the presence or absence of one field implies the record type, then I sometimes use that fact as the record type indicator rather than creating a redundant type. Like, if the only difference between a "personal sale" and a "business sale" is that a business sale has a foreign key for a company filled in, then you could simply state that you define a business sale as one with a company filled in, and no ambiguity is possible. But if the situation gets even slightly more complicated, this can be a trap: I've seen applications that say if a is null and b=c d / 7 = then it's record type A, else if b is null and etc etc. If you can't do it with one test on one field, forget it and put in a record type field.
You can always enforce consistency with code or constraints.
I worry a lot more about redundant data creating consistency problems then inapplicable fields. Redundant data creates all sorts of problems. Data inapplicable to a record type? In the worst case, just ignore it. If it's a "personal sale" and somehow a company got filled in, ignore it or null it out on sight. Problem solved.
If there are two distinct entities, "Personal Sales" and "Company Sales", then perhaps you ought to have two tables to represent those entities?
News flash: the DB cannot prevent 100% of corrupt data now matter which way you cut it. So far you have only considered what I call level 1 corruption (level 0 corruption is essentially what would happen if you wrote garbage over your database with a hex editor).
I have yet to see a database that could prevent level 2 corruption (syntactically correct records but when taken as a whole mean something perverse).
The PRO for keeping all fields in one table is that you get rid of JOIN's which makes your queries faster.
The CONTRA is that your table grows larger which makes your queries slower.
Which one impacts you more, totally depends on your data distribution and which queries you issue most often.
In general, splitting is better for OLTP systems, joining is better for data analysis (that tends to scan the tables).
Let's imagine 2 scenarios:
Split fields. There are 1,000,000 rows, the average row size is 20 bytes, the split field is filled once per 50 rows (i. e. 20,000 records in the split table).
We want to query like this:
SELECT SUM(mainfield + COALESCE(splitfield, 0))
FROM maintable
LEFT JOIN
splittable
ON splitid = mainid
This will require scanning 20,000,000 bytes and nested loops (or hash lookups) to find 10,000 records.
Each hash lookup is roughly equivalent to scanning 10 rows, so the total time will be equivalent of scanning 20,000,000 + 10 * 20,000 * 20 = 24,000,000 bytes
Joined fields. There are 1,000,000 rows, the average row size is 24 bytes, so the query will scan 24,000,000 bytes.
As you can see, the times tie.
However, if either parameter changes (field is filled more often or more rarely, the row size is more or less, etc), one or another solution will become better.