SSAS first cube with facttable, is kinda hard - ssas

Link to DCD: http://i48.tinypic.com/29de538.png
I have this DCD to work on, with SQL Server Business Intelligence Development Studio 2008.
I have made a facttable which is a view, because I need to have some measures..
Now how do I connect my facttable to the other tables? Dont know if the should be linked to selftest, instrument or network.
It would be great to hear the correct linking, and how to do it!
Thanks a bunch!

It looks like you have Instrument (first dimension), execution date (second dimension) and a resulting status (third dimension) for each Self Test. You may generate hierarchies of Instrument Type, Group and Network to group together Instruments within ssas.
The resulting counts (NumOk, NumWarning, NumError) may be your measures. Typically, you would group these together depending on the grain of your fact (i.e. daily) to a single fact table. There is no need to create separate facts per measures, you would only create separate fact tables if the measures relate to a different set of dimensions
Try to set your source up as follows (add additional attributes as required) and follow the model through to the DSV. Suggest that you ignore hierarchies until you get the basics working within SSAS.

Related

SSAS Data Source View Relationship

I need to get all the relationships in my SSAS cube Data Source View between Fact and Dim tables. I've around 15 Fact tables and linked dimensions to it. Is there any MDX query to get the relationship other then doing it manually
I suspect that you want to export a list of relationships between measure groups and dimensions as they are represented in the Dimension Usage tab of the cube designer. (The relationships in the DSV don't much matter unless SSAS needs to figure out how to join two tables in a SQL query it generates. You can have a cube with no DSV relationships at all. As long as the Dimension Usage tab has the right relationships then the cube will work.)
So please install the free BI Developer Extensions and run the Printer Friendly Dimension Usage Report. I believe it will contain the info you need.
I would recommend the above. If you want to look at the appropriate data management view (DMV) run the MDSCHEMA_MEASUREGROUP_DIMENSIONS DMV. It is harder to use and interpret but has what you need in terms of representing the Dimension Usage tab:
Select * from $system.MDSCHEMA_MEASUREGROUP_DIMENSIONS

SSAS Environment or CUBE creation methodology

Though I have relatively good exposer in SQL Server, but I am still a newbie in SSAS.
We are to create set of reports in SSRS and have the Data source as SSAS CUBE.
Some of the reports involves data from atleast 3 or 4 tables and also would involve Grouping and all possible things from SQL Environment (like finding the max record for a day and joining that with 4 more tables and apply filtering logic on top of it)
So the actual question is should I need to have these logics implemented in Cubes or have them processed in SQL Database (using Named Query in SSAS) and get the result to be stored in Cube which would be shown in the report? I understand that my latter option would involve creation of more Cubes depending on each report being developed.
I was been told to create Cubes with the data from Transaction Tables and do entire logic creation using MDX queries (as source in SSRS). I am not sure if that is a viable solution.
Any help in this would be much appreciated; Thanks for reading my note.
Aru
EDIT: We are using SQL Server 2012 version for our development.
OLAP cubes are great at performing aggregations of data, effectively grouping over the majority of columns all at once. You should not strive to implement all the grouping at the named query or relational views level as this will prevent you from being able to drill down through the data in the cube and will result in unnecessary overhead on the relational database when processing the cube.
I would start off by planning to pull in the most granular data from your relational database into your cube and only perform filtering or grouping in the named queries or views if data volumes or processing time are a concern. SSAS will perform some default aggregations of the data to allow for fast queries at the most grouped level.
More complex concerns such as max(someColumn) for a particular day can still be achieved in the cube by using different aggregations, but you do get into complex scenarios if you want to aggregate by one function (MAX) only to the day level and then by another function across other dimensions (e.g. summing the max of each day per country). In that case it may well be worth performing the max-per-day calculation in a named query or view and loading that into its own measure group to be aggregated by SUM after that.
It sounds like you're at the beginning of the learning path for OLAP, so I'd encourage you to look at resources from the Kimball Group (no affiliation) including, if you have time, the excellent book "The Data Warehouse Toolkit". At a minimum, please look into Dimensional Modelling techniques as your cube design will be a good deal easier if you produce a dimensional model (likely a star schema) in either views or named queries.
I would look at BISM Tabular if your model is not complicated. It compresses and stores data in memory. As for data processing I would suggest to keep all calculations and grouping in database layer (create views).
All the calculations and grouping should be done at database level atleast in form of views.
There are mainly two ways to store data (MOLAP and ROLAP). Use MOLAP storage model for deal with tables that store transactions kind of data.
The customer's expectation from transaction data (from my experience) would be to understand the sales based upon time dimension. Something like Get total sales in last week or last quarter. etc.
MDX scripts are basically kind of SQL scripts that Cube can understand. No logic should be there. based upon what Parameters are chosen in SSRS report, MDX query should be prepared. Small analytical functions such as subtotal, average can be dome by MDX but not complex calculations.

Wrong link between Fact and Dim table SSAS

We have a problem with a Fact table in our cube.
We know it is not the best practice of developing a Dimensional db but we have dimension and fact table combined in 1 table.
This is because Dimensional data isn't much (5 fields). But moving on to the problem.
We have added this table to our cube and for testing we added 1 measure(count of rows). . Like the image says we have the grand total for every sub category this isn't correct.
.
Does anyone have an idea where we have to look for the problem.
Kind regards,
Phoenix
You have not defined a relationship between your sub category dimension and your fact table. That has resulted in the full count mapping to all of the sub category attributes, hence the same value repeating
Add a relation between cube measure group and your dimension on the second tab (Dimension Usage) in cube (It's 'regular' and key-level on both sides in most cases).
If this relation is exist try to recreate it again. Sometimes it happens after several manual changes in 'advanced' mode.
Check dimension mapping in fact table. If everything is ok, try to add new dimension with only one level the first time, than add another one etc. I know it sounds like shaman tricks but still...
And always use SQL Server Profiler on both servers (SQL, SSAS) to capture exact query that returns wrong value. Maybe the mistake is somewhere else.

SSAS MOLAP Aggregations not working

When we run the aggregation wizard from Visual Studio for a MOLAP cube, no aggregations are created. The graph stays flat. We tried every different scenario and setting that we could think of with no luck. Any help would be appreciated.
Things to check:
Do you have natural hierarchies in your dimensions? (Needed for optimal aggregation)
Do you have the correct attribute relations for those hierarchies? (Also, needed for optimal aggregation and performance)
Do you have non parent-child dimensions available to aggregate? (Only the key attribute will aggregate in these)
Do you have any data in your cube? (the wizard needs data to work)
Do you have partitions? (aggregations can't be created for higher levels than the partition split e.g. you can't have a year aggregation on a monthly partition)
Ensure that at least some dimension attributes have their AttributeHierarchyOptimizedState set to FullyOptimized
Check the AggregationUsage cube dimension attribute property
Also read:
What are the natural hierarchies and why they are a good thing
Influencing Aggregation Candidates (This has much more detail than here)

Setting up Dim and Fact tables for a Data Warehouse

I'm tasked with creating a datawarehouse for a client. The tables involved don't really follow the traditional examples out there (product/orders), so I need some help getting started. The client is essentially a processing center for cases (similar to a legal case). Each day, new cases are entered into the DB under the "cases" table. Each column contains some bit of info related to the case. As the case is being processed, additional one-to-many tables are populated with events related to the case. There are quite a few of these event tables, example tables might be: (case-open, case-dept1, case-dept2, case-dept3, etc.). Each of these tables has a caseid which maps back to the "cases" table. There are also a few lookup tables involved as well.
Currently, the reporting needs relate to exposing bottlenecks in the various stages and the granularity is at the hour level for certain areas of the process.
I may be asking too much here, but I'm looking for some direction as to how I should setup my Dim and Fact tables or any other suggestions you might have.
The fact table is the case event and it is 'factless' in that it has no numerical value. The dimensions would be time, event type, case and maybe some others depending on what other data is in the system.
You need to consolidate the event tables into a single fact table, labelled with an 'event type' dimension. The throughput/bottleneck reports are calculating differences between event times for specific combinations of event types on a given case.
The reports should calculate the event-event times and possibly bin them into a histogram. You could also label certain types of event combinations and apply the label to the events of interest. These events could then have the time recorded against them, which would allow slice-and-dice operations on the times with an OLAP tool.
If you want to benchmark certain stages in the life-cycle progression you would have a table that goes case type, event type1, event type 2, benchmark time.
With a bit of massaging, you might be able to use a data mining toolkit or even a simple regression analysis to spot correlations between case attributes and event-event times (YMMV).
I suggest you check out Kimball's books, particularly this one, which should have some examples to get you thinking about applications to your problem domain.
In any case, you need to decide if a dimensional model is even appropriate. It is quite possible to treat a 3NF database 'enterprise data warehouse' with different indexes or summaries, or whatever.
Without seeing your current schema, it's REALLY hard to say. Sounds like you will end up with several star models with some conformed dimensions tying them together. So you might have a case dimension as one of your conformed dimensions. The facts from each other table would be in fact tables which link both to the conformed dimension and any other dimensions appropriate to the facts, so for instance, if there is an employee id in case-open, that would link to an employee conformed dimension, from the case-open-fact table. This conformed dimension might be linked several times from several of your subsidiary fact tables.
Kimball's modeling method is fairly straightforward, and can be followed like a recipe. You need to start by identifying all your facts, grouping them into fact tables, identifying individual dimensions on each fact table and then grouping them as appropriate into dimension tables, and identifying the type of each dimension.
Like any other facet of development, you must approach the problem from the end requirements ("user stories" if you will) backwards. The most conservative approach for a warehouse is to simply represent a copy of the transaction database. From there, guided by the requirements, certain optimizations can be made to enhance the performance of certain data access patterns. I believe it is important, however, to see these as optimizations and not assume that a data warehouse automatically must be a complex explosion of every possible dimension over every fact. My experience is that for most purposes, a straight representation is adequate or even ideal for 90+% of analytical queries. For the remainder, first consider indexes, indexed views, additional statistics, or other optimizations that can be made without affecting the structures. Then if aggregation or other redundant structures are needed to improve performance, consider separating these into a "data mart" (at least conceptually) which provides a separation between primitive facts and redundancies thereof. Finally, if the requirements are too fluid and the aggregation demands to heavy to efficiently function this way, then you might consider wholesale explosions of data i.e. star schema. Again though, limit this to the smallest cross section of the data as possible.
Here's what I came up with essentially. Thx NXC
Fact Events
EventID
TimeKey
CaseID
Dim Events
EventID
EventDesc
Dim Time
TimeKey
Dim Regions
RegionID
RegionDesc
Cases
CaseID
RegionID
This may be a case of choosing a solution before you've considered the problem. Not all datawarehouses fit into the star schema model. I don't see that you are aggregating any data here. So far we have a factless fact table and at least one rapidly changing dimension (cases).
Looking at what I see so far I think the central entity in this database should be the case. Trying to stick the event at the middle doesn't seem right. Try looking at it a different way. Perhaps, case, events, and case events to start.