SSAS MDX Calculation for SUM(DISTINCT COUNT) - ssas

Just to give you an overview I am very new at SSAS cubes and MDX Calculations in general. I'm currently having a hard time grasping the concept of querying an OLAP Cube.
So say I have a Movement Dimension with a unique key MovementID. I'm trying to create a calculation in SSAS that counts The total different MovementIDs in this dimension. So a few questions
Can I create this using the New Measure option in SSAS? It throws me an error if there is a MovementID in the dimension that is not present in my fact table so I'm trying to instead do this in MDX Calculation. Is my understanding here correct?
How do I make this script in MDX SSAS Calculation? The SQL equivalent should simply be SELECT COUNT(PFNo) FROM Dim_Movement. I know that the OLAP query works on axes rather than traditional table format. So how can I achieve a simple DISTINCT COUNT of a column in MDX script?
If I were to make the script to compute the SUM of the DISTINCT COUNT of the column instead. What would the script look like?
Thanks!

Related

Understanding MDX 'ALL'?

I'm new to MDX coming from the world of SQL and I am trying to understand the concept of 'ALL'. I am understanding that 'ALL' is a single member, creating a cube with low granularity as far as that dimension is concerned. Is this correct?
What are some examples of SQL that can help me reason about this concept? Of course, SQL uses tables not cubes, but I'm sure there are some similarities which can help me make the connection? Let me try to create an example.
Let's say I have a table with such a schema representing a cube with 3 dimensions:
myTable (dim1_attribute1,dim1_attribute2, dim2_attribute1,dim2_attribute2,dim2_attribute3,
dim3_attribute1,dim3_attribute2,dim3_attribute3)
What kind of SQL would give me an aggregate with 'ALL' granularity on dim3?
'ALL' means that basically you are taking into account all members of that dimension - you are not slicing cube with some particular member of that dimension. In SQL query equivalent would be to omit dim3 from the where clause altogether, so you are not filtering your resulting aggregation with some particular value from dim3 but you are taking all rows into account.

SSAS Tabular Model: Difference between calculated table and DAX

Just wondering if anyone had encounter this situation, that calculated table returns no row, but when evaluating the DAX it returns correct result.
In one of my tabular models, I created several summarized tables to improve query performance. The summarized table linked to the project table by project ID. I also have a row level security applied on project table.
For some reasons, I was told the report does not work. When I use SSMS to browse the tabular model via user's account, I found that if I evaluate the table directly it returns no row, e.g.
evaluate my_summarized_table
But, if I evaluate the formula directly it returns correct result, e.g.
evaluate Filter(Summarize(mytable, 'Project'[Project Id]...))
I tried to use SQL Profiler and $system.Discover_Sessions but could not figure out what is the difference between the calculated table and the raw DAX query.
Had anyone have this situation before? What could be the possible reason for that?

ssas tabular - measure design

question about measures in ssas tabular. Is it better to have the physical measure in the fact table as a column and then do a simple sum measure? e.g
Imagine the scenario I have a measure called Income in the fact table, but the user wants to see ProductA income, Product B income as individual measures (not using income measure with product dimension, which yes gives the same result)
Or is it better to do a dax calculation with a sum and filter based on the product dim. e.g. Product B Income:= CALCULATE(SUM('fact'[Income]);VALUES('Product Type dim');('Product Type dim'[Product Tye] ="ProductB"))
I have tried both methods and both return the same result... just want to know what would be best practice here. (fact table around 300million rows)
The fewer the measures the more performant your report will be in the end. I would recommend, in this situation, just sticking to a simple sum() measure and applying it each product. This would be the most efficient approach, doing a calculation on 300 million rows whilst filtering and invoking Values will definitely slow you down.
DC07 How are end users viewing your model? Are they using it in PowerPivot (excel) or PowerBI? I would suggest performing those calculations in either the visualization element (PowerBI) in PowerPivot as a method of displaying the calculation.

How are OLAP-cube operations and MDX related?

I would like to understand how OLAP-cube operations (i.e. drilling up/down, slicing/dicing and pivoting) and MDX are related.
My current guess is that OLAP-cube operations to MDX are like relational algebra to SQL. However, I do not see how some basic features of MDX correspond to OLAP-cube operations. For example, consider the following query on the demo "Sales" cube that comes with icCube:
SELECT {([Ottawa],[2009]), ([United States],[Feb 2010])} on Rows,
[Measures].members on Columns
FROM [Sales]
How does the use of tuples (e.g. ([Ottawa],[2009])) correspond to an OLAP-cube operation?
Yes, "OLAP-cube operations are what visualization tools are expected to implement". MDX is the query language that is executed against a cube that produces a result. OLAP clients generally run MDX against a cube. "OLAP cube operations" as described in that wikipedia are usually as a result of a person performing adhoc analysis against a cube in an client application.
Cube provide a structure and an access language that usually makes these types of operations easier (or at least faster)
How does MDX relate to a "drill down" operation? for example?
Firstly some MDX has already been run and yielded some kind of view of the cube (generally some rows, some columns, and a measure in the intersection although the MDX language syntax doesn't limit to two axes only).
So a person sees this information and decides to drill down on a single item in the row (this item was previously returned by some MDX). So the OLAP client generates some MDX that provides the drilled down view on the item
It might just add a children MDX function to the item in question. Or it might do it some other way. It depends on the client.
Heres some introductory info on how you can eavesdrop on the interactions between an OLAP client (which one? doesn't matter) and a SSAS cube
https://learn.microsoft.com/en-us/sql/analysis-services/instances/introduction-to-monitoring-analysis-services-with-sql-server-profiler
You can think of the MDX query specifying areas or regions of space within a cube - the tuple is a primary way of giving the processor co-ordinates which correspond to the part of the cube you are interested in.
It is the intersection of the co-ordinates and slices you specify that give you a result.
MDX is strongly related to set theory as the main types withing the cube are dimensions, sets, tuples, members etc.
An MDX query defines a table and for each of the table cell we've a tuple. In your scenario assuming we've two measures ( Meas1, Meas2 ) :
([Ottawa],[2009],[Meas1]) ([Ottawa],[2009],[Meas2])
([United States],[Feb 2010],[Meas1]) ([United States],[Feb 2010],[Meas2])
On this cell tuples you might add the WHERE clause, the SubQuery and the defaults that might be different than ALL (not advised). Remember all is a 'special' member that is ignored.
A tuple defines a single measure, Meas1 or Meas2, this will select the 'fact table' with a measure column, usually a numerical value. The other members are used to select rows in the table performing on them the aggregation defined by the measure ( sum , min, max.... ) on all rows defined by the tuple members, Ottawa and 2009 for example. As whytheq explains, you've a lot of transformations to 'play' with members as you would with sets.
This is the simple vision, as you can use calculated members that define a transformation instead of a simple row aggregation (e.g. Difference with previous year) and some aggregations are a bit more tricky ( open, close...).
But if you understand this well you got a perfect ground to understand MDX.

SSAS Environment or CUBE creation methodology

Though I have relatively good exposer in SQL Server, but I am still a newbie in SSAS.
We are to create set of reports in SSRS and have the Data source as SSAS CUBE.
Some of the reports involves data from atleast 3 or 4 tables and also would involve Grouping and all possible things from SQL Environment (like finding the max record for a day and joining that with 4 more tables and apply filtering logic on top of it)
So the actual question is should I need to have these logics implemented in Cubes or have them processed in SQL Database (using Named Query in SSAS) and get the result to be stored in Cube which would be shown in the report? I understand that my latter option would involve creation of more Cubes depending on each report being developed.
I was been told to create Cubes with the data from Transaction Tables and do entire logic creation using MDX queries (as source in SSRS). I am not sure if that is a viable solution.
Any help in this would be much appreciated; Thanks for reading my note.
Aru
EDIT: We are using SQL Server 2012 version for our development.
OLAP cubes are great at performing aggregations of data, effectively grouping over the majority of columns all at once. You should not strive to implement all the grouping at the named query or relational views level as this will prevent you from being able to drill down through the data in the cube and will result in unnecessary overhead on the relational database when processing the cube.
I would start off by planning to pull in the most granular data from your relational database into your cube and only perform filtering or grouping in the named queries or views if data volumes or processing time are a concern. SSAS will perform some default aggregations of the data to allow for fast queries at the most grouped level.
More complex concerns such as max(someColumn) for a particular day can still be achieved in the cube by using different aggregations, but you do get into complex scenarios if you want to aggregate by one function (MAX) only to the day level and then by another function across other dimensions (e.g. summing the max of each day per country). In that case it may well be worth performing the max-per-day calculation in a named query or view and loading that into its own measure group to be aggregated by SUM after that.
It sounds like you're at the beginning of the learning path for OLAP, so I'd encourage you to look at resources from the Kimball Group (no affiliation) including, if you have time, the excellent book "The Data Warehouse Toolkit". At a minimum, please look into Dimensional Modelling techniques as your cube design will be a good deal easier if you produce a dimensional model (likely a star schema) in either views or named queries.
I would look at BISM Tabular if your model is not complicated. It compresses and stores data in memory. As for data processing I would suggest to keep all calculations and grouping in database layer (create views).
All the calculations and grouping should be done at database level atleast in form of views.
There are mainly two ways to store data (MOLAP and ROLAP). Use MOLAP storage model for deal with tables that store transactions kind of data.
The customer's expectation from transaction data (from my experience) would be to understand the sales based upon time dimension. Something like Get total sales in last week or last quarter. etc.
MDX scripts are basically kind of SQL scripts that Cube can understand. No logic should be there. based upon what Parameters are chosen in SSRS report, MDX query should be prepared. Small analytical functions such as subtotal, average can be dome by MDX but not complex calculations.