MDX SELECT conditionally - ssas

I am new to MDX querying and working with multidimensional data, though a project requires me to learn.
We have a cube with a KPI which performs a Lag() that sums up the number of transactions that have met a requirement, based on a date.
A query would be something like
SELECT
{} on 0,
[Transaction].[Trader].Members HAVING
KPIStatus("REQ") < 0 on 1
FROM
[Cube]
WHERE
[Transaction].[Trade Date].[2014-07-02]
This will output all the traders that fulfill the criteria of the KPI.
From this, I would like to select all the [Transaction].[Trade ID] for a trader in the result from the above query. I only want the [Transaction].[Trade ID] in the same date range as earlier, since they are a part of what triggers the KPIGoal.
I've tried something like
SELECT
[Transaction].[Trader].&[trader_id] HAVING
KPIStatus("REQ") < 0 on 0,
[Transaction].[Trade ID].Members on 1
FROM
[Cube]
WHERE
[Transaction].[Trade Date].[2014-07-02]
but I get a System.OutOfMemoryException. This might be because of low RAM on my computer that SSMS eats up, but is there a better way to execute the query than the above example? Also, I believe this would result in a badly formated result. The trader can have very many trades with unique [Transaction].[Trade ID]'s. What could be a better approach for it?
Bonus question:
What is the difference of getting the results as row/column "headears" instead of values.
The first query gives me one column with the result as "headers" such as how the categories below are encapsulated. What does it really mean?

Just use a cross product on the rows. Analysis services Autoexists should take care that you only see trade IDs belonging to their trader, as both hierarchies are in the same dimension:
SELECT
{} on 0,
[Transaction].[Trader].Members
*
[Transaction].[Trade ID].Members
HAVING KPIStatus("REQ") < 0 on 1
FROM
[Cube]
WHERE
[Transaction].[Trade Date].[2014-07-02]
I am not sure how your KPI is built, and if it will be affected by the additional hierarchy in the rows. If that would be the case, you could also use Filter:
SELECT
{} on 0,
Filter([Transaction].[Trader].Members, KPIStatus("REQ") < 0)
*
[Transaction].[Trade ID].Members
on 1
FROM
[Cube]
WHERE
[Transaction].[Trade Date].[2014-07-02]
With regard to your bonus question:
An MDX query can have zero, one, two, or even more axes, which build the headers and span the cell space for the result cells (normally numbers). Each axis can have zero, one, two, or more hierarchies from which the member names for the labeling are taken. If you e. g. have three hierarchies on the rows axis, you get three header entries on each row, which are members of the three hierarchies. Most reports use two dimensions. And SQL Server Management Studio complains if you use more than two axes, as it does not know how to display the result, but this is not a restriction of MDX.
In your first query, you state that you want to have zero hierarchies on the columns, as you put the empty set on axis number zero, which is the columns axis. Then you put one hierarchy on the rows (axis number one). If you wanted to put a result into the cells, you could do so with calculated measures like this:
WITH Member Measures.[Id as measure] AS
[Transaction].[Trade ID].CurrentMember.Name
SELECT
{ Measures.[Id as measure] } on 0,
Filter([Transaction].[Trader].Members, KPIStatus("REQ") < 0)
*
[Transaction].[Trade ID].Members
on 1
FROM
[Cube]
WHERE
[Transaction].[Trade Date].[2014-07-02]
Of course, you can do much more with calculated members, which - in contrast to physical measures - can be strings and not only numbers, but this would be another question.

Related

SSAS : Filter Cube

I have dimension and fact table. The dimension is PATIENTDBOID meanwhile the fact table is Total_Admissions.
Now, I want to filter for Patient that Admission >1
Can someone help me how. ?
You Can use Filter Function to Apply filter on Measures with Logical Expression.
Since I don't know Exact Dimension Name, You can try something like below.
0 - Represents COLUMNS
1 - Represents Rows
SELECT [Measures].[Total_Admissions] ON 0,
FILTER(
[Patient].[Patient].[PatientDBOID].MEMBERS
, [Measures].[Total_Admissions]>1)
ON 1
FROM [Cube Name]

MDX Result Count

I am a beginner in MDX queries. Can any one tell me how to get the record count that is a result of a MDX query?
The query is following:
select {[Measures].[Employee Department History Count],[Measures].[Rate]} on columns, Non Empty{{Filter([Shift].[Shift ID].[Shift ID].Members, ([Shift].[Shift ID].CurrentMember.name <> "1"))}*{[Employee].[Business Entity ID].[Business Entity ID].Members}} on rows from [Adventure Works2012].
I have tried various methods and I haven't really got a solution for that.
I assume you mean row count when you talk of "record count", as MDX does not know a concept of records, but the result shown from an MDX query is the space built by the tuples on the axes.
I see two possibilities to get the row count:
Just count the rows returned from your above query in the tool from which you call the MDX query.
If you want to count in MDX, then let's state what you want to have:
You want to know the number of members of the set of non empty combinations of [Shift ID]s and [Business Entity ID]s where the Shift ID is not "1" and at least one of the measures [Employee Department History Count] and [Rate] is not null.
To state that different: Let's call the tuples like above for which the first measure is not null "SET1", and the tuples like above for which teh second measure is not null "SET2". Then you you want to know the count of the the tuples which are contained in one of these sets (or in both).
To achieve this, we define these two sets and then a calculated menber (a new measure in our case) containing this calculation in its definition, and then use this calculated member in the select clause to show it:
WITH
SET SET1 AS
NonEmpty({{Filter([Shift].[Shift ID].[Shift ID].Members,
([Shift].[Shift ID].CurrentMember.name <> "1"))}
* {[Employee].[Business Entity ID].[Business Entity ID].Members}},
{[Measures].[Employee Department History Count])
SET SET2 AS
NonEmpty({{Filter([Shift].[Shift ID].[Shift ID].Members,
([Shift].[Shift ID].CurrentMember.name <> "1"))}
* {[Employee].[Business Entity ID].[Business Entity ID].Members}},
{[Measures].[Rate])
MEMBER [Measures].[MyCalculation] AS
COUNT(SET1 + SET 2)
SELECT [Measures].[MyCalculation] ON COLUMNS
FROM [Adventure Works2012]
Please note:
The sets SET1 and SET2 are not absolutely necessary, you could also put the whole calculation in one long and complicated definition of the MyCalculation measure, but splitting it up makes is easier to read. However, the definition of a new member is necessary, as in MDX you can only put members on axes (rows, columns, ...). These members can either already been defined in the cube, or you have to define them in the WITH clause of your query. There is no such thing as putting expressions/calculations on axes in MDX, only members.
The + for sets is a union which removes duplicates, hence this operation gives us the tuples which have an non empty value for at least one of the measures. Alternatively, you could have used the Union function equivalently to the +.
The Nonempty() I used in the definitions of the sets is the NonEmpty function, which is slightly different from the NON EMPTY keyword that you can use on the axes. We use one of the measures as second argument to this function in both set definitions.
I have currently no working SSAS installation available to test my statement, hence there might be a minor error or typo in my above statement, but the idea should work.

MDX where clause in subquery does not slice cube - how to understand?

This query gives me sales of one store:
select
[measures].[sales] on 0
from [MyCube]
where [store].[store].[042]
However, if I move the slicer to inside of the subquery, it gives me sales of all stores.
select
[measures].[sales] on 0
from (select
from [MyCube]
where [store].[store].[042]
)
How to understand the mechanisms behind this difference?
This is also noted in this article, but without much explanation.
----EDIT----:
I tried various things and read around for a while. I'd like to add a question: is there a scenario in which the where clause in sub-select does filter the result?
This query gives me sales of all stores in state MI (store [042] belongs to MI):
select
[measures].[sales] on 0
from (select
[store].[state].[MI] on 0
from [myCube]
where [store].[store].[042]
)
Thinking of 'inner query only filters if the filtered dimension is returned on an axis', the theory is proved wrong if I do this:
select
[measures].[sales] on 0
from (select
[store].[state].members on 0
from [myCube]
where [store].[store].[042]
)
The sub-select still returns one state MI, but the outer query returns sales of all stores (of all states).
----EDIT 4/13----:
Re-phrasing the question in AdventureWorks cube with screenshot.
Query 1: sales of one store
Query 2: it returns sales of all stores if where clause is in the sub-select.
Query 3: the two answers I got suggested that we select the dimension in an axis - here is the result - we get all cities.
select
[measures].[sales] on 0
from (select
from [MyCube]
where [store].[store].[042]
)
The above query reduces the scope of stores just to the member [042]. Make note that sub-select is executed before the actual select. So, when it comes to the select, the engine just sees a cube which has all the members in all the dimensions; but only the member [store].[store].[042] in the store dimension. It's as if the cube has been kept intact every where else but sliced off on the Store dimension.
If you go a step ahead and add the store on to one of the axes, like
select
[measures].[sales] on 0,
[store].[store].members on 1
from (select
from [MyCube]
where [store].[store].[042]
)
you would see that although the member [All] appears in the output, it actually is just comprised of only one store.
In essence, the [All] is a special member which is calculated with respect to scope of the cube. It reflects the combined effect of all the members in the cube.
In SQL terms, it is similar to:
select sales, store as [All] from
(select sales, store from tbl where store = '042') tbl
Even though you see Sales----All, it is but a reflection of sales for store [042]
Here are some other good references concerning sub-select and slicer debate:
http://bisherryli.com/2013/02/08/mdx-25-slicer-or-sub-cube/
https://cwebbbi.wordpress.com/2014/04/07/free-video-on-subselects-in-mdx/
Chris Webb's video being located here:
https://projectbotticelli.com/knowledge/what-is-a-subselect-mdx-video-tutorial?pk_campaign=tt2014cwb
This should still leave an All member:
SELECT
[measures].[sales] ON 0
FROM
(
SELECT
FROM [MyCube]
WHERE
[store].[store].[042]
);
...but the member [All] of the Store hierarchy will only now be made up of [store].[store].[042].
You can see this by adding the Store hierarchy onto ROWS:
SELECT
[measures].[sales] ON 0,
[store].MEMBERS ON 1
FROM
(
SELECT
FROM [MyCube]
WHERE
[store].[store].[042]
);
This is the AdvWorks version similar to the reference in your question:
SELECT
{[Measures].[Order Count]} ON 0
,[Subcategory].MEMBERS ON 1
FROM
(
SELECT
{
[Subcategory].[Subcategory].&[22]
} ON 0
FROM [Adventure Works]
);
It returns the member from the sub-select and the All member adjusted to take account of the subselect:
In the references article why is the [All] less than the sum of the other two - this is not down to the subselect but is in connection with the measure that he has chosen [Measures].[Order Count] which is a distinct count. If you take away the subselect you see exactly the same behaviour of the All member being less than the sum of the other subcategory members (I've marked the point at which the total of the parts becomes higher than the All member):
SELECT
{[Measures].[Order Count]} ON 0
,Order
(
[Subcategory].MEMBERS
,[Measures].[Order Count]
,bdesc
) ON 1
FROM [Adventure Works];
Order Count: on 1 order there might be several Product Subcategories - hence this behaviour.
Edit
This query of yours:
select
[measures].[sales] on 0
from (select
[store].[state].members on 0
from TestCube //<< added this!
where [store].[store].[042]
)
This inner script is not valid? Using the same dimension on an axes and the WHERE clause is not valid:
select
[store].[state].members on 0
from TestCube
where [store].[store].[042]
Edit2
An mdx script returns a cube, which may be sliced or not sliced, but nevertheless it returns a cube. The WHERE clause is used to slice the cube that is returned. If we were using a third party tool then the dimension added to the WHERE clause would go into a combobox - with say Cliffside selected. BUT the user could effectively select Ballard from that combobox - it is just a slicer. The WHERE clause is not changing the cube that is returned by the mdx script, it is just affecting what is displayed in the cellset.
WHERE is valid within a subselect. It is part of the definition:
https://msdn.microsoft.com/en-us/library/ff487138.aspx
I've never found a use case for a subselect's WHERE clause.
Edit3
This link will explain things:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/ccb66ac3-0f9a-4261-8ccc-b6ecc51b6f07/is-where-clause-pointless-inside-a-subselect?forum=sqlanalysisservices
As Darren gosbell says in the answer to this question:
https://msdn.microsoft.com/en-us/library/ff487138.aspx it says that:
The WHERE clause does not filter the subspace.

MDX Query percentile 25th, 50th and 75th

I have a question and I haven't been able to find the answer (neither in this forum nor other) I am looking for:
I need to calculate the 25th Percentile, the median (the 50th percentile) and the 75th percentile.
Putting in another words: I need to write in the MDX query in SSRS for it to tell me which data is the 25th, the median and the 75th
All I was able to find so far was not the exact values of each one of them
thanks
I've been working on the same issue for my own data. The trouble I was having is in figuring out the Median() function. Here's how I interpret the parameters of the function:
Microsoft's definition:
MEDIAN(Set_Expression [, Numeric_Expression])
My interpretation:
Set_Expression is the set of values that define the grain to which the measure is summed before the median is evaluated
Numeric_Expression is the measure that is summed, which set of sums is then sorted and evaluated to find the median
In my case for finding the straight median across the entire data set, I didn't want to sum the values at all. To prevent any sums from being calculated, I used the key attribute for a dimension that had a 1-1 cardinality with the records in the fact table that contains the measure that I'm using. The only flaw I've seen so far is that sometimes the median returns a whole number when there are an even number of records and the mean of the two middle records should result in a number ending in .5. For example, the values of the two middle records are 16 and 17 and the function is returning 17 instead of 16.5. Since this is a minor flaw, I'm willing to overlook it for now.
This is what my calculation with the median function looks like:
WITH MEMBER Measures.[Set Median] AS MEDIAN(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Non-summable Measure]
)
I used a combination of Median and TopCount to get the 75th percentile. I use TopCount to limit the set for the median to the second half of the data since TopCount sorts the data in descending order. I'll explain how I understand TopCount:
Microsoft's definition:
TopCount(Set_Expression, Count [, Numeric_Expression])
My interpretation:
Set_Expression is the set of values from which the desired number of tuples will be returned
Count is the number of tuples to return from the set
Numeric_Expression is the value that will be used to sort the set in descending order
I want the Median function to use the last half of the records in the fact table that are returned in the query, so I again use the key for the dimension table that has a 1-1 cardinality with the fact table and I sort it by the measure from which I want to find the median value.
Here is how I coded the member:
MEMBER Measures.[75th Percentile] AS MEDIAN(
TOPCOUNT(
[Dimension].[Key Attribute].MEMBERS
,Measures.[Fact Table Record Count] / 2
,Measures.[Non-summable Measure]
)
,Measures.[Non-summable Measure]
)
So far, this combination of functions has returned a true 75th percentile from my data set. To get the 25th percentile, I tried replacing TOPCOUNT in my code with BOTTOMCOUNT, which is supposed to do the same thing, only sorting the data in ascending order to use the first half of the records instead of the second half. Unfortunately, I haven't been able to get anything but NULL from this combination of functions, so I'm open to suggestions on how to get the 25th percentile.
This is how my final query looks:
SELECT
{
Measures.[Set Median]
,Measures.[25th Percentile]
,Measures.[75th Percentile]
} ON 0
,[Dimensional row members here] ON 1
FROM [Cube]
WHERE
[Non-axis dimensional filter members here]

Filtering a Measure (or Removing Outliers)

Say I have a measure, foo, in a cube, and I have a reporting requirement that users want to see the following measures in a report:
total foo
total foo excluding instances where foo > 10
total foo excluding instances where foo > 30
What is the best way to handle this?
In the past, I have added Named Calculations which return NULL if foo > 10 or just foo otherwise.
I feel like there has to be a way to accomplish this in MDX (something like Filter([Measures].[foo], [Measures].[foo] > 10)), but I can't for the life of me figure anything out.
Any ideas?
The trick is that you need to apply the filter on your set, not on your measure.
For example, using the usual Microsoft 'warehouse and sales' demo cube, the following MDX will display the sales for all the stores where sales were greater than $2000.
SELECT Filter([Store].[Stores].[Store].members, [Unit Sales] > 2000) ON COLUMNS,
[Unit Sales] ON ROWS
FROM [Warehouse and Sales]
I met similar problem when use saiku (backend with Mondrain), as I haven't found any clear solution of "add filter on measure", I added it here, and that may be useful for other guy.
In Saiku3.8, you could add filter on UI: "column"->"filter"->"custom", then you may see a Filter MDX Expression.
Let's suppose we want clicks in Ad greater than 1000, then add the following line there:
[Measures].[clicks] > 1000
Save and close, then that filter will be valid for find elem with clicks greater than 1000.
The MDX likes below (suppose dt as dimension and clicks as measure, we want to find dt with clicks more than 1000)
WITH
SET [~ROWS] AS
Filter({[Dt].[dt].[dt].Members}, ([Measures].[clicks] > 1000))
SELECT
NON EMPTY {[Measures].[clicks]} ON COLUMNS,
NON EMPTY [~ROWS] ON ROWS
FROM [OfflineData]
i think you have two choices:
1- Add column to your fact(or view on data source view that is based on fact table)like:
case when unit_Price>2000 then 1
else 0
end as Unit_Price_Uper_Or_Under_10
and add a fictitious Dimension based on this columns value.
and add named query for New Dimension(say Range_Dimension in datasourceview :
select 1 as range
union all
select 0 as range
and after taht you cant used this filter like other dimension and attribute.
SELECT [Store].[Stores].[Store].members ON COLUMNS,
[Unit Sales] ON ROWS
FROM [Warehouse and Sales]
WHERE [Test_Dimension].[Range].&[1]
the problem is for every range you must add When condition and only if the range is static this solution is a good solution.
and for dynamic range it's better to formulate the range (based on disceretizing method )
2- add dimension with granularity near fact table based on fact table
for example if we have fact table with primary key Sale_id.we can add
dimension based on fact table with only one column sale_Id and in dimension Usage tab
we can relate this new dimension and measure group with relation type Fact and
after that in mdx we can use something like :
filter([dim Sale].[Sale Id].[Sale Id].members,[Measures].[Unit Price]>2000)