Distinctly counting measure results that match a certain criteria

Distinctly counting measure results that match a certain criteria - ssas

We have a measure for feedback scores which I am trying to report on. I need to calculate how many have reached a score >5 to calculate a performance %age. The issue I have is where there is more than one score available for my member which is aggregated in my results.
Here is what I have so far:
with
MEMBER [Client Sat Score] AS ([Measures].[Avg_Score], Linkmember([Bill Period].[Fiscal].[Fiscal Period].&[201409],[Date].[Fiscal]),[Bill Period].[Fiscal].[All],[Time Slice].[KeyTimeSlice].[12M])
MEMBER [Sum Scores] AS([Measures].[Sum of Scores], Linkmember([Bill Period].[Fiscal].[Fiscal Period].&[201409],[Date].[Fiscal]),[Bill Period].[Fiscal].[All],[Time Slice].[KeyTimeSlice].[12M])
MEMBER [Number Scores] AS([Measures].[Number of Scores], Linkmember([Bill Period].[Fiscal].[Fiscal Period].&[201409],[Date].[Fiscal]),[Bill Period].[Fiscal].[All],[Time Slice].[KeyTimeSlice].[12M])
MEMBER [Over 5] As IIF([Client Sat Score]>5,1,NULL)
MEMBER [Scores Over 5] As SUM([Matter].[KeyMatterNumber].[KeyMatterNumber].members,[Over 5])
MEMBER [Percent 6 or 7s] As IIF([Number Scores]=0,NULL,[Scores Over 5] / [Number Scores])
select {[Client Sat Score],[Sum Scores],[Number Scores],[Over 5],[Scores Over 5],[Percent 6 or 7s] }
on columns,
non empty ({[Client].[KeyClientRelatedID].&[XXX] })
on rows
from [Cube]
This returns 4 for "Scores over 5" but there are actually 2 scores of 7 on one of the members. These scores are on different date keys but I am unable to stop them aggregating within the SUM.
Any suggestions/advice please?
EDIT:
I've found that if I run the following I do get the 5 separate results each with a score above 5:
select {[Measures].[Avg_Score] }
on columns,
non empty ({[Matter].[KeyMatterNumber].[KeyMatterNumber].members })*
{[Date].[KeyDate].[KeyDate].&[20130206]:[Date].[KeyDate].[KeyDate].&[20140206]}
on rows
from [Cube]
where (
[Client].[KeyClientRelatedID].&[XXXX]
)
Does this help at all with the amendment of my first query?

It is difficult to answer without a cube schema available. However, it sounds like you are trying to do a row level calculation post-aggregation which results in your attempts to find the combination of dimensions that will cause leaf level data to be used in the calculation. Even if possible these invariably perform badly.
I would suggest pushing this calculation into the underlying database and generating an additional dimension with banded values { [1-5], [6-7] }.
Also, you didn't mention if this Analysis Services (or another vendor) Multidimensional or Tabular - if Tabular just create a row calc for "Over 5" in the cube and use that as your new dimension members.

Related

MDX OLAP Cube Query Optimization

Problem: I'm trying to write a MDX query that will show the first date a member has measure values.
Data obstacles:
1. I don't have access to the data warehouse/source data
2. I can't request any physical calcs or CUBE changes
Looking for: I know this goes against what a CUBE should be doing, but is there any way to achieve this result. I'm running into locking conflicts and general run time issues.
Background: After some trial and error. I have a working query but sadly it's only is practical when filtered for <10 employees. I've tried some looping but there are ~60k employee ids in the cube with each one having 10-20 emp keys (one for each change in their employee info).
//must have values for measure 1 or 2
WITH
set NE_measures as
{
[Measures].[measure1] ,
[Measures].[measure2]
}
//first date with measure values for each unique emp key
MEMBER [Measures].[changedate] AS
Head
(
NonEmpty([Dim Date].[Date].[Date].allMEMBERS, NE_measures)
).Item(0).Member_Name
SELECT non empty {[Measures].[changedate]} ON COLUMNS,
non empty [Dim Employee].[Emp Key].[Emp Key].allmembers ON ROWS
FROM [Cube]

Try this:
MEMBER [Measures].[changedate] AS
Min(
[Dim Date].[Date].[Date].allMEMBERS,
IIF(
NOT(ISEMPTY([Measures].[measure1]))
OR NOT(ISEMPTY([Measures].[measure2])),
[Dim Date].[Date].CurrentMember.MemberValue,
NULL
)
);
I’m assuming the KeyColumn or ValueColumn is more likely to sort properly than the name. So if MemberValue doesn’t work then try Member_Key.
The most efficient way of accomplishing this would be to add a date column in the fact table with measure 1 and measure 2 then create a AggregateFunction=Min measure on it. But you said you couldn’t change the cube so I didn’t propose that superior option.

MDX show all sales until now

i have a huge table of cashflows that means there are +int values for income and -int values for outcome.
I have MeasureGroup for Sum the amount of money.
I now want to display not only the sum of money per month but also the sum of all the past time until the current month so like that:
Month MoneyAmount Total
1 20 20
2 -10 10
3 5 15
4 -10 5
So i know for the first part its just like
select [Measures].[Money] on 0,
[Date].[Month].Members on 1
From MyCube
but how can i add the sum column?
i thought about something like SUM( { NULL : [Date].[Month].CurrentMember } , [Measures].[Money] ) but that didnt work as well :(

In MDX, the total is already there. You do not have to do complex calculations to get it.
But it depends on your exact hierarchy structure how the All member is called. If you have a date user hierarchy named [Date].[Date], and it has a month level named [Date].[Date].[Month], then the all member of the hierarchy would probably be called something like [Date].[Date].[All]. If [Month] is an attribute hierarchy of the Date dimension, then the "all member" would probably be called [Date].[Month].[All]. In the latter case, the all member would already be the first member of the set [Date].[Month].Members. As you are asking the question, I am assuming this is not the case, and you are using a user hierarchy. Then you could change your MDX query to
select [Measures].[Money] on 0,
Union([Date].[Month].Members, { [Date].[Date].[All] }) on 1
From MyCube
Please note that you can change the name of the All member in the property settings of a dimension when designing an Analysis Services dimension, hence I cannot know the definitive name without knowing the details of this setting in your cube. So you might have to adapt the name of the all member.
You can find this name out in SQL Server Management Studio in an MDX window as follows: open the hierarchy that you are using, and then open the "Members" node, below which you should find the "All Member". You can drag this into your MDX statement, and the proper name will appear there.

As in a running sum?
You need a calculated measure, like this:
With Member [Measures].[Running Sum] as Sum( ( [Date].[Months].Members.Item(0) : [Date].[Months].CurrentMember ), [Measures].[Money])
Select [Date].[Months].Members on Rows,
{[Measures].[Money], [Measures].[Running Sum] } on Columns
From [MyCube]

Calculating percentile values in SSAS

I am trying to calculate percentile (for example 90th percentile point of my measure) in a cube and I think I am almost there. The problem I am facing is, I am able to return the row number of the 90th percentile, but do not know how to get my measure.
With
Member [Measures].[cnt] as
Count(NonEmpty(
-- dimensions to find percentile on (the same should be repeated again
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members
,
-- add the measure to group
[Measures].[Profit]))
-- define percentile
Member [Measures].[Percentile] as 90
Member [Measures].[PercentileInt] as Int((([Measures].[cnt]) * [Measures].[Percentile]) / 100)
**-- this part finds the tuple from the set based on the index of the percentile point and I am using the item(index) to get the necessary info from tuple and I am unable to get the measure part
Member [Measures].[PercentileLo] as
(
Order(
NonEmpty(
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members,
[Measures].[Profit]),
[Measures].[Profit].Value, BDESC)).Item([Measures].[PercentileInt]).Item(3)**
select
{
[Measures].[cnt],
[Measures].[Percentile],[Measures].[PercentileInt],
[Measures].[PercentileLo],
[Measures].[Profit]
}
on 0
from
[TestData]
I think there must a way to get measure of a tuple found through index of a set. Please help, let me know if you need any more information. Thanks!

You should extract the tuple at position [Measures].[PercentileInt] from your set and add the measure to it to build a tuple of four elements. Then you want to return its value as the measure PercentileLo, i. e. define
Member [Measures].[PercentileLo] as
(
[Measures].[Profit],
Order(
NonEmpty(
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members,
[Measures].[Profit]),
[Measures].[Profit], BDESC)).Item([Measures].[PercentileInt])
)
The way you implemented it, you tried to extract the fourth (as Item() starts counting from zero) item from a tuple containing only three elements. Your ordered set only has three hierarchies.
Just another unrelated remark: I think you should avoid using complete hierarchies for [Calendar].[Hierarchy].members, [Region Dim].[Region].members, and [Product Dim].[Product].members. Your code looks like you are including all levels (including the all member) in the calculation. But I do not know the structure and names of your cube, hence I may be wrong with this.

An alternate method could be to find the median of the last 20% of the records in the table. I've used this combination of functions to find the 75th percentile. By dividing the record count by 5, you can use the TopCount function to return a set of tuples that make up 20% of the whole table sorted in descending order by your target measure. The median function should then land you at the correct 90th percentile value without having to find the record's coordinates. In my own use, I use the same measure for the last parameter in both the Median and TopCount functions.
Here's my code:
WITH MEMBER Measures.[90th Percentile] AS MEDIAN(
TOPCOUNT(
[set definition]
,Measures.[Fact Table Record Count] / 5
,Measures.[Value by which to sort the set so the first 20% of records are chosen]
)
,Measures.[Value from which the median should be determined]
)
Based on what you've supplied in your problem definition, I would expect your code to look something like this:
WITH MEMBER Measures.[90th Percentile] AS MEDIAN(
TOPCOUNT(
{
[Calendar].[Hierarchy].members *
[Region Dim].[Region].members *
[Product Dim].[Product].members
}
,Measures.[Fact Table Record Count] / 5
,[Measures].[Profit]
)
,[Measures].[Profit]
)

MDX - Count of Filtered CROSSJOIN - Performance Issues

BACKGROUND: I've been using MDX for a bit but I am by no means an expert at it - looking for some performance help. I'm working on a set of "Number of Stores Authorized / In-Stock / Selling / Etc" calculated measures (MDX) in a SQL Server Analysis Services 2012 Cube. I had these calculations performing well originally, but discovered that they weren't aggregating across my product hierarchy the way I needed them to. The two hierarchies predominantly used in this report are Business -> Item and Division -> Store.
For example, in the original MDX calcs the Stores In-Stock measure would perform correctly at the "Item" level but wouldn't roll up a proper sum to the "Business" level above it. At the business level, we want to see the total number of store/product combinations in-stock, not a distinct or MAX value as it appeared to do originally.
ORIGINAL QUERY RESULTS: Here's an example of it NOT working correctly (imagine this is an Excel Pivot Table):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 2,416 2,392 99.01%
[-] Business Two 2,377 2,108 93.39%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 84.06%
-Item 3 2,377 2,108 88.68%
-Item N ... ... ...
FIXED QUERY RESULTS: After much trial and error, I switched to using a filtered count of a CROSSJOIN() of the two hierarchies using the DESCENDANTS() function, which yielded the correct numbers (below):
[FILTER: CURRENT WEEK DAYS]
[BUSINESS] [AUTH. STORES] [STORES IN-STOCK] [% OF STORES IN STOCK]
[+] Business One 215,644 149,301 93.90%
[-] Business Two 86,898 55,532 83.02%
-Item 1 2,242 2,094 99.43%
-Item 2 2,234 1,878 99.31%
-Item 3 2,377 2,108 99.11%
-Item N ... ... ...
QUERY THAT NEEDS HELP: Here is the "new" query that yields the results above:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
FILTER(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
)
),
[Measures].[Inventory Qty] > 0
)
),
FORMAT_STRING = "#,#",
NON_EMPTY_BEHAVIOR = { [Inventory Qty] },
This query syntax is used in a bunch of other "Number of Stores Selling / Out of Stock / Etc."-type calculated measures in the cube, with only a variation to the [Inventory Qty] condition at the bottom or by chaining additional conditions.
In its current condition, this query can take 2-3 minutes to run which is way too long for the audience of this reporting. Can anyone think of a way to reduce the query load or help me rewrite this to be more efficient?
Thank you!
UPDATE 2/24/2014: We solved this issue by bypassing a lot of the MDX involved and adding flag values to our named query in the DSV.
For example, instead of doing a filter command in the MDX code for "number of stores selling" - we simply added this to the fact table named query...
CASE WHEN [Sales Qty] > 0
THEN 1
ELSE NULL
END AS [Flag_Selling]
...then we simply aggregated these measures as LastNonEmpty in the cube. They roll up much faster than the full-on MDX queries.

It should be much faster to model your conditions into the cube, avoiding the slow Filter function:
If there are just a handful of conditions, add an attribute for each of them with two values, one for condition fulfilled, say "cond: yes", and one for condition not fulfilled, say "cond: no". You can define this in a view on the physical fact table, or in the DSV, or you can model it physically. These attributes can be added to the fact table directly, defining a dimension on the same table, or more cleanly as a separate dimension table referenced from the fact table. Then define your measure as
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
CROSSJOIN(
DESCENDANTS(
[Product].[Item].CURRENTMEMBER,
[Product].[Item].[UPC]
),
DESCENDANTS(
[Division].[Store].CURRENTMEMBER,
[Division].[Store].[Store ID]
),
{ [Flag dim].[cond].[cond: yes] }
)
)
Possibly, you even could define the measure as a standard count measure of the fact table.
In case there are many conditions, it might make sense to add just a single attribute with one value for each condition as a many-to-many relationship. This will be slightly slower, but still faster than the Filter call.

I believe you can avoid the cross join as well as filter completely. Try using this:
CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS
CASE WHEN [Product].[Item Name].CURRENTMEMBER IS [Product].[Item Name].[All]
THEN
SUM(EXISTS([Product].[Item Name].[Item Name].MEMBERS,[Business].[Business Name].CURRENTMEMBER),
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
))
ELSE
COUNT(
EXISTS(
[Division].[Store].[Store].MEMBERS,
(
[Business].[Business Name].CURRENTMEMBER,
[Product].[Item Name].CURRENTMEMBER
),
"Measure Group Name"
)
)
END
I tried it using a dimension in my cube and using Area-Subsidiary hierarchy.
The case statement handles the situation of viewing data at Business level. Basically, the SUM() across all members of Item Names used in CASE statement calculates values for individual Item Names and then sums up all the values. I believe this is what you needed.

A simple MDX question

I am new to MDX and I know that this must be a simple question but I haven't been able to find an answer.
I am modeling a a questionnaire that has questions and answers. What I am trying to achieve is to find out the number of people who gave specific answers to questions., e.g. the number of males aged between 20-25
When I run the query below for the questions individually the correct result is returned
SELECT
[Measures].[Fact Demographics Count] ON Columns
FROM
[Dsv All]
WHERE
[Answer].[Dim Answer].&[1]
[Measures].[Fact Demographics Count] is a count of the primary key column
[Answer].[Dim Answer].&[1] is the key for the Male answer
Result for number of people who are male = 150
Result for number of people who are between 20-25 = 12
But when I run the next query below rather than getting the number people who are males and aged between 20-25. I get the sum of the number of people who are males and the number of people who are between 20-25.
SELECT
[Measures].[Fact Demographics Count] ON Columns
FROM
[Dsv All]
WHERE
{[Answer].[Dim Answer].&[1],[Answer].[Dim Answer].&[9]}
result = 162
The structure of the fact table is
FactDemographicsKey,
RespodentKey,
QuestionKey,
AnswerKey
Any help would be greatly appreciated
Thanks

Take a look at the MDX function FILTER - this may give you what you need. A combination of FILTER and Member Properties to filter against the ID's might do it. You're having a problem because what you're trying to do is a little against the grain of how an OLAP cube is structured (from my experience) because Age and Gender are both members of the same dimension (Answers), which means that they each get their own cells for aggregation, but unlike if Age and Gender were each on their own dimension, they don't get aggregated with respect to one another except to get added together. In an OLAP cube, each combination of each member of each dimension with each member of every other dimension gets a "cell" with the value of each measure that is unique to that combination - that is what you want, but members of the same dimension (such as Answers) aren't cross-calculated in that way.
If possible, I would recommend breaking out the individual answers into individual dimensions, i.e. Age and Gender each have their own dimensions with their own members, then what you want to do will naturally flow out of your cube. Otherwise, I'm afraid you will have lots of MDX fiddelry to do. (I am not an MDX expert, though, so I could be completely off base on this one, but that is my understanding)
Also, definitely read the book previously mentioned, MDX Solutions, unless this is the only MDX query you think you'll need to write. MDX and Multidimensional analysis are nothing like SQL, and a solid understanding of the structure of an OLAP database and MDX in general is absolutely essential, and that book does a very, very nice job of getting you where you need to be in that department.

When trying to figure out problems with where-criteria or slices I find it helpful to breakdown the items that you're slicing on into dimensions, rather than measures.
select
[Measures].[Fact Demographics Count] on Columns
from [Dsv All]
where
{
[Answer].[Dim Answer].&[1],
[Dim Age Band].[20-25]
}
Although then you're not really using the power of MDX - you're getting just one value.
select
[Dim Answer].Members on Columns,
[Dim Age Band].Members on Rows
from [Dsv All]
where ( [Measures].[Fact Demographics Count] )
Will give you a pivot table (or crosstable) breaking down gender (on columns) by age-bands (on rows).
BTW - ff you're learning MDX this book: MDX Solutions is far and away the best starting point that I've found.

Firstly thanks to everyone for their replies. This was an interesting one to solve and for anyone new to MDX and coming from SQL its an easy trap to fall into.
So for those interested here is a brief overview of the solution.
I have 3 tables
factDemographics: holds respondents and their answers (who answered what)
dimAnswer: the answers
dimRespondent: the respondents
In the datasource view for the cube I duplicated factDemographics 5 times using Named Queries and I named these fact1, fact2, ..., fact5. (which will create 5 measure groups)
Using VS Studio's create cube wizard I set the following fact tables
fact1, fact2, ... as fact tables
dimRespondent a fact table. I use this table to get the number of respondents.
Removed the original factDemographics table.
Once the cube was created I duplicated the dimAnswer dimension 5 times, naming them filter1, filter2, ...
Finally in the Cube Structure's Dimension Usage tab I linked these together as follows
filter1 many to many dimRespondent
filter2 many to many dimRespondent
filter3 many to many dimRespondent
filter4 many to many dimRespondent
filter5 many to many dimRespondent
filter1 regular relationship fact1
filter2 regular relationship fact2
filter3 regular relationship fact3
filter4 regular relationship fact4
filter5 regular relationship fact5
This now enables me to rewrite the query I used in my original post as
SELECT
[Measures].[Dim Respondent Count] On 0
FROM
[DemographicsCube]
WHERE
(
[Filter1].[Answer].&[Male],
[Filter2].[Answer].&[20-25]
)
My query can now be filtered by up to 5 questions.
Although this works I'm sure that there is a more elegant solution. If anyone knows what that is I'd love to hear it.
Thanks

If you are using MSSQL, you can use the "WITH ROLLUP" to get some extra rows which would have the information you want. Also, you are not using a "GROUP BY" which you will need.
Use the GROUP BY to break up the set into groups and then use aggregate functions to get your counts and other stats.
Example:
select AGE, GENDER, count(1)
from MY_TABLE
group by AGE, GENDER
with rollup
This would give you the number of each gender of person in your table in each age group, and the "rollup" would give you the total number of people in your table, the numbers in each age group regardless of gender, and the numbers of each gender regardless of age. Something like
AGE GENDER COUNT
--- ------ -----
20 M 1245
21 M 1012
20 F 942
21 F 838
M 2257
F 1780
20 2187
21 1850
4037

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas