I would like to use SSAS and a TimeSeries Mining Structure to predict when the predictable value will reach a certain threshold.
For Example:
SELECT [Info Key],
PredictTimeSeries([Free Space], 200) as ForcastedSize
FROM [Drive Module Information]
WHERE ForcastedSize < 10000 --(<< this does not work)
This will tell me the date that it forecasts that the drive space will be below 10000.
How do I write an MDX query to accomplish this?
Thanks,
Brian
UPDATE 1:
I think I can accomplish it this way, with some limitations:
SELECT [Drive Module Information].[Info Key],
(SELECT *
FROM PredictTimeSeries([Drive Module Information].[Free Space], 5000) as [FUTURE]
) AS T
FROM [Drive Module Information]
WHERE
[Info Key] = 'MyMachine C:' AND
[Free Space] < 10000
The limitation is that I can only look X# of steps forward without getting crazy. Which is ok. I am ok with knowing that the drive will not fill up over the next week, or month.
I did not figure out how to use FILTER in this situation, and am still curious as to if there is a "What date will this predictable value be equal to this value".
UPDATE 2: I have come to the conclusion that SSAS was not meant to do this, so until I find out differently, I will mark icCube as the answer since he helped out.
MDX is not SQL, the where clause of MDX is not a real filter. As a quick introduction you can go through this MDX gentle Tutorial.
There is a MDX Filter function you can use.
Related
I require some more advanced MDX knowledge than mine.
I need to get the RepoRate_MAX for repo products, at book and instrument level, but also looking at the Java code I'm replacing that code always uses the max MurexId.
How can I perform the below (I've placed MAX in here on the dimension but this is wrong) and I need the combo of the dimensions and also the MAX MurexId:
[Measures].[RepoRate_VAL] = (([Deal].[ProductType].&[REPO],[Deal].[Book],[Deal].[Instrument],MAX([Deal].[MurexId])),[Measures].[RepoRate_MAX])
I'm sure it's a simple one but my mind is part way between the Java OO and MDX worlds currently haha :D
Thanks
Leigh
So after some experimenting I found out about the TAIL and Item MDX functions.
I think at one point I did get it working, but didn't make a note of what did work. I was playing around with this and variants of it..but most versions ended up in unusable query times:
[Measures].[RepoRate_VAL] = (([Deal].[ProductType].&[REPO],[Deal].[Book],[Deal].[Instrument],TAIL(EXISTING([Deal].[MurexId].[MurexId])).Item(0)),[Measures].[RepoRate_MAX])
So I then decided to push the RepoRate calculation back to the SQL data preparation script. Cleaner/smoother data is always better and then to have simple calculated members.
I used SQL to determine the RepoRate from tradelevel with MAX(MurexId) and GROUP BY on Book, Instrument to then update my main fact table to ensure that the correct RepoRate was set at Book, Instrument level.
Thus the calculated member is then:
[Measures].[RepoRate_VAL] = (([Deal].[Book],[Deal].[Instrument]),[Measures].[RepoRate_MAX])
Fast data prep and a fast calculated member on the Excel/Pivot/UI layer.
I have a problem which looked like a simple requirement ... it turned out it wasn't ... at least for me.
At the moment I feel like I've read half of the MDX internet ...
I'm using latest Saiku CE (Mondrian 4), and my simplified cube looks like this:
Dimensions:
Machine.Manufacturer
Measures:
Measure.[Msg count]
Measure.[Distinct machines]
Measure.[Distinct days]
Calculated measures:
Measure.[Msg xMxD] which is basically: (Measure.[Msg count] / Measure.[Distinct machines] / Measure.[Distinct days]).
Measure.[Msg xMxd %] which is: (Measure.[Msg xMxD] / SUM(Measure.[Msg xMxD], Machine.[Manufacturer].[All Manufacturers]))
What I want to accomplish is this table:
But as you've probably guessed, I have a problem with the Measure.[Msg xMxd %] measure ...
Because it is calculated on base of another calculated measure, the % calculation is done after the summing for a particular Manufacturer and I don't know how to overcome this.
The closest answer I found was this one: https://forums.pentaho.com/threads/160265-Calculate-members-in-mdx/
... but this concerns only one generated member as a sum of all manufacturers.
I've found also some resolutions based on Axis(...) functions but those are unavailable in Mondrian.
Do you have any ideas ? Is there a possibility to generate a set of calculated members ? this would (at least theoretically) give me a possible to set solve order for all child members of [Machine].[Manufacturer]
Any help is much appreciated.
I using an incident management cube to try to determine how many tickets were opened by a specific set of users. For instance, if there were 80,000 tickets opened - how many were opened by .[Submitter].&[xAutoData] and .[Submitter].&[xAutoVoice]?
Sounds pretty easy but I'm just learning MDX so this is a bit of an uphill battle. I'm thinking the best way is to use a custom measure but the closest I've got to any kind of result was by using this query.
WITH MEMBER [Measures].[AutoTickets] AS
COUNT({[INC - Incident Management].[Submitter].&[xAutoData],[INC - Incident Management].[Submitter].&[xAutoVoice]})
SELECT
{[Total Incidents],[AutoTickets]} ON 0,
{[INC - Incident Management].[Assigned Group]} ON 1
FROM [Incident Management Cube];
All it ever does is return '2' when there should be quite a bit more.
A point in the right direction would be appreciated and I think would help me learn what's going on behind the scenes.
Thanks!
EDIT:
Over the past few days, I've got a bit closer. I think I need to use a SUM function combined with an IIF.
Something like
WITH Member [Measures].[AutoTickets] AS
SUM(IIF([INC - Incident Management].[Submitter] = [INC - Incident Management].[Submitter].&[na\xData] OR
[INC - Incident Management].[Submitter] = [INC - Incident Management].[Submitter].&[na\xVoice],1,0))
But this returns an error. If I test the query without the SUM, the IIF part does work as expected so I think I'm missing one more piece on how I'm supposed to use the SUM function.
I think you need to clarify your question a little bit more in terms of what you want to achieve. From the query that you have provided, the calculated member "AutoTickets" counts only the set members of the [Submitter] attribute of [INC - Incident Management] dimension that you have provided with exact tuples and that number will always be 2 {xAutoData and xAutoVoice} across all dimensions, it's like a constant.
Suppose that you want to know how many [Submitter]'s contributed to the Total Incident measure, than I suggest that you try the following statement:
WITH MEMBER [Measures].[AutoTickets] AS
COUNT(NonEmpty({[INC - Incident Management].[Submitter].Members),([Measures].[Total
Incidents])})
SELECT
{[Total Incidents],[AutoTickets]} ON 0,
{[INC - Incident Management].[Assigned Group]} ON 1
FROM [Incident Management Cube];
Supposing that you have a measure "[Measures].[TicketCount]" which gives the count of tickets, the below query would give you tickets for
"[Submitter].&[xAutoData] and .[Submitter].&[xAutoVoice]". You just replace with the relevant measure.
select [Measures].[TicketCount] on columns,
{[INC - Incident Management].[Submitter].&[xAutoData],[INC - Incident Management].[Submitter].&[xAutoVoice]}
on rows
from [Incident Management Cube]
here is my case. I have small training in creating OLAP cube in SSAS and as part of it I need to calculate median time from creation issue to resolve issue.
So according to microsoft docs I should use MEDIAN function in MDX. So here is my code:
MEDIAN([Issue].[Issue ID],[Measures].[Hours Resolved])
Short explanation: [Measures].[Hours Resolved] it's a measure calculated in database from dimensions "resolved issue time" - "creation issue time" with DATEDIFF function. Both are smalldatetime datatype.
And it looks like it works in proper way for case on the screen below.
Exept "Grand Total" value in Mediana column.
I believe that Grand Total value should be 12 becasue this is proper score according to way the median should be calculated (checked also in Excel). So am I wrong here and this is proper behaviour? Or maybe I miss something in my calculation or configuration in SSAS?
Second case in this exercise.
When I will add for example Group Name column like on the picture below:
In my understanding value mediana column for let's say CRM part should be 9.
Can you please guide me if I'm right or wrong? If I'm right how to achieve this. Or if I'm wrong please point mistake in my solution. This is my 1st time when I'm calculating median.
It's a little embrassing that no one even look at it - 16 views and probably all mine - well doesn't matter anymore because I figured it out myself.
For proper median calculation for all dimensions I should use Median function and Scope function in MDX. So here is the code if someone will face the same problem in future:
CREATE MEMBER CURRENTCUBE.[Measures].[Median]
AS Null
VISIBLE = 1;
SCOPE([Measures].[Median]);
THIS = MEDIAN([View Issue Median].[Issue ID].[Issue ID], [Measures].[Hours Resolved]);
END SCOPE;
I am building a data model with PowerPivot for Excel 2013 and need to be able to identify the max number of emails sent per person. The DAX formula below gives me the result that I looking for but performance is incredibly slow. Is there an alternative that will compute a maximum by group without the performance hit?
Maximum Emails per Constituent:
=MAXX(SUMMARIZE('Email Data','Email Data'[person_id],"MAX Value",
([Emails Sent]/[Unique Count People])),[MAX Value])
So, without the measure definitions for [Emails Sent] or [Unique Count People], it is not possible to give definitive advice on performance. I'm going to assume they are trivial measures, though, based on their names - note that this is an assumption and its truth will affect the rest of my post. That being said, there is an obvious optimization to make to start with.
Maximum Emails per Consultant:=
MAXX(
ADDCOLUMNS(
VALUES('Email Data'[person_id])
,"MAX Value"
,[Emails Sent] / [Unique Count People]
)
,[MAX Value]
)
I used the ADDCOLUMNS() rather than a SUMMARIZE() to calculate new columns. See this post for an explanation of the performance implications.
Additionally, since you're not grouping by multiple columns, there's no need to use SUMMARIZE(). The performance impact of using VALUES() instead should be minimal.
The other question that comes to mind is whether this needs to be a measure. Are you going to be slicing by other dimensions? If not, this becomes a static attribute of a [person_id] which could be calculated during ETL, or in a calculated column.
A final note - I've also been assuming that your model is optimal as well. Again, we'd need to see it to make comment on whether you could see performance issues from something you're doing there.