MDX calculation has wrong order of precendence

MDX calculation has wrong order of precendence - ssas

Im having an issue with an MDX query, and I think it boils down to the order of precedence between calculating an aggregate and a calculated member.
Let me start with the underlying data, which revolves around a valuation (which has a date, and some other data such as a member type, a scheme - and crucially for this question; a loading factor) and an associated value.
The data
Valuation Table
Id | Valuation Date | Member Type | Scheme | Loading Factor
=============================================================
1 | 2010-01-01 | TypeA | Scheme X | 0.02
2 | 2010-01-01 | TypeB | Scheme X | 0.02
3 | 2010-01-01 | TypeA | Scheme Y | 0.02
4 | 2010-01-01 | TypeB | Scheme Y | 0.02
ValuationValue table
ValuationId | Value
====================
1 | 1000.0
2 | 2000.0
3 | 3000.0
4 | 4000.0
This, when loaded into a cube has a Valuation dimension with attributes MemberType, Scheme and date. And a cube with Measure group ValuationValue containing Value measure, and a Valuation measure group containing Loading Factor like so:
Cube
-Measure Groups
- Valuation
|_Loading Factor
- ValuationValue
|_Value
- Dimensions
- Valuation
|_MemberType
|_Scheme
|_Date
The question
Loading factor is used to load the Value, think of it like a tax, so 0.02 means "Loading amount is 2% of the value". When returning Value from a query, I need to also calculate the amount to load this value by. A typical query might look like
SELECT
{
[Measures].[Value]
} ON 0,
[Valuation].[Scheme] ON 1
FROM Cube
This would return 2 rows, and as you can see by comparing to the data above it correctly sums across memberType:
Scheme | Value
=================
Scheme X | 3000.0
Scheme Y | 7000.0
Now, if I try to calculate my loading factor in that query, all goes wrong - i'll demonstrate. Given the following query:
WITH MEMBER [Measures].[Loading Value]
AS
(
[Measures].[Value] * [Measures].[Loading Factor]
)
SELECT
{
[Measures].[Value] ,
[Measures].[Loading Value]
} ON 0,
[Valuation].[Scheme] ON 1
FROM Cube
I get the result
Scheme | Value | Loading Value
=================================
Scheme X | 3000.0 | 120.0
Scheme Y | 7000.0 | 280.0
Basically, what is happening is that it is suming my Loading Factor and then multiplying that by the Sum of my values(The first row above should be 1000 * 0.02 + 2000 * 0.02 = 60. Instead it's calculating 3000 * 0.04 = 120).
This is of course a contrived example, my actual structure is a bit more complex - but I think this demonstrates the problem. I was under the impression that the calculated member in the example above should occur on a row-by-row basis, instead of at the end of an aggration of my Value measure.
Thanks for any replies.

Your [Measures].[Loading Factor] - How is that set, is it a SUM?
Calculated members are generally done as per the rows returned if I remember - Unless you specify otherwise.
If you want an example, take a look at the currency conversion wizard output - This does something similar using the LEAVES command - You will need to do this in the MDX script as a SCOPE'd command though.
Given your description, the code could be something like:
CREATE MEMBER [Measures].[Loading Value] AS NULL
Scope( { [Measures].[Loading Value] } );
Scope( Leaves([Valuation]) );
This = [Measures].[Value] * [Measures].[Loading Factor]
Format_String(This) = "#,##0.00;-#,##0.00";
End Scope;
End Scope;

I'm not sure I follow your example completely, but you might try using SOLVE_ORDER and SCOPE_ISOLATION to manipulate the order of the calculations.
For example,
WITH
MEMBER [Measures].[Custom Calculation] AS
'([Measures].[Sales Count] - [Measures].[Unit Returns])',
SOLVE_ORDER = 65535, SCOPE_ISOLATION = CUBE
SELECT
{[Measures].[Custom Calculation]} ON COLUMNS,
NON EMPTY [Time].[YQMD].[Day].AllMembers ON ROWS
FROM [Waremart]

Thes one turned out ot be REALLY easy.
WITH MEMBER [Measures].[Loading Value]
AS
(
[Measures].[Value] * [Measures].[Loading Factor]
)
WITH MEMBER [Measures].[Total Loading Value]
AS
SUM (
EXISTING [Valuation].[Id].[Id],
[Measures].[Loading Value]
)
SELECT
{
[Measures].[Value] ,
[Measures].[Measures].[Total Loading Value]
} ON 0,
[Valuation].[Scheme] ON 1
FROM Cube

Related

MDX/SSAS sum of certain values over totals - calculate success/failure rate

I have a simplified example cube used for learning purposes, and to try to figure out a more complex problem.
The cube represents a small web server log,
number of hits as a measure
hostname as a dimension
http status code as a dimension
I can get a breakdown on number of hits per host and http status code with the MDX
SELECT NON EMPTY { [Measures].[CNT HITS] } ON COLUMNS,
NON EMPTY { ([DIM NOS STATUSCODE].[Statuscode].[Statuscode].ALLMEMBERS *
[DIM NOS HOST].[HOST].[HOST].ALLMEMBERS ) } ON ROWS
FROM [DW]
Now what I would like is to make groups over various HTTP status codes to e.g. show the percentage of successful hits (all 2xx status codes), the percentage unsuccessful hits (all non 2xx status codes).
I can do this with SQL, but I'm at a loss on how to do it with MDX. e.g. with SQL I'd do:
select HOST,
sum(CNT_HITS) as HITS ,
SUM(CASE WHEN s.statuscode div 100 = 2 THEN CNT_HITS ELSE 0 END)/sum(CNT_HITS) * 100 as success_percent,
SUM(CASE WHEN s.statuscode div 100 = 2 THEN 0 ELSE CNT_HITS END)/sum(CNT_HITS) * 100 as failed_percent,
sum(CASE WHEN s.statuscode = 401 THEN CNT_HITS ELSE 0 END)/sum(CNT_HITS) * 100 as auth_fail_percent
from FACT_NOS_HTTPLOG fact
group by HOST;
And for the data shown in the above screenshot, I'd get
+-----------------+------+-----------------+----------------+-------------------+
| HOST | HITS | success_percent | failed_percent | auth_fail_percent |
+-----------------+------+-----------------+----------------+-------------------+
| www.example.com | 1610 | 93.1677 | 6.8323 | 6.2112 |
| www.test.com | 50 | 0.0000 | 100.0000 | 0.0000 |
+-----------------+------+-----------------+----------------+-------------------+
But how can I accomplish this with MDX ?

I think the easiest way to accomplish this is to add a column to your fact table (or view/query) that would contain keys for either success_percent, failed_percent or auth_fail_percent. Then create a new dimension with these 3 members. Join to the fact and you have your solution without the need for any MDX at all.

Add an extra attribute [Status] to your [DIM NOS STATUSCODE] dimension and use MDX for percentage, like this:
([DIM NOS STATUSCODE].[Status].&[Failed],[Measures].[CNT HITS]) / [Measures].[CNT HITS]

It will involve a certain amount of hard coding - although you could add these measures into your cube script.
WITH
MEMBER [Measures].[failed_percent] AS
DIVIDE(
(
[DIM NOS STATUSCODE].[Status].&[Failed]
,[DIM NOS HOST].[HOST].currentmember
,[Measures].[CNT HITS]
)
, (
[DIM NOS STATUSCODE].[Status].[All]
,[DIM NOS HOST].[HOST].currentmember
,[Measures].[CNT HITS]
)
)
SELECT
NON EMPTY
{
[Measures].[CNT HITS]
,[Measures].[failed_percent]
} ON COLUMNS,
NON EMPTY
[DIM NOS HOST].[HOST].[HOST].ALLMEMBERS
ON ROWS
FROM [DW];

How do I format numbers in and Access crosstab query to show two decimal places?

I have an Access crosstab query that displays the following results:
| SHORE_TYPE | Total Miles | Class 1 | Class 2 | Class 4 |
| ONSHORE | 31.37 | 0.337121212121212 | 12.4617424242424 | 0 |
I'd like it to display the following results instead. Note the 'Class' columns here show two decimal places:
| SHORE_TYPE | Total Miles | Class 1 | Class 2 | Class 4 |
| ONSHORE | 31.37 | 0.34 | 12.46 | 0.00 |
I've been able to configure the 'Total Miles' column by changing the Format and Decimal Places properties (in the Design View) to "Fixed" and "2," respectively. However, the query column (in Design View) that determines the value in the Class column has only a Format property, which I set to "Fixed"; there is not a Decimal Places property for me to adjust.
I have some similar crosstab queries that are showing the results in the way I desire, but I can't determine any differences between this one and those. Also, I've sometimes seen some of my queries display it the wrong way one time, then the desired way the next time.
This makes me wonder if the problem is a bug in Access, or if there is a something implicitly defined in my code that I should explicitly define.
Here is my SQL:
TRANSFORM IIf(IsNull(Sum([qryPartL].[MILES_OF_PHYS_LENGTH])),0,
Sum([qryPartL].[MILES_OF_PHYS_LENGTH])) AS SumOfMILES_OF_PHYS_LENGTH
SELECT qryPartL.SHORE_TYPE, Sum(qryPartL.MILES_OF_PHYS_LENGTH) AS [Total Miles]
FROM qryPartL
GROUP BY qryPartL.SHORE_TYPE
PIVOT qryPartL.CLASS_LOC_text In ("Class 1","Class 2","Class 4");
EDIT:
After closing and re-opening this query, the Total Miles column is now displaying 31.3714015..., and the properties I had previously set for this column in the Design View are now blank. So, it looks like Access does not consistently save these property settings. At least not in the context in which I was using them.

The trick is to use a series of nested functions.
CDbl: Converts the data to a Double number data type
FormatNumber: Returns an expression formatted as a number with a specified precision (2)
Nz: Returns the specified value (0) when a field is null
The CDbl function won't work if a value is Null.
I also removed the IIf function from the TRANSFORM clause since Nz works better in this case.
Here is the new SQL that returns the desired results. (I've added new lines and indents to make it easier to read. This is a not necessary step, and may in fact not be remembered by Access.)
TRANSFORM
CDbl(
FormatNumber(
Nz(
Sum([qryPartL].[MILES_OF_PHYS_LENGTH])
,0)
,2)
) AS SumOfMILES_OF_PHYS_LENGTH
SELECT qryPartL.SHORE_TYPE,
CDbl(
FormatNumber(
Nz(
Sum(qryPartL.MILES_OF_PHYS_LENGTH)
,0)
,2)
) AS [Total Miles]
FROM qryPartL
GROUP BY qryPartL.SHORE_TYPE
PIVOT qryPartL.CLASS_LOC_text In ("Class 1","Class 2","Class 4");
Thanks to Allen Browne and a tip on his awesome Access website for leading me to this answer.

How do you replace nulls in a crosstab query with zeroes?

Based on the following SQL in Access...
TRANSFORM Sum([Shape_Length]/5280) AS MILES
SELECT "ONSHORE" AS Type, Sum(qry_CurYrTrans.Miles) AS [Total Of Miles]
FROM qry_CurYrTrans
GROUP BY "ONSHORE"
PIVOT qry_CurYrTrans.QComb IN ('1_HCA_PT','2_HCA_PT','3_HCA_PT','4_HCA_PT');
... my results returned the following datasheet:
| Type | Total Of Miles | 1_HCA_PT | 2_HCA_PT | 3_HCA_PT | 4_HCA_PT |
| ONSHORE | 31.38 | | 0.30 | 7.80 | |
This result is exactly what I want except I want to see zeroes in the cells that are null.
What are some options for doing this? If possible, I'd like to avoid using a subquery. I'd also prefer the query to remain editable in Access' Design View.

I think you have to use the Nz function, which will allow you to convert NULLs to another value. In this case, I used the (optional) part of the function to say, "If Sum([Shape_Length]/5280) is NULL, set it to 0". You may have to use quotes around the 0, I can't recall.
TRANSFORM Nz(Sum([Shape_Length]/5280), 0) AS MILES
SELECT "ONSHORE" AS Type, Sum(qry_CurYrTrans.Miles) AS [Total Of Miles]
FROM qry_CurYrTrans
GROUP BY "ONSHORE"
PIVOT qry_CurYrTrans.QComb IN ('1_HCA_PT','2_HCA_PT','3_HCA_PT','4_HCA_PT');

MDX query and calculated members for two different averages in same query

I have an MDX/calculated member question here. It has been a while since I've done this and have forgotten a lot. I have a cube with the following dimensions and levels:
Sites
Site Name
Clients
Client Name
Industry Name
I have a measure
Product Count
What I want to show/return from an MDX query is the following:
Site | Prod Count | Avg Prod Count Across All Sites for Current Client | Avg Prod Count
Across All Sites in Current Client's Industry
Example Data:
Site | Prod Count | Avg 1 | Avg 2
Site 1 | 100 | 50 | 200
Site 2 | 125 | 50 | 200
Site 3 | 112 | 50 | 200
What I'm trying to figure out is how or if I can use 2 different calculated members to calculate the averages above.
The challenge is that the query has to be in the following format because I'm using a reporting tool and it is generating the MDX.
`SELECT
{
[Measures].[Product Count],
[Measures].[Calc Avg 1],
[Measures].[Calc Avg 2]
} ON COLUMNS,
{[Sites].[Site Name].[Site Name].Members} ON ROWS
FROM [Cube]
where ([Clients].[Client Name].&[Client A])`
So basically, my question is:
What would be the proper way to define the averages I'm looking for using calculated members?
Whenever I try it out I'm only able to calculate the average product count across all sites for the current client, but I'm not able to get the average across all sites in the current client's industry.

here's an example using adventure works to get you started. the calculated members will need to be ported to the MDX script to use with your tool. Here's the mapping:
City = "Client Site"
State = "Client"
Country = "Client Industry"
WITH
MEMBER Measures.ClientCitiesCount AS
Exists(
[Customer].[City].[City] // represents client sites
,[Customer].[State-Province].CurrentMember // represents client
).Count
MEMBER Measures.ClientCitiesSales AS
SUM(
[Customer].[State-Province].CurrentMember
,[Measures].[Internet Sales Amount]
)
MEMBER Measures.AvgAcrossClientCities AS
ClientCitiesSales/ClientCitiesCount
MEMBER Measures.IndustryCitiesCount AS
Exists(
[Customer].[City].[City] // represents industry sites
,Exists(
[Customer].[Country].[Country] // represents client's industry
,[Customer].[State-Province].CurrentMember // represents client
)
).Count
MEMBER Measures.IndustryCitiesSales AS
SUM(
Exists(
[Customer].[Country].[Country]
,[Customer].[State-Province].CurrentMember
)
,[Measures].[Internet Sales Amount]
)
MEMBER Measures.AvgAcrossIndustryCities AS
IndustryCitiesSales/IndustryCitiesCount
SELECT
{
[Measures].[Internet Sales Amount]
,ClientCitiesCount
,ClientCitiesSales
,AvgAcrossClientCities
,IndustryCitiesCount
,IndustryCitiesSales
,AvgAcrossIndustryCities
} ON 0,
{
[Customer].[City].[City] // represents client sites
} ON 1
FROM
[Adventure Works]
WHERE
[Customer].[State-Province].&[GA]&[US] // represents client
Don't forget to add in some edge-case handling (e.g. IIF the client has 0 "sites" in context) and consider using the "measuregroup" parameter in the EXISTS function.

Optimal solution for interview question

Recently in a job interview, I was given the following problem.
Say I have the following table
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
b | 30.00 | 1
c | 20.00 | 1
d | 25.00 | 1
where widget_name is holds the name of the widget, widget_costs is the price of a widget, and in stock is a constant of 1.
Now for my business insurance I have a certain deductible. I am looking to find a sql statement that will tell me every widget and it's price exceeds the deductible. So if my dedudctible is $50.00 the above would just return
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
a | 15.00 | 1
d | 25.00 | 1
Since widgets b and c where used to meet the deductible
The closest I could get is the following
SELECT
*
FROM (
SELECT
widget_name,
widget_price
FROM interview.tbl_widgets
minus
SELECT widget_name,widget_price
FROM (
SELECT
widget_name,
widget_price,
50 - sum(widget_price) over (ORDER BY widget_price ROWS between unbounded preceding and current row) as running_total
FROM interview.tbl_widgets
)
where running_total >= 0
)
;
Which gives me
widget_Name | widget_Costs | In_Stock
---------------------------------------------------------
c | 20.00 | 1
d | 25.00 | 1
because it uses a and b to meet the majority of the deductible
I was hoping someone might be able to show me the correct answer
EDIT: I understood the interview question to be asking this. Given a table of widgets and their prices and given a dollar amount, substract as many of the widgets you can up to the dollar amount and return those widgets and their prices that remain

I'll put an answer up, just in case it's easier than it looks, but if the idea is just to return any widget that costs more than the deductible then you'd do something like this:
Select
Widget_Name, Widget_Cost, In_Stock
From
Widgets
Where
Widget_Cost > 50 -- SubSelect for variable deductibles?
For your sample data my query returns no rows.

I believe I understand your question, but I'm not 100%. Here is what I'm assuming you mean:
Your deductible is say, $50. To meet the deductible you have you "use" two items. (Is this always two? How high can it go? Can it be just one? What if they don't total exactly $50, there is a lot of missing information). You then want to return the widgets that aren't being used towards deductible. I have the following.
CREATE TABLE #test
(
widget_name char(1),
widget_cost money
)
INSERT INTO #test (widget_name, widget_cost)
SELECT 'a', 15.00 UNION ALL
SELECT 'b', 30.00 UNION ALL
SELECT 'c', 20.00 UNION ALL
SELECT 'd', 25.00
SELECT * FROM #test t1
WHERE t1.widget_name NOT IN (
SELECT t1.widget_name FROM #test t1
CROSS JOIN #test t2
WHERE t1.widget_cost + t2.widget_cost = 50 AND t1.widget_name != t2.widget_name)
Which returns
widget_name widget_cost
----------- ---------------------
a 15.00
d 25.00

This looks like a Bin Packing problem these are really hard to solve especially with SQL.
If you search on SO for Bin Packing + SQL, you'll find how to find Sum(field) in condition ie “select * from table where sum(field) < 150” Which is basically the same problem except you want to add a NOT IN to it.
I couldn't get the accepted answer by brianegge to work but what he wrote about it in general was interesting
..the problem you
describe of wanting the selection of
users which would most closely fit
into a given size, is a bin packing
problem. This is an NP-Hard problem,
and won't be easily solved with ANSI
SQL. However, the above seems to
return the right result, but in fact
it simply starts with the smallest
item, and continues to add items until
the bin is full.
A general, more effective bin packing
algorithm would is to start with the
largest item and continue to add
smaller ones as they fit. This
algorithm would select users 5 and 4.
So with this advice you could write a cursor to loop over the table to do just this (it just wouldn't be pretty).
Aaron Alton gives a nice link to a series of articles that attempts to solve the Bin Packing problem with sql but basically concludes that its probably best to use a cursor to do it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

MDX calculation has wrong order of precendence - ssas

Related

MDX/SSAS sum of certain values over totals - calculate success/failure rate

How do I format numbers in and Access crosstab query to show two decimal places?

How do you replace nulls in a crosstab query with zeroes?

MDX query and calculated members for two different averages in same query

Optimal solution for interview question

Categories

Resources