I have an ordinary OLAP cube (MS AS2000) with three dimensions, time, market, geography. Each of these dimensions has a simple hierarchy, e.g. time - [all][year][quarter][month], product - [all][market][brand][product]. There are two measures: value, units.
Assume that for business reasons I don't want to distribute that cube with all product brands data. Someone may order/buy sales data for his brand and selected competitor. However for the market level, the cube should have full market aggregated data. In other words, there are four brands: B1, B2, B3, B4. A client orders data only for B1 and B2, so his cube should have data for B1 and B2. But the brands market should have aggregated sum of four brands.
It is possible to build a such OLAP cube, where aggregated data of lower level cells doesn't sum up to parent cell value?
If yes to above, then how to find cells with values that do not equal to aggregated lower levels.
I'd probably be looking to do that in the data warehouse rather than the cubes. So for your example where they've bought B1 and B2 I'd create a new product in the product dimension called "Rest of Market" and then replace the B3 and B4 ID's in the Fact table(s) with the ID for "Rest of Market".
You may be able to use Parent/Child hierarchies because with that option you can have "data members". These are non-leaf members that can contain data. Have a lok at this link here for more info Parent Child Dimensions
Related
I am editing my question which I want exactly.
I have two columns Actual Units, Future Units from Fact A and Fact B respectively but at same granular level.I also have Demand Units from Fact B
My requirement is :
1. Projected Units = Coalesce(Actual Units,Future Units)
2. Stock Units = IF(Projected Units > Demand Units,Demand Units,Projected
Units)
3. Stock Rate = (Stock Units/Demand Units)
I cannot join the two facts in the data source view level and do the
calculation there because they are a very huge tables, so I think the
performance would be very slow. If you say that doing the calculations at
the data source view level level is the only way we have, please let me
know.
Did you get this?
When calculating the grand total MDX is summing up A, summing up B, and then comparing them.
If you want the calculation to occur at the row level (checking whether B>A) then edit the Data Source View and add a new calculated column to the table your measure group is based upon. The calculated column should be:
CASE WHEN B>A THEN A ELSE B END
Then create a Sum measure based upon that new column.
This approach will perform much better compared to any completely MDX approach to calculating this at a very detailed grain. If your fact tables had 500,000 rows or less and you had a degenerate Dimension which was the same grain as the grain you need to calculate at, we could possibly do it in MDX. But since you are concerned with SQL query performance I am assuming the tables are big. Just remember that SQL is done once at processing time. MDX is calculated in every query at query time. So do expensive things in SQL when you can.
Let's say i have the following hierarchy that i use as a dimension:
Root
A1
B11
B12
...
B1N
B1Special
A2
B21
B22
...
B2N
B2Special
...
AM
BM1
BM2
...
BMN
BMSpecial
Under each of the "B" nodes there are several more nodes at different levels. Each leaf of the hierarchy has a measure associated (SUM of some fact F).
Is it possible with MDX to have the SUM of all and only the items children of the "Special" nodes?
I have to assume you want to see the sum of all 'Special' nodes only once, at the root level. In other words, you want to see just one number in your results set.
Assuming the hierarchy detailed in your original question was called 'Bob', and you had another dimension called 'Kate', you might try this...
WITH MEMBER [Bob].[Only the special levels]
AS 'Aggregate(
Filter(
{[Bob].[Name of level which holds B members].members},
InStr(1, [Bob].CurrentMember.Name, "Special") > 0
)
)'
SELECT {[Kate].defaultMember} ON ROWS,
{[Measures].[Whever you want to see aggregated]} ON COLUMNS
FROM [Cube name]
WHERE ([Bob].[Only the special levels])
This creates a new, temporary, member in the Bob dimension, which is an aggregation of several other members in the Bob dimension. We start with all the members that sit in one particular level. The Filter chooses only those members which have the word "special" in their name.
Note that InStr is a VBA function which is supported by Microsoft SSAS. It returns zero if the chosen string is not found. Alternative string searching functions may be available in other flavours of MDX.
You then use this new member in your WHERE clause, and slap your other dimensions/measures wherever you want.
I have a query to pull clickthrough for a funnel, where if a user hit a page it records as "1", else NULL --
SELECT datestamp
,COUNT(visits) as Visits
,count([QE001]) as firstcount
,count([QE002]) as secondcount
,count([QE004]) as thirdcount
,count([QE006]) as finalcount
,user_type
,user_loc
FROM
dbname.dbo.loggingtable
GROUP BY user_type, user_loc
I want to have a column for each ratio, e.g. firstcount/Visits, secondcount/firstcount, etc. as well as a total (finalcount/Visits).
I know this can be done
in an Excel PivotTable by adding a "calculated field"
in SQL by grouping
in PowerPivot by adding a CalculatedColumn, e.g.
=IFERROR(QueryName[finalcount]/QueryName[Visits],0)
BUT I need give the report consumer the option of slicing by just user_type or just user_loc, etc, and excel will tend to ADD the proportions, which won't work b/c
SUM(A/B) != SUM(A)/SUM(B)
Is there a way in DAX/MDX/PowerPivot to add a calculated column/measure, so that it will be calculated as SUM(finalcount)/SUM(Visits), for any user-defined subset of the data (daterange, user type, location, etc.)?
Yes, via calculated measures. calculated columns are for creating values that you want to see on rows/columns/report header...calculated measures are for creating values that you want to see in the values section of a pivot table and can slice/dice by the columns in the model.
The easiest way would be to create 3 calculated "measures" in the calculation area of the powerpivot sheet.
TotalVisits:=SUM(QueryName[visits])
TotalFinalCount:=SUM(QueryName[finalcount])
TotalFinalCount2VisitsRatio:=[TotalFinalCount]/[TotalVisits]
You can then slice the calculated measure [TotalFinalCount2VisitsRatio] by user_type or just user_loc (or whatever) and the value will be calculated correctly. The difference here is that you are explicitly telling the xVelocity engine to SUM-then-DIVIDE. If you create the calculated column, then the engine thinks you want to DIVIDE-then-SUM.
Also, you don't have to break down the measure into 3 separate measures...it's just good practice. If you're interested in learning more, I'd recommend this book...the author is the PowerPivot/DAX guru and the book is very straightforward.
I'm creating Analysis Services cubes in Visual Studio BIDS, and have a question about summing in calculated members.
The data has to do with commercial real estate transactions. I want to sum square feet of building space involved in sales transactions for each region. I'm going to use that result in a weighted average calculation. However, I only want to sum the square feet of transactions which have non-null values for the corresponding building capitalization rate (cap rate) member.
Here is a drill-down to Athens in the cube browser:
Note that Athens has 15 values for square feet, but only 5 values for cap rate, reflecting my relational data source as shown here:
So, I only want to sum the five square feet values that have associated cap rate values. Doing the math with the relational query result above you can see that this should result in a sum just over 900K, not the 2 million+ sum shown in the BIDS screenshot.
My attempt at this calculation:
sum(
descendants(
[Property].[Property by Region].CurrentMember,
[Property].[Property by Region].[Metro Area]
),
iif([Measures].[Cap Rate] is null or [Measures].[Sq Ft] is null, 0,
[Measures].[Sq Ft])
)
ends up including the square feet values that have no corresponding cap rates, so I still end up with a value in the 2 millions.
Why is my iff() clause not working as one would expect?
I was finally able to create the weighted average calculation using a combination of Named Calculations in the Data Source View (DSV) and a calculated member (in the cube script). First, I went to the DSV and added a named calculation called xWeightedCapRt with a formula as follows:
CASE WHEN CapRate IS Null THEN Null Else CapRate * SqFt END
In the cube, I then added xWeightedCapRt as a New Measure. I set its aggregation function to Sum and left its Visible property set to True temporarily.
I created an additional Named Calculation called "xSqFt", defined as:
CASE WHEN CapRate IS Null THEN Null Else SqFt END
and again created a corresponding measure.
On the Calculation tab (of the cube designer) I created a new calculated member, [WAvg Cap Rate by Sq Ft], with the following formula:
[Measures].[x Weighted Cap Rt] / [Measures].[x Sq Ft]
After deploying and processing the cube, I was able to verify that the weighted average calculation matched my spreadsheet numbers. At that point, I set the Visible property of the two intermediate measures to False and redeployed.
What I've learned is that calculations at the "row-level" are best performed through the DSV. You can then use those to build up more complex calculations within the cube.
(NOTE: One thing that needs to be added to the steps above is logic to handle division by zeros.)
Couldnt you have done a nonempty around the descendants on the cap rate measure?
I am trying to create a pivot table with data from a SQL database. The pivot table is basically a process capability report that requires certain information. I have created an Excel file that sort of does what we are trying to accomplish, but I don't think my calculations are quite accurate. I know after doing some research that there are Pivot Tables within SQL but I don't know how to get them to work.
The table that my data is stored in has thousands of records. Each record has the following information: DATE, TIME, PRODUCT_NO, SEQ, DATECODE, DATECODE_IDH, PRODUCT, LINE, SHIFT, SIZE, OPERATOR, SAMPLE_SIZE, WEIGHT (1-12), WEIGHTXR, LSL_WT, TAR_WT, USL_WT, LABEL, MAV, LINE_SIZE
For the report, I need to group the data based on product and by line. Since the product isn't consistent, each product can be described by TAR_WT. So the grouping will be a combination of TAR_WT and LINE_SIZE. I need to count how many instances of that product were measured which will be the number of measurements (each individual weight which is 12 weights per record). I also need to find the minimum, maximum, and average of all of the weights per product (again 12 weights for every record). After those values are obtained, I have to calculate the Standard Deviation, Cp, and Tz of the values (staistical calculations) and report all the information.
Date Time Product No Seq DateCode Internal DateCode&ProductNo Product Description Line Size Weight1 Weight2 Weight3 Weight4 WeightXR LSL_WT TAR_WT USL_WT LABEL MAV
8/3/11 0:37:54 1234567 23 DateCode Internal DateCode&ProductNo Product Description L-1A 50 1575 1566 1569.5 1575.5 1573.4 1550.809 1574.623 1598.437 1564.623 1525.507 L-1A_50