icCube: multiple dimensions in MDX output - mdx

The documentation of icCube states:
However, a SELECT is not limited to two axes. We could have columns,
rows, pages, chapters, and sections. And you could still continue
beyond these by specifying a number for the axis.
Indeed, when I try using three dimensions on the demo Sales cube, it works:
select
{[paris], [london]} on 0,
{[2005], [2006]} on 1,
product.members on 2
from sales
However, when I try four dimensions:
select
{[paris], [london]} on 0,
{[2005], [2006]} on 1,
product.members on 2,
measures.members on 3
from sales
I get an error message: Unexpected number of axes (4) for the pivot table (expected:0..3)
What am I missing?

There is nothing wrong with using a 4 axes query. However, it is left up to the client your are using to be able to display it.
For example, Excel accepts 2D results, the icCube pivot table is able to display results up to (and including) 3 axes.
Hope that helps.

Related

Proportions for multiple subcategories

I am trying to calculate proportions with multiple subcategories. As seen in the screenshot below, the series is grouped by ['budget_levels', 'revenue_levels'].
I would like to calculate the proportion for each.
For example,
budget_levels=='low' & revenue_levels=='low' / budget_levels=='low'
budget_levels=='low' & revenue_levels=='medium' / budget_levels=='low'
However, not getting the desired output.
Is there any way I could do this calculation for each with a simple one-line code such as .apply(lambda) function?
Use value_counts to get the number of occurences of each combination. Then group by the column budget_levels and divide the observations in each group by their sum. sort_index makes it easier to compare the groups.
df.value_counts().groupby(level=0).transform(lambda x: x / x.sum()).sort_index()

Need explanation on how pandas.drop is working here

I have a data frame, lets say xyz. I have written code to find out the % of null values each column possess in the dataframe. my code below:
round(100*(xyz.isnull().sum()/len(xyz.index)), 2)
let say i got following results:
abc 26.63
def 36.58
ghi 78.46
I want to drop column ghi because it has more than 70% of null values.
I achieved it using the following code:
xyz = xyz.drop(xyz.loc[:,round(100*(xyz.isnull().sum()/len(xyz.index)), 2)>70].columns, 1)
but , i did not understand how does this code works, can anyone please explain it?
the code is doing the following:
xyz.drop( [...], 1)
removes the specified elements for a given axis, either by row or by column. In this particular case, df.drop( ..., 1) means you're dropping by axis 1, i.e, column
xyz.loc[:, ... ].columns
will return a list with the column names resulting from your slicing condition
round(100*(xyz.isnull().sum()/len(xyz.index)), 2)>70
this instruction is counting the number of nulls, adding them up and normalizing by the number of rows, effectively computing the percentage of nan in each column. Then, the amount is rounded to have only 2 decimal positions and finally you return True is the number of nan is more than 70%. Hence, you get a mapping between columns and a True/False array.
Putting everything together: you're first producing a Boolean array that marks which columns have more than 70% nan, then, using .loc you use Boolean indexing to look only at the columns you want to drop ( nan % > 70%), then using .columns you recover the name of such columns, which then are used by the .drop instruction.
Hopefully this clear things up!
If you code is hard to understand , you can just check dropna with thresh, since pandas already cover this case.
df=df.dropna(axis=1,thresh=round(len(df)*0.3))

Defining an RDLC chart axis with an aggregate function

The autoaxis for one of my embedded charts isn't behaving well, sometimes only showing one other major value besides top and bottom. So I thought I'd set my own boundaries, which seemed pretty easy given that one of the columns on the chart is always going to be larger than any of the others.
<Maximum>=(((Max(Fields!Entered.Value, "Chart1") + 10) \ 50) + 1) * 50</Maximum>
(the other columns detail what happened to the things that entered this process)
Round up to the nearest 50 with a little overage to put the label on top. Then I can put the intervals at this divided by 5 and I'm gold.
Except I'm not gold. The chart groups records by date and the individual bars are Sum(Fields!Entered.Value) et cetera, so it's drastically underscaling when multiple batches get processed on a single date. But hey, it groups records by date, I can use that:
<ChartCategoryHierarchy>
<ChartMembers>
<ChartMember>
<Group Name="Chart1_CategoryGroup">
<GroupExpressions>
<GroupExpression>=Fields!Date.Value</GroupExpression>
</GroupExpressions>
</Group>
</ChartMember>
</ChartMembers>
</ChartCategoryHierarchy>
as:
<Maximum>=(((Max(Fields!Entered.Value, "Chart1_CategoryGroup") + 10) \ 50) + 1) * 50</Maximum>
and it'll aggregate over the group just fine. Right?
The ValueAxis_Primary.Maximum expression for the chart 'Chart1' has a scope parameter that is not valid for an aggregate function. The scope parameter must be set to a string constant that is equal to either the name of a containing group, the name of a containing data region, or the name of a dataset.
Nope! It works just fine for "Chart1" but not for "Chart1_CategoryGroup"!
So, uh:
what scope are the axis calculations operating in, 'cause it ain't the category scope?
is there some way to provide them an aggregate scope that groups the data by date so they can do their calculations proper?
You Have To Nest The Scope
A little extra work gave me this insight:
Max(Fields!Entered.Value, "Chart1_CategoryGroup") returns the maximum of the entered fields within one single category group, which is not the level the Y axis is concerned with. What you're interested in is the maximum value of the summed calculation (within a group) for the whole chart, so specify the scopes to do that:
<Maximum>
=(((Max(
Sum(Fields!Entered.Value, "Chart1_CategoryGroup")
, "Chart1") + 10) \ 50) + 1) * 50
</Maximum>

Access graph get series on 2 database fields

I have a technical issue with access graphs: I have a table in Access database with 4 fields: xValue, yValue, round, partOfRound
What I want: there are always 2 rounds, each round has 2 parts. I need to get a series per round per part (so from round 1 part 1, round 1 part 2, round 2 part 1, round 2 part 2) with all xValues and yValues in a chart.
But then I have an other problem:The xValue isn't a good number to show, this is needing to be this number divided by a number from an other table (see this as number in table3) where the row of table 3 equels the identifier with the identifier I use for my chart. (IDtable2=IDtable3)
The final result will be 4 lines with the data in my graph, so 4 series.
But when I use the wizard for making graphs, I can only set 1 field to the series value, so it will see a round as just 1 series instead of 2.
How do I solve this problem?
Kind regards
Kristof
What type of graph - just a column?
Concatenate the round and partOfRound fields.
Try changing the graph RowSource to:
TRANSFORM Sum(Table2.yValue) AS SumOfyValue SELECT Table2.xValue FROM Table2 GROUP BY Table2.xValue PIVOT [round] & "_" & [partOfRound];
Possible SQL to include table join to calculate the division:
TRANSFORM Sum(Table2.yValue) AS SumOfyValue
SELECT Round([xValue]/[Factor],0) AS x
FROM Table3 INNER JOIN Table2 ON Table3.PK_Table3 = Table2.FK_Table3
GROUP BY Round([xValue]/[Factor],0)
PIVOT [round] & "_" & [partOfRound];
For both queries, I had to open the graph editor (double click the graph) and from the menu click on "By Column" button to get the x values on the x axis.
I do hope round is not an actual name as it is a reserved word and should not use reserved words as names for anything.

Dividing values from 2 different datasets

I am trying to divide 2 different fields from 2 different datasets. Also using a lookup in the statement but for some reason it does the lookup part of the expression but doesn't do the division part. Any ideas?
=IIF(Fields!PACKSHORT_DESC.Value = "EA",(LOOKUP(TRIM(Fields!PRODUCT_CODE.value), TRIM(Fields!item.value),Fields!tcost.value,"Cost")/Fields!NO_OF_EACHES.Value),(LOOKUP(TRIM(Fields!PRODUCT_CODE.value), TRIM(Fields!item.value),Fields!tcost.value,"Cost")))
Get it to output the two numbers you are trying to divide first to see if they are pulling through correctly first, assign them names and then divide them instead.