How to sort this by month in MDX? - mdx

I have never wrote a MDX line by myself, I have been using pentaho and a CDE Wizard to create some charts and it generates this code:
select NON EMPTY({Descendants([NOM_MES].[All NOM_MESs] ,[NOM_MES].[NOM_MES])}) on ROWS,
NON EMPTY({[Measures].[TOTAL]}) on Columns
from [museos_md]
where (${select_museoParameter})
I want to sort that result by a right mohth sequence, because I am getting the months in an alphabetical order. I also have a COD_MES measure that is the correct order of the months, I mean: NOM_MES->COD_MES, January->01, February->02 (coul it be useful?)

The quick solution would be to use
Order(Descendants([NOM_MES].[All NOM_MESs] ,[NOM_MES].[NOM_MES]), [Measures].[COD_MES], DESC)
in MDX.
The correct approach would be to do the sorting in the cube design. I am not sure about Pentaho, but in Analysis Services you can configure the month attribute to e. g. use the month code as key, and the month name as name column shown to users, and do the sorting based on the key. This also affects the display of the attribute as shown to the users, where alphabetically sorted month names are a bit confusing.

Related

How to populate all possible combination of values in columns, using Spark/normal SQL

I have a scenario, where my original dataset looks like below
Data:
Country,Commodity,Year,Type,Amount
US,Vegetable,2010,Harvested,2.44
US,Vegetable,2010,Yield,15.8
US,Vegetable,2010,Production,6.48
US,Vegetable,2011,Harvested,6
US,Vegetable,2011,Yield,18
US,Vegetable,2011,Production,3
Argentina,Vegetable,2010,Harvested,15.2
Argentina,Vegetable,2010,Yield,40.5
Argentina,Vegetable,2010,Production,2.66
Argentina,Vegetable,2011,Harvested,15.2
Argentina,Vegetable,2011,Yield,40.5
Argentina,Vegetable,2011,Production,2.66
Bhutan,Vegetable,2010,Harvested,7
Bhutan,Vegetable,2010,Yield,35
Bhutan,Vegetable,2010,Production,5
Bhutan,Vegetable,2011,Harvested,2
Bhutan,Vegetable,2011,Yield,6
Bhutan,Vegetable,2011,Production,3
Image of the above csv:
Now there is a very small country lookup table which has all possible countries the source data can come with, listed. PFB:
I want to have the output data's number of columns always fixed (this is to ensure the reporting/visualization tool doesn't get dynamic number columns with every day's new source data ingestions depending on the varying distinct number of countries present).
So, I've to somehow join the source data with the country_lookup csv and populate all those columns with default value as F. Every country column would be binary with T or F being the possible values.
The original dataset from the above has to be converted into below:
Data (I've kept the Amount field unsolved for column Type having Derived Yield as is, rather than calculating them below for a better understanding and for you to match with the formulae):
Country,Commodity,Year,Type,Amount,US,Argentina,Bhutan,India,Nepal,Bangladesh
US,Vegetable,2010,Harvested,2.44,T,F,F,F,F,F
US,Vegetable,2010,Yield,15.8,T,F,F,F,F,F
US,Vegetable,2010,Production,6.48,T,F,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
US,Vegetable,2011,Harvested,6,T,F,F,F,F,F
US,Vegetable,2011,Yield,18,T,F,F,F,F,F
US,Vegetable,2011,Production,3,T,F,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+2)/(3+3),T,F,T,F,F,F
US,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Argentina,Vegetable,2010,Harvested,15.2,F,T,F,F,F,F
Argentina,Vegetable,2010,Yield,40.5,F,T,F,F,F,F
Argentina,Vegetable,2010,Production,2.66,F,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Argentina,Vegetable,2011,Harvested,10,F,T,F,F,F,F
Argentina,Vegetable,2011,Yield,90,F,T,F,F,F,F
Argentina,Vegetable,2011,Production,9,F,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Bhutan,Vegetable,2010,Harvested,7,F,F,T,F,F,F
Bhutan,Vegetable,2010,Yield,35,F,F,T,F,F,F
Bhutan,Vegetable,2010,Production,5,F,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Bhutan,Vegetable,2011,Harvested,2,F,F,T,F,F,F
Bhutan,Vegetable,2011,Yield,6,F,F,T,F,F,F
Bhutan,Vegetable,2011,Production,3,F,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
The image of the above expected output data for a structured look at it:
Part 1 -
Part 2 -
Formulae for populating Amount Field for Derived Type:
Derived Amount = Sum of Harvested of all countries with T (True) grouped by Year and Commodity columns divided by Sum of Production of all countries with T (True)grouped by Year and Commodity columns.
So, the target is to have a combination of all the countries from source and calculate the sum of respective Harvested and Production values which then has to be divided. The commodity can be more than one in the actual scenario for any given country, but that should not bother as the summation of amount happens on grouped commodity and year.
Note: The users in the frontend can select any combination of countries. The sole purpose of doing it in the backend rather than dynamically doing it in the frontend is because AWS QuickSight (our visualisation tool), even though can populate sum on selected column filters but doesn't yet support calculation on those derived summed fields. Hence, the entire calculation of all combination of countries has to be pre-populated (very naive approach) in order to make it available in report on dynamic users selection of countries.
Also if you've any better approach (than the above naive approach mentioned in note) to solve this problem, you are most welcome to guide me. I've also posted a question on the same problem without writing my expected approach for experts to show me the path on how we can solve this kind of a problem better than this naive approach. If you want to help solve it with some other technique, you're most welcome, here is the link to that question.
Any help shall be greatly acknowledged.

Calculated Attribute - Min and Max Valid Date

We have some data inside a table (Dimension) with historical values.
Like this (Small example)
ProductId is our Primary Key (and then is unique)
Code is our Business Key
Color and Type are our historical values
In Analysis Services (Tabular mode), our users want to build a report on that values.
Client usage Could be:
(1) If they only want to see the code ('CAR' in our example) the result would be:
(2) If they want to see the code and the Color:
Same for all the attributes that we can have and all the combinations.
Do you know how to solve this?
Can we add some logic in a calculated attribute
Thank you,
Arnaud
In essence, you want to aggregate by date? So, for any set of attributes you put in your pivot table, you want to show the earliest ValidFrom date and the latest ValidTo date that applies?
To accomplish this in SSAS Tabular, import the table and hide the columns ValidFrom & ValidTo. (To hide a column, right click it in Visual Studio and select Hide from Client Tools.)
Then, create 2 measures. For example:
Valid From := MIN([ValidFrom])
Valid To := MAX([ValidTo])
Note the extra space in the names to distinguish them from the column names. You could also call them something completely different. (E.g. Earliest Valid From Date)
When people connect to your cube, people will use these 2 measures rather than the columns from the original table. (They won't even see the columns because you've hidden them.)
If their pivot table includes all the attributes above (Product ID, Code, Color, Type), then the table will look exactly like your original table. If they only show Code, then your table will look like your (1). If they only show Code & Color, then your table will look like (2).

Qlikview: Total of calculated metric based on calculated dimension

I started working on Qlikview a week back and I am working on this dashboard.
I have a particular requirement which I am not able to achieve:
So, I have a calculated dimension "Categories" added in my script which based on certain conditions tags each name as SLEEPERS,STARS,WEAKLINKS etc.
Now, I have flagged the names based on certain condition which works fine.
The issue is, I want the sum of those flags on the level of calculated dimension CATEGORIES(SLEEPERS, STARS..etc) and my month field.
I am not able to achieve it because, the flag itself is a calculated field so sum of calculated field doesn't work. I tried using aggr, function but it returns zero for all rows. I am not sure why. in the aggr function I use the sum(Aggr(Flag,MONTH,Categories))
Can someone suggest a work around for this? I have attached the screenshot of the report for better understanding of the requirement

Rename Attribute value in Time Dimension in SSAS

I am working on SQL Analysis service to provide ad hoc reporting in my application. I have created a time dimension to use in my cube. It has some predefined attributes. e.g. Month of year. It is having values like Month 1, Month 2, etc. while I want January for Month 1, February for Month 2, etc...
Can any one please suggest me some work around it??
As I am newbie to SSAS, Sorry if I am missing something very silly....
When you work with attributes in SSAS, there are two properties that affect the members of that attribute. The first property - which is set by default when you create the attribute - is KeyColumn. The column that you use here determines how many members are in the attribute because processing generates a SELECT DISTINCT statement based on this column. It's a good idea if you use an integer value here for better performance.
It sounds like perhaps you have a month number for your attribute here, which is good. Except that you want to display a month name. In that case, you set the NameColumn property with the column in your data source view that contains the month name. This produces the label that you see when you browse the dimension.
That said, it's usually not a good idea to have just a month number or month name because you probably want to create a hierarchy to roll up months by year and you won't be able to do that with just a month name. I wrote a blog post describing how to set up a date dimension that might help you: http://blog.datainspirations.com/2011/05/11/sqlu-ssas-week-dimension-design-101-2/

Why data is repeating in ssas dimension

I have an SSAS cube with time as one of the dimention.It contains hierarchy like year-quarter-month etc.When i drag and drop the this on SQL Server Managment Studio window(brows), it looks like data is repeating.For example,year is like 2002,2002,2003,2003,.. etc.If i expand first 2002 i can see 1st quarter under that.If i expand second 2002 i can see 2nd quarter etc..Can any one tell what is the reason? how can i change to single data?
First of all are you using your own Date Dimension table. If so, make sure you use the correct key for Date, Month, Quarter and Year. For example, normally the date dimension has YearMonth column used as the key for the Month attribute (eg 2012-04). If you don't have such a column you will need to pick a composite key for Month (Year and Month). Also, a good way to check is in the dimension designer in BIDS go the browser tab of the dimension and make sure the hierarchies are showing up fine.