ParrellPeriod not returning what I expect - ssas

I have yearly sales goals in a measure called "Target" and a date dimension called DimCalendar.
Looking at the underlying data, I'm thinking I should get a value different than what the below query returns. It's my understanding that this query will get the value for the "Target" measure where the associated year is 2016 (one year in the future or -1) and for a specific account.
SELECT
{[Measures].[Target]} on columns,
{ParallelPeriod(
[DimCalendar].[Year].[Year]
,-1
,[DimCalendar].[Year].&[2015])} on rows
FROM [MySalesCube]
WHERE { [Account].[Account].&[2025] }
This query returns
1944768
However, the underlying data seems to add up to only 162064
Nope, looks like there is an issue with the data after all after using the cube browser. Got to go revisit my ETL process.

This is what you have specified:
ParallelPeriod(
[DimCalendar].[Year].[Year]
,-1
,[DimCalendar].[Year].&[2015])
It means the following:
Take the year 2015
Then jump the specified number of periods, in your case -1, using the level specified, in your case [Year]
The following should (I think) be a simplified but equivalent version - if the first argument is missed out then it just uses the level of the third argument:
ParallelPeriod(
-1
,[DimCalendar].[Year].&[2015])
Although I think you can just use lag to make everything more readable:
[DimCalendar].[Year].&[2015].LAG(-1)
Then again there is no point using 1 or -1 inside lag, as we have the functions NEXTMEMBER and prevMEMBER this could just be simplified to the following:
[DimCalendar].[Year].&[2015].NEXTMEMBER

Related

Unexpected result with ORDER BY

I have the following query:
SELECT
D.[Year] AS [Year]
, D.[Month] AS [Month]
, CASE
WHEN f.Dept IN ('XSD') THEN 'Marketing'
ELSE f.Dept
END AS DeptS
, COUNT(DISTINCT f.OrderNo) AS CountOrders
FROM Sales.LocalOrders AS l WITH
INNER JOIN Sales.FiscalOrders AS f
ON l.ORDER_NUMBER = f.OrderNo
INNER JOIN Dimensions.Date_Dim AS D
ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE)
WHERE YEAR(f.OrderDate) = 2019
AND f.Dept IN ('XSD', 'PPM', 'XPP')
GROUP BY
D.[Year]
, D.[Month]
, f.Dept
ORDER BY
D.[Year] ASC
, D.[Month] ASC
I get the following result the ORDER BY isn't giving the right result with Month column as we can see it is not ordered:
Year Month Depts CountOrders
2019 1 XSD 200
2019 10 PPM 290
2019 10 XPP 150
2019 2 XSD 200
2019 3 XPP 300
The expected output:
Year Month Depts CountOrders
2019 1 XSD 200
2019 2 XSD 200
2019 3 XPP 300
2019 10 PPM 290
2019 10 XPP 150
Your query
It is ordered by month, as your D.[Month] is treated like a text string in the ORDER BY clause.
You could do one of two things to fix this:
Use a two-digit month number (e.g. 01... 12)
Use a data type for the ORDER BY clause that will be recognized as representing a month
A quick fix
You can correct this in your code by quickly changing the ORDER BY clause to analyze those columns as though they are numbers, which is done by converting ("casting") them to an integer data type like this:
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
This will correct your unexpected query results, but does not address the root cause, which is your underlying data (more on that below).
Your underlying data
The root cause of your issue is how your underlying data is stored and/or surfaced.
Your Month seems to be appearing as a default data type (VarChar), rather than something more specifically suited to a month or date.
If you administer or have access to or control over the database, it is a good idea to consider correcting this.
In considering this, be mindful of potential context and change management issues, including:
Is this underlying data, or just a representation of upstream data that is elsewhere? (e.g. something that is refreshed periodically using a process that you do not control, or a view that is redefined periodically)
What other queries or processes rely on how this data is currently stored or surfaced (including data types), that may break if you mess with it?
Might there be validation issues if correcting it? (such as from the way zero, null, non-numeric or non-date data is stored, even if invalid)
What change management practices should be followed in your environment?
Is the data source under high transactional load?
Is it a production dataset?
Are other reporting processes dependent on it?
None of these issues are a good excuse to leave something set up incorrectly forever, which will likely compound the issue and introduce others. However, that is only part of the story.
The appropriate approach (correct it, or leave it) will depend on your situation. In a perfect textbook world, you'd correct it. In your world, you will have to decide.
A better way?
The above solution is a bit of a quick and nasty way to force your query to work.
The fact that the solution CASTs late in the query syntax, after the results have been selected and filtered, hints that is not the most elegant way to achieve this.
Ideally you can convert data types as early as possible in the process:
If done in underlying data, not the query, this is the ultimate but may not suit the situation (see below)
If done in the query, try to do it earlier.
In your case, your GROUP BY and ORDER BY are both using columns that look to be redundant data from the original query results, that is, you are getting a DATE and a MONTH and a YEAR. Ideally you would just get a DATE and then use the MONTH or YEAR from that date. Your issue is your dates are not actually dates (see "underlying data" above), which:
In the case of DATE, is converted in your INNER JOIN line ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE) (likely to minimise issues with the join)
In the case of D.[year] and D.[month], are not converted (which is why we still need to convert them further down, in ORDER BY)
You could consider ignoring D.[month] and use the MONTH DATEPART computed from DATE, which would avoid the need to use CAST in the ORDER BY clause.
In your instance, this approach is a middle ground. The quick fix is included at the top of this answer, and the best fix is to correct the underlying data. This last section considers optimizing the quick fix, but does not correct the underlying issue. It is only mentioned for awareness and to avoid promoting the use of CAST in an ORDER BY clause as the most legitimate way of addressing your issue with good clean query syntax.
There are also potential performance tradeoffs between how many columns you select that you don't need (e.g. all of the ones in D?), whether to compute the month from the date or a seperate month column, whether to cast to date before filtering, etc. These are beyond the scope of this solution.
So:
The immediate solution: use the quick fix
The optimal solution: after it's working, consider the underlying data (in your situation)
The real problem is your object Dimensions.Date_Dim here. As you are simply ordering on the value of D.[Year] and D.[Month] without manipulating the values at all, this means the object is severely flawed; you are storing numerical data as a varchar. varchar, and numerical data types are completely different. For example 2 is less than 10 but '2' is greater than '10'; because '2' is greater than '1', so therefore it must also be greater than '10'.
The real solution, therefore, is fixing your object. Assuming that both Month and Year are incorrectly stored as a varchar, don't have any non-integer values (another and different flaw if so), and not a computed column then you could just do:
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Year] int NOT NULL;
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Month] int NOT NULL;
You could, however, also make the columns a PERSISTED computed column, which might well be easier, in my opinion, as DATEPART already returns a strongly typed int value.
ALTER TABLE dbo.Date_Dim DROP COLUMN [Month];
ALTER TABLE dbo.Date_Dim ADD [Month] AS DATEPART(MONTH,[Date]) PERSISTED;
Of course, for both solutions, you'll need to (first) DROP and (afterwards) reCREATE any indexes and constraints on the columns.
As long as your "Month" is always 1-12, you can use
SELECT ..., TRY_CAST(D.[Month] AS INT) AS [Month],...
ORDER BY TRY_CAST(D.[Month] AS INT)
The simplest solution is:
ORDER BY MIN(D.DATE)
or:
ORDER BY MIN(f.ORDER_DATE)
Fiddling with the year and month columns is totally unnecessary when you have a date column that is available.
A very common issue when you store numerical data as a varchar/nvarchar.
Try to cast Year and Month to INT.
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
If you try using the <, > and BETWEEN operators, you will get some really "weird" results.

MDX: Make Measure Value 0 based on flag

Would much appreciate any help on this.
I have a measure called "Sales" populated with values, however i am trying to turn the "Sales" value to 0, whenever the "Sales Flag" is set to 0.
Important Note: The Sales Flag is based on Date (lowest level of detail).
The difficulty that i am really experiencing and cant get a grip on, is how i am trying the display the MDX outcome.
As explained above, i would want to make the "Sales" value 0 whenever we have a 0 in the "Sales Flag" (which is based on the Date), but when I run the MDX Script I would wan't on the ROWS to NOT display the Date, but instead just the Week (higher Level to Date), as below shows:
I really have spent hours on this and can't seem to understand how we can create this needed custom Sales measure based on the Sales Flag on the date level, but having the MDX outcome display ROWS on Week level.
Laz
You need to define the member in the MDX before the select. Something like that:
WITH MEMBER [Measures].[Fixed Sales] as IIF([Sales Flag].currentMember=1,[Sales], 0)
SELECT [Measures].[Fixed Sales] on 0, [Sales Flag] on 1 from [Cube]
I am writing the code without SSAS here so it might not be the 100% correct syntax but you can get the general idea ;)
You can add the iif in the SELECT part but I find creating member to be the cleaner solution.
SELECT IIF([Sales Flag].currentMember=1,[Sales], 0) on 0, [Sales Flag] on 1 from [Cube]
If you have a control over the cube in SSAS you can create a calculated member there and you can access it easier.
Glad to hear if Veselin's answer works for you, but if not...
Several approaches are also possible.
Use Measure expression for Sales measure:
Use SCOPE command for Day level (if it's Key level of Date dimension). If it's not a key level you have to aggregate on EVERY level (week, year etc) to emulate AggregateFunction of Sales measure but with updated behavior for one flag:
SCOPE([Date].[Your Date Hierarchy].[Day].members,[Measures].[Sales]);
THIS=IIF([Sales Flag].CurrentMember = 1,[Measures].[Sales],0);
END SCOPE;
Update logic in DSV to multiply Sales column by SalesFlag. This is the easiest way from T-SQL perspective.

How to expose total number of days in each month via OLAP cube

I'd like an easy way of getting the total number of days for each month in the date dimension.
Currently this information is not exposed in our cubes. Therefore I need to write custom mdx such as the following:
WITH
SET [13Mth] AS Tail([Date].[Date - Calendar Month].[Calendar Month].MEMBERS ,13)
SET [m] AS Tail([13Mth])
MEMBER [Measures].[TotalNumDaysMth] AS
Datepart
("D",
Dateadd
("M",1,
Cdate(Cstr(VBA!Month([m].Item(0).Item(0).Lag(1).Name)) + "-01-" + Cstr(VBA!Year([m].Item(0).Item(0).Lag(1).Name)))
)
- 1
)
MEMBER [Measures].[TotalNumDaysMth-1] AS
Datepart
("D",
Dateadd
("M",1,
Cdate(Cstr(VBA!Month([m].Item(0).Item(0).Lag(1).Name)) + "-01-" + Cstr(VBA!Year([m].Item(0).Item(0).Lag(1).Name)))
)
- 1
)
I don't believe our users will need this information within our cube browsing client but from a developer point of view I could do without having to always implement the above.
What approach should we use to make the above data more easily available?
I have always added an attribute to the date dimension "days in month", of type integer. You can hide this attribute, if you do not want to expose the attribute hierarchy to your users.
But you can still use in calculations.
So my advise would be add a proper attribute to your time dimension and base your calculations of on the attributes.
Hope this helps somehow.
I am assuming that in your development scenario, the month will come from Web Service/Front End/SSRS report etc. In any such cases, you can create a measure that will return the count.
Approach 1: Year-Quarter-Month-Date exists
It is a good practice to have a similar hierarchy in place. The following MDX will work:
with set CurrentMonth as
[Date].[Year-Quarter-Month-Date].currentmember
set DatesInCurrentMonth as
descendants(abc, [Date].[Year-Quarter-Month-Date].[Date])
member [Measures].countofdaysinmonth as
count(DatesInCurrentMonth)
select measures.CountOfDaysInMonth on 0
from [MyCube]
where [Date].[Year-Quarter-Month-Date].[Month].&[Feb-2004]
Approach 2: Such hierarchy doesn't exist
with set CurrentMonth as
[Date].[Month].currentmember
set DatesInCurrentMonth as
exists([Date].[Date].children, abc, "<<Any Measure group>>")
member measures.countofdaysinmonth as
count(DatesInCurrentMonth)
select measures.countofdaysinmonth on 0
from [MyCube]
where [Accident Date].[Month].&[Feb-2004]
Let me know if they work/have issues.

wrong result of parallelperiod() for DATE in MDX

I want to extract data for same period last year and last month.
for this i am using Parallelperiod(), for eg
PARALLELPERIOD([date].[year],1,[date].[date].[20-Sep-2014]) ,
for which I am getting output : 21-Sep-2014
and
PARALLELPERIOD([date].[month],1,[date].[date].[20-Sep-2014]) ,
for which I am getting output : 16-Aug-2014
Same function would throw some other date for some other month
Can you guide about the issue, where i am doing wrong or if there is some other alternative to this?
You must have some dates that do not exist in the cube.
What the PARALLELPERIOD function is doing is saying ok we are 262 members at the [date] level into 2014 - then it goes to 2013 and finds the member at the [date] level that is also 262 members in. Therefore unless you have complete sets of dates in your cube this function will return surprising results.
Therefore the solution is to ensure that all historical dates are represented in the cube. These extra dates should not cause any extra overhead as they will be creating empty space in the cube which is dealt with very well by SSAS

Aggregation of recursive day-level calculations

I'm trying to set up a cube calculation. My current remit is that I will receive exchange rates, and these will be sent at the start of the month. Sometimes they will change during the month, and I will receive a rate to use for that currency onwards.
The initial problem is that LastNonEmpty uses the granularity questioned, which means when I query by day most days do not have exchange rates, so the calculations fail. I replaced the exchange rate measure, initially with a recursive calculation, but later with the following:
WITH
MEMBER [Measures].[EOD]
AS
Aggregate({NULL:[Date].[Calendar Y M D].CurrentMember}, [Measures].[End Of Day Rate])
So, when viewed on the date dimension, it will get the most recent exchange rate. This allows a daily calculation as follows:
MEMBER [Measures].[Converted]
AS
SUM
(
{[Currency].[Currency].[Currency] * [Date].[Calendar Y M D].CurrentMember},
(
(
[Measures].[Sales]
)
/ [Measures].[EOD]
)
)
Lastly I use this in a query:
SELECT
{
[Measures].[Converted]
}
ON COLUMNS,
NON EMPTY
{
[Date].[Calendar Y M D].[Calendar Day].&[2010-Q3-08-01] :
[Date].[Calendar Y M D].[Calendar Day].&[2010-Q3-08-31]
}
ON ROWS
FROM [Cube]
This all works fine, but ideally I'd prefer to query by year/month, as these are month-end reports. I can sum all this in the code tiers, but I'd rather have MDX that can do it itself.
The problem is that the calculations are done almost recursively at the day-level, then I'd like them rolled up to the month (Or even year) level.
I did try the currency conversion wizard, swapping the exchange rate for the time-aggregated one above, but attempting to browse the cube in SSMS locked the server as it tried to do the aggregation for the whole calendar :(
Any suggestions on what approach is best to take?
If you used views or named queries in the DSV instead of tables, then you could adjust the view or named query to limit the data to days where you have loaded exchange rate data. I'm not sure if this would meet your business need, but you could eliminate the empty data issues through this approach. After all, there's not much use to loading data that doesn't contain a value or, worse, preloading a guestimated exchange rate that would have to be overwritten once the actual numbers are available.