DAX Formula Issue - ssas

This DAX formula issue is frustrating me to no end so I appreciate any help or another way to look at this.
Both of these formulas calculate the same value and that's been confirmed so don't worry about what the LY_Key actually equals. BUT the one with Variables removes the ability to drill down into separate years in the same table. My problem really exists when using weeks, but it's slightly easier to understand using these tables.
Can you see a difference between these 2 formulas that would remove a drilldown capability? Thank you in advance for any assistance.
---Original
Net Sales Trailing 3 Periods LY:=
CALCULATE (
factSales[Net Sales],
ALL(dimDate),
FILTER (
ALL ( 'dimPeriod' ),
'dimPeriod'[PeriodKey] <= MAX ( 'dimPeriod'[PeriodKey] ) - 14
&& 'dimPeriod'[PeriodKey] >= MAX ( 'dimPeriod'[PeriodKey] ) - 16
)
)
---With Variables
Net Sales Trailing 3 Periods LY:=
VAR
LY_FPW = MAX('dimPeriod'[YYYYFP]) - 100
VAR
LY_Key = MAXX(FILTER('dimPeriod', 'dimPeriod'[YYYYFP] = LY_FPW), 'dimPeriod'[PeriodKey])
RETURN
CALCULATE (
factSales[Net Sales],
ALL(dimDate),
FILTER (
ALL ( 'dimPeriod' ),
'dimPeriod'[PeriodKey] <= LY_Key - 1
&& 'dimPeriod'[PeriodKey] >= LY_Key - 3
)
)

Related

Applying advanced filter in Power BI DAX, from a different table

I have the following tables:
Episodes:
Clients:
My DAX calculation sums up [Days_epi] unique values, from Episodes tbl, grouping them by [ProgramID_epi], [EpisodeID_epi], [ClientID_epi].
So, the SUM of [Days_epi] = 3 + 5 + 31 + 8 + 15 + 20 + 10 = 92
Here is my working code for this:
DaysSUM =
CALCULATE (
SUMX (
SUMMARIZE (
'Episodes',
'Episodes'[EpisodeID_epi],
'Episodes'[ProgramID_epi],
'Episodes'[ClientID_epi],
'Episodes'[Days_epi]
),
'Episodes'[Days_epi]
),
FILTER (
'Episodes',
'Episodes'[Category_epi] = "Homeless"
)
)
I need to add two advanced filters to the calculation above:
Filter 1 should ONLY KEEP records in Episodes, if the records in the Clients have the difference between [DischDate_clnt] and [AdmDate_clnt] >= 365.
Filter 1 in SQL statement is
DATEDIFF(DAY, [AdmDate_clnt], [DischDate_clnt]) >= 365)
After that, Filter 2 should ONLY KEEP records in Episodes, if the records in the Clients have
[Date_clnt] >= [AdmDate_clnt] + 12 months. (12 month after the Admission Date)
Filter 2 in SQL statement is
[Date_clnt] <= DATEADD(MONTH, 12, [[AdmDate_clnt])
So, after applying those two filters I expect the records 6 and 10 of the Episodes tbl must be excluded (filtered out), because the records 2 and 3 of the Clients tbl (highlighted in green) are not satisfied my Filter 1 / Filter 2.
Here is the final Episodes dataset I should have (without the 2 records in red):
I was starting to update my DAX code as the following (below).
But keep receiving error "Parameter is not the correct type"
enter
DaysSUM_Filters =
CALCULATE (
SUMX (
SUMMARIZE (
'Episodes',
'Episodes'[EpisodeID_epi],
'Episodes'[ProgramID_epi],
'Episodes'[ClientID_epi],
'Episodes'[Days_epi]
),
'Episodes'[Days_epi]
),
FILTER (
'Episodes',
'Episodes'[Category_epi] = "Homeless"
), TREATAS(DATEDIFF('Clients'[AdmDate_clnt],
'Clients'[DischDate_clnt], DAY)>=365,
'Clients'[Date_clnt])
)
Not exactly sure how to set those 2 filters correctly in DAX Power BI, as I
am relatively new to it.
Please help!
I can't say about all the case. But what is obvious is that you use TREATAS in a wrong way. It works like this TREATAS({"Red", "White", "Blue"}, 'Product'[Color]).
In your case
DATEDIFF('Clients'[AdmDate_clnt],
'Clients'[DischDate_clnt], DAY)>=365
will return TRUE or FALSE value. The first argument of TREATAS - is a column or set of columns not a single value.
You can use the filter like this:
FILTER(
'Clients'
,DATEDIFF(
'Clients'[AdmDate_clnt]
,'Clients'[DischDate_clnt]
,DAY
)>=365
)
This will return you a filtered table.
This may work if your tables are linked.

Frequency table of continuous variable in SQL?

I have a continuous variable SQL table:
x
1 622.108
2 622.189
3 622.048
4 622.758
5 622.191
6 622.677
7 622.598
8 622.020
9 621.228
10 622.690
...
and I try to get a simple frequency table, e.g. with 3 buckets, like this:
bucket n
[621.228-621.738[ 1
[621.738-622.248[ 5
[622.248-622.758] 4
Seems easy but I cannot manage to make it in SQL (I am running it on a Cloudera Impala engine).
I have looked into dense_rank() or ntile() without success.
Any idea ?
You can use window functions to divide the range into three equal parts and then use arithmetic:
select min_x + range * (row_number() over (order by min(x)) - 1) as bucket_hi,
min_x + range * row_number() over (order by min(x)) as bucket_hi,
count(*)
from (select t.*,
min(x) over () as min_x,
max(x) over () as max_x,
0.000001 + max(x) over () - min(x) over () as range
from t
) t
group by floor((x - min_x) / range)), min_x, range
There are at least two problems with your question:
You have not provided any code to show us what you have tried. It really is good sometimes to just work out the problem yourself. Nevertheless, I found the problem interesting and decided to play.
Your range blocks overlap. If, for example, you were to have the value 621.738 in your list, which bucket would contain it? [621.228-621.738] or [621.738-622.248]?
There are also at least three problems with my answer, so I don't expect you to accept this. However, maybe it will get you started. Hopefully, this disclaimer will keep me from getting down voted. :-)
The answer is in T-SQL. Sorry, it's what I have to work with.
The answer is not generic. It always creates three and only three buckets.
It only works if the data type limits the result to 3 decimal places.
Remember, this is only one possible solution, and in my mind a very weak one at that.
With those disclaimers, here's what I wrote:
SELECT
'[' + STR( RANGES.RANGESTART, 7, 3 )
+ ' - '
+ STR( RANGES.RANGEEND, 7, 3 ) + ']' AS 'BUCKET'
,COUNT(*) AS 'N'
FROM
( SELECT
VALS.MINVAL + (CAST( CNT.INC AS DECIMAL(7,3) ) * VALS.RANGEWIDTH) AS 'RANGESTART'
,CASE WHEN CNT.INC < 2
THEN VALS.MINVAL + (CAST( CNT.INC + 1 AS DECIMAL(7,3) ) * VALS.RANGEWIDTH) - 0.001
ELSE VALS.MINVAL + (CAST( CNT.INC + 1 AS DECIMAL(7,3) ) * VALS.RANGEWIDTH)
END AS 'RANGEEND'
FROM
( SELECT
MIN(CURVAL) AS 'MINVAL'
,MAX(CURVAL) AS 'MAXVAL'
,(MAX(CURVAL) - MIN(CURVAL)) / 3 AS 'RANGEWIDTH'
FROM
MYVALUE ) VALS
CROSS JOIN (VALUES (0), (1), (2) ) CNT(INC)
) RANGES
INNER JOIN MYVALUE V
ON V.CURVAL BETWEEN RANGES.RANGESTART AND RANGES.RANGEEND
GROUP BY
RANGES.RANGESTART
,RANGES.RANGEEND
ORDER BY 1
;
In the above, your values would be in the CURVAL column of the MYVALUE table.
Good luck. I hope this helps you on your way.

Is there a way I can Query Missing numbers in a table?

I work for a Logistics Company and we have to have a 7 digit Pro Number on each piece of freight that is in a pre-determined order. So we know there is gaps in the numbers, but is there any way I can Query the system and find out what ones are missing?
So show me all the numbers from 1000000 to 2000000 that do not exist in column name trace_number.
So as you can see below the sequence goes 1024397, 1024398, then 1051152 so I know there is a substantial gap of 26k pro numbers, but is there anyway to just query the gaps?
Select t.trace_number,
integer(trace_number) as number,
ISNUMERIC(trace_number) as check
from trace as t
left join tlorder as tl on t.detail_number = tl.detail_line_id
where left(t.trace_number,1) in ('0','1','2','3','4','5','6','7','8','9')
and date(pick_up_by) >= current_date - 1 years
and length(t.trace_number) = 7
and t.trace_type = '2'
and site_id in ('SITE5','SITE9','SITE10')
and ISNUMERIC(trace_number) = 'True'
order by 2
fetch first 10000 rows only
I'm not sure what your query has to do with the question, but you can identify gaps using lag()/lead(). The idea is:
select (trace_number + 1) as start_gap,
(next_tn - 1) as end_gap
from (select t.*,
lead(trace_number) order by (trace_number) as next_tn
from t
) t
where next_tn <> trace_number + 1;
This does not find them within a range. It just finds all gaps.
try Something like this (adapt the where condition, put into clause "on") :
with Range (nb) as (
values 1000000
union all
select nb+1 from Range
where nb<=2000000
)
select *
from range f1 left outer join trace f2
on f2.trace_number=f1.nb
and f2.trace_number between 1000000 and 2000000
where f2.trace_number is null

Mean time to Failure calculation in DAX

I am trying to calculate the mean time to failure for each asset in a job table. At the moment I calculate as follows;
Previous ID = CALCULATE(MAX('JobTrackDB Job'[JobId]),FILTER('JobTrackDB Job','JobTrackDB Job'[AssetDescriptionID]=EARLIER('JobTrackDB Job'[AssetDescriptionID]) && 'JobTrackDB Job'[JobId]<EARLIER('JobTrackDB Job'[JobId])))
Then I bring back the last finish time for the current job when the JobStatus is 7 (closed);
Finish Time = CALCULATE(MAX('JobTrackDB JobDetail'[FinishTime]),'JobTrackDB JobDetail'[JobId],'JobTrackDB JobDetail'[JobStatus]=7)
Then I bring back the previous jobs finish time where the JobType is 1 (Response rather than comparing it to maintenance calls);
Previous Finish = CALCULATE(MAX('JobTrackDB Job'[Finish Time]),FILTER('JobTrackDB Job','JobTrackDB Job'[AssetDescriptionID]=EARLIER('JobTrackDB Job'[AssetDescriptionID]) && 'JobTrackDB Job'[Finish Time]<EARLIER('JobTrackDB Job'[Finish Time]) && EARLIER('JobTrackDB Job'[JobTypeID])=1))
Then I calculate the Time between failure where I also disregard erroneous values;
Time between failure = IF([Previous Finish]=BLANK(),BLANK(),IF('JobTrackDB Job'[Date Logged]-[Previous Finish]<0,BLANK(),'JobTrackDB Job'[Date Logged]-[Previous Finish]))
Issue is that sometimes the calculation uses previous maintenance jobs even though I specified JobTypeID = 1 in the filter. Also, the current calculation does not take into account the time from the start of records to the first job for that asset and also from the last job till today. I am scratching my head trying to figure it out.
Any ideas???
Thanks,
Brent
Some base measures:
MaxJobID := MAX( Job[JobID] )
MaxLogDate := MAX ( Job[Date Logged] )
MaxFinishTime := MAX (JobDetail[Finish Time])
Intermediate calculations:
ClosedFinishTime := CALCULATE ( [MaxFinishTime], Job[Status] = 7 )
AssetPreviousJobID := CALCULATE (
[MaxJobID],
FILTER(
ALLEXCEPT(Job, Table1[AssetDescriptionID]),
Job[JobId] < MAX(Table1[JobID])
)
)
PreviousFinishTime: = CALCULATE ( [ClosedFinishTime],
FILTER(
ALLEXCEPT(Job, Job[AssetDescriptionID]),
Job[JobId] < MAX(Job[JobID])
&& Job[JobType] = 1
)
)
FailureTime := IF (
ISBLANK([PreviousFinishTime]),
0,
( [MaxLogDate]-[PreviousFinishTime] )
)
This should at least get you started. If you want to set some sort of "first day", you can replace the 0 in the FailureTime with a calc like MaxLogDate - [OverallFirstDate], which could be a calculated measure or a constant.
For something that hasn't failed, you'd want to use an entirely different measure since this one is based on lookback only. Something like [Days Since Last Failure] which would just be (basically) TODAY() - [ClosedFinishTime]

MDX query optimization while using CrossJoin

I am writing an MDX query in which i am selecting some Measures and while selection i have a where condition in which i am doing a cross join two facts , one is date and another a unique id and i am passing around 2000 unique ids and the query is taking around 20 minutes to execute and give the result.
Please find below query for the same
SELECT {[Measures].[TOTAL1], [Measures].[TOTAL2], [Measures].[TOAL3]} ON COLUMNS,
" + " {TOPCOUNT(FILTER([ID].[Ids].MEMBERS,
[ID].CurrentMember > 0),
5,[Measures].[TOTAL])} " + "ON ROWS
FROM [CHARTS]
WHERE({[Date].&[2015-09-01 00:00:00.0]}*{[NUM].[1],[NUM].[10],"
+ "[NUM].[18],[NUM].[47],[NUM].[52],[NUM].[105],[NUM].[126],[NUM].[392],"
+ "[NUM].[588],[NUM].[656],[NUM].[995],[NUM].[1005],[NUM].[1010],[NUM].[1061]})";
The straight mdx without the string manipulation operators (+) is as follows:
SELECT
{
[Measures].[TOTAL1]
,[Measures].[TOTAL2]
,[Measures].[TOAL3]
} ON COLUMNS
,{
TopCount
(
Filter
(
[ID].[Ids].MEMBERS
,
[ID].CurrentMember > 0
)
,5
,[Measures].[TOTAL]
)
} ON ROWS
FROM [CHARTS]
WHERE
{[Date].&[2015-09-01 00:00:00.0]}
*
{
[NUM].[1]
,[NUM].[10]
,[NUM].[18]
,[NUM].[47]
,[NUM].[52]
,[NUM].[105]
,[NUM].[126]
,[NUM].[392]
,[NUM].[588]
,[NUM].[656]
,[NUM].[995]
,[NUM].[1005]
,[NUM].[1010]
,[NUM].[1061]
};
Can you please tell me the different performance optimization techniques for the same.
TopCount is slow if you use the third ordering parameter - it is better to order the data first and then feed your pre-ordered set into TopCount with just 2 parameters:
WITH
SET [S0] AS
Filter
(
[ID].[Ids].MEMBERS
,
[ID].CurrentMember > 0
)
SET [S1] AS
Order
(
[S0]
,[Measures].[TOTAL]
,BDESC
)
SET [S2] AS
TopCount
(
[S1]
,5
)
SELECT
{
[Measures].[TOTAL1]
,[Measures].[TOTAL2]
,[Measures].[TOAL3]
} ON COLUMNS
,[S2] ON ROWS
FROM [CHARTS]
WHERE
{[Date].&[2015-09-01 00:00:00.0]}
*
{
[NUM].[1]
,[NUM].[10]
,[NUM].[18]
,[NUM].[47]
,[NUM].[52]
,[NUM].[105]
,[NUM].[126]
,[NUM].[392]
,[NUM].[588]
,[NUM].[656]
,[NUM].[995]
,[NUM].[1005]
,[NUM].[1010]
,[NUM].[1061]
};