Beginner MDX Questions

Beginner MDX Questions - mdx

My first question here so sorry for my ignorance. I'm new to MDX but really struggling to get the results I need. I'm working in a logistics warehouse and would like to calculate some internal lead times.
For this I would need output with the following layout:
Order ID | Run State Name First | Date&Time | Run State Name Last | Date&Time | Sales QTY
The data fields are called
[OrderTable].[Order ID]
[OrderRunState].[Run State Name]
[Time Status History Created Date].[Y-Q-M-D]
[Clock Status History Created Date].[Time Minute]
[Measure].[Sales QTY]
[Time Status History Created Date].[Y-Q-M-D] <- used for filtering
Now there are 2 'buts'.
Run State Name First. I would need the first RSN based on the date& time, but this RSN can't be equal to "A" or "B" or "C".
Run State Name Last needs to be equal to "F".
To make it a bit clearer I've made a small Excel Table to illustrate what I mean.
Excel Table Example
EDIT: Forgot to add what I've done so far (please don't laugh..to hard ;)). It's not what I need but I'm trying to add complexity step by step.
SELECT non empty
[Measures].[Sales QTY] on columns,
non empty
( [OrderTable].[Order ID].members,
[OrderRunState].[Run State Name].members,
[Time Status History Created Date.Y-Q-M-D].[Date].members,
[Clock Status History Created Date].[Time Minute].members )
on rows
FROM
[Operations]
WHERE
[Time Status History Created Date].[Yesterday].&[Yes]
Still need to add date & time and have like "topcount" or "firstchild" but excluding RSN A,B,C.

Related

When Joining a SQL Table to Itself, Return only One Result. Time Log Entry

I am working with some data from Jira. In Jira there are "Issues" and each issue can go through states such as New, In Progress, Review, etc. I want to measure the time something stays in each state. It is possible for things to move back and forth and return to a state multiple times. Jira produces a log table which logs the event of the issue moving from one state to another. To measure how long it’s been in a state you need to have the entry from the log for that state (end) and the previous change of state (start).
I can’t return one entry for each state change. For Issues that moved between a state multiple times I get multiple rows.
I tried the gaps and islands approach. Also a select within the top select. Min or Max in the join was atrociously slow.
The desired result would be a column added to the table in the select which gives the duration for the State in the column ItemFromString. The date difference is between this entry’s Created date and the previous state change entry’s created date, which shows when the issue moved to this state. In the example data below the first entry for Assessment, History ID 436260, would be a duration of 9/19–9/14. When I join I get multiple entries for this History ID since there are multiple Assessments. I filter the join by Issue key and Item from/to String where they match; however, I need to also add a filter where it looks at any entries created before the current items created date and selects the most recent, or largest, one. This is where I am hung up.
Fields:
Created - This is when the log entry was created which is the date time it changed state from ItemFromString to ItemToString.
IssueCreated - This is when the issue the log is about, was created. For example, they start in the new state so we need this date to figure out how long it sat in new as the first log entry will be it moving from New to something else.
IssueKey and IssueID are almost the same thing, they are key ID's for the issue in a different table.
HistoryID is the key for each log entry in this table.
Assessment
IssueKey
HistoryID
IssueId
Created
IssueCreatedDate
ItemFromString
ItemToString
TPP-16
434905
208965
9/14/2022 14:33
9/14/2022 8:56
New
Assessment
TPP-16
436260
208965
9/19/2022 8:32
9/14/2022 8:56
Assessment
Internal Review
TPP-16
437795
208965
9/19/2022 16:11
9/14/2022 8:56
Internal Review
New
TPP-16
437796
208965
9/19/2022 16:11
9/14/2022 8:56
New
Assessment
TPP-16
439006
208965
9/20/2022 15:08
9/14/2022 8:56
Assessment
New
TPP-16
457786
208965
10/17/2022 11:02
9/14/2022 8:56
New
Assessment
TPP-16
457789
208965
10/17/2022 11:03
9/14/2022 8:56
Assessment
Internal Review
TPP-16
490205
208965
10/27/2022 15:15
9/14/2022 8:56
Internal Review
On Hold
TPP-16
539391
208965
1/11/2023 15:24
9/14/2022 8:56
On Hold
Backlog
This query does not get a duration as the last column in the query. The query creates a table that is then published and utilized by BI products for graphing and analysis.
SELECT
IssueChangelogs.IssueKey IssueKey,
IssueChangelogs.HistoryId HistoryId,
IssueChangelogs.IssueId IssueId,
IssueChangelogs.IssueCreatedDate IssueCreatedDate,
IssueChangelogs.ItemFromString ItemFromString,
IssueChangelogs.ItemToString ItemToString,
ICLPrev.Created PrevCreated, --For Testing
IssueChangelogs.Created Created,
ICLPrev.HistoryID PrevHistoryID, --For Testing
CASE
-- If the join found a match for a previous status, then we can calculate the Duration it was in that state.
WHEN ICLPrev.HistoryID IS NOT NULL
THEN DATEDIFF(hour, ICLPrev.Created, IssueChangeLogs.Created)/24
-- If the state was new then we need to use the IssueCreatedDate as the start date as the default state is New for each issue.
WHEN IssueChangeLogs.ItemFromString LIKE '%New%'
THEN Round(DATEDIFF(hour, IssueChangeLogs.IssueCreatedDate, IssueChangeLogs.Created), 2)/24
-- Else, let's add something easy to identify so when we test and look at the table we know what occured.
ELSE 0.01
END AS Duration
FROM
TableNameRedacted AS IssueChangelogs
LEFT JOIN
TableNameRedacted AS ICLPrev
ON ICLPrev.IssueKey = IssueChangeLogs.IssueKey AND
ICLPrev.ItemToString = IssueChangeLogs.ItemFromString
WHERE
IssueChangelogs.IssueKey LIKE '%TPP%'

I used the Row_number function to identify the rows I wanted, then wrap it in a CTE and pull those row numbers, then add one more criteria to the join to rule out oddities.
First we select what we need from the table and then pull out the previous entry related to the state (status) change so we can calculate the time
it spent in that state as Duration. So create a Temporary Table (CTE) and order this to use the Row_Number function to identify the rows we want. Then select only those rows from the CTE.
WITH CTE AS (
SELECT
IssueChangelogs.IssueKey IssueKey,
IssueChangelogs.HistoryId HistoryId,
IssueChangelogs.IssueId IssueId,
IssueChangelogs.AuthorDisplayName AuthorDisplayName,
IssueChangelogs.IssueCreatedDate IssueCreatedDate,
IssueChangelogs.ItemFromString ItemFromString,
IssueChangelogs.ItemToString ItemToString,
ICLPrev.Created PrevRelatedCreatedDate,
ICLPrev.HistoryId PrevRelatedHistoryID,
IssueChangelogs.Created Created,
CASE
-- If the join found a match for a previous status, then we can calculate the Duration it was in that state.
WHEN ICLPrev.HistoryID IS NOT NULL
-- Funky math is so we can get the number of days with 2 decimal points.
THEN CAST((DATEDIFF(second, ICLPrev.Created, IssueChangeLogs.Created)/86400.00)AS DECIMAL(6,2))
-- If the state was new then we need to use the IssueCreatedDate as the start date as the default state is New for each issue.
WHEN IssueChangeLogs.ItemFromString LIKE '%New%'
-- Again calculate to get 2 decimal places for duration in days.
THEN CAST((DATEDIFF(second, IssueChangeLogs.IssueCreatedDate, IssueChangeLogs.Created)/86400.00)AS DECIMAL(6,2))
-- Else, let's add something easy to identify so when we test and look at the table we know what occured.
ELSE NULL
END AS Duration,
-- Here we are going to assign the number 1 to the rows we want to keep.
ROW_NUMBER() OVER(PARTITION BY IssueChangeLogs.IssueKey, IssueChangeLogs.HistoryID, IssueChangeLogs.Created
-- Since duplicates can exist we only want the most recent match. The join ensures we don't have any start dates greater than the end date and the order by here puts the most recent (DESC) on top so its gets tagged with a 1.
ORDER BY ICLPrev.Created DESC) AS RowNo
FROM
/shared/"Space"/CUI/TandTE/Physical/Metadata/JIRA/CIS/JIRA/IssueChangelogs IssueChangelogs
LEFT JOIN
/shared/"Space"/CUI/TandTE/Physical/Metadata/JIRA/CIS/JIRA/IssueChangelogs AS ICLPrev
-- These are the closest we can get to a key; however, it will still produce duplicates since an issue can return to a previous state multiple times.
ON ICLPrev.IssueKey = IssueChangeLogs.IssueKey AND ICLPrev.ItemToString = IssueChangeLogs.ItemFromString
-- This will remove anything that doesn't make sense in the join, for example a state change where the start date is greater than the end date. Still leaves duplicates which the row number will fix.
AND ICLPrev.Created < IssueChangeLogs.Created
WHERE
IssueChangelogs.IssueKey LIKE '%TPP%'
)
-- Now that the CTE table has been created and we have identifed the rows we want from the join with a 1 in RowNo, we filter by those and only pull the columns we want.
SELECT
CTE.IssueKey,
CTE.HistoryId,
CTE.IssueId,
CTE.AuthorDisplayName,
CTE.IssueCreatedDate,
CTE.ItemFromString,
CTE.ItemToString,
CTE.PrevRelatedHistoryID,
CTE.PrevRelatedCreatedDate,
CTE.Created,
CTE.Duration
FROM CTE
WHERE CTE.RowNo = 1

SAP BO - how to get 1/0 distinct values per week in each row

the problem I am trying to solve is having a SAP Business Objects query calculate a variable for me because calculating it in a large excel file crashes the process.
I am having a bunch of columns with daily/weekly data. I would like to get a "1" for the first instance of Name/Person/Certain Identificator within a single week and "0" for all the rest.
So for example if item "Glass" was sold 5 times in week 4 in this variable/column first sale will get "1" and next 4 sales will get "0". This will allow me to have the number of distinct items being sold in a particular week.
I am aware there are Count and Count distinct functions in Business Objects, however I would prefer to have this 1/0 system for the entire raw table of data because I am using it as a source for a whole dashboard and there are lots of metrics where distinct will be part/slicer for.
The way I doing it previously is with excel formula: =IF(SUMPRODUCT(($A$2:$A5000=$A2)*($G$2:$G5000=$G2))>1,0,1)
This does the trick and gives a "1" for the first instance of value in column G appearing in certain value range in column A ( column A is the week ) and gives "0" when the same value reappears for the same week value in column A. It will give "1" again when the week value change.
Since it is comparing 2 cells in each row for the entire columns of data as the data gets bigger this tends to crash.
I was so far unable to emulate this in Business Objects and I think I exhausted my abilities and googling.
Could anyone share their two cents on this please?

Assuming you have an object in the query that uniquely identifies a row, you can do this in a couple of simple steps.
Let's assume your query contains the following objects:
Sale ID
Name
Person
Sale Date
Week #
Price
etc.
You want to show a 1 for the first occurrence of each Name/Week #.
Step 1: Create a variable with the following definition. Let's call it [FirstOne]
=Min([Sale ID]) In ([Name];[Week #])
Step 2: In the report block, add a column with the following formula:
=If [FirstOne] = [Sale ID] Then 1 Else 0
This should produce a 1 in the row that represents the first occurrence of Name within a Week #. If you then wanted to show a 1 one the first occurrence of Name/Person/Week #, you could just modify the [FirstOne] variable accordingly:
=Min([Sale ID]) In ([Name];[Person];[Week #])

I think you want logic around row_number():
select t.*,
(case when 1 = row_number() over (partition by name, person, week, identifier
order by ??
)
then 1 else 0
end) as new_indicator
from t;
Note the ??. SQL tables represent unordered sets. There is no "first" row in a table or group of rows, unless a column specifies that ordering. The ?? is for such a column (perhaps a date/time column, perhaps an id).
If you only want one row to be marked, you can put anything there, such as order by (select null) or order by week.

Processed cube sees no changes in sources, but new rows has been inserted in source db table

Simple AdwWorks cube has time dimension with years range from 2000-th to 2007-th. Fact data exists for 2001-2006. 2000-th and 2007-th are empty. Ok.
I'm insert new data to source fact table for 2001-th and process the cube. All changes immediately reflect in measure value (it grows). But new data for 2000-th don't reflect in measure value growing. Nevertheless, during cube processing SSAS sees new rows for both cases (the value of rows count grows), and sql profiler catches equal batches of commands for 2001-th and 2000-th. But measure value grows only for 2001-th.
I've cleared mdx-script - it contains now only calculate command. Now there is:
a) select sum(measure value) on fact tables in source database reflect new rows for both cases
b) cube processing sees changes for both cases
c) mdx-script is clear (there are no scopes, witch can set measure value to null )
d) select [sales amount] on cube reflects this changes only for 2001-th, not for 2000-th
Any suggestions?
================================================
One day later.. :)
The problem seems as resolved, but... I don't know: is it tech and/or logic feature or bug..
So, what is happened.. Simple mdx query:
select [sales amount] on 0,
[some date from 2000] on 1
from [cube]
gives null value for [sales amount], but for [some date from 2001] value is not empty. In database fact table both values (results of sum(salesamount) for both dates) are not empty. After research i found that there was one measure ([Temp]) in the [sales amount] measure group for which MeasureExpression was set to [sales amount]*[average rate] ([average rate] - measure from [Fact Currency Rate] measure group). Fact table for [Fact Currency Rate] measure group didn't contain data for 2000-th. I've insert it and - vualya - all work and for 2000-th now.
Main question is: why ssas calculate MeasureExpression formulas for measures, which I'm not request in a query, and why ssas set requested by me value to null, if it does not depend from formula of MeasureExpression property of unrequested measure?
Is it bug or "defense from fool"?. There is another sample of such strange ssas behaviour. I mean usage of UnaryOperatorColumn property. If it is set for any attribute of any dimension, then it influences on every query, even if this attribute not is a part of this query. For example
select [sales amount] on 0
from [cube]
returns different results, which depends by setting of UnaryOperatorColumn for some dimension attribute..
What all it mean?

Member with Named set returns always the same value in SSAS

I'm needing to query a cube as a plain table. Also, i need to use a named set for performance reasons (the query takes 10x if not using the set). The issue is that i'm getting the same value for every row for the Date Time calculated member. BTW, i'm using this member cause i haven't found any way to query a named set 'on columns'.
The query looks like this:
with
set [CurrentDates] as filter([Time].[Date].Members, not isempty([Measures].[Net Sold Units]))
member [Measures].[Date Time] as [CurrentDates].Item(1).Member_Key
select
{
[Measures].[Date Time],
[Measures].[Song Name]
--other calculated members
}
on columns
,
subset
(
{
order
(
except
(
NONEMPTY([Trend Transaction].[Trend Transaction].Members, [Measures].[Net Sold Units]),
[Trend Transaction].[Trend Transaction].[All]
),
[Measures].[Date Time], basc
)
}
,0, 50)
on rows
from Trends
where
(
{[Time].[Date].&[2012-09-01T00:00:00],[Time].[Date].&[2012-09-02T00:00:00]}
)
And the result i'm getting looks like this:
Date Time Song Name
9/1/2012 Where Have You Been
9/1/2012 We Are Young
9/1/2012 Wide Awake (Origionally Performed by Katy Perry) [Karaoke Version]
9/1/2012 Breathing
9/1/2012 So Sophisticated (Originally Performed by Rick Ross) [Karaoke Version]
.
.
.
The dates for the last songs is obviously wrong, should e 9/2/2012
I recognize that i'm kinda SSAS newbie, probably there is something i'm missing here :)
Is there any way to do that i'm needing?
Thanks in advance!

Your calculated : member [Measures].[Date Time] as [CurrentDates].Item(1).Member_Key
will always refer to the second item in your set (remember it is 0 based )
What you want is to use the RANK function: this with the CURRENTMEMBER function brings back the item position in the Set as the set is rendered. e.g. with:
set [CurrentDates] as filter([Time].[Date].Members, not isempty([Measures].[Net Sold Units]))
member MemberRank As 'RANK(Time].[Date].CURRENTMEMBER , CurrentDates)'
member [Measures].[Date Time] as [CurrentDates].Item(MemberRank-1).UNIQUE_NAME

Well, the solution was finally to have the set and the member like this:
set CurrentDates as filter([Time].[Date].[Date], not isempty([Measures].[Net Sold Units]))
member [Measures].[Date Time] as CDATE(NONEMPTY(CurrentDates, [Measures].[Net Sold Units]).Item(0).Member_Key)
The key is to have the named set into the SCOPE, to have it referenced in the member.

MDX Calculating Time Between Events

I have a Cube which draws its data from 4 fact/dim tables.
FactCaseEvents (EventID,CaseID,TimeID)
DimEvents (EventID, EventName)
DimCases (CaseID,StateID,ClientID)
DimTime (TimeID,FullDate)
Events would be: CaseReceived,CaseOpened,CaseClientContacted,CaseClosed
DimTime holds an entry for every hour.
I would like to write an MDX statement that will get me 2 columns: "CaseRecievedToCaseOpenedOver5" and "CaseClientContactedToCaseClosedOver5"
CaseRecievedToCaseOpenedOver5 would hold the number of cases that had a time difference over 5 hours for the time between CaseReceived and CaseOpened.
I'm guessing that "CaseRecievedToCaseOpenedOver5" and "CaseClientContactedToCaseClosedOver5" would be calculated members, but I need some help figuring out how to create them.
Thanks in advance.

This looks like a good place to use an accumulating snapshot type fact table and calculate the time it takes to move from one stage of the pipeline to the next in the ETL process.

Query for AdventureWorks (DateDiff works in MDX):
WITH
MEMBER Measures.NumDays AS
'iif(ISEMPTY(([Delivery Date].[Date].CurrentMember
,[Ship Date].[Date].CurrentMember
,Measures.[Order Count]))
,null
, Datediff("d",[Ship Date].[Date].CurrentMember.Name
,[Delivery Date].[Date].CurrentMember.Name))'
SELECT
NON EMPTY {[Ship Date].[Date].&[63]
:[Ship Date].[Date].&[92]} ON COLUMNS,
NON EMPTY {[Delivery Date].[Date].&[63]
:[Delivery Date].[Date].&[92]}
* {[Measures].[NumDays]
, [Measures].[Order Count]} ON ROWS
FROM [Adventure Works]
Taken from: http://www.mombu.com/microsoft/sql-server-olap/t-can-i-have-datediff-in-mdx-265763.html
If you'll be using this member a lot, create it as a calculated member in the cube, on the Calculations tab if I remember right.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas