Counts Based on Multiple condition in OLAP MDX in a single statement

Counts Based on Multiple condition in OLAP MDX in a single statement - mdx

I am stuck with merging different MDX conditions into one single statement on MDX-SSAS.
Details: I have a cube (Acceptance) with following data as below:
Uniq_ID Acceptance_Type Responsible_Area
1 Accepted FrontEnd
2 Denied BackEnd
3 Accepted FrontEnd
4 Accepted FrontEnd
5 Denied BackEnd
6 Accepted BackEnd
7 Denied BackEnd
8 Accepted FrontEnd
9 Accepted BackEnd
Logic:
(Count of UniqID where Acceptance_type = 'Accepted' and
Responsible_Area = 'FrontEnd' / count(ALL))
At this point I have created 3 calculated members in my SSAS to get the Acceptance rate:
1.[Count of Accepted] --> Here I just take the count of all members WHERE Acceptance_type = "Accepted"
Code: ([AcceptanceType].[AcceptanceType].[AcceptanceTypeID].&[2],[Measures].[count])
2.
[Count of Accepted with Responsible area Frontend]
Here I add one more condition of Responsible_area = FrontEnd
Code:
([AcceptanceType].[AcceptanceType].[AcceptanceTypeID].&[2],[Measures].[Count of Accepted])
Note that I am using the measure created in 1
3.[Acceptance Rate]**Code: IIF([Measures].[Count] = 0, NULL, [Count of Accepted with Responsible area Frontend] / [Measures].[Count])I'm actually chaining the calculated members.
I want to merge all the points 1, 2 ,3 into a single and I'm not able to do so. I'm using Microsoft Visual Studio -SSAS.

You can do it in one calc like this. Note I made up the Frontend member reference because you didn't have an example in your question:
DIVIDE(
([AcceptanceType].[AcceptanceType].[AcceptanceTypeID].&[2],[ResponsibleArea].[ResponsibleArea].&[Frontend],[Measures].[count]),
[Measures].[count]
)
Note that DIVIDE is just a shortcut for your IIf(count=0,,) code. It should work unless you have an older version of SSAS.

Related

SSAS MDX Calculated Measure Based on Related Dimension Attribute Value

I have a measure [Measures].[myMeasure] that I would like to create several derivatives of based on the related attribute values.
e.g. if the related [Location].[City].[City].Value = "Austin" then I want the new calculated measure to return the value of [Measures].[myMeasure], otherwise, I want the new calculated measure to return 0.
Also, I need the measure to aggregate correctly meaning sum all of the leaf level values to create a total.
The below works at the leaf level or as long as the current member is set to Austin...
Create Member CurrentCube.[Measures].[NewMeasure] as
iif(
[Location].[City].currentmember = [Location].[City].&[Austin],
[Measures].[myMeasure],
0
);
This has 2 problems.
1 - I don't always have [Location].[City] in context.
2. When multiple cities are selected this return 0.
I'm looking for a solution that would work regardless of whether the related dimension is in context and will roll up by summing the atomic values based on a formula similar to above.
To add more context consider a transaction table with an amount field. I want to convert that amount into measures such as payments, deposits, return, etc... based on the related account.

I don't know the answer but just a couple of general helpers:
1 You should use IS rather than = when comparing to a member
2 You should use null rather than 0 - 0/NULL are effecitvely the same but using 0 will slow things up a lot as the calculation will be fired many more times. (this might help with the second section of your question)
Create Member CurrentCube.[Measures].[NewMeasure] as
iif(
[Location].[City].currentmember IS [Location].[City].&[Austin],
[Measures].[myMeasure],
NULL
);

Slow MDX Custom Distinct Count Formula

I have a question related to creating a (more efficient) custom Distinct Count Measure using MDX.
Background
My cube has several long many to many relationship chains between Facts and Dimensions and it is important for me to be able to track which members in certain Dimensions do and do not relate to other Dimensions. As such, I have created a "Not Related" record in each of my dimension tables and set those records' ID values to -1. Then in my intermediate mapping fact tables I use the -1 ID to connect to these "Not Related" records.
The issue arises when I try to run a normal out-of-the-box distinct count on any field where the -1 members are present. In the case that a -1 member exists, the distinct count measure will return a result of 1 more than the true answer.
To solve this issue I have written the following MDX:
CREATE MEMBER CURRENTCUBE.[Measures].[Provider DCount]
AS
//Oddly enough MDX seems to require that the PID (Provider ID) field be different from both the linking field and the user sliceable field.
SUM( [Providers].[PID Used For MDX].Children ,
//Don't count the 'No Related Record' item.
IIF( NOT([Providers].[PID Used For MDX].CURRENTMEMBER IS [Providers].[PID Used For MDX].&[-1])
//For some reason this seems to be necessary to calculate the Unknown Member correctly.
//The "Regular Provider DCount Measure" below is the out-of-the-box, non-MDX measure built off the same field, and is not shown in the final output.
AND [Measures].[Regular Provider DCount Measure] > 0 , 1 , NULL )
),
VISIBLE = 1 , DISPLAY_FOLDER = 'Distinct Count Measures' ;
The Issue
This MDX works and always shows the correct answer (yeah!), but it is EXTREMELY slow when users start pulling Pivot Tables with more than a few hundred cells that use this measure. For less than 100 cells, the results are nearly instantaneously. For a few thousand cells (which is not uncommon at all), the results could take up to an hour to resolve (uggghhh!).
Can anyone help show me how to write a more efficient MDX formula to accomplish this task? Your help would be GREATLY appreciated!!
Jon Oakdale
jonoakdale#hotmail.com
Jon

You can use predefined scope to nullify all unnecessary (-1) members and than create your measure.
SCOPE ([Providers].[PID Used For MDX].&[-1]
,[Measures].[Regular Provider DCount Measure]);
THIS = NULL;
END SCOPE;
CREATE MEMBER CURRENTCUBE.[Measures].[Provider DCount]
AS
SUM([Providers].[PID Used For MDX].Children
,[Measures].[Regular Provider DCount Measure]),
VISIBLE = 1;
By the way, I used in my tests [Providers].[PID Used For MDX].[All].Children construction since don't know, what is dimension / hierarchy / ALL-level in your case. It seems like [PID Used For MDX] is ALL-level and [Providers] is name of dimension and hierarchy, and HierarchyUniqueName is set to Hide.

How to report MS Access data, calculated columns with group by

I have an Access 2003 database which records fault call help requests in a medium size organisation of around 200 users. Calls are logged (and appended into the database) via a Classic ASP page, and a team of systems administrators use a seperate classic ASP web page to view calls, provide a response, etc.
All calls are recorded in one table called tblFaultCall, it's structure is below
tblFault call
ID : Autonumber
strName
strPhone
dtmDateOpen : Date/Time (date call logged)
dtmDateClosed : Date/Time (date call closed)
dtmTime : Date/Time (time call logged)
strStatus (always 'Open', 'Pending' or 'Closed')
strCategory (always one of 10 categories, held as as list in tblCatgory, and used in lookup lists in the ASP web page)
strFaultDesc
strResolution
strCallOwner
dtmDatePending : Date/Time (date call set to pending, if it ever was)
For management, I need a way of easily creating a quarterly report which shows as below
Call recieved between dd/mm/yyyy and dd/mm/yyyy
----
Category Calls recieved Of which 'Closed' closed within 5 days Closed within 14 days Open Pending
Cateogry x 1052 950 700 200 50 50
Cateogry Y 65 60 50 5 0 5
I need an easy way to do this. I need the manager to be able to insert the dates he wants, and then click a button and it all comes up. I cannot work out how to create one query which gives all of this. It's easy to give just the categories and number of Open calls, but then can't work out how to add a further column to show number of Closed calls, or the number closed within x days, etc. I can create individual queries for the harder columns, but not get it all together.
So, options are
Classic ASP - I think would involve a lot of individual SQLs for the calculated fields
Access Report ?
Some kind of export to Excel?
VBA in Excel to link back to prepared queries in Access?
Any advise would be appreciated.

You should be able to get that data in one query. Try this one:
SELECT AllCalls.strCategory, CallsReceived, CallsClosed, ClosedWithin5Days, ClosedWithin14days, CallsOpen, CallsPending
FROM
((
SELECT strCategory,
Count(ID) AS CallsReceived,
Sum(IIF(strStatus='Closed',1,0)) AS CallsClosed,
Sum(IIF(strStatus='Open',1,0)) AS CallsOpen,
Sum(IIF(strStatus='Pending',1,0)) AS CallsPending
FROM tblFaultCall
WHERE dtmDateOpen BETWEEN #6/1/2014# and #6/30/2014#
GROUP BY strCategory
) AS AllCalls
LEFT JOIN
(
SELECT strCategory,
Count(ID) AS ClosedWithin5Days
FROM tblFaultCall
WHERE DateDiff("d", dtmDateOpen, dtmDateClosed) <=5
AND dtmDateOpen BETWEEN #6/1/2014# and #6/30/2014#
GROUP BY strCategory
) AS FiveDay ON AllCalls.strCategory=FiveDay.strCategory)
LEFT JOIN
(
SELECT strCategory,
Count(ID) AS ClosedWithin14Days
FROM tblFaultCall
WHERE DateDiff("d", dtmDateOpen, dtmDateClosed) between 5 and 14
AND dtmDateOpen BETWEEN #6/1/2014# and #6/30/2014#
GROUP BY strCategory
) AS FourteenDay ON AllCalls.strCategory=FourteenDay.strCategory
The classic ASP part should be very similar to your other pages: query the database, loop through the resulting data, output it to the screen. You would use the same approach if you were generating a spreadsheet too.

Each column can be calculated, mostly with iif statements:
Total calls = count(calls)
Closed calls = sum(iif(<call is closed>,1,0) (however you define <call is closed>)
Closed in 5 days = sum(iif(<call is closed in 5 days>,1,0))
and so on

T-SQL query for SQL Server 2008 : how to query X # of rows where X is a value in a query while matching on another column

Summary:
I have a list of work items that I am attempting to assign to a list of workers. Each working is allowed to only have a max of 100 work items assigned to them. Each work item specifies the user that should work it (associated as an owner).
For example:
Jim works a total of 5 accounts each with multiple work items. In total jim has 50 items to work already assigned to him. I am allowed to assign only 50 more.
My plight/goal:
I am using a temp table and a select statement to get the # of items each owner has currently assigned to them and I calculate the available slots for new items and store the values in new column. I need to be able to select from the items table where the owner matches my list of owners and their available items(in the temp table), only retrieving the number of rows for each user equal to the number of available slots per user - query would return only 50 rows for jim even though there may be 200 matching the criteria while sam may get 0 rows because he has no available slots while there are 30 items for him to work in the items table.
I realize I may be approaching this problem wrong. I want to avoid using a cursor.
Edit: Adding some example code
SELECT
nUserID_Owner
, CASE
WHEN COUNT(c.nWorkID) >= 100 THEN 0
ELSE 100 - COUNT(c.nWorkID)
END
,COUNT(c.nWorkID)
FROM tblAccounts cic
LEFT JOIN tblWorkItems c
ON c.sAccountNumber = cic.sAccountNumber
AND c.nUserID_WorkAssignedTo = cic.nUserID_Owner
AND c.nTeamID_WorkAssignedTo = cic.nTeamID_Owner
WHERE cic.nUserID_Collector IS NOT NULL
AND nUserID_CurrentOwner = 5288
AND c.bCompleted = 0
GROUP BY nUserID_Owner
This provides output vaulues of 5288, 50, 50 (in Jim's scenario)

It took longer than I wanted it to but I found a solution.
I did use a sub-query as suggested above to produce the work items with a unique row count by user.
I used PARTITION BY to produce a unique row count for each worker and included in my HAVING clause that the row number must be < the count of available slots. I'd post the code but it's beyond the char limit and I'd also have a lot of things to change to anon the system properly.
Originally I was approaching the problem incorrectly focusing on limiting the results rather than thinking about creating the necessary data to relate the result sets.

Group by run when there is no run number in data (was Show how changing the length of a production run affects time-to-build)

It would seem that there is a much simpler way to state the problem. Please see Edit 2, following the sample table.
I have a number of different products on a production line. I have the date that each product entered production. Each product has two identifiers: item number and serial number I have the total number of labour hours for each product by item number and by serial number (i.e. I can tell you how many hours went into each object that was manufactured and what the average build time is for each kind of object).
I want to determine how (if) varying the length of production runs affects the average time it takes to build a product (item number). A production run is the sequential production of multiple serial numbers for a single item number. We have historical records going back several years with production runs varying in length from 1 to 30.
I think to achieve this, I need to be able to assign 'run id'. To me, that means building a query that sorts by start date and calculates a new unique value at each change in item number. If I knew how to do that, I could solve the rest of the problem on my own.
So that suggests a series of related questions:
Am I thinking about this the right way?
If I am on the right track, how do I generate those run id values? Calculate and store is an option, although I have a (misguided?) preference for direct queries. I know exactly how I would generate the run numbers in Excel, but I have a (misguided?) preference to do this in the database.
If I'm not on the right track, where might I find that track? :)
Edit:
Table structure (simplified) with sample data:
AutoID Item Serial StartDate Hours RunID (proposed calculation)
1 Legend 1234 2010-06-06 10 1
3 Legend 1235 2010-06-07 9 1
2 Legend 1237 2010-06-08 8 1
4 Apex 1236 2010-06-09 12 2
5 Apex 1240 2010-06-10 11 2
6 Legend 1239 2010-06-11 10 3
7 Legend 1238 2010-06-12 8 3
I have shown that start date, serial, and autoID are mutually unrelated. I have shown the expectation that labour goes down as the run length increases (but this is a 'fact' only via received wisdom, not data analysis). I have shown what I envision as the heart of the solution, that being a RunID that reflects sequential builds of a single item. I know that if I could get that runID, I could group by run to get counts, averages, totals, max, min, etc. In addition, I could do something like hours/ to get percentage change from the start of the run. At that point I could graph the trends associated with different run lengths either globally across all items or on a per item basis. (At least I think I could do all that. I might have to muck about a bit, but I think I could get it done.)
Edit 2: This problem would appear to be: how do I get the 'starting' member (earliest start date) of each run when I don't already have a runID? (The runID shown in the sample table does not exist and I was originally suggesting that being able to calculate runID was a potentially viable solution.)
AutoID Item
1 Legend
4 Apex
6 Legend
I'm assuming that having learned how to find the first member of each run that I would then be able to use what I've learned to find the last member of each run and then use those two results to get all other members of each run.
Edit 3: my version of a query that uses the AutoID of the first item in a run as the RunID for all units in a run. This was built entirely from samples and direction provided by Simon, who has the accepted answer. Using this as the basis for grouping by run, I can produce a variety of run statistics.
SELECT first_product_of_run.AutoID AS runID, run_sibling.AutoID AS itemID, run_sibling.Item, run_sibling.Serial, run_sibling.StartDate, run_sibling.Hours
FROM (SELECT first_of_run.AutoID, first_of_run.Item, first_of_run.Serial, first_of_run.StartDate, first_of_run.Hours
FROM dbo.production AS first_of_run LEFT OUTER JOIN
dbo.production AS earlier_in_run ON first_of_run.AutoID - 1 = earlier_in_run.AutoID AND
first_of_run.Item = earlier_in_run.Item
WHERE (earlier_in_run.AutoID IS NULL)) AS first_product_of_run LEFT OUTER JOIN
dbo.production AS run_sibling ON first_product_of_run.Item = run_sibling.Item AND first_product_of_run.AutoID run_sibling.AutoID AND
first_product_of_run.StartDate product_between.Item AND
first_product_of_run.StartDate

Could you describe your table structure some more? If the "date that each product entered production" is a full time stamp, or if there is a sequential identifier across products, you can write queries to identify the first and last products of a run. From that, you can assign IDs to or calculate the length of the runs.
Edit:
Once you've identified 1,4, and 6 as the start of a run, you can use this query to find the other IDs in the run:
select first_product_of_run.AutoID, run_sibling.AutoID
from first_product_of_run
left join production run_sibling on first_product_of_run.Item = run_sibling.Item
and first_product_of_run.AutoID <> run_sibling.AutoID
and first_product_of_run.StartDate < run_sibling.StartDate
left join production product_between on first_product_of_run.Item <> product_between.Item
and first_product_of_run.StartDate < product_between.StartDate
and product_between.StartDate < run_sibling.StartDate
where product_between.AutoID is null
first_product_of_run can be a temp table, table variable, or sub-query that you used to find the start of a run. The key is the where product_between.AutoID is null. That restricts the results to only pairs where no different items were produced between them.
Edit 2, here's how to get the first of each run:
select first_of_run.AutoID
from
(
select product.AutoID, product.Item, MAX(previous_product.StartDate) as PreviousDate
from production product
left join production previous_product on product.AutoID <> previous_product.AutoID
and product.StartDate > previous_product.StartDate
group by product.AutoID, product.Item
) first_of_run
left join production earlier_in_run
on first_of_run.PreviousDate = earlier_in_run.StartDate
and first_of_run.Item = earlier_in_run.Item
where earlier_in_run.AutoID is null
It's not pretty, and will break if StartDate is not unique. The query could be simplified by adding a sequential and unique identifier with no gaps. In fact, that step will probably be necessary if StartDate is not unique. Here's how it would look:
select first_of_run.AutoID
from production first_of_run
left join production earlier_in_run
on (first_of_run.Sequence - 1) = earlier_in_run.Sequence
and first_of_run.Item = earlier_in_run.Item
where earlier_in_run.AutoID is null
Using outer joins to find where things aren't still twists my brain, but it's a very powerful technique.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas