MDX Show all months for a person if any of the months have a value of "Y" in a separate dimension - ssas

Have a funky issue i'm trying to resolve if you'd be so kind.
Measures: BillableHours
Dimensions:
Personnel (EmployeeId, EmployeeName)
Grouping(EmployeePeriodKey, ActiveFlag)
PeriodCalendarYear, CalendarQuarter, CalendarPeriod)
Grouping dimension has flags and calcs that are particular to a person in a period so the PK is the combo of EmployeeId and CalendarPeriod.
Data As Follows:
EmployeeId CalendarPeriod ActiveFlag BillableHours
123 201501 Y 10
123 201502 Y 20
123 201503 N 30
123 201504 Y 40
People are filtering on "Active Flag" = "Y" and missing the "N" row in the results which is not what is desired. Whatever filter I design needs to be flexible enough that at an employee level I need to know if an employee ever had a value of "Y" JUST the periods selected by the query.
Scenario 1: user selects employee 123 for periods 201501:201504 and filters hypothetical flag to "Y" - Billable Hours should be 100, not 70.
Scenario 2: user selects employee 123 for periods 201501:201503 and filters hypothetical flag to "Y" - Billable Hours should be 60, not 30.
Scenario 3: user selects employee 123 for period 201503 and filters hypothetical flag to "Y" - Billable Hours should be 0, not 30. since in this selected group of periods this person was not active for any period
i'm not interested in all siblings, just the ones at a person level. And if person is not selected I need it to know to perform this check on a person level for the periods filtered for. If they have the following
ActiveFlag: "Y"
Fiscal Year: 2016
Group BillableHours
IT Consulting 1000
HR Consulting 1500
It would be understood that those total amounts represent the hours for every employee who was active for any part of FY2016 whether all 12 months or only 1. If someone was active the year before, but weren't in 2016 they should not show up because I only want to interrogate the flags for the periods selected.

Do you want to see Y value when it's non empty? Otherwise N + Y value?
IIF(
[Measures].[BillableHours] > 0,
([Grouping].[ActiveFlag].[All],[Measures].[BillableHours]),
[Measures].[BillableHours]
)

WhatI needed to accomplish with the above is to have all values for a specific employee evaluated to see if any values for that employee were valid. The key to doing that ended up being using EXISTING. Additionally using NON_EMPTY_BEHAVIOR cut down on evaluation cycles because there was no need to evaluate the rows if they weren't active in a time period. I've posted the MDX below.
CREATE HIDDEN UtilizedFTESummator;
[Measures].[UtilizedFTESummator] = Iif([Measures].[Is Active For Utilization Value] > 0,[Measures].[Period FTE],NULL);
NON_EMPTY_BEHAVIOR([Measures].[UtilizedFTESummator]) = [Measures].[Is Active For Utilization Value];
//only include this measure if the underlying employee has values in their underlying data for active in utilization
CREATE MEMBER CURRENTCUBE.[Measures].[FTE Active Utilization]
AS
SUM
(
EXISTING [Historical Personnel].[Employee Id].[Employee Id],
[Measures].[UtilizedFTESummator]
),VISIBLE=0;

Related

How to calculated on created fields? Why the calculation is wrong?

I am working on the workforce analysis project. And I did some case when conditional calculations in Google Data Studio. However, when I successfully conducted the creation of the new field, I couldn't do the calculation again based on the fields I created.
Based on my raw data, I generated the start_headcount, new_hires, terminated, end_headcount by applying the Case When conditional calculations. However, I failed in the next step to calculate the Turnover rate and Retention rate.
The formula for Turnover rate is
terms/((start_headcount+end_headcount)/2)
for retention is
end_headcount/start_headcount
However, the result is wrong. Part of my table is as below:
Supervisor sheadcount newhire terms eheadcount turnover Retention
A 1 3 1 3 200% 0%
B 6 2 2 6 200% 500%
C 6 1 3 4 600% 300%
So the result is wrong. The turnover rate for A should be 1/((1+3)/2)=50%; For B should be 2/((6+6)/2)=33.33%.
I don't know why it is going wrong. Can anyone help?
For example, I wrote below for start_headcount for each employee
CASE
WHEN Last Hire Date<'2018-01-01' AND Termination Date>= '2018-01-01'
OR Last Hire Date<'2018-01-01' AND Termination Date IS NULL
THEN 1
ELSE 0
END
which means if an employee meets the above standard, will get 1. And then they all grouped under a supervisor. I think it might be the problem why the turnover rate in sum is wrong since it is not calculated on the grouped date but on each record and then summed up.
Most likely you are trying to do both steps within the same query and thus newly created fields like start_headcount, etc. not visible yet within the same select statement - instead you need to put first calculation as a subquery as in example below
#standardSQL
SELECT *, terms/((start_headcount+end_headcount)/2) AS turnover
FROM (
<query for your first step>
)

How to count employees that have been promoted?

I'm trying to figure out how to come up with a calculation or query to count the number of employees by grade promoted on each pay period.
*count the number of records who's value in grade have increased by pay period.
Sample solution:
Soln:
Year Payroll Period Count
2018 16 2
2019 6 1
2019 10 1
I've tried pivot and queries in access but I think this needs to have an inner join to identify specific employees who got promoted. thanks for the assistance.
code in excel that seems to work but needs to be transferred in access due to the number of records. I think inner join would make this work. =AND(B2<>B3,C2=C3,D3>D2)
Based on EXCEL, you can derive your solution, assuming that your records are in sequence for columns Year, Payroll, Employee & Grade.
Add another column to determine if there is a grade increase for that particular Payroll Period.
For excel cell reference sake, "Year" is in cell A1
Set formula of 1st cell of this column to false
For the next cell in this new column, set it as such:
The above checks if there is a grade increase for that particular Payroll Period.
The explanation of the formula in sequence is as such, 1. Check if year same (A3=A2), 2. Check if Payroll Period is different(B3<>B2), 3. Check if Employee is the same (C3=C2) and finally 4. Check if there is a change in grade (D3=D2).
Copy this formula down to the rest of your range.
Next, you can start to pivot.
Add your pivot table from your table/range with the following
Filter Grade Increase to true and also change the values aggregation of Employee from Sum to Count.
You will get the following:
I would rename Count of Employees to make it more meaningful.
One caveat for the above approach is that if the grade was increased at the beginning of the 1st Payroll Period of the year, the increase won't be captured. For such, you can remove the year check from the formula A3=A2.
Edit:
Doing a bit of research, perhaps you can do
select t1.*, (t1.Grade > t2.Grade) as Grade_Increase
from YourTableName t1 left join YourTableName t2 on
t1.Employee = t2.Employee and
(((t1.Year - 2018)*26) + t1.Payroll_Period) =
(((t2.Year - 2018)*26) + t2.Payroll_Period - 1) -- -1 to get the prior record to compare grades
What the above does is essentially joining the table to itself.
Records that are 'next in sequence' are combined into the same row. And a comparison is done.
This was not verified in Access.
Substitute 2018 with whatever your base year is. I'm using 2018 to calculate the sequence number of the records. Initially I thought of using common table expressions, rank and row_number. But access doesn't seem to support these functions.

Calculating the number of new ID numbers per month in powerpivot

My dataset provides a monthly snapshot of customer accounts. Below is a very simplified version:
Date_ID | Acc_ID
------- | -------
20160430| 1
20160430| 2
20160430| 3
20160531| 1
20160531| 2
20160531| 3
20160531| 4
20160531| 5
20160531| 6
20160531| 7
20160630| 4
20160630| 5
20160630| 6
20160630| 7
20160630| 8
Customers can open or close their accounts, and I want to calculate the number of 'new' customers every month. The number of 'exited' customers will also be helpful if this is possible.
So in the above example, I should get the following result:
Month | New Customers
------- | -------
20160430| 3
20160531| 4
20160630| 1
Basically I want to compare distinct account numbers in the selected and previous month, any that exist in the selected month and not previous are new members, any that were there last month and not in the selected are exited.
I've searched but I can't seem to find any similar problems, and I hardly know where to start myself - I've tried using CALCULATE and FILTER along with DATEADD to filter the data to get two months, and then count the unique values. My PowerPivot skills aren't up to scratch to solve this on my own however!
Getting the new users is relatively straightforward - I'd add a calculated column which counts rows for that user in earlier months and if they don't exist then they are a new user:
=IF(CALCULATE(COUNTROWS(data),
FILTER(data, [Acc_ID] = EARLIER([Acc_ID])
&& [Date_ID] < EARLIER([Date_ID]))) = BLANK(),
"new",
"existing")
Once this is in place you can simply write a measure for new_users:
=CALCULATE(COUNTROWS(data), data[customer_type] = "new")
Getting the cancelled users is a little harder because it means you have to be able to look backwards to the prior month - none of the time intelligence stuff in PowerPivot will work out of the box here as you don't have a true date column.
It's nearly always good practice to have a separate date table in your PowerPivot models and it is a good way to solve this problem - essentially the table should be 1 record per date with a unique key that can be used to create a relationship. Perhaps post back with a few more details.
This is an alternative method to Jacobs which also works. It avoids creating a calculated column, but I actually find the calculated column useful to use as a flag against other measures.
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, LASTDATE('Dates'[Date])
)
) - CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, FIRSTDATE('Dates'[Date]) - 1
)
)
It basically uses the dates table to make a distinct count of all Acc_ID from the beginning of time until the first day of the period of time selected, and subtracts that from the distinct count of all Acc_ID from the beginning of time until the last day of the period of time selected. This is essentially the number of new distinct Acc_ID, although you can't work out which Acc_ID's these are using this method.
I could then calculate 'exited accounts' by taking the previous months total as 'existing accounts':
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATEADD('Dates'[Date], -1, MONTH)
)
Then adding the 'new accounts', and subtracting the 'total accounts':
=DISTINCTCOUNT('Accounts'[Acc_ID])

Iterate through table by date column for each common value of different column

Below I have the following table structure:
CREATE TABLE StandardTable
(
RecordId varchar(50),
Balance float,
Payment float,
ProcDate date,
RecordIdCreationDate date,
-- multiple other columns used for calculations
)
And here is what a small sample of what my data might look like:
RecordId Balance Payment ProcDate RecordIdCreationDate
1 1000 100 2005-01-01 2005-01-01
2 5000 250 2008-01-01 2008-01-01
3 7500 350 2006-06-01 2006-06-01
1 900 100 2005-02-01 NULL
2 4750 250 2008-02-01 NULL
3 7150 350 2006-07-01 NULL
The table holds data on a transactional basis and has millions of rows in it. The ProcDate field indicates the month that each transaction is being processed. Regardless of when the transaction occurs throughout the month, the ProcDate field is hard coded to the first of the month that the transaction happened in. So if a transaction occurred on 2009-01-17, the ProcDate field would be 2009-01-01. I'm dealing with historical data, and it goes back to as early as 2005-01-01. There are multiple instances of each RecordId in the table. A RecordId will show up in each month until the Balance column reaches 0. Some RecordId's originate in the month the data starts (where ProcDate is 2005-01-01) and others don't originate until a later date. The RecordIdCreationDate field represents the date where the RecordId was originated. So that row has millions of NULL values in the table because every month that each RecordId didn't originate in is equal to NULL.
I need to somehow look at each RecordId, and run a number of different calculations on a month to month basis. What I mean is I have to compare column values for each RecordId where the ProcDate might be something like 2008-01-01, and compare those values to the same column values where the ProcDate would be 2008-02-01. Then after I run my calculations for the RecordId in that month, I have to compare values from 2008-02-01 to values in 2008-03-01 and run my calculations again, etc. I'm thinking that I can do this all within one big WHILE loop, but I'm not entirely sure what that would look like.
The first thing I did was create another table in my database that had the same table design as my StandardTable and I called it ProcTable. In that ProcTable, I inserted all of the data where the RecordIdCreationDate was not equal to NULL. This gave me the first instance of each RecordId in the database. I was able to run my calculations for the first month successfully, but where I'm struggling is how I use the column values in the ProcTable, and compare those to the column values where the ProcDate is the month after that. Even if I could somehow do that, I'm not sure how I would repeat that process to compare the 2nd month's data to the 3rd month's data, and the 3rd month's data to the 4th month's data, etc.
Any suggestions? Thanks in advance.
Seems to me, all you need to do is JOIN the table to itself, on this condition
ON MyTable1.RecordId = MyTable2.RecordId
AND MyTable1.ProcDate = DATEADD(month, -1, MyTable2.ProcDate)
Then you will have all the rows in your table (MyTable1), joined to the same RecordId's row from the next month (MyTable2).
And in each row you can do whatever calculations you want between the two joined tables.

Build a Fact Table to derive measures in SSAS

My goal is to build a fact table which would be used to derive measures in SSAS. The measure I am building is 'average length of employment'. The measure will be deployed in a dashboard and the users will have the ability to select a calendar period and drill-down into month, week and days.
This is what the transactional data looks like :
DeptID EmployeeID StartDate EndDate
--------------------------------------------
001 123 20100101 20120101
001 124 20100505 20130101
What fields should my Fact Table have? on what fields should I be doing the aggregation? How about averaging it? Any kind of help is appreciated.
Whenever you design a fact table, the first set questions to ask yourself is:
What is the business process you're analysing?
What are relevant facts?
What are the dimensions you'd like to analyse the facts by?
What does the lowest (least aggregated) level of detail in the fact table represent, i.e. what is the grain of the fact table?
The process seems to be Human Resources (HR).
You already know the fact, length of employment, which you can calculate easily: EndDate - StartDate. The obvious dimensions are Department, Employee, Date (two role-playing dimensions for Start and End).
In this case, since you're looking for 'average length of employment' as a measure, it seems that the grain should be individual Employees by Department (your transactional data may have the same EmployeeID listed under a different DeptID when an employee has transferred).
Your star schema will then look something like this:
Fact_HR
DeptKey EmployeeKey StartDateKey EndDateKey EmploymentLengthInDays
-------------------------------------------------------------------------
10001 000321 20100101 20120101 730
10001 000421 20100505 20130101 972
Dim_Department
DeptKey DeptID Name ... (other suitable columns)
------------------------- ...
10001 001 Sales ...
Dim_Employee
EmployeeKey EmployeeID FirstName LastName ... (other suitable columns)
---------------------------------------------- ...
000321 123 Alison Smith ...
000421 124 Anakin Skywalker ...
Dim_Date
DateKey DateValue Year Quarter Month Day ... (other suitable columns)
00000000 N/A 0 0 0 0 ...
20100101 2010-01-01 2010 1 1 1 ...
20100102 2010-01-02 2010 1 1 2 ...
... ... ... ... ... ...
(so on for every date you want to represent)
Every column that ends in Key is a surrogate key. The fact you're interested in is EmploymentLengthInDays, you can derive a measure Avg. Employment Length and you would aggregate using the average across all dimensions.
Now you can ask questions like:
Average employment length by department.
Average employment length for employees starting in 2011, or ending in September 2010.
Average employment length for a given employee (across each department he/she worked for).
BONUS: You can also add another measure to your cube that uses the same column, but instead has a SUM aggregator, this may be called Total Employment Length. Across a given employee this will tell you how long the employee worked for the company, but across a department, it will tell you the total man-days that were available to that department. Just an example of how a single fact can become multiple measures.