Calculating Totals per members in an MDX Calculated Member - mdx

I'm new to MDX and trying to create a calculation that will provide the total of a member. I have tried the Root function but it gives me the grand total for all the members, I just want the total per member.
This is what I'm trying to achieve:
Team 1 Closed 12 Calls within 1 Day, Team 1 has a total of 30 calls
Team 1 Closed 14 Calls within 2 Days, Team 1 has a total of 30 calls
Team 2 Closed 10 Calls within 1 Day, Team 2 has a total of 15 calls
Team 2 Closed 15 Calls within 2 Days, Team 2 has a total of 15 calls
How can I create a calculated measure in SSAS so it stores the total number of calls per Team, for Team 1 will be 30 and for Team 2 will be 15, so I can use it to calculate the % of calls closed within a number of days, i.e. :
% Of Calls closed for Team 1 within 1 Day = (12/30)*100 = 40%
% Of Calls closed for Team 1 within 2 Days = (14/30)*100 = 46.66%
% Of Calls closed for Team 2 within 1 Day = (10/15)*100= 66.66%
% Of Calls closed for Team 2 within 2 Days = (15/15)*100 =100%
I have tried the Root function but it gives me the grand total for all the members i.e. 15 + 30=45
Thanks

Related

How to calculate monthy counts per season using dataframe in pandas

Need to calculate monthly averge count as per season for below given dataset
season months daily counts
1 2 280
1 3 360
2 1 290
3 2 750
3 4 360
I tried using below code but the counts are daily for each month therefore couldn't find average monthly counts
dataseason=pd.read_csv(path,usecols = ['season','mnth','cnt']);
dataseason ['col5']=dataseason.groupby(dataseason['season'].ne(dataseason['season'].shift()).cumsum())['cnt'].transform('mean')
print(dataseason.drop_duplicates(subset='col5'))

Ensuring years and months are running as part of data cleaning

I have 2 datasets:
rainfall per month (mm) from 1982-01 to 2022-08
no. of rainy days per month per year from 1982-01 to 2022-08.
month no_of_rainy_days
0 1982-01 10
1 1982-02 5
2 1982-03 11
3 1982-04 14
4 1982-05 10
month total_rainfall
0 1982-01 107.1
1 1982-02 27.8
2 1982-03 160.8
3 1982-04 157.0
4 1982-05 102.2
Qn 1: As part of ensuring data integrity, how do I ensure that the dates are running consecutively? i.e 1982-01 and next is 1982-02 and not a skip to 1982-03?
I am unsure how to perform the checking and have done a search online. Is it common practice to assume that the years and months are running?
First, separate the year from the month.
df.rename(columns={"month": "ym"}, inplace=True)
df[["year", "month"]] = df["ym"].astype(str).str.split("-", expand=True)
Then you can group the dataframe by year and count the number of observations per year (counts number of rows per year).
observations_per_year = df["year"]\
.groupby(df["year"])\
.agg("count")\
.reset_index(name="observations")
observations_per_year[observations_per_year["observations"] < 12]
Assuming you have any years with less than 12 observations, they will be displayed like so:
year observations
0 1982 11
4 1986 11
5 1987 11
6 1988 10
11 1993 11
Given the lack of detail and no sample data provided, I made some assumptions about your data:
Each data set will not have more than one row for any month of the year (i.e., a maximum of 12 rows/observations per year).
Each dataframe contains a single observation per row, as shown in your examples (so you would do this for each dataframe prior to merging them). As such, counting rows per year-month is an accurate means of counting the number of observations for any given month.
The sorted order of the data is irrelevant (you can later sort by year-month if needed).

SQL : How to create a row per period value

I am currently trying to upskill myself within data analysis using tools such as SQL, EXCEL etc. So apologies, if what I am asking for may not make much sense, but happy to expand/clarify where required.
Problem :
I am trying to create a period by period line graph, showing pay across periods. However, with my current dataset the rows are :
employee codes and the columns are the individual periods with the values pertaining to that period for each row.
In order to achieve the requirements for my line graph. I would need to perform pivot of some sort to create a row per period for each worker. This will then allow me to group by periods for my line graph.
Current dataset :
Code Name Period 1 Period 2 Period 3
P1 Worker 1 2740.67 0 0
2 Worker 2 0 0 0
3 Worker 3 0 759.85 607.88
4 Worker 4 0 0 0
5 Worker 5 5000 5000 5000
6 Worker 6 1762.5 1672.5 960
12 Worker 7 6050 7750 5000
7 Worker 8 625.38 748.46 10
1234 Worker 9 2616.67 2616.67 2616.67
8 Worker 10 500 200 0
144 Worker 11 0 0 0
M100 Worker 12 423.08 0 0
M01 Worker 13 1583.33 1583.33 1583.33
M102 Worker 14 5833.33 5833.33 0
2403 Worker 15 8333.33 8333.33 11269.23
So for worker 5 they should have have 3 rows. The only thing i can think of is subqueries per worker that make up the columns or multiple unions, but seems rather time consuming ? Was hoping for a quicker efficient way of achieving what i need.

How to run a loop on a query that gives the sum of time remaining on tickets so that we get time remaining of individual tickets?

I have a table consisting of Entity_Id, Date_of_Modification, Previous_State, and New_State for tickets we are working on.
Entity_Id
Date_of_Modification
Previous_State
New_State
Time Difference (Days)
1
3/18/2020
Internal Review
Done
0
1
3/18/2020
Open
Internal Review
0
2
6/25/2020
Internal Review
Done
1
2
6/24/2020
Done
Internal Review
0
2
6/21/2020
Testing
Done
3
2
6/18/2020
In Dev
Testing
3
2
4/30/2020
Planned
In Dev
49
2
3/21/2020
Open
Planned
0
3
3/31/2020
Internal Review
Internal Review
6
3
3/25/2020
Analyzing
Internal Review
5
3
3/20/2020
Analyzing
Analyzing
1
3
3/10/2020
Open
Analyzing
0
4
3/25/2020
Internal Review
Done
2
4
3/23/2020
Internal Review
Internal Review
0
4
3/23/2020
Open
Internal Review
5
4
3/18/2020
Open
Open
32
4
3/18/2020
Done
Open
0
4
2/14/2020
Done
Done
17
4
2/14/2020
Internal Review
Done
0
4
1/28/2020
Internal Review
Internal Review
2
4
1/28/2020
Open
Internal Review
0
I have figured out the query for calculating the total amount of time already spent by a ticket.
I also have figured out the time spent by the ticket on 'internal review' state because we want the time spent apart from this state and have written a query to calculate the remaining time.
-------query to find total time remaining for a ticket apart from internal review---------
SELECT M.TotalTime - N.IRTotalTime AS RemainingHours
FROM
----------query to find total time spent on a ticket---------
(SELECT SUM(B.Diff) AS TotalTime
FROM
(SELECT
A.Modification_Id,
A.Date_of_Modification,
A.Previous_State,
A.State AS NewState,
DATEDIFF(DAY, LAG(Date_of_Modification) OVER (ORDER BY Date_of_Modification), Date_of_Modification)
AS Diff
FROM
(SELECT
Modification_Id,
Date_of_Modification,
Previous_State,
State
FROM Book2
)AS A)
AS B) AS M
,
----------query to find total time spent on internal review---------
(SELECT SUM(B.Diff) AS IRTotalTime
FROM
(SELECT
A.Modification_Id,
A.Date_of_Modification,
A.Previous_State,
A.State AS NewState,
DATEDIFF(DAY, LAG(Date_of_Modification) OVER (ORDER BY Date_of_Modification), Date_of_Modification) AS Diff
FROM
(SELECT
Modification_Id,
Date_of_Modification,
Previous_State,
State
FROM Book2
WHERE Previous_State = 'Internal Review' AND State <> 'Internal Review'
UNION
SELECT
Modification_Id,
Date_of_Modification,
Previous_State,
State
FROM Book2
WHERE Previous_State = 'Internal Review' AND State = 'Internal Review'
) AS A
) AS B
WHERE B.Previous_State = 'Internal Review' AND B.NewState <> 'Internal Review') AS N
But this query for some reason is only for for case when I specify the ticket number (i.e. Entity_Id). It is not working when I run it over the entire table. So I thought if we could use a loop to get the total remaining time of individual tickets.
But I am having difficulty running that query through a loop and getting the Entity_Id displayed for each calculation on the tickets.
When I run the query I get the value 55 which might be the total remaining time. But I want the total remaining time for individual tickets like:
Entity_Id
Remaining Time (Days)
1
NULL
2
95
3
11
4
20
Thank you
Update:
I used PARTITION BY Entity_Id and got the required total time and Internal Review time of individual tickets and saved the result in separate tables. I now need to subtract the value of time of 2nd table from 1st table. There are rows that have NULL value in the time spent column in some of the rows of both the table.
Table A (Total time spent):
Entity_Id
Remaining Time (Days)
1
NULL
2
96
3
21
4
21
Table A (Time spent in Internal Review):
Entity_Id
Remaining Time (Days)
2
1
3
15
4
5
Thanks
Update:
I have figured out the query for it. Thank you all for your suggestions.
If the question regarding the Internal review state was unclear, here is a diag representing what I require from this query for a particular ticket:
Total: sum of time diff = 58 days
Internal Review State: 19 days
Final result: 39 days
#JonArmstrong
Try adding a PARTITION BY in your lag functions, like so:
DATEDIFF(DAY, LAG(Date_of_Modification) OVER (PARTITION BY Entity_Id ORDER BY Date_of_Modification), Date_of_Modification)

SQL Teradata - in query create new column that multiplies column by 2 if certain value is true

I have a sql query I'm running that exports 2 columns, cost and months. The months column either has a value of 6 or 2. I want to create a new column that checks the months column and sees what the value is. If the month value is 6 then multiply the cost column by 2 and if the month value is 12 then just copy that same number in the cost column. Sample data:
cost months
100 6
200 12
400 6
expected result:
cost months total
100 6 200
200 12 200
400 6 800
A simple case statement should work:
select
cost,
months,
case when months = 6 then cost * 2
else cost
end as total
from <your table>