QlikSense expression to find average duration over a period of time

QlikSense expression to find average duration over a period of time - qlikview

I want to find the overall duration in hours over different periods of time. ie. once I add filters such as 'October' it should show me the overall hours for that month. I want to count duplicate lessons for multiple attendees as 1 lesson. ie. the duration spent to teach the subject.
Date Duration Subject Attendee
1/10/2019 2:00 Math Joe Bloggs
1/10/2019 2:00 Math John Doe
2/10/2019 3:00 English Jane Doe
6/11/2019 1:00 Geog Jane Roe
17/12/2019 0:30 History Joe Coggs
I want the overall hours spent on the subjects. This mean the duration total above should add up to 6:30, as the two math lessons should only count as 1 lesson (2 hours). How can I write an expression that produces a KPI of the overall learning ours, and then also allows me to drill down to month and date. Thanks in advance

Can suggest you to create another table that will contains the distinct values (im presuming that the unique combination is Date <-> Subject)
The script below will create OverallDuration table will contains the distinct duration values for the combination Date <-> Subject. This way you will have one additional field OverallDuration which can be used in the KPI.
The OverallDuration table is linked to the RawData table (which is linked itself to the Calendar table) which means that OverallDuration calculation will respect the selections on Subject, LessonYear, LessonMonth etc. (have a look at the Math selection picture below)
RawData:
Load
*,
// Create a key field with the combination of Date and Subject
Date & '_' & Subject as DateSubject_Key
;
Load * Inline [
Date, Duration, Subject, Attendee
1/10/2019, 2:00, Math, Joe Bloggs
1/10/2019, 2:00, Math, John Doe
2/10/2019, 3:00, English, Jane Doe
6/11/2019, 1:00, Geog, Jane Roe
17/12/2019, 0:30, History, Joe Coggs
];
// Load distinct DateSubject_Key and the Duration
// converting the duraion to time.
// This table will link to RawData on the key field
OverallDuration:
Load
Distinct
DateSubject_Key,
time(Duration) as OverallDuration
Resident
RawData
;
// Creating calendar table from the dates (distinct)
// from RawData and creating two additional fields - Month and Year
// This table will link to RawData on Date field
Calendar:
Load
Distinct
Date,
Month(Date) as LessonMonth,
Year(Date) as LessonYear
Resident
RawData
;
Once the script above is reloaded then your expression will be just sum( OverallDuration ) And you can see the result in the pivot table below:
The overall duration is 06:30 hours and for Math is 02:00 hours:
And your data model will look like this:
I like to keep my calendar data in separate table but you can add month and year fields to the main table if you want

Related

HR Cube in SSAS

I have to design a cube for students attendance, we have four status (Present, Absent, Late, in vacation). the cube has to let me know the number of students who are not present in a gap of time (day, month, year, etc...) and the percent of that comparing the total number.
I built a fact table like this:
City ID | Class ID | Student ID | Attendance Date | Attendance State | Total Students number
--------------------------------------------------------------------------------------------
1 | 1 | 1 | 2016-01-01 | ABSENT | 20
But in my SSRS project I couldn't use this to get the correct numbers. I have to filter by date, city and attendance status.
For example, I must know that in date X there is 12 not present which correspond to 11% of total number.
Any suggestion of a good structure to achieve this.

I assume this is homework.
Your fact table is wrong.
Don't store aggregated data (Total Students) in the fact as it can make calculations difficult.
Don't store text values like 'Absent' in the fact table. Attributes belong in the dimension.
Reading homework for you:
Difference between a Fact and Dimension and how they work together
What is the grain of a Fact and how does that affect aggregations and calculations.
There is a wealth of information at the Kimball Groups pages. Start with the lower # tips as they get more advanced as you move on.

Calculating the number of new ID numbers per month in powerpivot

My dataset provides a monthly snapshot of customer accounts. Below is a very simplified version:
Date_ID | Acc_ID
------- | -------
20160430| 1
20160430| 2
20160430| 3
20160531| 1
20160531| 2
20160531| 3
20160531| 4
20160531| 5
20160531| 6
20160531| 7
20160630| 4
20160630| 5
20160630| 6
20160630| 7
20160630| 8
Customers can open or close their accounts, and I want to calculate the number of 'new' customers every month. The number of 'exited' customers will also be helpful if this is possible.
So in the above example, I should get the following result:
Month | New Customers
------- | -------
20160430| 3
20160531| 4
20160630| 1
Basically I want to compare distinct account numbers in the selected and previous month, any that exist in the selected month and not previous are new members, any that were there last month and not in the selected are exited.
I've searched but I can't seem to find any similar problems, and I hardly know where to start myself - I've tried using CALCULATE and FILTER along with DATEADD to filter the data to get two months, and then count the unique values. My PowerPivot skills aren't up to scratch to solve this on my own however!

Getting the new users is relatively straightforward - I'd add a calculated column which counts rows for that user in earlier months and if they don't exist then they are a new user:
=IF(CALCULATE(COUNTROWS(data),
FILTER(data, [Acc_ID] = EARLIER([Acc_ID])
&& [Date_ID] < EARLIER([Date_ID]))) = BLANK(),
"new",
"existing")
Once this is in place you can simply write a measure for new_users:
=CALCULATE(COUNTROWS(data), data[customer_type] = "new")
Getting the cancelled users is a little harder because it means you have to be able to look backwards to the prior month - none of the time intelligence stuff in PowerPivot will work out of the box here as you don't have a true date column.
It's nearly always good practice to have a separate date table in your PowerPivot models and it is a good way to solve this problem - essentially the table should be 1 record per date with a unique key that can be used to create a relationship. Perhaps post back with a few more details.

This is an alternative method to Jacobs which also works. It avoids creating a calculated column, but I actually find the calculated column useful to use as a flag against other measures.
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, LASTDATE('Dates'[Date])
)
) - CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, FIRSTDATE('Dates'[Date]) - 1
)
)
It basically uses the dates table to make a distinct count of all Acc_ID from the beginning of time until the first day of the period of time selected, and subtracts that from the distinct count of all Acc_ID from the beginning of time until the last day of the period of time selected. This is essentially the number of new distinct Acc_ID, although you can't work out which Acc_ID's these are using this method.
I could then calculate 'exited accounts' by taking the previous months total as 'existing accounts':
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATEADD('Dates'[Date], -1, MONTH)
)
Then adding the 'new accounts', and subtracting the 'total accounts':
=DISTINCTCOUNT('Accounts'[Acc_ID])

Access - Query to find changing values

I have a table that contains all the weekly info for each employee by week. I want to create a query that shows me each employee's rates over the year, but I don't want to see duplicates.
For instance, Mary has RateX in Week1, RateY in Week2 through Week15, RateZ in Week 16. I want a query that spits out:
Name Week Rate
Mary 1 RateX
Mary 2 RateY
Mary 16 RateZ
And so forth for each employee. How can I do this? I've tried doing a criteria where the rate can't be the same as the rate for [Week]-1, but that seems to exclude the first week. Maybe some kind of Group By? I welcome any help.
Thanks,
Joel

It was as simple as Group By Employee and Rate, and First on the Week.

get number of occurrences taking 1 for every 3 days group

I've got a table for employees that usually get late to work. I need to send a report to Human Resources showing every user that got late, taking into account that I just can count a warning per user if that user got late in a period of 3 days at least 1 time within period.
The first data I need is the total number of warnings to be sent for HR manager to evaluate global "lateness".
Users that got late just one day will receive one warning, but if they got late twice or more the warnings they'll receive depend if they received a warning within a 3 days period counting from day one.
Let's see with an example:
Joe monday 9th
Mark monday 9th
Tim monday 9th
Joe tuesday 10th
Joe wednesday 11th
Joe Thursday 12th
Tim Friday 13th
Taking the data from table above as an example.
Joe will receive 2 warnings: first for monday, and second for Thursday. Tuesday and Wednesday will be discarded because they belonged to the same 3 day period.
Mark will receive just one warning for monday.
Tim will receive 2 warnings. First for monday and second for Friday.
Maybe this number is not possible to get using standard sql query and some cursors need to be done.
Thanks in advance

Ok... some information is missing here, as correctly stated in both comments (Quassnoi and Mr. Llama).
RDBMS in use can impact on the solution as this has to do with date functions and date algebra and not all RDBMS share the same extended function set. I have presumed MySQL 5.5, which is quite common and can be tested on SQLFiddle.
Your 3 day period is also a bit vague. Is it the same for all employees, or on what does it depend? Is it 3 days, or half a week (mon-wed, thu-sat)? What happens after that? Do we use sun-tue, then wed-fri and so on? I have presumed half-weeks (sun-wed, thu-sat) the same for all employees.`So a period is identified by year, week-of-year, half-week.
The last point you should clear up is the expected result set. Do you want a list of warning days or a count or what? I have presumed a list of warnings with the date of each (I've taken the first date for each combination of employee & period).
Creating your sample dataset with these statements:
CREATE TABLE LateEntrances (
Employee VARCHAR(20),
DateLate DATE
);
INSERT INTO LateEntrances VALUES('Joe' ,'2014.06.09');
INSERT INTO LateEntrances VALUES('Mark','2014.06.09');
INSERT INTO LateEntrances VALUES('Tim' ,'2014.06.09');
INSERT INTO LateEntrances VALUES('Joe' ,'2014.06.10');
INSERT INTO LateEntrances VALUES('Joe' ,'2014.06.11');
INSERT INTO LateEntrances VALUES('Joe' ,'2014.06.12');
INSERT INTO LateEntrances VALUES('Tim' ,'2014.06.13');
The following query solves your problem:
SELECT i.Employee, i.YearLate, i.WeekLate, i.PeriodLate, MIN(i.DateLate)
FROM (
SELECT Employee, DateLate,
YEAR(DateLate) AS YearLate,
WEEKOFYEAR(DateLate) AS WeekLate,
FLOOR(DAYOFWEEK(DateLate)/4) AS PeriodLate
FROM LateEntrances
) i
GROUP BY i.Employee, i.YearLate, i.WeekLate, i.PeriodLate;
(SQLFiddle here)
The three columns YearLate, WeekLate and PeriodLate identify the warning period. You could concatenate them in a single period identification column:
SELECT i.Employee, i.PeriodLate, MIN(i.DateLate)
FROM (
SELECT Employee, DateLate,
CONCAT_WS('*',
YEAR(DateLate) ,
WEEKOFYEAR(DateLate) ,
FLOOR(DAYOFWEEK(DateLate)/4)
) AS PeriodLate
FROM LateEntrances
) i
GROUP BY i.Employee, i.PeriodLate;
... or you could hide them alltogether (in the SELECT), even though you must still use them to GROUP BY:
SELECT i.Employee, MIN(i.DateLate)
FROM (
SELECT Employee, DateLate,
CONCAT_WS('*',
YEAR(DateLate) ,
WEEKOFYEAR(DateLate) ,
FLOOR(DAYOFWEEK(DateLate)/4)
) AS PeriodLate
FROM LateEntrances
) i
GROUP BY i.Employee, i.PeriodLate;
You can also easily change the period calculation logic to something else, like strict 3 days of year, or 3 days of month periods. There are many possibilities.
...as per the assumptions I made. Clear up the open points and I'll try to make the answer better. But in the meantime this should be enough to get you started.

Build a Fact Table to derive measures in SSAS

My goal is to build a fact table which would be used to derive measures in SSAS. The measure I am building is 'average length of employment'. The measure will be deployed in a dashboard and the users will have the ability to select a calendar period and drill-down into month, week and days.
This is what the transactional data looks like :
DeptID EmployeeID StartDate EndDate
--------------------------------------------
001 123 20100101 20120101
001 124 20100505 20130101
What fields should my Fact Table have? on what fields should I be doing the aggregation? How about averaging it? Any kind of help is appreciated.

Whenever you design a fact table, the first set questions to ask yourself is:
What is the business process you're analysing?
What are relevant facts?
What are the dimensions you'd like to analyse the facts by?
What does the lowest (least aggregated) level of detail in the fact table represent, i.e. what is the grain of the fact table?
The process seems to be Human Resources (HR).
You already know the fact, length of employment, which you can calculate easily: EndDate - StartDate. The obvious dimensions are Department, Employee, Date (two role-playing dimensions for Start and End).
In this case, since you're looking for 'average length of employment' as a measure, it seems that the grain should be individual Employees by Department (your transactional data may have the same EmployeeID listed under a different DeptID when an employee has transferred).
Your star schema will then look something like this:
Fact_HR
DeptKey EmployeeKey StartDateKey EndDateKey EmploymentLengthInDays
-------------------------------------------------------------------------
10001 000321 20100101 20120101 730
10001 000421 20100505 20130101 972
Dim_Department
DeptKey DeptID Name ... (other suitable columns)
------------------------- ...
10001 001 Sales ...
Dim_Employee
EmployeeKey EmployeeID FirstName LastName ... (other suitable columns)
---------------------------------------------- ...
000321 123 Alison Smith ...
000421 124 Anakin Skywalker ...
Dim_Date
DateKey DateValue Year Quarter Month Day ... (other suitable columns)
00000000 N/A 0 0 0 0 ...
20100101 2010-01-01 2010 1 1 1 ...
20100102 2010-01-02 2010 1 1 2 ...
... ... ... ... ... ...
(so on for every date you want to represent)
Every column that ends in Key is a surrogate key. The fact you're interested in is EmploymentLengthInDays, you can derive a measure Avg. Employment Length and you would aggregate using the average across all dimensions.
Now you can ask questions like:
Average employment length by department.
Average employment length for employees starting in 2011, or ending in September 2010.
Average employment length for a given employee (across each department he/she worked for).
BONUS: You can also add another measure to your cube that uses the same column, but instead has a SUM aggregator, this may be called Total Employment Length. Across a given employee this will tell you how long the employee worked for the company, but across a department, it will tell you the total man-days that were available to that department. Just an example of how a single fact can become multiple measures.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas