2nd latest Date - DAX - powerpivot

I have a dataset of users who log into an app. I want to find the # of days between their last two logins. I have the DAX expression to get their last login (latest date)
=CALCULATE(Max([Date]),ALL(Table1),Table1[Name]=EARLIER(Table1[Name]))
But now I'd like to get their 2nd to last login, and subtract the two. I see some posts about the 2nd to last login, but it puts a blank if there are only two logins, whereas I want the number of days between these as well.

dcheney,
this one is tricky, but doable. It might be a bit difficult to understand but given you have already used EARLIER function, you are very close to your desired result with calculating the day difference between last and second-to-last date of login.
So assuming your source data look like this:
ID User Day
1 1 1-Jan
2 1 10-Jan
3 2 2-Feb
4 2 3-Feb
5 2 7-Feb
I would start with creating a new calculated column that would sort of rank each visit for specific user. This formula should do it:
=CALCULATE (
COUNTROWS ( 'datatable' ),
'datatable'[User] = EARLIER ( 'datatable'[User] ),
'datatable'[Day] < EARLIER ( 'datatable'[Day] ),
ALL ( 'datatable' )
)
+ 1
This will rank add the user based rank to you datatable:
ID User Day CountLoginNumber
1 1 1/1/2014 12:00:00 AM 1
2 1 1/10/2014 12:00:00 AM 2
3 2 2/2/2014 12:00:00 AM 1
4 2 2/3/2014 12:00:00 AM 2
5 2 2/7/2014 12:00:00 AM 3
With this done, there is one more magic formula for another nested column (I have named it Date of Last Login) that does all the heavy lifting:
=
IF (
AND (
[CountLoginNumber] > 1,
[CountLoginNumber]
= CALCULATE (
COUNTROWS ( 'datatable' ),
'datatable'[User] = EARLIER ( 'datatable'[User] ),
ALL ( 'datatable' )
)
),
CALCULATE (
LASTDATE ( 'datatable'[Day] ),
'datatable'[User] = EARLIER ( 'datatable'[User] ),
ALL ( 'datatable' )
)
- CALCULATE (
LASTDATE ( 'datatable'[Day] ),
'datatable'[User] = EARLIER ( 'datatable'[User] ),
'datatable'[CountLoginNumber]
< EARLIER ( 'datatable'[CountLoginNumber] ),
ALL ( 'datatable' )
),
BLANK ()
)
Honestly, this is one of the longest formula I have ever written in Powerpivot. You could do it with separated calculated columns, but I am not a big fan of that. This is what the formula basically does:
IF clause checks whether there is more than 1 login AND if the date of login also equals to the last known login for each user (I want to calculate the date difference only for the last known date).
IF the above mentioned conditions are TRUE, then there are 2 CALCULATE formulas - the first one calculate the last date of login for each user; the second one calculates the previous one for the very same user). If you subtract those two dates, you get the desired result.
Then there is also BLANK() function which is executed when the IF conditions are not TRUE. Just in case :-)
The resulting table then looks like this:
ID User Day CountLoginNumber Date of Last Login
1 1 1/1/2014 12:00:00 AM 1
2 1 1/10/2014 12:00:00 AM 2 9
3 2 2/2/2014 12:00:00 AM 1
4 2 2/3/2014 12:00:00 AM 2
5 2 2/7/2014 12:00:00 AM 3 4
With that, you can then create a simple (Power)pivot table to do all the following (analytic) work that needs to be done.
Check out my source file in Excel (2013) if needed. Hope this helps!

Related

Using Parameter within timestamp_trunc in SQL Query for DataStudio

I am trying to use a custom parameter within DataStudio. The data is hosted in BigQuery.
SELECT
timestamp_trunc(o.created_at, #groupby) AS dateMain,
count(o.id) AS total_orders
FROM `x.default.orders` o
group by 1
When I try this, it returns an error saying that "A valid date part name is required at [2:35]"
I basically need to group the dates using a parameter (e.g. day, week, month).
I have also included a screenshot of how I have created the parameter in Google DataStudio. There is a default value set which is "day".
A workaround that might do the trick here is to use a rollup in the group by with the different levels of aggregation of the date, since I am not sure you can pass a DS parameter to work like that.
See the following example for clarity:
with default_orders as (
select timestamp'2021-01-01' as created_at, 1 as id
union all
select timestamp'2021-01-01', 2
union all
select timestamp'2021-01-02', 3
union all
select timestamp'2021-01-03', 4
union all
select timestamp'2021-01-03', 5
union all
select timestamp'2021-01-04', 6
),
final as (
select
count(id) as count_orders,
timestamp_trunc(created_at, day) as days,
timestamp_trunc(created_at, week) as weeks,
timestamp_trunc(created_at, month) as months
from
default_orders
group by
rollup(days, weeks, months)
)
select * from final
The output, then, would be similar to the following:
count | days | weeks | months
------+------------+----------+----------
6 | null | null | null <- this, represents the overall (counted 6 ids)
2 | 2021-01-01| null | null <- this, the 1st rollup level (day)
2 | 2021-01-01|2020-12-27| null <- this, the 1st and 2nd (day, week)
2 | 2021-01-01|2020-12-27|2021-01-01 <- this, all of them
And so on.
At the moment of visualizing this on data studio, you have two options: setting the metric as Avg instead of Sum, because as you can see there's kind of a duplication at each stage of the day column; or doing another step in the query and get rid of nulls, like this:
select
*
from
final
where
days is not null and
weeks is not null and
months is not null

PostgreSQL - Generate series using subqueries

Using PostgreSQL, I need to accomplish the following scenario. I have a table called routine, where I store start_date and end_date columns. I have another table called exercises, where I store all the data related with each exercise and finally, I have a table called routine_exercise where I create the relationship between the routine and the exercise. Each routine can have seven days (one day indicates the day of the week, e.g: 1 means Monday, etc) of exercises and each day can have one or more exercise. For example:
Exercise Table
Exercise ID
Name
1
Exercise 1
2
Exercise 2
3
Exercise 3
Routine Table
Routine ID
Name
1
Routine 1
2
Routine 2
3
Routine 3
Routine_Exercise Table
Exercise ID
Routine ID
Day
1
1
1
2
1
1
3
1
1
1
1
2
2
1
3
3
1
4
The thing that I'm trying to do is generate a series from start_date to end_date (e.g 03-25-2020 to 05-25-2020, two months) and assign to each date the number of day it supposed to work.
For example, using the data in the Routine_Exercise Table the user should only workout days: 1,2,3,4, so I would like to attach that number to each date. For example, something like this:
Expected Result
Date
Number
03-25-2020
1
03-26-2020
2
03-27-2020
3
03-28-2020
4
03-29-2020
null
03-30-2020
null
03-31-2020
null
04-01-2020
1
04-02-2020
2
04-03-2020
3
04-04-2020
4
04-05-2020
null
Any suggestions or different ideas on how to implement this? Another solution that doesn't require series?
Thanks in advance!
You can generate the dates between start and end input dates using generate_series and then do left join with your routine_exercise table as follows:
SELECT t.d, re.day
FROM generate_series(timestamp '2020-03-25', timestamp '2020-05-25',
interval '1 day') AS t(d)
left join (select distinct day from Routine_Exercise re WHERE ROUTINE_ID = 1) re
on mod(extract(day from (t.d -timestamp '2020-03-25')), 7) + 1 = re.day;

How to run a query for multiple independent date ranges?

I would like to run the below query that looks like this for week 1:
Select week(datetime), count(customer_call) from table where week(datetime) = 1 and week(orderdatetime) < 7
... but for weeks 2, 3, 4, 5 and 6 all in one query and with the 'week(orderdatetime)' to still be for the 6 weeks following the week(datetime) value.
This means that for 'week(datetime) = 2', 'week(orderdatetime)' would be between 2 and 7 and so on.
'datetime' is a datetime field denoting registration.
'customer_call' is a datetime field denoting when they called.
'orderdatetime' is a datetime field denoting when they ordered.
Thanks!
I think you want group by:
Select week(datetime), count(customer_call)
from table
where week(datetime) = 1 and week(orderdatetime) < 7
group by week(datetime);
I would also point out that week doesn't take the year into account, so you might want to include that in the group by or in a where filter.
EDIT:
If you want 6 weeks of cumulative counts, then use:
Select week(datetime), count(customer_call),
sum(count(customer_call)) over (order by week(datetime)
rows between 5 preceding and current row) as running_sum_6
from table
group by week(datetime);
Note: If you want to filter this to particular weeks, then make this a subquery and filter in the outer query.

Oracle Database Temporal Query Implementation - Collapse Date Ranges

This is the result of one of my queries:
SURGERY_D
---------
01-APR-05
02-APR-05
03-APR-05
04-APR-05
05-APR-05
06-APR-05
07-APR-05
11-APR-05
12-APR-05
13-APR-05
14-APR-05
15-APR-05
16-APR-05
19-APR-05
20-APR-05
21-APR-05
22-APR-05
23-APR-05
24-APR-05
26-APR-05
27-APR-05
28-APR-05
29-APR-05
30-APR-05
I want to collapse the date ranges which are continuous, into intervals. For examples,
[01-APR-05, 07-APR-05], [11-APR-05, 16-APR-05] and so on.
In terms of temporal databases, I want to 'collapse' the dates. Any idea how to do that on Oracle? I am using version 11. I searched for it and read a book but couldn't find/understand how to do it. It might be simple, but everyone has their own flaws and Oracle is mine. Also, I am new to SO so my apologies if I have violated any rules. Thank You!
You can take advantage of the ROW_NUMBER analytical function to generate a unique, sequential number for each of the records (we'll assign that number to the dates in ascending order).
Then, you group the dates by difference between the date and the generated number - the consecutive dates will have the same difference:
Date Number Difference
01-APR-05 1 1 -- MIN(date_val) in group with diff. = 1
02-APR-05 2 1
03-APR-05 3 1
04-APR-05 4 1
05-APR-05 5 1
06-APR-05 6 1
07-APR-05 7 1 -- MAX(date_val) in group with diff. = 1
11-APR-05 8 3 -- MIN(date_val) in group with diff. = 3
12-APR-05 9 3
13-APR-05 10 3
14-APR-05 11 3
15-APR-05 12 3
16-APR-05 13 3 -- MAX(date_val) in group with diff. = 3
Finally, you select the minimal and maximal date in each of the groups to get the beginning and ending of each range.
Here's the query:
SELECT
MIN(date_val) start_date,
MAX(date_val) end_date
FROM (
SELECT
date_val,
row_number() OVER (ORDER BY date_val) AS rn
FROM date_tab
)
GROUP BY date_val - rn
ORDER BY 1
;
Output:
START_DATE END_DATE
------------ ----------
01-04-2005 07-04-2005
11-04-2005 16-04-2005
19-04-2005 24-04-2005
26-04-2005 30-04-2005
You can check how that works on SQLFidlle: Dates ranges example

Grouping and summing items in a table using SSRS

I have a SSRS report, and I'm trying to sum up rows conditionally. I have:
11/15/2010 12:14:43 AM | Current Rate | Current Speed | Amount used in that minute (Speed*Rate/60)
etc etc etc
I am trying to add all the rows that happened in an hour, so that my report will show:
11/15/2010 | 12 AM - 1 AM | Amount used for that hour (say, 7 gallons)
I cannot find anywhere how to conditionally sum up a row per hour, or how to get my report to say the above.
Thank you in advance!
Using the following table for testing:
CREATE TABLE `log` (
`entry_date` datetime DEFAULT NULL,
`amount_used` int(11) DEFAULT NULL
)
With some test data:
entry_date amount_used
2010-11-01 10:00:00, 3
2010-11-01 10:30:00, 1
2010-11-01 11:00:00, 6
Use this query to get the date, hour range and total amount used:
SELECT DATE(entry_date) AS entry_date,
CONCAT_WS('-',CONVERT(MIN(HOUR(entry_date)), char(2)), CONVERT(MAX(HOUR(entry_date)),CHAR(2))) hours,
SUM(amount_used) amount_used
FROM (
SELECT entry_date, SUM(amount_used) AS amount_used
FROM log
GROUP BY DATE(entry_date), HOUR(entry_date)
) T;
All the CONCAT/CONVERT stuff is just to get the range of hours in that particular day, as a string. This is the result:
entry_date hours amount_used
2010-11-01, 10-11, 10