Filtered DistinctCount Measure and MDX Not Delivering Same Results as SQL - sql

I have the following two queries, one SQL, one MDX:
SQL:
SELECT t.term_report_year, COUNT(*)
FROM(
SELECT DISTINCT de.term_report_year, fe.student_id
FROM warehouse.FactEnrolments fe
INNER JOIN warehouse.DimDate dd
ON fe.term_record_creation_fk = dd.DateKey
INNER JOIN warehouse.DimTermEnrolments de
ON fe.term_enrolments_fk = de.term_enrolments_pk
WHERE dd.ISOWeekNumberOfYear <= 8 OR dd.ISOYearCode < de.term_report_year
) t
GROUP BY t.term_report_year
ORDER BY term_report_year
MDX:
SELECT
NON EMPTY
Measures.[Enrolments] ON COLUMNS
,NON EMPTY
Filter
(
[Term Enrolments].[Term Year].Children *
[Term Record Creation].[ISO Year Code].children *
[Term Record Creation].[ISO Week Number Of Year].children
,
Cint([Term Record Creation].[ISO Week Number Of Year].CurrentMember.Member_Key) <= 8
OR
Cint([Term Record Creation].[ISO Year Code].CurrentMember.Member_key) < Cint([Term Enrolments].[Term Year].CurrentMember.Member_key)
) ON ROWS
FROM [Enrolments];
I am trying to express the idea in both, "count the number of students in a year who enrolled for that year before or during the 8th week of that year" where year = term_year.
In my SSAS cube the Enrolments measure is a DistinctCount on student_id. In the SQL query, term_report_year is equivalent to Term Year in the MDX.
Could someone please explain why the two queries are not delivering the same numbers e.g. the SQL for 2016 gives 2803 and the MDX 2948?
I think it has something to do with the MDX double counting across the weeks, but I can't work out how to fix it.

Try this. I am hopeful it will filter the year total to just weeks <= 8
SELECT
NON EMPTY
Measures.[Enrolments] ON COLUMNS
,NON EMPTY [Term Enrolments].[Term Year].Children ON ROWS
FROM (
SELECT
Filter
(
[Term Enrolments].[Term Year].Children *
[Term Record Creation].[ISO Year Code].children *
[Term Record Creation].[ISO Week Number Of Year].children
,
Cint([Term Record Creation].[ISO Week Number Of Year].CurrentMember.Member_Key) <= 8
OR
Cint([Term Record Creation].[ISO Year Code].CurrentMember.Member_key) < Cint([Term Enrolments].[Term Year].CurrentMember.Member_key)
) ON COLUMNS
FROM [Enrolments]
);

Related

Running Total - Create row for months that don't have any sales in the region (1 row for each region in each month)

I am working on the below query that I will use inside Tableau to create a line chart that will be color-coded by year and will use the region as a filter for the user. The query works, but I found there are months in regions that don't have any sales. These sections break up the line chart and I am not able to fill in the missing spaces (I am using a non-date dimension on the X-Axis - Number of months until the end of its fiscal year).
I am looking for some help to alter my query to create a row for every month and every region in my dataset so that my running total will have a value to display in the line chart. if there are no values in my table, then = 0 and update the running total for the region.
I have a dimDate table and also a Regions table I can use in the query.
My Query now, (Results sorted in Excel to view easier) Results Table Now
What I want to do; New rows highlighted in Yellow What I want to do
My Code using SQL Server:
SELECT b.gy,
b.sales_month,
b.region,
b.gs_year_total,
b.months_away,
Sum(b.gs_year_total)
OVER (
partition BY b.gy, b.region
ORDER BY b.months_away DESC) RT_by_Region_GY
FROM (SELECT a.gy,
a.region,
a.sales_month,
Sum(a.gy_total) Gs_Year_Total,
a.months_away
FROM (SELECT g.val_id,
g.[gs year] AS GY
,
g.sales_month
AS
Sales_Month,
g.gy_total,
Datediff(month, g.sales_month, dt.lastdayofyear) AS
months_away,
g.value_type,
val.region
FROM uv_sales g
JOIN dbo.dimdate AS dt
ON g.[gs year] = dt.gsyear
JOIN dimvalsummary val
ON g.val_id = val.val_id
WHERE g.[gs year] IN ( 2017, 2018, 2019, 2020, 2021 )
GROUP BY g.valuation_id,
g.[gs year],
val.region,
g.sales_month,
dt.lastdayofyear,
g.gy_total,
g.value_type) a
WHERE a.months_away >= 0
AND sales_month < Dateadd(month, -1, Getdate())
GROUP BY a.gy,
a.region,
a.sales_month,
a.months_away) b
It's tough to envision the best method to solve without data and the meaning of all those fields. Here's a rough sketch of how one might attempt to solve it. This is not complete or tested, sorry, but I'm not sure the meaning of all those fields and don't have data to test.
Create a table called all_months and insert all the months from oldest to whatever date in the future you need.
01/01/2017
02/01/2017
...
12/01/2049
May need one query per region and union them together. Select the year & month from that all_months table, and left join to your other table on month. Coalesce your dollar values.
select 'East' as region,
extract(year from m.month) as gy_year,
m.month as sales_month,
coalesce(g.gy_total, 0) as gy_total,
datediff(month, m.month, dt.lastdayofyear) as months_away
from all_months m
left join uv_sales g on g.sales_month = m.month
--and so on

Translating SQL with join and window function to DAX

I have a working SQL query, but it needs to be translated to DAX and I'm struggling.
This is what I'm trying to achieve:
I have agreements running from a startdate to an enddate in a fact table. An agreement running for a whole year counts as 1, meaning DATEDIFF(startdate, enddate) / 365.0 gives the "weight" of the agreement. It is needed to look at any given month and get the sum of the trailing year's total agreement weights. I also have a dimension table related to the fact table with all dates (single days and in which year/month they belong), which gives me the possibility to perform the following SQL query to get exactly what I want:
SELECT
sub.yearMonth
,SUM(sub.[cnt]) OVER (ORDER BY sub.[yearMonth] DESC ROWS BETWEEN 11 PRECEDING AND CURRENT ROW) / 365.0 AS [runningYear]
FROM
(SELECT
d.yearMonth
,COUNT(*) AS [cnt]
FROM Agreememts AS a
INNER JOIN Date AS d
ON d.date >= a.startddate AND d.date <= a.enddate
GROUP BY d.yearMonth
) AS sub
I tried to replicate this in a DAX measure ending up with the following:
AgreementWeights:=
CALCULATE (
CALCULATE (
COUNTROWS ( 'Agreements' );
FILTER (
'Agreements';
'Agreements'[startdate] <= MAX ( 'Date'[date] )
&& 'Agreements'[enddate] >= EDATE ( MAX( 'Date'[date] ); -12)
)
);
CROSSFILTER ( 'Agreements'[receiveddate]; 'Date'[sys_date_key]; NONE )
)
The last line is to cut the relationship from the recieveddate to the date dimension, which is irrelevant here. This DAX query yields too many rows, but gives the correct result when divided like (result/365)*100 and I simply cannot figure out why.
An example of the fact table Agreements:
ID startdate enddate recieveddate
0 10-04-2014 12-06-2015 10-03-2014
1 11-06-2014 11-07-2014 11-05-2014
An example of the dimension table Date:
ID date yearMonth sys_date_key
0 10-04-2014 April2014 10042014
1 11-04-2014 April2014 11042014
Thanks :)

only show the sum of all the id that have transacted in the past 12 months

select
id
,id_name
, MAX(last_login_date)
, SUM(transaction_count)
, mAX(last_transaction_date)
from sales;
hi I am looking for the results to only include a transaction count for the sales made in the last 12 months. what can I do?
I have max and sum because there are multiple instances of the same ids so they are not unique.
I don't have individual transaction dates. I only have a last transaction date field
You may use months_between function to have 12 months directly :
select id,id_name, MAX(last_login_date), SUM(transaction_count), mAX(last_transaction_date)
from sales
where months_between(trunc(sysdate),last_transaction_date) <= 12
group by id, id_name;
if you need to select all transactions in a month (with current month), you can use this construction:
select id
, id_name
, Max(last_login_date)
, Sum(transaction_count)
, Max(last_transaction_date)
from sales
where last_transaction_date >= add_months(trunc(sysdate,'mm'),-11)
group by id, id_name;

Account for missing values in group by month

I'm trying to retrieve the average number of records added to the database each month. However for months that no records were added, the row is missing and therefore not being calculated into the average.
Here is the query:
SELECT AVG(a.count) AS AVG
FROM ( SELECT COUNT(*) AS count, MONTH(InsertedTimestamp) AS Month
FROM Certificates
WHERE InsertedTimestamp >= '9/19/2014'
AND InsertedTimestamp <= '7/1/2015'
GROUP BY MONTH(InsertedTimestamp)
) AS a
When I run just the inner query, only results from months 9,10,11 are showing, because there are no records for months 12,1,2,3,4,5,6,7. How can I add these missing rows to the table in order to get the correct monthly average?
Thanks!
This is easy enough to fix, just by using sum / cnt:
SELECT COUNT(*) / (TIMESTAMPDIFF(month, '2014-09-19', '2015-07-01' ) + 1)
FROM Certificates
WHERE InsertedTimestamp >= '2014-09-19' AND
InsertedTimestamp <= '2015-07-01' ;
You don't even need the subquery.

SQL/HSQLDB query and sub-query in Aggregate Function

My database looks like this (very simple) and is called "RideDate":
BikeDate Bike Miles
What I am looking to achieve is a query that for each month is a total(Sum) across all years, average(Avg) across all years, and a total for a specific year
(WHERE YEAR("Date")= '2014"). (I don"t have my exact code in front of me due to power fluctuations, pushing me onto an iPad (high winds and wet/heavy snow)).
My attempt goes something like this:
SElECT MONTH("BikeDate") AS "Month", SUM("Miles") AS "SMiles", AVG("AMiles") AS "Average",
(SELECT MONTH("BikeDate") SUM("Miles") WHERE YEAR("BikeDate") = '2014') AS "2014"
FROM "RideDate"
GROUP BY MONTH("BikeDate")
ORDER BY MONTH("BikeDate") ASC
The results should be:
(month) (sum of month over all years) (avg of month over all years) (sum of month for '14)
The last column will not collate by the 'group by month' and gives a sum for the whole year.
How can I write the sub-query to sum across the iterated month of the main query for the selected year? Is there another way of solving this?
You can try it with a CROSS JOIN
SELECT * FROM
(
(SELECT MONTH("BikeDate") AS "Month", SUM("Miles") AS "SMiles", AVG("AMiles") AS "Average",
FROM "RideDate"
GROUP BY MONTH("BikeDate"))a
CROSS JOIN
(SELECT SUM("Miles") as "YearSum"
FROM "RideDate"
WHERE YEAR("BikeDate") = '2014')b
) results