SSAS MDX for SUM (DISTINCT Customer's (MAX (Date's Weight))) - ssas

To the MDX gurus,
I have been beating my head against this one for a week and I am nowhere close to solving it. Can it be solved?
Here's the challenge:
To create a Calculated Member Expression in SSAS BIDs to calculate the Weighted_Members which is described as the following:
"For any date period chosen, we need to calculate the sum of the Weights that is associated with the most recent visit of a unique member."
In pseudo-code: SUM(DISTINCT Member’s (MAX (Date’s Weight)))
NOTES:
* The WEIGHT is given to a member’s visit to a particular location and is applicable for 1 month.
Here is a sample of the fact table showing:
* Two members (membership id: 100 and 103)
* Visiting 3 different locations (location id: 200, 220 and 230)
* At different dates throughout 2014 and 2015.
Visits_F_ID | Visit_Date | Membership_ID | Location_ID | Weights |
1 | Jan 1, 2014 | 100 | 230 | 3.5 |
2 | Mar 1, 2014 | 100 | 220 | 2.0 |
3 | May 1, 2015 | 100 | 220 | 2.5 |
4 | Apr 1, 2014 | 103 | 200 | 1.0 |
5 | Jul 1, 2014 | 103 | 220 | 1.5 |
6 | Sep 1, 2014 | 103 | 230 | 0.5 |
7 | Nov 1, 2014 | 103 | 220 | 3.0 |
8 | Jan 1, 2015 | 103 | 220 | 1.0 |
9 | Aug 1, 2015 | 103 | 200 | 7.0 |
10 | Sep 1, 2015 | 103 | 230 | 4.5 |
11 | Dec 1, 2015 | 103 | 200 | 1.5 |
Dimensions:
The Visit Date Dimension has the following attributes:
* YEAR
* Quarter
* MONTH
* Date
* Calendar Year->Quarter->Month->Date (calendar_quarter_hierarchy)
* Calendar Year->Month->Date (calendar_month_hierarchy)
The Membership Dimension has the following attributes:
* membership_id (currently visibility set to false (or hidden) as there are >5M records)
* Gender
* Age Cohort
The Location Dimension has the following attributes:
* Location_ID
* Location_Name
* City
* Province
* Province->City->Location_Name (Geographical_hierarchy)
Examples:
Example #1.) The Weighted_Members for the YEAR 2014 would be calculated as follows:
STEP 1: filtering the fact data for activity in YEAR 2014.
Visits_F_ID | Visit_Date | Membership_ID | Location_ID | Weights |
=============================================================================
1 | Jan 1, 2014 | 100 | 230 | 2.5 |
2 | Mar 1, 2014 | 100 | 220 | 2.0 |
4 | Apr 1, 2014 | 103 | 200 | 1.0 |
5 | Jul 1, 2014 | 103 | 220 | 1.5 |
6 | Sep 1, 2014 | 103 | 230 | 0.5 |
7 | Nov 1, 2014 | 103 | 220 | 3.0 |
STEP 2: taking the data with the most recent date for each unique member from the above:
Visits_F_ID | Visit_Date | Membership_ID | Location_ID | Weights |
=============================================================================
2 | Mar 1, 2014 | 100 | 220 | 2.0 |
7 | Nov 1, 2014 | 103 | 220 | 3.0 |
STEP 3: sum the Weights to give the Weighted_Members = 2.0 + 3.0 is 5.0
======
Example #2.) If the cube user slices for the time period of 2015, following the same three steps in example #1 above, the Weighted_Members:
Visits_F_ID | Visit_Date | Membership_ID | Location_ID | Weights |
=============================================================================
3 | May 1, 2015 | 100 | 220 | 2.5 |
11 | Dec 1, 2015 | 103 | 200 | 1.5 |
Weighted_Members = 2.5 + 1.5 is 4.0
======
Example #3.) If the cube user slices for the time period of Mar 2014 to Oct 2014 and is interested in visits to location_id = 220, the Weighted_Members:
Visits_F_ID | Visit_Date | Membership_ID | Location_ID | Weights |
=============================================================================
2 | Mar 1, 2014 | 100 | 220 | 2.0 |
5 | Jul 1, 2014 | 103 | 220 | 1.5 |
Weighted_Members = 2.0 + 1.5 is 3.5
======
Example #4.) If the cube user slices for the time period of July 2015 to Aug 2015, the Weighted_Members:
Visits_F_ID | Visit_Date | Membership_ID | Location_ID | Weights |
=============================================================================
9 | Aug 1, 2015 | 103 | 200 | 7.0 |
Weighted_Members = 7.0

Based on my understanding - You can give this a try:
WITH MEMBER Measures.YourCalcMember AS
SUM
(
generate
(
Customer.CustomerID.MEMBERS AS S,
s.CURRENT *
TAIL(
NonEmpty
(
[Date].[Date].[Date].MEMBERS, --The last date for the "current" customer
(s.CURRENT, [Measures].[Weight])
)
)
)
,
Measures.[Weight]
)
SELECT Measures.YourCalcMember ON 0,
Location.LocationID.MEMBERS ON 1
FROM
(
SELECT [Date].[Year].&[2014] ON 0 FROM [Your Cube] --The year filter
)
Using the "generate" function, loop thru the customers and obtain a cross-set of customerId and the 'last' date for that customer. Over this set then, obtain the sum of weights.
All said, further details are needed before this question can be attempted correctly.

Related

Join two columns as a date in sql

I am currently working with a report through Microsoft Query and I ran into this problem where I need to calculate the total amount of money for the past year.
The table looks like this:
Item Number | Month | Year | Amount |
...........PAST YEARS DATA...........
12345 | 1 | 2019 | 10 |
12345 | 2 | 2019 | 20 |
12345 | 3 | 2019 | 15 |
12345 | 4 | 2019 | 12 |
12345 | 5 | 2019 | 11 |
12345 | 6 | 2019 | 12 |
12345 | 7 | 2019 | 12 |
12345 | 8 | 2019 | 10 |
12345 | 9 | 2019 | 10 |
12345 | 10 | 2019 | 10 |
12345 | 11 | 2019 | 10 |
12345 | 12 | 2019 | 10 |
12345 | 1 | 2020 | 10 |
12345 | 2 | 2020 | 10 |
How would you calculate the total amount from 02-2019 to 02-2020 for the item number 12345?
Assuming that you are running SQL Server, you can recreate a date with datefromparts() and use it for filtering:
select sum(amount)
from mytable
where
itemnumber = 12345
and datefromparts(year, month, 1) >= '20190201'
and datefromparts(year, month, 1) < '20200301'
You can use this also
SELECT sum(amount) as Amount
FROM YEARDATA
WHERE ( Month >=2 and year = '2019')
or ( Month <=2 and year = '2020')
and ItemNumber = '12345'

How to refer to other columns using a condition when creating a calculated column?

Suppose I have a SQL table as shown below where Min Spend is the minimum spend for each year and is a calculated column created using SQL-Window Function
|------------|-------|--------|----------|
| Year |Month | Spend |Min Spend |
|------------|-------|--------|----------|
| 2018 | Jan | 10 | 10 |
| 2018 | Feb | 20 | 10 |
| 2018 | Oct | 25 | 10 |
| 2019 | Jan | 90 | 45 |
| 2019 | Aug | 60 | 45 |
| 2019 | Nov | 45 | 45 |
|------------|-------|--------|----------|
I would like to create a new column as a calculated field in the table that gives me the month corresponding the the 'Min Spend' for that year as shown below
|------------|-------|--------|----------|---------------|
| Year |Month | Spend |Min Spend |Min Spend Month|
|------------|-------|--------|----------|---------------|
| 2018 | Jan | 10 | 10 | Jan |
| 2018 | Feb | 20 | 10 | Jan |
| 2018 | Oct | 25 | 10 | Jan |
| 2019 | Jan | 90 | 45 | Nov |
| 2019 | Aug | 60 | 45 | Nov |
| 2019 | Nov | 45 | 45 | Nov |
|------------|-------|--------|----------|---------------|
Can anybody suggest how to approach this?
You can use window functions like this:
select t.*,
min(spend) over (partition by year) as min_spend,
first_value(month) over (partition by year order by spend) as min_spend_month
from t;

sql unique mapping of columns

I have a database where there are n products ,m units sold on different dates.
Like bags are sold on daily basis , some days 5 some days 6 etc.
Sample database :
+---------+----------+-------+
| Product | UnitSold | Date |
+---------+----------+-------+
| bag | 1 | 1 jun |
| wallet | 2 | 2 jun |
| purse | 3 | 3 jun |
| bag | 4 | 4 jun |
| shoes | 3 | 4 jun |
| Shirt | 2 | 1 jun |
| bag | 5 | 2 jun |
| shirt | 6 | 3 jun |
| Purse | 1 | 1 jun |
+---------+----------+-------+
I want a unique combination of results where a particular quantity of a product is sold on particular date. How can I do that?
Example I am looking for :
Result:
+---------+----------+-------+
| Product | UnitSold | Date |
+---------+----------+-------+
| bag | 1 | 1 jun |
| purse | 3 | 3 jun |
| shirt | 6 | 3 jun |
+---------+----------+-------+
Want a specific mapping of columns
How can I do that ? I am using Microsoft sql server 2008
You could throw in a rank or row number if you don't care about what result you really want.
I threw your data into a temp table and ran the following. It will give me one result per product. With rank, it will give me number 1 based on unit sold. You can change that if you want, based on date or whatever else.
select *
from (
select *,rank() over(partition by product order by unitsold ) as rnk
from #temp a
)final
where rnk = 1
product unitsold Date rnk
bag 1 2017-06-01 1
purse 3 2017-06-02 1
shirt 2 2017-06-02 1
shoes 3 2017-06-04 1
wallet 2 2017-06-02 1

Group By Different Values

I would like to group by the first day and then the rest of the month, I have data that spans years.
I have data like below:
--------------------------------------
DAY MONTH YEAR VISITOR_COUNT
--------------------------------------
1 | 12 | 2014 | 16260
2 | 12 | 2014 | 15119
3 | 12 | 2014 | 14464
4 | 12 | 2014 | 13746
5 | 12 | 2014 | 13286
6 | 12 | 2014 | 14352
7 | 12 | 2014 | 19293
8 | 12 | 2014 | 13338
9 | 12 | 2014 | 13961
10 | 12 | 2014 | 9519
11 | 12 | 2014 | 10204
12 | 12 | 2014 | 9380
13 | 12 | 2014 | 11611
14 | 12 | 2014 | 14839
15 | 12 | 2014 | 10051
16 | 12 | 2014 | 8983
17 | 12 | 2014 | 7348
18 | 12 | 2014 | 7258
19 | 12 | 2014 | 7205
20 | 12 | 2014 | 6113
21 | 12 | 2014 | 5316
22 | 12 | 2014 | 6914
23 | 12 | 2014 | 6880
24 | 12 | 2014 | 6289
25 | 12 | 2014 | 6000
26 | 12 | 2014 | 13328
27 | 12 | 2014 | 10367
28 | 12 | 2014 | 7946
29 | 12 | 2014 | 9042
30 | 12 | 2014 | 9408
31 | 12 | 2014 | 8411
1 | 1 | 2015 | 9965
2 | 1 | 2015 | 10560
3 | 1 | 2015 | 9662
4 | 1 | 2015 | 8735
5 | 1 | 2015 | 12817
6 | 1 | 2015 | 13516
7 | 1 | 2015 | 9800
8 | 1 | 2015 | 10629
9 | 1 | 2015 | 12325
10 | 1 | 2015 | 11899
11 | 1 | 2015 | 11049
12 | 1 | 2015 | 13934
13 | 1 | 2015 | 16833
14 | 1 | 2015 | 13434
15 | 1 | 2015 | 13128
16 | 1 | 2015 | 14660
17 | 1 | 2015 | 11951
18 | 1 | 2015 | 10916
19 | 1 | 2015 | 14126
20 | 1 | 2015 | 16909
21 | 1 | 2015 | 16555
22 | 1 | 2015 | 14726
23 | 1 | 2015 | 14642
24 | 1 | 2015 | 13067
25 | 1 | 2015 | 11738
26 | 1 | 2015 | 15353
27 | 1 | 2015 | 17935
28 | 1 | 2015 | 14448
29 | 1 | 2015 | 15372
30 | 1 | 2015 | 16694
31 | 1 | 2015 | 16763
I would like to be able to group it like below:
--------------------------------------
DAY MONTH YEAR VISITOR_COUNT
--------------------------------------
1 | 12 | 2014 | 16260
2-31| 12 | 2014 | 309971
1 | 1 | 2015 | 9965
2-31| 1 | 2015 | 404176
Microsoft SQL Server 2016. Compatibility level: SQL Server 2005 (90)
Just use case:
select (case when min(day) = 1 then '1'
else concat(min(day), '-', max(day))
end) as day, month, year,
sum(visitor_count)
from t
group by year, month,
(case when day = 1 then 1 else 2 end);
Okay, this is a little tricky. The case in the group by and the case in the select are different. The group by just puts the days into two categories, 1 and others. The select chooses the minimum and maximum days in the month, to construct the range string.
EDIT:
Oy, SQL Server 2005 ???
Of course, you can do the same thing with + and type conversion, or using replace():
select (case when min(day) = 1 then '1'
else replace(replace('#min-#max', '#min', min(day)), '#max', max(day))
end) as day, month, year,
sum(visitor_count)
from t
group by year, month,
(case when day = 1 then 1 else 2 end);

Get data for fiscal year from table without date columns

I'm trying to create a query (purpose: manual DB testing) that would get the rows of the previous/current/next Fiscal Year and then the SUM(turnover)
Given
(1) the below table,
and
(2) Fiscal Year (FY) = March to February
When Previous FY -- Then 2 rows: 2016/1 to 2016/2
When Current FY -- Then 12 rows: from 2016/3 to 2017/2 (year/month)
When Future FY -- Then 1 row: 2017/3
+--------------+---------------+----------+
| Year (num) | Month (num) | Turnover |
+--------------+---------------+----------+
| 2016 | 1 | 1000 |
+--------------+---------------+----------+
| 2016 | 2 | 2000 |
+--------------+---------------+----------+
| 2016 | 3 | 3000 |
+--------------+---------------+----------+
| 2016 | 4 | 4000 |
+--------------+---------------+----------+
| 2016 | 5 | 2000 |
+--------------+---------------+----------+
| 2016 | 6 | 1000 |
+--------------+---------------+----------+
| 2016 | 7 | 2000 |
+--------------+---------------+----------+
| 2016 | 8 | 1000 |
+--------------+---------------+----------+
| 2016 | 9 | 2000 |
+--------------+---------------+----------+
| 2016 | 10 | 3000 |
+--------------+---------------+----------+
| 2016 | 11 | 4000 |
+--------------+---------------+----------+
| 2016 | 12 | 5000 |
+--------------+---------------+----------+
| 2017 | 1 | 6000 |
+--------------+---------------+----------+
| 2017 | 2 | 2000 |
+--------------+---------------+----------+
| 2017 | 3 | 1000 |
+--------------+---------------+----------+
The best solution I came up with is the below query and change the Year values to switch between years. It feels hacky to me because of creating an extra solumn with sysdate and checking for NOT NULL. Is there a more elegant way?
WITH CTE AS (
SELECT
CASE
WHEN Month BETWEEN 3 AND 12 AND Year = 2016
THEN sysdate
WHEN Month BETWEEN 1 AND 2 AND Year = 2017
THEN sysdate
END case_statement_date,
year, month, turnover, FROM Table
)
SELECT sum(turnover) FROM CTE
WHERE case_statement_date IS NOT NULL
;
Is this what you want?
select year + (case when month >= 3 then 0 else -1 end) as fiscal_year,
sum(turnover)
from t
group by year + (case when month >= 3 then 0 else -1 end) ;