MDX last non empty over multiple dimensions - ssas

I would geatly appreciate if someone could help me with the
problem. I have the following fact table:
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| EntryNo | ItemNo | CompanyId | BranchId | LocationId | ValuationDate | ValuatedQty | ValuatedAmount |
+=========+========+===========+==========+============+===============+=============+================+
| 1 | Item1 | 1 | 1 | 1 | 2016-03-01 | 0 | 0 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 2 | Item1 | 1 | 2 | 1 | 2016-03-01 | 4 | 400 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 3 | Item1 | 1 | 1 | 1 | 2016-03-02 | 10 | 1000 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 4 | Item2 | 1 | 1 | 2 | 2016-03-02 | 4 | 200 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 5 | Item2 | 2 | 2 | 2 | 2016-03-02 | 6 | 300 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 6 | Item1 | 2 | 2 | 1 | 2016-03-03 | 0 | 0 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 7 | Item3 | 1 | 2 | 3 | 2016-03-03 | 0 | 0 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
| 8 | Item1 | 2 | 2 | 3 | 2016-03-03 | 9 | 450 |
+---------+--------+-----------+----------+------------+---------------+-------------+----------------+
There are two measures that represent "overstocked" items on a particular day.
Is it possible to create a calculated member that will allow for slicing data
on the all linked dimensions (Items, Companies, etc.) ? I guess the LastNonEmpty agregration
would be useful here except it is not available in the standard edition.
Given the example the results should be as follows:
By Company:
+---------+-------------+----------------+
| Company | ValuatedQty | ValuatedAmount |
+=========+=============+================+
| 1 | 14 | 1200 |
+---------+-------------+----------------+
| 2 | 15 | 750 |
+---------+-------------+----------------+
By Date:
+------------+-------------+----------------+
| Date | ValuatedQty | ValuatedAmount |
+============+=============+================+
| 2016-03-01 | 4 | 400 |
+------------+-------------+----------------+
| 2016-03-02 | 16 | 1300 |
+------------+-------------+----------------+
| 2016-03-03 | 9 | 450 |
+------------+-------------+----------------+
By Item:
+-------+-------------+----------------+
| Item | ValuatedQty | ValuatedAmount |
+=======+=============+================+
| Item1 | 9 | 450 |
+-------+-------------+----------------+
| Item2 | 6 | 300 |
+-------+-------------+----------------+
| Item3 | 0 | 0 |
+-------+-------------+----------------+

Two functions that come to mind for your requirements are:
Tail: https://msdn.microsoft.com/en-us/library/ms146056.aspx
Bottomcount: https://msdn.microsoft.com/en-us/library/ms144864.aspx
So with Tail something like the following is possible:
WITH SET [LastYearPerSubCat] AS
GENERATE(
[Product].[Product Categories].[SubCategory].members AS S,
S.CURRENTMEMBER
*
TAIL(
NONEMPTY(
[Date].[Calendar Year].[Calendar Year].MEMBERS,
S.CURRENTMEMBER
)
)
)
SELECT
[Measures].[Reseller Gross Profit] ON 0
,[LastYearPerSubCat] ON 1
FROM [Adventure Works];

Related

How do I conditionally increase the value of the proceeding row number by 1

I need to increase the value of the proceeding row number by 1. When the row encounters another condition I then need to reset the counter. This is probably easiest explained with an example:
+---------+------------+------------+-----------+----------------+
| Acct_ID | Ins_Date | Acct_RowID | indicator | Desired_Output |
+---------+------------+------------+-----------+----------------+
| 5841 | 07/11/2019 | 1 | 1 | 1 |
| 5841 | 08/11/2019 | 2 | 0 | 2 |
| 5841 | 09/11/2019 | 3 | 0 | 3 |
| 5841 | 10/11/2019 | 4 | 0 | 4 |
| 5841 | 11/11/2019 | 5 | 1 | 1 |
| 5841 | 12/11/2019 | 6 | 0 | 2 |
| 5841 | 13/11/2019 | 7 | 1 | 1 |
| 5841 | 14/11/2019 | 8 | 0 | 2 |
| 5841 | 15/11/2019 | 9 | 0 | 3 |
| 5841 | 16/11/2019 | 10 | 0 | 4 |
| 5841 | 17/11/2019 | 11 | 0 | 5 |
| 5841 | 18/11/2019 | 12 | 0 | 6 |
| 5132 | 11/03/2019 | 1 | 1 | 1 |
| 5132 | 12/03/2019 | 2 | 0 | 2 |
| 5132 | 13/03/2019 | 3 | 0 | 3 |
| 5132 | 14/03/2019 | 4 | 1 | 1 |
| 5132 | 15/03/2019 | 5 | 0 | 2 |
| 5132 | 16/03/2019 | 6 | 0 | 3 |
| 5132 | 17/03/2019 | 7 | 0 | 4 |
| 5132 | 18/03/2019 | 8 | 0 | 5 |
| 5132 | 19/03/2019 | 9 | 1 | 1 |
| 5132 | 20/03/2019 | 10 | 0 | 2 |
+---------+------------+------------+-----------+----------------+
The column I want to create is 'Desired_Output'. It can be seen from this table that I need to use the column 'indicator'. I want the following row to be n+1; unless the next row is 1. The counter needs to reset when the value 1 is encountered again.
I have tried to use a loop method of some sort but this did not produce the desired results.
Is this possible in some way?
The trick is to identify the group of consecutive rows starts from indicator 1 to the next 1. This is achieve by using the cross apply finding the Acct_RowID with indicator = 1 and use that as a Grp_RowID to use as partition by in the row_number() window function
select *,
Desired_Output = row_number() over (partition by t.Acct_ID, Grp_RowID
order by Acct_RowID)
from your_table t
cross apply
(
select Grp_RowID = max(Acct_RowID)
from your_table x
where x.Acct_ID = t.Acct_ID
and x.Acct_RowID <= t.Acct_RowID
and x.indicator = 1
) g

SQL - Window Functions with dense_rank()

I have a dataset structured such as the one below stored in Hive, call it df:
+-----+-----+----------+--------+
| id1 | id2 | date | amount |
+-----+-----+----------+--------+
| 1 | 2 | 11-07-17 | 0.93 |
| 2 | 2 | 11-11-17 | 1.94 |
| 2 | 2 | 11-09-17 | 1.90 |
| 1 | 1 | 11-10-17 | 0.33 |
| 2 | 2 | 11-10-17 | 1.93 |
| 1 | 1 | 11-07-17 | 0.25 |
| 1 | 1 | 11-09-17 | 0.33 |
| 1 | 1 | 11-12-17 | 0.33 |
| 2 | 2 | 11-08-17 | 1.90 |
| 1 | 1 | 11-08-17 | 0.30 |
| 2 | 2 | 11-12-17 | 2.01 |
| 1 | 2 | 11-12-17 | 1.00 |
| 1 | 2 | 11-09-17 | 0.94 |
| 2 | 2 | 11-07-17 | 1.94 |
| 1 | 2 | 11-11-17 | 1.92 |
| 1 | 1 | 11-11-17 | 0.33 |
| 1 | 2 | 11-10-17 | 1.92 |
| 1 | 2 | 11-08-17 | 0.94 |
+-----+-----+----------+--------+
I wish to partition by id1 and id2, and then order by date descending within each grouping of id1 and id2, and then rank "amount" within that, where the same "amount" on consecutive days would receive the same rank. The ordered and ranked output I'd hope to see is shown here:
+-----+-----+------------+--------+------+
| id1 | id2 | date | amount | rank |
+-----+-----+------------+--------+------+
| 1 | 1 | 2017-11-12 | 0.33 | 1 |
| 1 | 1 | 2017-11-11 | 0.33 | 1 |
| 1 | 1 | 2017-11-10 | 0.33 | 1 |
| 1 | 1 | 2017-11-09 | 0.33 | 1 |
| 1 | 1 | 2017-11-08 | 0.30 | 2 |
| 1 | 1 | 2017-11-07 | 0.25 | 3 |
| 1 | 2 | 2017-11-12 | 1.00 | 1 |
| 1 | 2 | 2017-11-11 | 1.92 | 2 |
| 1 | 2 | 2017-11-10 | 1.92 | 2 |
| 1 | 2 | 2017-11-09 | 0.94 | 3 |
| 1 | 2 | 2017-11-08 | 0.94 | 3 |
| 1 | 2 | 2017-11-07 | 0.93 | 4 |
| 2 | 2 | 2017-11-12 | 2.01 | 1 |
| 2 | 2 | 2017-11-11 | 1.94 | 2 |
| 2 | 2 | 2017-11-10 | 1.93 | 3 |
| 2 | 2 | 2017-11-09 | 1.90 | 4 |
| 2 | 2 | 2017-11-08 | 1.90 | 4 |
| 2 | 2 | 2017-11-07 | 1.94 | 5 |
+-----+-----+------------+--------+------+
I attempted this with the following SQL query:
SELECT
id1,
id2,
date,
amount,
dense_rank() OVER (PARTITION BY id1, id2 ORDER BY date DESC) AS rank
FROM
df
GROUP BY
id1,
id2,
date,
amount
But that query doesn't seem to be doing what I'd like it to as I'm not receiving the output I'm looking for.
It seems like a window function using dense_rank, partition by and order by is what I need but I can't quite seem to get it to give me that sample output that I desire. Any help would be much appreciated! Thanks!
This is quite tricky. I think you need to use lag() to see where the value changes and then do a cumulative sum:
select df.*,
sum(case when prev_amount = amount then 0 else 1 end) over
(partition by id1, id2 order by date desc) as rank
from (select df.*,
lag(amount) over (partition by id1, id2 order by date desc) as prev_amount
from df
) df;

SQL Getting Running Count with SUM and OVER

In sql I have a history table for each item we have and they can have a record of in or out with a quantity for each action. I'm trying to get a running count of how many of an item we have based on whether it's an activity of out or in. Here is my final sql:
SELECT itemid,
activitydate,
activitycode,
SUM(quantity) AS quantity,
SUM(CASE WHEN activitycode = 'IN'
THEN quantity
WHEN activitycode = 'OUT'
THEN -quantity
ELSE 0 END) OVER (PARTITION BY itemid ORDER BY activitydate rows unbounded preceding) AS runningcount
FROM itemhistory
GROUP BY itemid,
activitydate,
activitycode
This results in:
+--------+-------------------------+--------------+----------+--------------+
| itemid | activitydate | activitycode | quantity | runningcount |
+--------+-------------------------+--------------+----------+--------------+
| 1 | 2017-06-08 13:58:00.000 | IN | 1 | 1 |
| 1 | 2017-06-08 16:02:00.000 | IN | 6 | 2 |
| 1 | 2017-06-15 11:43:00.000 | OUT | 3 | 1 |
| 1 | 2017-06-19 12:36:00.000 | IN | 1 | 2 |
| 2 | 2017-06-08 13:50:00.000 | IN | 5 | 1 |
| 2 | 2017-06-12 12:41:00.000 | IN | 4 | 2 |
| 2 | 2017-06-15 11:38:00.000 | OUT | 2 | 1 |
| 2 | 2017-06-20 12:54:00.000 | IN | 15 | 2 |
| 2 | 2017-06-08 13:52:00.000 | IN | 5 | 3 |
| 2 | 2017-06-12 13:09:00.000 | IN | 1 | 4 |
| 2 | 2017-06-15 11:47:00.000 | OUT | 1 | 3 |
| 2 | 2017-06-20 13:14:00.000 | IN | 1 | 4 |
+--------+-------------------------+--------------+----------+--------------+
I want the end result to look like this:
+--------+-------------------------+--------------+----------+--------------+
| itemid | activitydate | activitycode | quantity | runningcount |
+--------+-------------------------+--------------+----------+--------------+
| 1 | 2017-06-08 13:58:00.000 | IN | 1 | 1 |
| 1 | 2017-06-08 16:02:00.000 | IN | 6 | 7 |
| 1 | 2017-06-15 11:43:00.000 | OUT | 3 | 4 |
| 1 | 2017-06-19 12:36:00.000 | IN | 1 | 5 |
| 2 | 2017-06-08 13:50:00.000 | IN | 5 | 5 |
| 2 | 2017-06-12 12:41:00.000 | IN | 4 | 9 |
| 2 | 2017-06-15 11:38:00.000 | OUT | 2 | 7 |
| 2 | 2017-06-20 12:54:00.000 | IN | 15 | 22 |
| 2 | 2017-06-08 13:52:00.000 | IN | 5 | 27 |
| 2 | 2017-06-12 13:09:00.000 | IN | 1 | 28 |
| 2 | 2017-06-15 11:47:00.000 | OUT | 1 | 27 |
| 2 | 2017-06-20 13:14:00.000 | IN | 1 | 28 |
+--------+-------------------------+--------------+----------+--------------+
You want sum(sum()), because this is an aggregation query:
SELECT itemid, activitydate, activitycode,
SUM(quantity) AS quantity,
SUM(SUM(CASE WHEN activitycode = 'IN' THEN quantity
WHEN activitycode = 'OUT' THEN -quantity
ELSE 0
END)
) OVER (PARTITION BY itemid ORDER BY activitydate ) AS runningcount
FROM itemhistory
GROUP BY itemid, activitydate, activitycode

Count rows each month of a year - SQL Server

I have a table "Product" as :
| ProductId | ProductCatId | Price | Date | Deadline |
--------------------------------------------------------------------
| 1 | 1 | 10.00 | 2016-01-01 | 2016-01-27 |
| 2 | 2 | 10.00 | 2016-02-01 | 2016-02-27 |
| 3 | 3 | 10.00 | 2016-03-01 | 2016-03-27 |
| 4 | 1 | 10.00 | 2016-04-01 | 2016-04-27 |
| 5 | 3 | 10.00 | 2016-05-01 | 2016-05-27 |
| 6 | 3 | 10.00 | 2016-06-01 | 2016-06-27 |
| 7 | 1 | 20.00 | 2016-01-01 | 2016-01-27 |
| 8 | 2 | 30.00 | 2016-02-01 | 2016-02-27 |
| 9 | 1 | 40.00 | 2016-03-01 | 2016-03-27 |
| 10 | 4 | 15.00 | 2016-04-01 | 2016-04-27 |
| 11 | 1 | 25.00 | 2016-05-01 | 2016-05-27 |
| 12 | 5 | 55.00 | 2016-06-01 | 2016-06-27 |
| 13 | 5 | 55.00 | 2016-06-01 | 2016-01-27 |
| 14 | 5 | 55.00 | 2016-06-01 | 2016-02-27 |
| 15 | 5 | 55.00 | 2016-06-01 | 2016-03-27 |
I want to create SP count rows of Product each month with condition Year = CurrentYear , like :
| Month| SumProducts | SumExpiredProducts |
-------------------------------------------
| 1 | 3 | 3 |
| 2 | 3 | 3 |
| 3 | 3 | 3 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
| 6 | 2 | 2 |
What should i do ?
You can use a query like the following:
SELECT MONTH([Date]),
COUNT(*) AS SumProducts ,
COUNT(CASE WHEN [Date] > Deadline THEN 1 END) AS SumExpiredProducts
FROM mytable
WHERE YEAR([Date]) = YEAR(GETDATE())
GROUP BY MONTH([Date])

Join Distinct or First

I have a table structure for SalesItems, and Sales.
SalesItems is setup something like this
| SaleItemID | SaleID | ProductID | ProductType |
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 3 | 1 | 15 | 1 |
| 4 | 2 | 5 | 2 |
| 5 | 3 | 1 | 1 |
| 6 | 3 | 8 | 5 |
And Sales is setup something like this
| Sale | Cash |
| 1 | 1.00 |
| 2 | 10.00 |
| 3 | 28.50 |
I am trying to export a basic 'Daily History' that uses joins to spit out the information like this.
| Date | StoreID | Type1Sales | Type2Sales | ... | Cash Taken |
| 5/2 | 50 | 50 | 40 | ... | 39.50 |
| 5/3 | 50 | 10 | 32.50 | ... | 48.50 |
The issue I'm having is if I do an inner join From Sales to Sales Items, I'll end up with this.
| SaleItemID | SaleID | ProductID | ProductType | Sale | Cash |
| 1 | 1 | 1 | 1 | 1 | 1.00 |
| 2 | 1 | 2 | 2 | 1 | 1.00 |
| 3 | 1 | 15 | 1 | 1 | 1.00 |
| 4 | 2 | 5 | 2 | 2 | 10.00 |
| 5 | 3 | 1 | 1 | 3 | 28.50 |
| 6 | 3 | 8 | 5 | 3 | 28.50 |
So if I do a SUM(Cash), then I'll end up returning $70.00, instead of the correct $39.50. I'm not the best with joins, so I've been researching outer joins and such, but none of those seem to work as it's still matching up. Is there a way to only match on the FIRST instance, and return NULL for the rest? For example, something like this
| SaleItemID | SaleID | ProductID | ProductType | Sale | Cash |
| 1 | 1 | 1 | 1 | 1 | 1.00 |
| 2 | 1 | 2 | 2 | 1 | NULL |
| 3 | 1 | 15 | 1 | 1 | NULL |
| 4 | 2 | 5 | 2 | 2 | 10.00 |
| 5 | 3 | 1 | 1 | 3 | 28.50 |
| 6 | 3 | 8 | 5 | 3 | NULL |
Or do you have any other suggestions for returning back the correct amount of Cash for each particular day?
Use DISTINCT(SaleID) in your SELECT to return a single row for each Sale ID.