OLAP dimension for boolean, time, selective count

OLAP dimension for boolean, time, selective count - ssas

I have just started tinkering with MS SQL Analysis Services. For a start, I'm creating one cube from sales detail table. For the dimensions I have created ProductDim from product master table, LocationDim from location tables, and a CalendarDim.
However I'm stuck when trying to provide these data:
boolean: how do I let user filter active/inactive transactions? Should I create a dimension containing 2 values, TRUE and FALSE?
time: should I create a dimension containing 00:00:00 to 23:59:59 or should I merge time into my calendar dimension?
transaction count: one transaction can have many line items, there's line item id, and there's transaction id, how do I set the dimension so user can see transaction count? Because the count of the measure is line item count.

So, I've been reading about this quite a bit recently, and I will try to answer each one as much as theory suggests:
For this, you should create something called 'junk' dimension: its basically a dimension with no attributes. http://en.wikipedia.org/wiki/Dimension_(data_warehouse)
You probably don't want the time dimension merged with calendar. You'll end up storing way too many records. If your granularity is minute, then one day would be 24 * 60 = 1440 records. You have to decide how granular you want to go (per minute, per second??) And then store an entire days worth of time in a 'Time' dimension. So you fact tables will have two keys, one to your calendar dimension, and one to your 'time' dimension.
Transaction count should be a 'measure', I think (no?). I assume you have transaction id repeated, because you have multiple line items per transaction. When you setup the measure, you can do 'distinct count' of transaction id.

Related

SSAS Date stored as text

I have a measure table for forecast that has a MMM-YY date stored as text;
Period Forecast
-------------------
Jan-20 200
Feb-20 300
I also have some other tables in my model that have similar date formats ie. (1/2020) or 2020_1. Hence I created a date dimension that maps the period to an actual datetime and linked it to the fact table;
Period (Month/Year) Year_Month MonthEnd
---------------------------------------------------
Jan-20 (1/2020) 2020_1 31/01/2020
Feb-20 (2/2020) 2020_2 28/02/2020
This is causing me two issues;
If I slice the forecast by period I get the right answer, but if I slice by the datetime field 'MonthEnd', SSAS can't allocate the costs across the attributes and I get the total each month (so 500 in both jan and feb in this example). Why?
I can't connect time as a referenced dimension to the date dimension so I can't use any time intelligence features.
I could just swap the period ID for a datetime on ETL to standardise the date fields across the model, but I wondered if there was a standard way to approach this?
https://imgur.com/gallery/onxtvhq

In Analysis Services Multidimensional models you need to standardize on one format for representing a period and have all measure groups use that. I would recommend you change the SQL Query for your Actuals measure group to return values that join to the Period column in your Date table.
Understanding how this works means understanding attribute relationships and the IgnoreUnrelatedDimensions setting. If set to true then slicing by an “unrelated” attribute (one that’s below the grain or unrelated or an unrelated dimension) will just cause the measure to repeat. If set to false then it will become null.
I’m unclear why you need Time as a reference dimension. It appears to also contain a Date hierarchy. Typically Date is for days, weeks, months and years. Typically Time is for hours minutes and seconds. For processing performance reasons I would avoid reference dimensions. They are more trouble than they are worth. Add the Time dimension key to your fact tables.

The scrrenshot shows there is relation between Date and Forcast,so I do not think the root cause that is the root casue,however, you can try GreGalloway's solution, to set the property of IgnoreUnrelatedDimensions to False to test.
enter image description here

Exclude the last fact row of a sum measure

I have a fact table that have 2 columns, 'timeInEventA', 'timeInEventB'. These columns store the difference in seconds between the actual ticket and the next ticket.
Ex: If I have a 'eventA' at 2020/01/04 05:00:00, and the next ticket is 2020/01/04 05:01:10 the column 'timeInEventA' in the first ticket will have the value '70'.
There is the possibility of the ticket have neither eventA, neither eventB, so the two values in the fact table row will be 0.
This difference is calculated in the ETL and stored in the Fact Table.
Problem: The client will filter the period by day. So he will choose 'between 2020/01/03 and 2020/01/05 give me the sum of timeInEventA and timeInEventB'. But was decided that the last ticket of the filter will be excluded because the next event is outside the filter range. So what I can do to exclude the last row register of the sum?
My fact table have these 2 measure columns, a surrogate key to date dimension (ex: 20200103 ), a surrogate key for time with minute granularity( Ex: event occurred 05:03:22 will result in a 0503 surrogate key ), and a surrogate key to the customer dimension.
PS: In the past, I had this problem for another situation and was suggested me to have a sum measure pointing to the measure column, a lastvalue measure pointing to the same column, and a derived calculation subtracts the lastvalue( MDX: Exclude a member that share same dimension property of a measure ).
But for this situation, this approach don't resolve. If my filter is 20200102 to 20200108, this approach will exclude all 20200108 values in the calculation.
Best Regards,
Luis

You need to define another measure with usage of last non empty value(This is done in the SSAS project, Cube structure tab->Define new measure). Then substract its value from your measure.

DAX formula calculate dates between first transaction and (first transaction + 6 months)

Background: I have a column in a Customer dimension with a static date(e.g '2013-01-01').
This column is the result of a calculation that gets the first transaction date ever made by that certain customer. This customer dimension is linked to a fact table containing reportdate as a date column linked to a date dimension.
Goal: I want to make a calculated measure that, based on a sum of amount measure, calculates the result based on a given period between start_date(First transaction date), end_date (first transaction date + 6 months).
All i get is "cannot be determined in the current context" warnings and cannot get my head around to fix it.
All help is welcome!
Thanks in advance,
/Blixter

SOLVED I replicated the logic from the calculated measure found in the Customer table.
=CALCULATE([SumAmount];DATESBETWEEN(DimDate[Date];FIRSTDATE(‌FactTable[Reportdate‌]);DATEADD(FIRSTDATE‌(FactTable[Reportdat‌e]);5;MONTH)))

Query to find average stock ... with a twist

We are trying to calculate average stock from a movements table in a single sql sentence.
As far as we are, no problem with what we thought was a standard approach, instead of adding up the daily stock and divide by the number of days, as we don’t have daily stock, we simply add (movements*remaining days) :
select sum(quantity*(END_DATE-move_date))/(END_DATE-START_DATE)
from move_table
where move_date<=END_DATE
This is a simplified example, in real life we already take care of the initial stock at the starting date. Let’s say there are no movements prior to start_date.
Quantity sign depends on move type (sale, purchase, inventory, etc).
Of course this is done grouping by product, warehouse, ... but you get the idea.
It works as expected and the calculus is fine.
But (there is always a “but”), our customer doesn’t like accounting days when there is no stock (all stock sold out). So, he doesnt like
Sum of (daily_stock) / number_of_days (which is what we calculate using a diferent math)
Instead, he would like
Sum of (daily stock) / number_of_days_in_which_stock_is_not_zero
For sure we can do this in any programming language without much effort, but I was wondering how to do it using plain sql ... and wasn’t able to come up with a solution.
Any suggestion?

Consider creating a new table called something like Stock_EndOfDay_History that has the following columns.
stock#
date
stock_count_eod
This table would get a new row for each stock item at the start of a new day for the prior day. Rows could then be purged from this table once the applicable date value went outside the date window of interest.
To get the "number_of_days_in_which_stock_is_not_zero", use this.
SELECT COUNT(*) AS 'Not_Zero_Stock_Days' FROM Stock_EndOfDay_History
WHERE stock# = <stock#_value>
AND <date_window_clause>
Other approaches might attempt to just add a new column to the existing stock table to maintain a cumulative sum of the " number_of_days_in_which_stock_is_not_zero". But inevitably, questions will be asked as to how did the non-zero stock days count get calculated? Using this new table approach will address those questions better than the new column approach.

How to handle monthly and yearly values

I have a Fact table that holds what are more or less, sales goals. The ETL process that populates it, generates 12 "weighted" values into seperate rows, one per month. Each row however, also includes a field that holds the yearly value. I do this with unpivot. This all works. Now Im trying to get at this data in the cube with an SSRS report. The problem seems to be that I can query and see the results that include either the yearly goal values or the monthly, weighted values, but not both in the same set.
[update for fact table details]
My Fact table looks something like this:
FK_Account
FK_User
Target
Projected
GoalYear
FK_DateKey
FK_Dept
MonthlyWeightedTarget
MonthlyWeightedProjected
When I load this fact table via the ETL, I get the date key associated with each monthly value (MonthlyWeightedTarget). That will be 12 seperate records, but each one will have the same yearly value. Im not including next years value as a seperate column, because there are seperate records already associated with that year.
Basically, the users define a set of goals associated with a given year. Then I am applying a "weighting" to generate 12 seperate "monthly" records, which total up to the yearly target goal. Hope this makes sense.
What I need to see is something like this result:
Account Name
YTDgoal
YearGoal
NextYrGoal
I created a calculated member for the NextYrGoal, but now Im not sure I even need it.
What would be a good approach for handling the above (getting the ytd, yearly and next year values) ?
If I was getting at these values with TSQL, I would sum on the monthly values, and just include the associated yearly and next years values, grouping by account, year-goal, next-year-goal

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas