SSAS - calculated column with an If statment - ssas

I have a calculated column that basicly calculates an average time on page.
fix([Measures].[Time on Page] / ([Measures].[Pageviews] -[Measures].[Exits])) /3600)
the problem is that if Measures].[Time on Page] is 0 its messing up the Calculation due to a devide by 0 error. Is there a way to test for this, maybe add an If statment into the Calcualtion?

If you are using 2012 version, you can use the new divide function.
Here is a link describing how to use that. http://cwebbbi.wordpress.com/2013/07/26/new-mdx-divide-function/
Basically, Divide (, [,])
Alternatively you can use an IF statement to check if the divisior is 0.
But there are cases where we have near 0 numbers as well. For example if you have adjustment data to correct the actuals, you may end up actual value 5, and adjustment value -5 and as a result you expect 0. That may not happen in all the cases. Because of how numbers aggregated internally etc. So the number could become 0.00000001 instead of 0.
For such cases, you need to adjust your condition.

Related

How to handle very big numbers in snowflake?

I've a python program that goes over tables in a DB (not mine) and for each column from type number it performs some mathematical operations such as stdev. However, there are some columns with very numbers and when I'm trying to execute:
select STDDEV(big_col) from table1;
I'm getting the error:
Number out of representable range: type FIXED[SB16](38,0){not null}, value 3.67864e+38
Any idea how can I handle this one? It's ok for me just to ignore this values in this case but I don't want my query to fail.
Thanks,
Nir.
As #dnoeth mentioned in the comment section, casting the standard deviation as DOUBLE should fix the issue: STDDEV(CAST(big_col as DOUBLE)).
The OP asked how the resulting standard deviation seems to be significantly smaller than e+38 (which is the max number of digits that the NUMBER format can hold), then why do we need to cast this number as DOUBLE?
The reason for this lies in the standard deviation formula:
The first step in this process is to subtract the mean of the column from each individual value in that column. All those values are then squared. Now this square determines the 88 upper bound for the values that the NUMBER format needs to be able to handle before further arithmetic operations, like dividing by the number of records (N), and taking a square root reduce it down to the final answer. The final value of standard deviation that you get from this process could be significantly smaller than the sum of squares that's required to be calculated first.

How to identify records which have clusters or lumps in data?

I have a tableau table as follows:
This data can be visualized as follows:
I'd like to flag cases that have lumps/clusters. This would flag items B, C and D because there are spikes only in certain weeks of the 13 weeks. Items A and E would not be flagged as they mostly have a 'flat' profile.
How can I create such a flag in Tableau or SQL to isolate this kind of a case?
What I have tried so far?:
I've tried a logic where for each item I calculate the MAX and MEDIAN. Items that need to be flagged will have a larger (MAX - MEDIAN) value than items that have a fairly 'flat' profile.
Please let me know if there's a better way to create this flag.
Thanks!
Agree with the other commenters that this question could be answered in many different ways and you might need a PhD in Stats to come up with an ideal answer. However, given your basic requirements this might be the easiest/simplest solution you can implement.
Here is what I did to get here:
Create a parameter to define your "spike". If it is going to always be a fixed number you can hardcode this in your formulas. I called min "Min Spike Value".
Create a formula for the Median Values in each bucket. {fixed [Buckets]: MEDIAN([Values])} . (A, B, ... E = "Buckets"). This gives you one value for each letter/bucket that you can compare against.
Create a formula to calculate the difference of each number against the median. abs(sum([Values])-sum([Median Values])). We use the absolute value here because a spike can either be negative or positive (again, if you want to define it that way...). I called this "Spike to Current Value abs difference"
Create a calculated field that evaluates to a boolean to see if the current value is above the threshold for a spike. [Spike to Current Value abs difference] > min([Min Spike Value])
Setup your viz to use this boolean to highlight the spikes. The beauty of the parameter is you can change the value for what a spike should be and it will highlight accordingly. Above the value was 4, but if you change it to 8:

Clever way to check if value meets threshold in VBA

Disclaimer: Numbers below are randomly generated
What I'm trying to do is, purely in VBA, look at the ratio of [column B]/[column A] and checking whether or not the ratio in row 10 (=1,241/468) is below the minimum of the ratios or above the maximum of the ratios in rows 1 through 9 but only compared to the rows where there is a 1 in column C.
That is, compare Cell(B10)/Cell(A10) to Cell(B2)/Cell(A2), Cell(B3)/Cell(A3), etc. (only comparing against rows with a 1 in column C).
The workbook I'm working with has a lot more data and columns and I'm not allowed to explicitly edit the cells, so defining a new column is out of the question. Is there a way to do this in VBA such that it essentially returns a boolean depending whether or not the ratio in the last row violates the threshold defined above?
You can achieve the minimum and maximum ratios (with criteria) easily with the AGGREGATE¹ function's SMALL sub-function and LARGE sub-function.
        
The formulas in D13:E13 are,
=AGGREGATE(15, 6, ((B1:B9)/(A1:A9))/C1:C9, 1)
=AGGREGATE(14, 6, ((B1:B9)/(A1:A9))/C1:C9, 1)
The 6 is the AGGREGATE parameter for ignoring error values. By dividing the ratio
by the value in column C we are producing #DIV/0! errors for anything we do not want considered leaving them ignored. If the values in C were more diverse, we could divide by (C1:C9=1) to produce the same results.
Since we are using the SMALL and LARGE sub-functions, we can easily retrieve the second, third, etc. ratios by increasing the k parameter (the 1 off the back end).
I've modified some of the values in your sample slightly to demonstrate that the min and max with criteria are being picked up correctly.
These can be adapted to VBA with the WorksheetFunction object or Application.Evaluate method.
¹The AGGREGATE¹ function's was introduced with Excel 2010. It is not available in previous versions.

Converting from excel formula for Using forecast with times

When using forecast, you input a number and it should return a value based on the known X data and Known Y data.
However if you put in a time this does not work.
I need two things.
First of all I need the VBA equivalent of forecast. I suspect this to be application.forecast
Then how to use the date as a value for the forecast to work as it should
The formula is as follows:
=FORECAST(15:00:00,A10:A33,B10:B33)
Currently this equation flags up an error.
Any ideas to get this to work for time values?
I see two potential problem areas. The first is the time. Use the TIME function to get a precise time. Second, in D9:D12, the values are left-aligned. Typically, this means they are text, not true numbers. If you absolutely require the m suffix, use a Custom number Format of General\m in order that they retain their numeric status while displaying an m as an increment suffix. If you type the m in, they become text-that-look-like-numbers and are useless for any maths.
=FORECAST(TIME(15, 0, 0), B10:B33, A10:A33)
That returns 3.401666667 which is either 09:38 AM or 3.4 m (it's been a while since I played with the FORECAST function).

How do I define my own hardcoded Ceilings and Floors in Sql Server?

I have some sql statements that calculates some numbers. It's possible (with bonus points/penalty points) that a person could get a final score OVER 100 (with bonus points) or
UNDER 0 (with penalties).
How can i make sure that the value calculated, if it exceeds 100 it get's maxed at 100. Alternatively, if the score ends up being below 0, then it's min'd at 0.
I'm assuming I could use a simple UDF that does this math - that's not hard. I'm not sure if this already exists?
I checked out CEILING and MAX and both of these are doing other things, not what I'm after.
Thoughts?
Tes, it would be nice if SQL Server had the ANSI-SQL "horizontal" aggregate functions, then you could do exactly what the others have suggested: "MIN(Score, 100)", etc. Sadly, it doesn't so you cannot do it that way.
The typical way to do this is with CASE expressions:
SELECT
CASE WHEN Score BETWEEN 0 AND 100 THEN Score
WHEN Score < 0 THEN 0
ELSE 100 END as BoundScore
FROM YourTable
Do you mean that the score is stored in a column in a table, or is there some calculated total with these constraints?
In general, probably the easiest would be to put the logic into a trigger that would check to see what value is being inserted, and change it to your max or min if it's out of bounds.
Or, what I would do is use whatever abstraction level is inserting the value do the checking, since it's not a database integrity issue, but an application validation issue.
A trigger is indeed the correct (and only solution).
You need a trigger which fires BEFORE both UPDATE and INSERT (or two separate triggers, one for UPDATE and one for INSERT). Simply check the proposed new value for that column, and push it up or down if necessary.
http://msdn.microsoft.com/en-us/library/aa258254(SQL.80).aspx
The constraint becomes optional, but is very strongly recommended.
If you don't want to use a trigger then how about
MIN(MAX(XXX, 0), 100) where XXX is whatever existing code you have to calculate a score.
eg: MIN(MAX(enemiesKilled+10*LivesLeft+*15*sum(PrincessRescued), 0), 100)