Is there a more concise way to calculate the P value for the Anderson-Darling A-squared statistic with VBA? - vba

I have two bits of code in VBA for Excel. One calculates the A-squared statistic for the Anderson-Darling test, this bit of code calculates the P value of the A-squared statistic. I am curious if there is a more concise way or more efficient way to calculate this value in VBA:
Function AndDarP(AndDar, Elements)
'Calculates P value for level of significance for the
'Anderson-Darling Test for Normality
'AndDar is the Anderson-Darling Test Statistic
'Elements is the count of elements used in the
'Anderson-Darling test statistic.
'based on calculations at
'http://www.kevinotto.com/RSS/Software/Anderson-Darling%20Normality%20Test%20Calculator.xls
'accessed 21 May 2010
'www.kevinotto.com
'kevin_n_otto#yahoo.com
'Version 6.0
'Permission to freely distribute and modify when properly
'referenced and contact information maintained.
'
'"Keep in mind the test assumes normality, and is looking for sufficient evidence to reject normality.
'That is, a large p-value (often p > alpha = 0.05) would indicate normality.
' * * *
'Test Hypotheses:
'Ho: Data is sampled from a population that is normally distributed
'(no difference between the data and normal data).
'Ha: Data is sampled from a population that is not normally distributed"
Dim M As Double
M = AndDar * (1 + 0.75 / Elements + 2.25 / Elements ^ 2)
Select Case M
Case Is < 0.2
AndDarP = 1 - Exp(-13.436 + 101.14 * M - 223.73 * M ^ 2)
Case Is < 0.34
AndDarP = 1 - Exp(-8.318 + 42.796 * M - 59.938 * M ^ 2)
Case Is < 0.6
AndDarP = Exp(0.9177 - 4.279 * M - 1.38 * M ^ 2)
Case Is < 13
AndDarP = Exp(1.2937 - 5.709 * M + 0.0186 * M ^ 2)
Case Else
AndDarP = 0
End Select
End Function

Related

VBA - showing wrong results

I have come across a issue while working in VBA . I'm supposed to write program that is Numerical integration of trapeze method (I'm not sure if It is how it's called in English) of function 100*x^99 lower limit = 0 upper limit = 1 . Cells (j,5) contains numbers (10,30,100,300,1000,3000,10000) - amount of point splits . Code seems to work but given wrong results , for amount of splits it should be around
10 - 5.000295129200607
30 - 1.786588019299606
100 - 1.0812206997600746
300 - 1.0091505687770146
1000 - 1.0008248693208752
3000 - 1.0000916650530287
10000 - 1.000008249986933
Function F(x)
F = 100 * (x ^ 99)
End Function
Sub calka()
Dim n As Single
Dim xp As Single
Dim dx As Single
Dim xk As Single
Dim ip As Single
Dim pole As Single
xp = 0
xk = 1
For j = 5 To 11
n = Cells(j, 5)
dx = (xk - xp) / n
pole = 0
For i = 1 To n - 1
pole = pole + F(xp + i * dx)
Next i
pole = pole + ((F(xp) + F(xk)) / 2)
pole = pole * dx
Worksheets("Arkusz1").Cells(j, 7) = pole
Next j
End Sub
I tried to implement same code in java and c++ and it worked flawlessly but VBA always gives me wrong results , I'm not sure if it's rounds at some point and I can disable in settings or my code is just not written right .
Apologies for low clarity It's hard for me to translate mathematic to English.
Use Doubles rather than Singles
http://www.techrepublic.com/article/comparing-double-vs-single-data-types-in-vb6/

Converting an equation to code

I have an equation that can be used to find the gun elevation for artillery, using the range, muzzle velocity and change in altitude in a game called Arma 3. The equation looks like this:
With g being the acceleration due to gravity (9.80665), V being the muzzle velocity, X being the range and Y being the change in altitude (called DAlt in my code).
I'm trying to convert it to a line of code so that I can make a program to calculate the elevation based on given coordinates. However I'm having trouble with it. I currently have this:
If rdoLow.Checked = True Then
Elevation = Math.Atan(((Velocity ^ 2) - (Math.Sqrt((Velocity ^ 4) - (G) * (G * (Range ^ 2) + (2 * DAlt * (Velocity ^ 2)))))) / G * Range)
Else
Elevation = Math.Atan(((Velocity ^ 2) + (Math.Sqrt((Velocity ^ 4) - (G) * (G * (Range ^ 2) + 2 * DAlt * (Velocity ^ 2))))) / G * Range)
End If
Which isn't particularly nice looking but as far as I can tell, it should work. However when I put in the values that the video I got the equation from used, I got a different answer. So there must be something wrong with my equation.
I've tried breaking it in to various parts as separate variables and calculating them, then using those variables in the overall equation, and that still didn't work and gave me an answer that was wrong in another way.
So I'm currently at a loss on how to fix it, starting to wonder if the way that vb handles long equations is different or something.
Any help is much appreciated.
You haven't given any sample data, so I can't verify that this gives the correct answer, but the last part of your equation is missing some parentheses.
Elevation = Math.Atan(((Velocity ^ 2) + Math.Sqrt((Velocity ^ 4) - (G * ((G * (Range ^ 2)) + (2 * DAlt * (Velocity ^ 2)))))) / (G * Range))
Note the parenthesis around the last G * Range.
Multiplication and division have equal precedence, so they are evaluated from left-to-right. See Operator Precedence in Visual Basic.
You were dividing everything by G, then multiplying the result by Range, whereas you needed to multiply G by Range, then divide everything by the result of that.
You can see the difference in this simple example:
Console.WriteLine(3 / 4 * 5) ' Prints 3.75
Console.WriteLine(3 / (4 * 5)) ' Prints 0.15
Out of curiosity I tried the problem. In order to have test data I found this web site, Range Tables For Mortars. I tested with the '82mm Mortar - Far' that has an initial velocity of 200 m/s. One problem I had, and don't know if I fixed it correctly, was that the first part of the equation was returning negative numbers... I also solved for the ±. To test I created a form with a button to perform the calculation, a textbox to enter the distance, and two labels to show the angles. This is what I came up with.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
'A - launch angle
'Target
' r - range
' y - altitude
'g - gravity 9.80665 m/s^2
'v - launch speed e.g. 50 m/s
'
'
'Formula
'from - https://en.wikipedia.org/wiki/Trajectory_of_a_projectile#Angle_required_to_hit_coordinate_.28x.2Cy.29
'in parts
'parts - px
' p1 = sqrt( v^4 - g * (g * r^2 + 2 * y * v^2) )
' p2 = v^2 ± p1 note plus / minus
' p3 = p2 / g * r
'
' A = arctan(p3)
Dim Ap, Am, r, y As Double
Dim g As Double = 9.80665
Dim v As Double
Dim p1, p2p, p2m, p3p, p3m As Double
If Not Double.TryParse(TextBox1.Text, r) Then Exit Sub
y = 0
v = 200 '82mm Mortar - Far velocity
p1 = v ^ 4 - g * (g * r ^ 2 + 2 * y * v ^ 2)
If p1 < 0 Then
Debug.WriteLine(p1)
p1 = Math.Abs(p1) 'square root does not like negative numbers
End If
p1 = Math.Sqrt(p1)
'plus / minus
p2p = v ^ 2 + p1
p2m = v ^ 2 - p1
p3p = p2p / (g * r)
p3m = p2m / (g * r)
Const radiansToDegrees As Double = 180 / Math.PI
Ap = Math.Atan(p3p) * radiansToDegrees
Am = Math.Atan(p3m) * radiansToDegrees
Label1.Text = Ap.ToString("n3")
Label2.Text = Am.ToString("n3")
End Sub
Using the web site to verify the calculations the code seem correct.
Writing long formulas in a bunch of nested parentheses serves no purpose, unless you are going for confusion.

End a while loop in the event a condition is met

I am in the process of converting Mathcad code into VBA and am trying to figure out how to replicate the While loop, which asks the program to run the loop while TGuess < 0. At the end of the loop is an if statement to break the loop if sGuess>1/1.4 (I would attach a picture, but my reputation does not allow me to).
I have written this code in VBA, but am wondering if including the sGuess variable in the original While statement is correct, or if it could influence the output of the loop:
While TGuess < 0 And sGuess <= 1 / 1.4
kterm = (kj ^ (1 / 6)) / 26 'k term in the numerator of depth equation
epw = 3 / 5
FDepth = ((kterm * RainInt * L * CF) / sGuess ^ 0.5) ^ epw
tflow = UW_Wat * g * sGuess * FDepth 'Calc Flow Shear Stress
pflow = 7.853 * UW_Wat * (tflow / UW_Wat) ^ (3 / 2)
TGuess = pflow - pcrit 'Recalc TGuess as E-P
sGuess = sGuess + SlopeInc 'Calc new stable slope
Wend
Any input would be appreciated.
To mitigate your concern, it might be better to replace the while...wend loop with a Do While ... Loop block. You can then put your break condition where you'd have it in the corresponding Mathcad code by using something along the lines of
If sGuess > 1/1.4 Then
Exit Do
End If

Calculate APR (annual percentage rate) Programmatically

I am trying to find a way to programmatically calculate APR based on
Total Loan Amount
Payment Amount
Number of payments
Repayment frequency
There is no need to take any fees into account.
It's ok to assume a fixed interest rate, and any remaining amounts can be rolled into the last payment.
The formula below is based on a credit agreement for a total amount of credit of €6000 repayable in 24 equal monthly instalments of €274.11.
(The APR for the above example is 9.4%)
I am looking for an algorithm in any programming language that I can adapt to C.
I suppose you want to compute X from your equation. This equation can be written as
f(y) = y + y**2 + y**3 + ... + y**N - L/P = 0
where
X = APR
L = Loan (6000)
P = Individual Payment (274.11)
N = Number of payments (24)
F = Frequency (12 per year)
y = 1 / ((1 + X)**(1/F)) (substitution to simplify the equation)
Now, you need to solve the equation f(y) = 0 to get y. This can be done e.g. using the Newton's iteration (pseudo-code):
y = 1 (some plausible initial value)
repeat
dy = - f(y) / f'(y)
y += dy
until abs(dy) < eps
The derivative is:
f'(y) = 1 + 2*y + 3*y**2 + ... + N*y**(N-1)
You would compute f(y) and f'(y) using the Horner rule for polynomials to avoid the exponentiation. The derivative can be likely approximated by some few first terms. After you find y, you get x:
x = y**(-F) - 1
Here is the Objective C code snippet I came up with (which seems to be correct) if anybody is interested:
float x = 1;
do{
fx = initialPaymentAmt+paymentAmt *(pow(x, numPayments+1)-x)/(x-1)+0*pow(x,numPayments)-totalLoanAmt;
dx = paymentAmt *(numPayments * pow( x , numPayments + 1 ) - ( numPayments + 1 )* pow(x,numPayments)+1) / pow(x-1,2)+numPayments * 0 * pow(x,numPayments-1);
z = fx / dx;
x=x-z;
} while (fabs(z)>1e-9 );
apr=100*(pow(1/x,ppa)-1);

Implementing Wilson Score in SQL

We have a relatively small table that we would like to sort based on rating, using the Wilson interval or a reasonable equivalent. I'm a reasonably smart guy, but my math fu is nowhere near strong enough to understand this:
The above formula, I am told, calculates a score for a positive/negative (thumbs up/thumbs down) voting system. I've never taken a statistics course, and it's been 15 years since I've done any sort of advanced mathematics. I don't have a clue what the little hat that the p is wearing means, or what the backwards Jesus fish beneath z indicates.
I would like to know two things:
Can this formula be altered to accommodate a 5-star rating system? I found this, but the author expresses his doubts as to the accuracy of his formula.
How can this formula be expressed in a SQL function? Note that I do not need to calculate and sort in real-time. The score can be calculated and cached daily.
Am I overlooking something built-in to Microsoft SQL Server?
Instead of trying to manipulate the Wilson's algorithm to do a 5 star rating system. Why don't you look into a different algorithm? This is what imdb uses for their top 250: Bayesian Estimate
As for explaining the math in the Wilson's algorithm, below was posted on the link in your first post. It is written in Ruby.
require 'statistics2'
def ci_lower_bound(pos, n, power)
if n == 0
return 0
end
z = Statistics2.pnormaldist(1-power/2)
phat = 1.0*pos/n
(phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end
If you'd like another example, here is one in PHP:
http://www.derivante.com/2009/09/01/php-content-rating-confidence/
Edit: It seems that derivante.com is no longer around. You can see the original article on archive.org - https://web.archive.org/web/20121018032822/http://derivante.com/2009/09/01/php-content-rating-confidence/ and I've added the code from the article below.
class Rating
{
public static function ratingAverage($positive, $total, $power = '0.05')
{
if ($total == 0)
return 0;
$z = Rating::pnormaldist(1-$power/2,0,1);
$p = 1.0 * $positive / $total;
$s = ($p + $z*$z/(2*$total) - $z * sqrt(($p*(1-$p)+$z*$z/(4*$total))/$total))/(1+$z*$z/$total);
return $s;
}
public static function pnormaldist($qn)
{
$b = array(
1.570796288, 0.03706987906, -0.8364353589e-3,
-0.2250947176e-3, 0.6841218299e-5, 0.5824238515e-5,
-0.104527497e-5, 0.8360937017e-7, -0.3231081277e-8,
0.3657763036e-10, 0.6936233982e-12);
if ($qn < 0.0 || 1.0 < $qn)
return 0.0;
if ($qn == 0.5)
return 0.0;
$w1 = $qn;
if ($qn > 0.5)
$w1 = 1.0 - $w1;
$w3 = - log(4.0 * $w1 * (1.0 - $w1));
$w1 = $b[0];
for ($i = 1;$i <= 10; $i++)
$w1 += $b[$i] * pow($w3,$i);
if ($qn > 0.5)
return sqrt($w1 * $w3);
return - sqrt($w1 * $w3);
}
}
As for doing this in SQL, SQL has all these Math functions already in it's library. If I were you I'd do this in your application though. Make your application update your database every so often (hours? days?) instead of doing this on the fly or your application will become very slow.
Regarding your first question (adjusting the formula to the 5-stars system) I would agree with Paul Creasey.
conversion formula: [3 +/- i stars -> i up/down-votes] (3 stars -> 0)
example: 4 stars -> +1 up-vote, 5 stars -> +2, 1 -> -2 and so on.
I would note though that instead of the lower bound of the interval that both ruby and php functions compute, I would just compute the much more simple wilson midpoint:
(x + (z^2)/2) / (n + z^2)
where:
n = Sum(up_votes) + Sum(|down_votes|)
x = (positive votes)/n = Sum(up_votes) / n
z = 1.96 (fixed value)
Taking Williams link to the php solution http://www.derivante.com/2009/09/01/php-content-rating-confidence/ and making your system such that it just postive and negative (5 stars could be 2 pos, 1 start could be 2 neg perhaps) then it would be fairly easy to convert it to T-SQL, but you'd be much better off doing it in the server side logic.
The author of the first link recently added an SQL implementation to his post.
Here it is:
SELECT widget_id, ((positive + 1.9208) / (positive + negative) -
1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) /
(positive + negative)) / (1 + 3.8416 / (positive + negative))
AS ci_lower_bound FROM widgets WHERE positive + negative > 0
ORDER BY ci_lower_bound DESC;
Whether this can be accommodated to a 5-star rating system is beyond me too.
I have uploaded an Oracle PL/SQL implementation to https://github.com/mattgrogan/stats_wilson_score
create or replace function stats_wilson_score(
/*****************************************************************************************************************
Author : Matthew Grogan
Website : https://github.com/mattgrogan
Name : stats_wilson_score.sql
Description : Oracle PL/SQL function to return the Wilson Score Interval for the given proportion.
Citation : Wilson E.B. J Am Stat Assoc 1927, 22, 209-212
Example:
select
round(29 / 250, 4) point_estimate,
stats_wilson_score(29, 250, 0.10, 'LCL') lcl,
stats_wilson_score(29, 250, 0.10, 'UCL') ucl
from dual;
******************************************************************************************************************/
x integer, -- Number of successes
m integer, -- Number of trials
alpha number default 0.95, -- Probability of a Type I error
return_value varchar2 default 'LCL' -- LCL = Lower control limit, UCL = upper control limit
)
return number is
z float(10);
phat float(10) := 0.0;
lcl float(10) := 0.0;
ucl float(10) := 0.0;
begin
if m = 0 then
return(0);
end if;
case alpha
when 0.10 then z := 1.644854;
when 0.05 then z := 1.959964;
when 0.01 then z := 2.575829;
else return(null); -- No Z value for this alpha
end case;
phat := x/m;
lcl := (phat + z*z/(2*m) - z * sqrt( (phat * (1-phat) ) / m + z * z / (4 * (m * m)) ) ) / (1 + z * z / m);
ucl := (phat + z*z/(2*m) + z * sqrt((phat*(1-phat)+z*z/(4*m))/m))/(1+z*z/m);
case return_value
when 'LCL' then return(lcl);
when 'UCL' then return(ucl);
else return(null);
end case;
end;
/
grant execute on stats_wilson_score to public;
The Wilson score is actually not a very good of a way of sorting items by rating. It's certainly better than just sorting by mean review score, but it still has a lot of problems. For example, an item with 1 negative review (whose quality is still very uncertain) will be sorted below an item with 10 negative reviews and 1 positive review (which we can be fairly certain is bad quality).
I would recommend using an adaptation of the SteamDB rating formula instead (by Reddit user /u/tornmandate). In addition to being better suited to this sort of thing than the Wilson score (for reasons that are explained in the linked article), it can also be adapted to a 5-star rating system much more easily than Wilson.
Original SteamDB formula:
( Total Reviews = Positive Reviews + Negative Reviews )
( Review Score = frac{Positive Reviews}{Total Reviews} )
( Rating = Review Score - (Review Score - 0.5)*2^{-log_{10}(Total Reviews + 1)} )
5-star version (note the change from 0.5 (a 50% score with up/down votes) to 2.5 (a 50% score with 5-star ratings)):
( Total Reviews = total count of all reviews )
( Review Score = mean star rating of all reviews )
( Rating = Review Score - (Review Score - 2.5)*2^{-log_{10}(Total Reviews + 1)} )
The formula is also much more understandable by non-mathematicians and easy to translate into code.