Calculate APR (annual percentage rate) Programmatically - objective-c

I am trying to find a way to programmatically calculate APR based on
Total Loan Amount
Payment Amount
Number of payments
Repayment frequency
There is no need to take any fees into account.
It's ok to assume a fixed interest rate, and any remaining amounts can be rolled into the last payment.
The formula below is based on a credit agreement for a total amount of credit of €6000 repayable in 24 equal monthly instalments of €274.11.
(The APR for the above example is 9.4%)
I am looking for an algorithm in any programming language that I can adapt to C.

I suppose you want to compute X from your equation. This equation can be written as
f(y) = y + y**2 + y**3 + ... + y**N - L/P = 0
where
X = APR
L = Loan (6000)
P = Individual Payment (274.11)
N = Number of payments (24)
F = Frequency (12 per year)
y = 1 / ((1 + X)**(1/F)) (substitution to simplify the equation)
Now, you need to solve the equation f(y) = 0 to get y. This can be done e.g. using the Newton's iteration (pseudo-code):
y = 1 (some plausible initial value)
repeat
dy = - f(y) / f'(y)
y += dy
until abs(dy) < eps
The derivative is:
f'(y) = 1 + 2*y + 3*y**2 + ... + N*y**(N-1)
You would compute f(y) and f'(y) using the Horner rule for polynomials to avoid the exponentiation. The derivative can be likely approximated by some few first terms. After you find y, you get x:
x = y**(-F) - 1

Here is the Objective C code snippet I came up with (which seems to be correct) if anybody is interested:
float x = 1;
do{
fx = initialPaymentAmt+paymentAmt *(pow(x, numPayments+1)-x)/(x-1)+0*pow(x,numPayments)-totalLoanAmt;
dx = paymentAmt *(numPayments * pow( x , numPayments + 1 ) - ( numPayments + 1 )* pow(x,numPayments)+1) / pow(x-1,2)+numPayments * 0 * pow(x,numPayments-1);
z = fx / dx;
x=x-z;
} while (fabs(z)>1e-9 );
apr=100*(pow(1/x,ppa)-1);

Related

Numerically stable calculation of invariant mass in particle physics?

In particle physics, we have to compute the invariant mass a lot, which is for a two-body decay
When the momenta (p1, p2) are sometimes very large (up to a factor 1000 or more) compared to the masses (m1, m2). In that case, there is large cancellation happening between the last two terms when the calculation is carried out with floating point numbers on a computer.
What kind of numerical tricks can be used to compute this accurately for any inputs?
The question is about suitable numerical tricks to improve the accuracy of the calculation with floating point numbers, so the solution should be language-agnostic. For demonstration purposes, implementations in Python are preferred. Solutions which reformulate the problem and increase the amount of elementary operations are acceptable, but solutions which suggest to use other number types like decimal or multi-precision floating point numbers are not.
Note: The original question presented a simplified 1D dimensional problem in form of a Python expression, but the question is for the general case where the momenta are given in 3D dimensions. The question was reformulated in this way.
With a few tricks listed on Stackoverflow and the transformation described by Jakob Stark in his answer, it is possible to rewrite the equation into a form that does not suffer anymore from catastrophic cancellation.
The original question asked for a solution in 1D, which has a simple solution, but in practice, we need the formula in 3D and then the solution is more complicated. See this notebook for a full derivation.
Example implementation of numerically stable calculation in 3D in Python:
import numpy as np
# numerically stable implementation
#np.vectorize
def msq2(px1, py1, pz1, px2, py2, pz2, m1, m2):
p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
m1_sq = m1 ** 2
m2_sq = m2 ** 2
x1 = m1_sq / p1_sq
x2 = m2_sq / p2_sq
x = x1 + x2 + x1 * x2
a = angle(px1, py1, pz1, px2, py2, pz2)
cos_a = np.cos(a)
if cos_a >= 0:
y1 = (x + np.sin(a) ** 2) / (np.sqrt(x + 1) + cos_a)
else:
y1 = -cos_a + np.sqrt(x + 1)
y2 = 2 * np.sqrt(p1_sq * p2_sq)
return m1_sq + m2_sq + y1 * y2
# numerically stable calculation of angle
def angle(x1, y1, z1, x2, y2, z2):
# cross product
cx = y1 * z2 - y2 * z1
cy = x1 * z2 - x2 * z1
cz = x1 * y2 - x2 * y1
# norm of cross product
c = np.sqrt(cx * cx + cy * cy + cz * cz)
# dot product
d = x1 * x2 + y1 * y2 + z1 * z2
return np.arctan2(c, d)
The numerically stable implementation can never produce a negative result, which is a commonly occurring problem with naive implementations, even in double precision.
Let's compare the numerically stable function with a naive implementation.
# naive implementation
def msq1(px1, py1, pz1, px2, py2, pz2, m1, m2):
p1_sq = px1 ** 2 + py1 ** 2 + pz1 ** 2
p2_sq = px2 ** 2 + py2 ** 2 + pz2 ** 2
m1_sq = m1 ** 2
m2_sq = m2 ** 2
# energies of particles 1 and 2
e1 = np.sqrt(p1_sq + m1_sq)
e2 = np.sqrt(p2_sq + m2_sq)
# dangerous cancelation in third term
return m1_sq + m2_sq + 2 * (e1 * e2 - (px1 * px2 + py1 * py2 + pz1 * pz2))
For the following image, the momenta p1 and p2 are randomly picked from 1 to 1e5, the values m1 and m2 are randomly picked from 1e-5 to 1e5. All implementations get the input values in single precision. The reference in both cases is calculated with mpmath using the naive formula with 100 decimal places.
The naive implementation loses all accuracy for some inputs, while the numerically stable implementation does not.
If you put e.g. m1 = 1e-4, m2 = 1e-4, p1 = 1 and p2 = 1 in the expression, you get about 4e-8 with double precision but 0.0 with single precision calculation. I assume, that your question is about how one can get the 4e-8 as well with single precision calculation.
What you can do is a taylor expansion (around m1 = 0 and m2 = 0) of the expression above.
e ~ e|(m1=0,m2=0) + de/dm1|(m1=0,m2=0) * m1 + de/dm2|(m1=0,m2=0) * m2 + ...
If I calculated correctly, the zeroth and first order terms are 0 and the second order expansion would be
e ~ (p1+p2)/p1 * m1**2 + (p1+p2)/p2 * m2**2
This yields exactly 4e-8 even with single precision calculation. You can of course do more terms in the expansion if you need, until you hit the precision limit of a single float.
Edit
If the mi are not always much smaller than the pi you could further massage the equation to get
The complicated part is now the one in the square brackets. It essentially is sqrt(x+1)-1 for a wide range of x values. If x is very small, we can use the taylor expansion of the square root (e.g. like here). If the x value is larger, the formula works just fine, because the addition and subtraction of 1 are no longer discarding the value of x due to floating point precision. So one threshold for x must be choosen below one switches to the taylor expansion.

McDonald's sells Chicken McNuggets only in packages of 6, 9 or 20. What is the largest number of McNuggets that cannot be bought exactly?

Question is from MIT OCW Course Number 6.00, As Taught in Fall 2008:
Here is a theorem below:
If it is possible to buy x, x+1,…, x+5 sets of McNuggets, for some x, then it is possible to buy any number of McNuggets >= x, given that McNuggets come in 6, 9 and 20 packs.
Using the theorem above, write an exhaustive search to find the largest number of McNuggets that cannot be bought in exact quantity, i.e. write an iterative program that finds the largest number of McNuggets that cannot be bought in exact quantity. Format of search should follow the outline below:
Hypothesise possible instances of numbers of McNuggets that cannot be purchased exactly, starting with 1.
For each possible instance, called n, test if there exists non-negative integers a, b, and c, such that 6a+9b+20c = n.
If n McNuggets cannot be bought in exact quantity, save n.
When have found 6 consecutive values of n where 6a+9b+20c = n, the last answer that was saved (not the last value of n that had a solution) is the correct answer, since from the theorem, any amount larger than this saved value of n can also be bought in exact quantity
The error is in line 14 of the code below and this is the error:
elif(6*a + 9*b + 20*c < n_saved or 6*a + 9*b + 20*c > n_saved):
^
SyntaxError: invalid syntax
Here is the code:
def largest_not(a, b, c, n, n_saved):
a = 0
b = 0
c = 0
n = 0
n_saved = 0
for a in range (10):
for b in range (10):
for c in range (10):
for n in range (10):
for n_saved in range (10):
if (6*a + 9*b + 20*c == n):
print (n)
elif(6*a + 9*b + 20*c < n_saved or 6*a + 9*b + 20*c > n_saved):
print (n_saved)
if (n - n_saved > 5):
print "Largest number of McNuggets that cannot be bought in exact quantity: " + "<" + n_saved + ">"
else :
print "Have not found the largest number of McNuggets that cannot be bought in exact quantity."
a=6
b=9
c=20
largest_not(a, b, c, n, n_saved)
Here is a way to solve this:
def check(n):
"""check if n nuggets can be bought exactly with packs of 6, 9 and 20"""
for a in range(20):
for b in range(20):
for c in range(20):
if (6*a+9*b+20*c==n):
return True
return False
### look for a serie of 6 successives n
### to apply the theorem
nb_i = 0 ## number of successive good n found
sv_i = 0 ## last good n found
bad_i = 0 ## last bad n found
for i in range(1, 100):
if (check(i)):
nb_i+=1
sv_i=i
else:
bad_i=i
nb_i=0
sv_i=0
if nb_i==6:
print "Solved: the biggest number of nuggets you cannot buy exactly is: " + bad_i
break
result is:
Solved: the biggest number of nuggets you cannot buy exactly is: 43
Your elif needs to be inline with your if. However, your algorithm will also not work.
I got 43 as the largest number not possible: I'm sure this solution could be optimised.
def check(n):
# a in [0..max times 6 goes into n]
for a in range(0, n // 6 + 1):
# b in [0..max times 9 goes into remainder]
for b in range((n - 6*a) // 9 + 1):
# c in [0..max times 20 goes into remainder]
for c in range(0, (n - 6*a - 9*b) // 20 + 1):
if 6*a + 9*b + 20*c == n:
return (a, b, c)
return None
def largest_not():
n = 1
last_n_not_possible = 1
while (n - last_n_not_possible <= 6):
can_purchase = check(n)
if can_purchase is not None:
print("Can purchase ", n, ' = ', can_purchase[0],'*6 + ', can_purchase[1],'*9 + ', can_purchase[2], '*20', sep='')
else:
print("Can't purchase", n)
last_n_not_possible = n
n = n + 1
return last_n_not_possible
print("Answer: ", largest_not())

Is there a more concise way to calculate the P value for the Anderson-Darling A-squared statistic with VBA?

I have two bits of code in VBA for Excel. One calculates the A-squared statistic for the Anderson-Darling test, this bit of code calculates the P value of the A-squared statistic. I am curious if there is a more concise way or more efficient way to calculate this value in VBA:
Function AndDarP(AndDar, Elements)
'Calculates P value for level of significance for the
'Anderson-Darling Test for Normality
'AndDar is the Anderson-Darling Test Statistic
'Elements is the count of elements used in the
'Anderson-Darling test statistic.
'based on calculations at
'http://www.kevinotto.com/RSS/Software/Anderson-Darling%20Normality%20Test%20Calculator.xls
'accessed 21 May 2010
'www.kevinotto.com
'kevin_n_otto#yahoo.com
'Version 6.0
'Permission to freely distribute and modify when properly
'referenced and contact information maintained.
'
'"Keep in mind the test assumes normality, and is looking for sufficient evidence to reject normality.
'That is, a large p-value (often p > alpha = 0.05) would indicate normality.
' * * *
'Test Hypotheses:
'Ho: Data is sampled from a population that is normally distributed
'(no difference between the data and normal data).
'Ha: Data is sampled from a population that is not normally distributed"
Dim M As Double
M = AndDar * (1 + 0.75 / Elements + 2.25 / Elements ^ 2)
Select Case M
Case Is < 0.2
AndDarP = 1 - Exp(-13.436 + 101.14 * M - 223.73 * M ^ 2)
Case Is < 0.34
AndDarP = 1 - Exp(-8.318 + 42.796 * M - 59.938 * M ^ 2)
Case Is < 0.6
AndDarP = Exp(0.9177 - 4.279 * M - 1.38 * M ^ 2)
Case Is < 13
AndDarP = Exp(1.2937 - 5.709 * M + 0.0186 * M ^ 2)
Case Else
AndDarP = 0
End Select
End Function

Find 3D point along the line at given distance

I have a problem and please let me know if my solution is correct.
I have a known point, at location A(x1,y1,z1) and the origin O(0,0,0) and I would like to find the coordinates of a point B(x2,y2,z2) that is located on the line OA, and the distance OB is 1.2 times greater then OA.
So, my idea is to obtain the equation of the line formed by points O and A.
The direction of OA is (-x1, -y1, -z1), so the equation of the line is:
x = -x1*t;
y = -y1*t;
z = -z1*t;
Distance OA is sqrt( (x1-0)^2 + (y1-0)^2 + (z1-0)^2). KNOWN
Distance OB is sqrt( (x2-0)^2 + (y2-0)^2 + (z2-0)^2). UNKNOWN
I can replace the x, y, z points determined for the line equation in the distance OB, and the result should be 1.2 times greater then the distance OA.
So, sqrt( (-x1*t-0)^2 + (-y1*t-0)^2 + (-z1*t-0)^2) = 1.2 * dist(OA).
I find t from here, solving the quadratic equation and I obtain the coordinates of the point by replacing the t in the equation of the line.
Is this correct?
Thank you for your time.
EDIT:
This is my code:
rangeRatio = 1.114;
norm = sqrt((P2(1) - P1(1))^2 + (P2(2) - P1(2))^2 + (P2(3) - P1(3))^2);
P3(1) = P1(1) + ((P2(1,1) - P1(1)) /norm) * rangeRatio;
P3(2) = P1(2) + ((P2(1,2) - P1(2)) /norm) * rangeRatio;
P3(3) = P1(3) + ((P2(1,3) - P1(3)) /norm) * rangeRatio;
I tried also norm = 1, and i get slightly different results but still not always colinear.
Thank you
It is even a lot easier; you can just multiply a, b and c by 1.2. This gives a line that is 1.2 times the size of the original line.

Implementing Wilson Score in SQL

We have a relatively small table that we would like to sort based on rating, using the Wilson interval or a reasonable equivalent. I'm a reasonably smart guy, but my math fu is nowhere near strong enough to understand this:
The above formula, I am told, calculates a score for a positive/negative (thumbs up/thumbs down) voting system. I've never taken a statistics course, and it's been 15 years since I've done any sort of advanced mathematics. I don't have a clue what the little hat that the p is wearing means, or what the backwards Jesus fish beneath z indicates.
I would like to know two things:
Can this formula be altered to accommodate a 5-star rating system? I found this, but the author expresses his doubts as to the accuracy of his formula.
How can this formula be expressed in a SQL function? Note that I do not need to calculate and sort in real-time. The score can be calculated and cached daily.
Am I overlooking something built-in to Microsoft SQL Server?
Instead of trying to manipulate the Wilson's algorithm to do a 5 star rating system. Why don't you look into a different algorithm? This is what imdb uses for their top 250: Bayesian Estimate
As for explaining the math in the Wilson's algorithm, below was posted on the link in your first post. It is written in Ruby.
require 'statistics2'
def ci_lower_bound(pos, n, power)
if n == 0
return 0
end
z = Statistics2.pnormaldist(1-power/2)
phat = 1.0*pos/n
(phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end
If you'd like another example, here is one in PHP:
http://www.derivante.com/2009/09/01/php-content-rating-confidence/
Edit: It seems that derivante.com is no longer around. You can see the original article on archive.org - https://web.archive.org/web/20121018032822/http://derivante.com/2009/09/01/php-content-rating-confidence/ and I've added the code from the article below.
class Rating
{
public static function ratingAverage($positive, $total, $power = '0.05')
{
if ($total == 0)
return 0;
$z = Rating::pnormaldist(1-$power/2,0,1);
$p = 1.0 * $positive / $total;
$s = ($p + $z*$z/(2*$total) - $z * sqrt(($p*(1-$p)+$z*$z/(4*$total))/$total))/(1+$z*$z/$total);
return $s;
}
public static function pnormaldist($qn)
{
$b = array(
1.570796288, 0.03706987906, -0.8364353589e-3,
-0.2250947176e-3, 0.6841218299e-5, 0.5824238515e-5,
-0.104527497e-5, 0.8360937017e-7, -0.3231081277e-8,
0.3657763036e-10, 0.6936233982e-12);
if ($qn < 0.0 || 1.0 < $qn)
return 0.0;
if ($qn == 0.5)
return 0.0;
$w1 = $qn;
if ($qn > 0.5)
$w1 = 1.0 - $w1;
$w3 = - log(4.0 * $w1 * (1.0 - $w1));
$w1 = $b[0];
for ($i = 1;$i <= 10; $i++)
$w1 += $b[$i] * pow($w3,$i);
if ($qn > 0.5)
return sqrt($w1 * $w3);
return - sqrt($w1 * $w3);
}
}
As for doing this in SQL, SQL has all these Math functions already in it's library. If I were you I'd do this in your application though. Make your application update your database every so often (hours? days?) instead of doing this on the fly or your application will become very slow.
Regarding your first question (adjusting the formula to the 5-stars system) I would agree with Paul Creasey.
conversion formula: [3 +/- i stars -> i up/down-votes] (3 stars -> 0)
example: 4 stars -> +1 up-vote, 5 stars -> +2, 1 -> -2 and so on.
I would note though that instead of the lower bound of the interval that both ruby and php functions compute, I would just compute the much more simple wilson midpoint:
(x + (z^2)/2) / (n + z^2)
where:
n = Sum(up_votes) + Sum(|down_votes|)
x = (positive votes)/n = Sum(up_votes) / n
z = 1.96 (fixed value)
Taking Williams link to the php solution http://www.derivante.com/2009/09/01/php-content-rating-confidence/ and making your system such that it just postive and negative (5 stars could be 2 pos, 1 start could be 2 neg perhaps) then it would be fairly easy to convert it to T-SQL, but you'd be much better off doing it in the server side logic.
The author of the first link recently added an SQL implementation to his post.
Here it is:
SELECT widget_id, ((positive + 1.9208) / (positive + negative) -
1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) /
(positive + negative)) / (1 + 3.8416 / (positive + negative))
AS ci_lower_bound FROM widgets WHERE positive + negative > 0
ORDER BY ci_lower_bound DESC;
Whether this can be accommodated to a 5-star rating system is beyond me too.
I have uploaded an Oracle PL/SQL implementation to https://github.com/mattgrogan/stats_wilson_score
create or replace function stats_wilson_score(
/*****************************************************************************************************************
Author : Matthew Grogan
Website : https://github.com/mattgrogan
Name : stats_wilson_score.sql
Description : Oracle PL/SQL function to return the Wilson Score Interval for the given proportion.
Citation : Wilson E.B. J Am Stat Assoc 1927, 22, 209-212
Example:
select
round(29 / 250, 4) point_estimate,
stats_wilson_score(29, 250, 0.10, 'LCL') lcl,
stats_wilson_score(29, 250, 0.10, 'UCL') ucl
from dual;
******************************************************************************************************************/
x integer, -- Number of successes
m integer, -- Number of trials
alpha number default 0.95, -- Probability of a Type I error
return_value varchar2 default 'LCL' -- LCL = Lower control limit, UCL = upper control limit
)
return number is
z float(10);
phat float(10) := 0.0;
lcl float(10) := 0.0;
ucl float(10) := 0.0;
begin
if m = 0 then
return(0);
end if;
case alpha
when 0.10 then z := 1.644854;
when 0.05 then z := 1.959964;
when 0.01 then z := 2.575829;
else return(null); -- No Z value for this alpha
end case;
phat := x/m;
lcl := (phat + z*z/(2*m) - z * sqrt( (phat * (1-phat) ) / m + z * z / (4 * (m * m)) ) ) / (1 + z * z / m);
ucl := (phat + z*z/(2*m) + z * sqrt((phat*(1-phat)+z*z/(4*m))/m))/(1+z*z/m);
case return_value
when 'LCL' then return(lcl);
when 'UCL' then return(ucl);
else return(null);
end case;
end;
/
grant execute on stats_wilson_score to public;
The Wilson score is actually not a very good of a way of sorting items by rating. It's certainly better than just sorting by mean review score, but it still has a lot of problems. For example, an item with 1 negative review (whose quality is still very uncertain) will be sorted below an item with 10 negative reviews and 1 positive review (which we can be fairly certain is bad quality).
I would recommend using an adaptation of the SteamDB rating formula instead (by Reddit user /u/tornmandate). In addition to being better suited to this sort of thing than the Wilson score (for reasons that are explained in the linked article), it can also be adapted to a 5-star rating system much more easily than Wilson.
Original SteamDB formula:
( Total Reviews = Positive Reviews + Negative Reviews )
( Review Score = frac{Positive Reviews}{Total Reviews} )
( Rating = Review Score - (Review Score - 0.5)*2^{-log_{10}(Total Reviews + 1)} )
5-star version (note the change from 0.5 (a 50% score with up/down votes) to 2.5 (a 50% score with 5-star ratings)):
( Total Reviews = total count of all reviews )
( Review Score = mean star rating of all reviews )
( Rating = Review Score - (Review Score - 2.5)*2^{-log_{10}(Total Reviews + 1)} )
The formula is also much more understandable by non-mathematicians and easy to translate into code.