Python to fit a linear-plateau curve - pandas

I have curve that initially Y increases linearly with X, then reach a plateau at point C.
In other words, the curve can be defined as:
if X < C:
Y = k * X + b
else:
Y = k * C + b
The training data is a list of X ~ Y values. I need to determine k, b and C through a machine learning approach (or similar), since the data is noisy and refection point C changes over time. I want something more robust than get C through observing the current sample data.
How can I do it using sklearn or maybe scipy?

WLOG you can say the second equation is
Y = C
looks like you have a linear regression to fit the line and then a detection point to find the constant.
You know that in the high values of X, as in X > C you are already at the constant. So just check how far back down the values of X you get the same constant.
Then do a linear regression to find the line with value of X, X <= C

Your model is nonlinear
I think the smartest way to solve this is to do these steps:
find the maximum value of Y which is equal to k*C+b
M=max(Y)
drop this maximum value from your dataset
df1 = df[df.Y != M]
and then you have simple dataset to fit your X to Y and you can use sklearn for that

Related

Why high order function is not popular in numpy

For example, when you do x * y + z
with high order function, it can be expressed as : on3(x,y,add,lambda (x,y,z): x * y + z) which in theory could save lots of computations.
My question is why such patterns are rare in numpy
If you say x * y + z with NumPy arrays, x * y will allocate a temporary array and then + z will allocate the final array. If you need to squeeze out every bit of performance, you can avoid the intermediate temporary array like this:
r = x * y
r += z
That will only allocate a single array, which is nearly optimal assuming you don't want to mutate the inputs. If you do want to mutate them, you could instead do this:
x *= y # or np.mul(x, y, out=x)
x += z # or np.add(x, z, out=x)
Then you allocate nothing.
The above may still not be optimal if the data do not fit in cache, because you have to traverse x twice. You can solve this problem using Numba, by writing a vectorized function:
import numba
#numba.vectorize
def fma(x, y, z):
return x * y + z
Now when you run fma(x, y, z) it will visit the first element of each array, run x * y + z on those three elements, detect the type of the result, allocate an output array of that type, and then do the calculation for the rest of the elements.
Putting it all together, doing a single pass over the inputs and allocating nothing:
fma(x, y, z, out=x) # can also use out=y or out=z

Tensorflow: ignore a specific dependency during tf.gradients()

Given variables y and z, both of which depend on a tensor x. By product rule, if I do tf.gradients(yz,x), it would give me y'(x)z(x) + z'(x)y(x). Is there a way I can specify y as a constant with respect to x such that tf.gradients(yz,x) only gives me z'(x)y(x)?
I know y_=tf.constant(sess.run(y)) will give me y as a constant, but I cannot use that solution in my code.
You can use tf.stop_gradient() to block backpropagation. To block gradients in your example:
y = function1(x)
z = function2(x)
blocked_y = tf.stop_gradient(y)
product = blocked_y * z
After you backpropagate through product, the backpropagation will continue to z and not y.

Fast algorithm for computing cofactor matrix

I wonder if there is a fast algorithm, say (O(n^3)) for computing the cofactor matrix (or conjugate matrix) of a N*N square matrix. And yes one could first compute its determinant and inverse separately and then multiply them together. But how about this square matrix is non-invertible?
I am curious about the accepted answer here:Speed up python code for computing matrix cofactors
What would it mean by "This probably means that also for non-invertible matrixes, there is some clever way to calculate the cofactor (i.e., not use the mathematical formula that you use above, but some other equivalent definition)."?
Factorize M = L x D x U, whereL is lower triangular with ones on the main diagonal,U is upper triangular on the main diagonal, andD is diagonal.
You can use back-substitution as with Cholesky factorization, which is similar. Then,
M^{ -1 } = U^{ -1 } x D^{ -1 } x L^{ -1 }
and then transpose the cofactor matrix as :
Cof( M )^T = Det( U ) x Det( D ) x Det( L ) x M^{ -1 }.
If M is singular or nearly so, one element (or more) of D will be zero or nearly zero. Replace those elements with zero in the matrix product and 1 in the determinant, and use the above equation for the transpose cofactor matrix.

Coordinate Descent Algorithm in Julia for Least Squares not converging

As a warm-up to writing my own elastic net solver, I'm trying to get a fast enough version of ordinary least squares implemented using coordinate descent.
I believe I've implemented the coordinate descent algorithm correctly, but when I use the "fast" version (see below), the algorithm is insanely unstable, outputting regression coefficients that routinely overflow a 64-bit float when the number of features is of moderate size compared to the number of samples.
Linear Regression and OLS
If b = A*x, where A is a matrix, x a vector of the unknown regression coefficients, and y is the output, I want to find x that minimizes
||b - Ax||^2
If A[j] is the jth column of A and A[-j] is A without column j, and the columns of A are normalized so that ||A[j]||^2 = 1 for all j, the coordinate-wise update is then
Coordinate Descent:
x[j] <-- A[j]^T * (b - A[-j] * x[-j])
I'm following along with these notes (page 9-10) but the derivation is simple calculus.
It's pointed out that instead of recomputing A[j]^T(b - A[-j] * x[-j]) all the time, a faster way to do it is with
Fast Coordinate Descent:
x[j] <-- A[j]^T*r + x[j]
where the total residual r = b - Ax is computed outside the loop over coordinates. The equivalence of these update rules follows from noting that Ax = A[j]*x[j] + A[-j]*x[-j] and rearranging terms.
My problem is that while the second method is indeed faster, it's wildly numerically unstable for me whenever the number of features isn't small compared to the number of samples. I was wondering if anyone might have some insight as to why that's the case. I should note that the first method, which is more stable, still starts disagreeing with more standard methods as the number of features approaches the number of samples.
Julia code
Below is some Julia code for the two update rules:
function OLS_builtin(A,b)
x = A\b
return(x)
end
function OLS_coord_descent(A,b)
N,P = size(A)
x = zeros(P)
for cycle in 1:1000
for j = 1:P
x[j] = dot(A[:,j], b - A[:,1:P .!= j]*x[1:P .!= j])
end
end
return(x)
end
function OLS_coord_descent_fast(A,b)
N,P = size(A)
x = zeros(P)
for cycle in 1:1000
r = b - A*x
for j = 1:P
x[j] += dot(A[:,j],r)
end
end
return(x)
end
Example of the problem
I generate data with the following:
n = 100
p = 50
σ = 0.1
β_nz = float([i*(-1)^i for i in 1:10])
β = append!(β_nz,zeros(Float64,p-length(β_nz)))
X = randn(n,p); X .-= mean(X,1); X ./= sqrt(sum(abs2(X),1))
y = X*β + σ*randn(n); y .-= mean(y);
Here I use p=50, and I get good agreement between OLS_coord_descent(X,y) and OLS_builtin(X,y), whereas OLS_coord_descent_fast(X,y)returns exponentially large values for the regression coefficients.
When p is less than about 20, OLS_coord_descent_fast(X,y) agrees with the other two.
Conjecture
Since things agrees for the regime of p << n, I think the algorithm is formally correct, but numerically unstable. Does anyone have any thoughts on whether this guess is correct, and if so how to correct for the instability while retaining (most) of the performance gains of the fast version of the algorithm?
The quick answer: You forgot to update r after each x[j] update. Following is the fixed function which behaves like OLS_coord_descent:
function OLS_coord_descent_fast(A,b)
N,P = size(A)
x = zeros(P)
for cycle in 1:1000
r = b - A*x
for j = 1:P
x[j] += dot(A[:,j],r)
r -= A[:,j]*dot(A[:,j],r) # Add this line
end
end
return(x)
end

Plot a third point past the two previously plotted points. Cocos2d

Ok so let me try to explain this the best way that i can.
I have two points plotted 'A' and 'B' and I am trying to plot a third point 'C' so that it is past point 'B' but along the same slope. I have the angle of the line and I would post some code but I really have no idea where to begin.
any help would be awesome!
Just a little code that i do have
CGPoint vector = ccpSub(touchedPoint, fixedPoint);
CGFloat rotateAngle = -ccpToAngle(vector);
Assuming that by this you mean you need a 3rd point C added such that all the points are colinear, all you need to do is calculate the vector that takes you from A to B, and then generate a new point by adding multiples of this vector to the point B. Choose the multiple based on the distance you want C to be from B.
As an example, say A = (2,2), B = (4,3). Then the vector from A to B is given by (2,1).
All you need to do then is work out how far your new point is from B and add a multiple K*(2,1) to your point B where K is chosen to meet the requirements of your distance
I am assuming you are in 2D, but the same method would apply in higher dimensions
My math is rusty, but the linear equation is generally represented as y=m*x+b, where m is the slope, and b is the y-intercept. You can get m, the slope, by taking the difference of the y values and dividing that by the difference in the x values, e.g., if A = (2,2) and B = (4,3), then m is (3-2)/(4-2) or 0.5. Then, you can solve the linear equation for b, the y-intercept, i.e. b=y-m*x and then plug in either of the data points, e.g. if we plug in the x and y values for point A, you get b = 2 - 0.5 * 2 = 1. Now knowing the slope, m (0.5 in this example), and the y-intercept, b (1 in this example), you can calculate the y for any x value using y=m*x+b, in this case y=0.5*x+1.
So, if touchedPoint and fixedPoint are CGPoint, you can calculate the slope and y-intercept from fixedPoint and touchedPoint like so:
double m = (fixedPoint.y - touchedPoint.y) / (fixedPoint.x - touchedPoint.x);
double b = fixedPoint.y - m * fixedPoint.x;
Now, you don't say how you want to determine where this third point, C, is. But if you, for example, knew the x coordinate for this new point C, you can calculate the y coordinate that falls on the same line as follows:
CGPoint pointC;
pointC.x = 400; // or set this to whatever you want
pointC.y = m * pointC.x + b;