Strange roots `using numpy.roots` - numpy

Is there something wrong in the evaluation of the polinomial (1-alpha*z)**9 using numpy? For
alpha=3/sqrt(2) my list of coefficients is given in the array
psi_t0 = [1.0, -19.0919, 162.0, -801.859, 2551.5, -5412.55, 7654.5, -6958.99, 3690.56, -869.874]
According to numpy documentation, I have to invert this array in order to compute the zeros, i.e.
psi_t0 = psi_t0[::-1]
Thus giving
a = np.roots(psi_t0)
[0.62765842+0.06979364j 0.62765842-0.06979364j 0.52672941+0.14448097j 0.52672941-0.14448097j 0.42775926+0.13031547j 0.42775926-0.13031547j 0.36690056+0.07504044j 0.36690056-0.07504044j 0.34454214+0.j]
which is completely crap since the roots must be all equal to sqrt(2)/3.

As you take the 9th power you'll find that you create a very "wide" zero, indeed, if you step eps away from the true zero and evaluate you'll get something of O(eps^9). In view of that numerical inaccuracies are all but expected.
>>> np.set_printoptions(4)
>>> print(C)
[-8.6987e+02 3.6906e+03 -6.9590e+03 7.6545e+03 -5.4125e+03 2.5515e+03
-8.0186e+02 1.6200e+02 -1.9092e+01 1.0000e+00]
>>> np.roots(C)
array([0.4881+0.0062j, 0.4881-0.0062j, 0.4801+0.0154j, 0.4801-0.0154j,
0.4681+0.0172j, 0.4681-0.0172j, 0.458 +0.011j , 0.458 -0.011j ,
0.4541+0.j ])
>>> np.polyval(C,_)
array([1.4622e-13+6.6475e-15j, 1.4622e-13-6.6475e-15j,
1.2612e-13+1.5363e-14j, 1.2612e-13-1.5363e-14j,
1.0270e-13+1.3600e-14j, 1.0270e-13-1.3600e-14j,
1.1346e-13+9.7179e-15j, 1.1346e-13-9.7179e-15j,
1.0936e-13+0.0000e+00j])
As you can see the roots numpy returns are "good" in that the polynomial evaluates to something pretty close to zero at these points.

Related

Weird numpy matrix values

When i want to calculate the determinant of matrix using <<np.linalg.det(mat1)>> or calculate the inverse it gives weird value output . For example it gives 1.11022302e-16 instead of 0.
I tried to round the number for determinant but i couldn't do the same for matrix elements.
Maybe the computation is a not fix numbers so multiplication or division very close to zero but not equals.
You can define a delta that can determine if its close enough, and then compute the the absolute distance between the result and the expected value.
Maybe like this:
res = np.linalg.det(mat)
delta = 0.0001
if abs(math.floor(res)-res)<delta:
return math.floor(res)
if abs(math.ceil(res)-res)<delta:
return math.ceil(res)
return res

Numpy returning False even though both arrays are the same?

From my understanding of numpy, the np.equal([x, prod]) command compares the arrays element by element and returns True for each if they are equal. But every time I execute the command, it returns False for the first comparison. On the other hand, if I copy-paste the two arrays into the command, it returns True for both, as you can see in the screenshot. So, why is there a difference between the two?
You cannot compare floating-point numbers, as they are only an approximation. When you compare them by hardcoded values, they will be equal as they are approximated in the exact same way. But once you apply some mathematical operation on them, it's no longer possible to check if two floating-points are equal.
For example, this
a = 0
for i in range(10):
a += 1/10
print(a)
print(a == 1)
will give you 0.9999999999 and False, even though (1/10) * 10 = 1.
To compare floating-point values, you need to compare the two values against a small delta value. In other words, check if they're just a really small value apart. For example
a = 0
for i in range(10):
a += 1/10
delta = 0.00000001
print(a)
print(abs(a - 1) < delta)
will give you True.
For numpy, you can use numpy.isclose to get a mask or numpy.allclose if you only want a True or False value.

tf.round() to a specified precision

tf.round(x) rounds the values of x to integer values.
Is there any way to round to, say, 3 decimal places instead?
You can do it easily like that, if you don't risk reaching too high numbers:
def my_tf_round(x, decimals = 0):
multiplier = tf.constant(10**decimals, dtype=x.dtype)
return tf.round(x * multiplier) / multiplier
Mention: The value of x * multiplier should not exceed 2^32. So using the above method, should not rounds too high numbers.
The Solution of gdelab is very Good moving the required decimal point numbers to left for "." then get them later like "0.78969 * 100" will move 78.969 "2 numbers" then Tensorflow round will make it 78 then you divide it by 100 again making it 0.78 it smart one There is another workaround I would like to share for the Community.
You Can just use the NumPy round method by taking the NumPy matrix or vector then applying the method then convert the result to tensor again
#Creating Tensor
x=tf.random.normal((3,3),mean=0,stddev=1)
x=tf.cast(x,tf.float64)
x
#Grabing the Numpy array from tensor
x.numpy()
#use the numpy round method then convert the result to tensor again
value=np.round(x.numpy(),3)
Result=tf.convert_to_tensor(temp,dtype=tf.float64)
Result

How leave's scores are calculated in this XGBoost trees?

I am looking at the below image.
Can someone explain how they are calculated?
I though it was -1 for an N and +1 for a yes but then I can't figure out how the little girl has .1. But that doesn't work for tree 2 either.
I agree with #user1808924. I think it's still worth to explain how XGBoost works under the hood though.
What is the meaning of leaves' scores ?
First, the score you see in the leaves are not probability. They are the regression values.
In Gradient Boosting Tree, there's only regression tree. To predict if a person like computer games or not, the model (XGboost) will treat it as a regression problem. The labels here become 1.0 for Yes and 0.0 for No. Then, XGboost puts regression trees in for training. The trees of course will return something like +2, +0.1, -1, which we get at the leaves.
We sum up all the "raw scores" and then convert them to probabilities by applying sigmoid function.
How to calculate the score in leaves ?
The leaf score (w) are calculated by this formula:
w = - (sum(gi) / (sum(hi) + lambda))
where g and h are the first derivative (gradient) and the second derivative (hessian).
For the sake of demonstration, let's pick the leaf which has -1 value of the first tree. Suppose our objective function is mean squared error (mse) and we choose the lambda = 0.
With mse, we have g = (y_pred - y_true) and h=1. I just get rid of the constant 2, in fact, you can keep it and the result should stay the same. Another note: at t_th iteration, y_pred is the prediction we have after (t-1)th iteration (the best we've got until that time).
Some assumptions:
The girl, grandpa, and grandma do NOT like computer games (y_true = 0 for each person).
The initial prediction is 1 for all the 3 people (i.e., we guess all people love games. Note that, I choose 1 on purpose to get the same result with the first tree. In fact, the initial prediction can be the mean (default for mean squared error), median (default for mean absolute error),... of all the observations' labels in the leaf).
We calculate g and h for each individual:
g_girl = y_pred - y_true = 1 - 0 = 1. Similarly, we have g_grandpa = g_grandma = 1.
h_girl = h_grandpa = h_grandma = 1
Putting the g, h values into the formula above, we have:
w = -( (g_girl + g_grandpa + g_grandma) / (h_girl + h_grandpa + h_grandma) ) = -1
Last note: In practice, the score in leaf which we see when plotting the tree is a bit different. It will be multiplied by the learning rate, i.e., w * learning_rate.
The values of leaf elements (aka "scores") - +2, +0.1, -1, +0.9 and -0.9 - were devised by the XGBoost algorithm during training. In this case, the XGBoost model was trained using a dataset where little boys (+2) appear somehow "greater" than little girls (+0.1). If you knew what the response variable was, then you could probably interpret/rationalize those contributions further. Otherwise, just accept those values as they are.
As for scoring samples, then the first addend is produced by tree1, and the second addend is produced by tree2. For little boys (age < 15, is male == Y, and use computer daily == Y), tree1 yields 2 and tree2 yields 0.9.
Read this
https://towardsdatascience.com/xgboost-mathematics-explained-58262530904a
and then this
https://medium.com/#gabrieltseng/gradient-boosting-and-xgboost-c306c1bcfaf5
and the appendix
https://gabrieltseng.github.io/appendix/2018-02-25-XGB.html

sympy subs in matrix doesn't change the values

I have a symbolic matrix that I want to differentiate. I have to substitute numeric values to some of the vars and then to solve with respect to 6 unknowns. My problem is that defining the element of matrix A by lambda and subistituting with subs doesn't change any value in the matrix. When I want retrieve the type of matrix in fact it's shown that it's immutable, which seems quite odd. Here's the code:
def optimalF1():
x,y,z=symbols('x y z', Real=True)
phi,theta,psi=symbols('phi theta psi')
b1x,b1y=symbols('b1x b1y')
b2x,b2y=symbols('b2x b2y')
b3x,b3y=symbols('b3x b3y')
b4x,b4y=symbols('b4x b4y')
b5x,b5y=symbols('b5x b5y')
b6x,b6y=symbols('b6x b6y')
bMat=sym.Matrix(([b1x,b2x,b3x,b4x,b5x,b6x],
[b1y,b2y,b3y,b4y,b5y,b6y],[0,0,0,0,0,0]))
mov=np.array([[x],[y],[z]])
Pi=np.repeat(mov,6,axis=1)
sym.pprint(Pi)
print 'shape of thing Pi', np.shape(Pi)
p1x,p1y,p1z=symbols('p1x,p1y,p1z')
p2x,p2y,p2z=symbols('p2x,p2y,p2z')
p3x,p3y,p3z=symbols('p3x,p3y,p3z')
p4x,p4y,p4z=symbols('p4x,p4y,p4z')
p5x,p5y,p5z=symbols('p5x,p5y,p5z')
p6x,p6y,p6z=symbols('p6x,p6y,p6z')
#legs symbolic array
l1,l2,l3,l4,l5,l6=symbols('l1,l2,l3,l4,l5,l6')
piMat=Matrix(([p1x,p2x,p3x,p4x,p5x,p6x],[p1y,p2y,p3y,\
p4y,p5y,p6y],[p1z,p2z,p3z,p4z,p5z,p6z]))
piMat=piMat.subs('p1z',0)
piMat=piMat.subs('p2z',0)
piMat=piMat.subs('p3z',0)
piMat=piMat.subs('p4z',0)
piMat=piMat.subs('p5z',0)
piMat=piMat.subs('p6z',0)
sym.pprint(piMat)
legStroke=np.array([[l1],[l2],[l3],[l4],[l5],[l6]])
'''redefine the Eul matrix
copy values of Pi 6 times by using np.repeat
'''
r1=[cos(phi)*cos(theta)*cos(psi)-sin(phi)*sin(psi),\
-cos(phi)*cos(theta)*sin(psi)-sin(phi)*cos(psi),\
cos(phi)*sin(theta)]
r2=[sin(phi)*cos(theta)*cos(psi)+cos(phi)*sin(psi),\
-sin(phi)*cos(theta)*sin(psi)+cos(phi)*cos(psi),\
sin(phi)*sin(theta)]
r3= [-sin(theta)*cos(psi),sin(theta)*sin(psi),cos(theta)]
EulMat=Matrix((r1,r2,r3))
print(EulMat)
uvw=Pi+EulMat*piMat
print 'uvw matrix is:\n', uvw, np.shape(uvw)
# check thisout -more elegant and compact form
A=Matrix(6,1,lambda j,i:((uvw[0,j]- \
bMat[0,j])**2+(uvw[1,j]-bMat[1,j])**2+\
(uvw[2,j]-bMat[2,j])**2)-legStroke[j]**2)
print'A matrix before simplification:\n ', A
B=simplify(A)
B=B.subs({'x':1.37,'y':0,'z':0,theta:-1.37,phi:0})
print'A matrix form after substituting:\n',B
So comparing A and B leads to the same output. I don't understand why!
When you use subs with variables that have assumptions, you have to use the symbols not strings. Using strings causes a new generic symbol to be created which does not match the symbol having assumptions so the subs fails.
>>> var('x')
x
>>> var('y',real=True)
y
>>> (x+y).subs('x',1).subs('y',2)
y + 1
Note, too, that to make real symbols you should use real=True not Real=True (lower case r).