I can't find any example anywhere on the internet .
I would like to learn using the exponential law to calculate a probability.
This my exponential lambda : 0.0035
What is the probability that my object becomes defectuous before 100 hours of work ? P(X < 100)
How could I write this with numpy or sci kit ? Thanks !
Edit : this is the math :
P(X < 100) = 1 - e ** -0.0035 * 100 = 0.3 = 30%
Edit 2 :
Hey guys, I maybe have found something there, hi hi :
http://web.stanford.edu/class/archive/cs/cs109/cs109.1192/handouts/pythonForProbability.html
Edit 3 :
This is my attempt with scipy :
from scipy import stats
B = stats.expon(0.0035) # Declare B to be a normal random variable
print(B.pdf(1)) # f(1), the probability density at 1
print(B.cdf(100)) # F(2) which is also P(B < 100)
print(B.rvs()) # Get a random sample from B
but B.cdf is wrong : it prints 1, while it should print 0.30, please help !
B.pdf prints 0.369 : What is this ?
Edit 4 : I've done it with the python math lib like this :
lambdaCalcul = - 0.0035 * 100
MyExponentialProbability = 1 - math.exp(lambdaCalcul)
print("My probability is",MyExponentialProbability * 100 , "%");
Any other solution with numpy os scipy is appreciated, thank you
The expon(..) function takes as parameters loc and scale (which correspond to the mean and the standard deviation. Since the standard deviation is the inverse of the variance, we thus can construct such distribution with:
B = stats.expon(scale=1/0.0035)
Then the cummulative distribution function says for P(X < 100):
>>> print(B.cdf(100))
0.2953119102812866
Related
I'm struggling a bit finding a fast algorithm that's suitable.
I just want to minimize:
norm2(x-s)
st
G.x <= h
x >= 0
sum(x) = R
G is sparse and contains only 1s (and zeros obviously).
In the case of iterative algorithms, it would be nice to get the interim solutions to show to the user.
The context is that s is a vector of current results, and the user is saying "well the sum of these few entries (entries indicated by a few 1.0's in a row in G) should be less than this value (a row in h). So we have to remove quantities from the entries the user specified (indicated by 1.0 entries in G) in a least-squares optimal way, but since we have a global constraint on the total (R) the values removed need to be allocated in a least-squares optimal way amongst the other entries. The entries can't go negative.
All the algorithms I'm looking at are much more general, and as a result are much more complex. Also, they seem quite slow. I don't see this as a complex problem, although mixes of equality and inequality constraints always seem to make things more complex.
This has to be called from Python, so I'm looking at Python libraries like qpsolvers and scipy.optimize. But I suppose Java or C++ libraries could be used and called from Python, which might be good since multithreading is better in Java and C++.
Any thoughts on what library/package/approach to use to best solve this problem?
The size of the problem is about 150,000 rows in s, and a few dozen rows in G.
Thanks!
Your problem is a linear least squares:
minimize_x norm2(x-s)
such that G x <= h
x >= 0
1^T x = R
Thus it fits the bill of the solve_ls function in qpsolvers.
Here is an instance of how I imagine your problem matrices would look like, given what you specified. Since it is sparse we should use SciPy CSC matrices, and regular NumPy arrays for vectors:
import numpy as np
import scipy.sparse as spa
n = 150_000
# minimize 1/2 || x - s ||^2
R = spa.eye(n, format="csc")
s = np.array(range(n), dtype=float)
# such that G * x <= h
G = spa.diags(
diagonals=[
[1.0 if i % 2 == 0 else 0.0 for i in range(n)],
[1.0 if i % 3 == 0 else 0.0 for i in range(n - 1)],
[1.0 if i % 5 == 0 else 0.0 for i in range(n - 1)],
],
offsets=[0, 1, -1],
)
a_dozen_rows = np.linspace(0, n - 1, 12, dtype=int)
G = G[a_dozen_rows]
h = np.ones(12)
# such that sum(x) == 42
A = spa.csc_matrix(np.ones((1, n)))
b = np.array([42.0]).reshape((1,))
# such that x >= 0
lb = np.zeros(n)
Next, we can solve this problem with:
from qpsolvers import solve_ls
x = solve_ls(R, s, G, h, A, b, lb, solver="osqp", verbose=True)
Here I picked CVXOPT but there are other open-source solvers you can install such as ProxQP, OSQP or SCS. You can install a set of open-source solvers by: pip install qpsolvers[open_source_solvers]. After some solvers are installed, you can list those for sparse matrices by:
print(qpsolvers.sparse_solvers)
Finally, here is some code to check that the solution returned by the solver satisfies our constraints:
tol = 1e-6 # tolerance for checks
print(f"- Objective: {0.5 * (x - s).dot(x - s):.1f}")
print(f"- G * x <= h: {(G.dot(x) <= h + tol).all()}")
print(f"- x >= 0: {(x + tol >= 0.0).all()}")
print(f"- sum(x) = {x.sum():.1f}")
I just tried it with OSQP (adding the eps_rel=1e-5 keyword argument when calling solve_ls, otherwise the returned solution would be less accurate than the tol = 1e-6 tolerance) and it found a solution is 737 milliseconds on my (rather old) CPU with:
- Objective: 562494373088866.8
- G * x <= h: True
- x >= 0: True
- sum(x) = 42.0
Hoping this helps. Happy solving!
I have a dataset which looks roughly as follows (and is sinusoidal in nature):
TW-240-run1.txt
Point Number Temperature
0 51.504781
1 51.487722
2 51.487722
3 51.828893
4 51.828893
5 51.436547
6 51.368312
7 51.726542
8 51.368312
9 51.317137
10 51.317137
11 51.283020
12 51.590073
.
.
.
9599 51.675366
I am tasked with finding the fundamental/first fourier coefficients, a_n and b_n for this dataset, by means of a numerical integration technique. In this case, I am simply using numpy.trapz from numpy, which aims to implement the trapezium rule. The fourier coefficients, a_n and b_n can be calculated with the following formulae:
where tau (𝛕) is the time period of the sine function. For my case, 𝛕 = 240 seconds (referring to the point number 240 on the data sheet), and thus the bounds of integration are 0 to 240. T(t) from the above formulae is the data set and n = 1.
My current code for trying to calculate the fourier coefficients is as follows:
# Packages
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
#input data from datasheet, the loadtxt below takes in the data from t = 0s to t = 240s
x1, y1 = np.loadtxt(r'C:\Users\Sidharth\Documents\y2python\y2python\thermal_4min_a.txt', unpack=True, skiprows=3)
tau_4min = 240.0
def cosine(period, t, n):
return np.cos((2*np.pi*n*t)/(period)) #defines the cos term for the a_n formula
def sine(period, t, n): #defines the sin term for the a_n formula
return np.sin((2*np.pi*n*t)/(period))
a_1_4min = (2/tau_4min)*np.trapz((y1*cos_term_4min), x1) #implement a_n formula (trapezium rule for T(t)*cos)
print('a_1 is', a_1_4min)
b_1_4min = (2/tau_4min)*np.trapz((y1*sin_term_4min), x1) #implement b_n formula (trapezium rule for T(t)*cos)
print('b_1 is', b_1_4min)
Essentially what this is doing is, it takes in the data, but only up to the row index 241 (point number 240), and then multiplies it by the sine/cosine term from each of the above formulae. However, I realise this isn't calculating the fourier coefficients properly.
My question(s) are as follows:
Will my code work if I can find a way to set limits of integration for np.trapz and then importing the entire data set, instead of only importing the data points from 0 to 240 and multiplying it by the cos or sine term, then using np trapz on that product, as I am currently doing (0 and 240 are supposed to be my limits of integration)
I would like to vectorize a function with a condition, meaning to calculate its values with array arithmetic. np.vectorize handles vectorization, but it does not work with array arithmetic, so it is not a complete solution
An answer was given as the solution in the question "How to vectorize a function which contains an if statement?" but did not prevent errors here; see the MWE below.
import numpy as np
def myfx(x):
return np.where(x < 1.1, 1, np.arcsin(1 / x))
y = myfx(x)
This runs but raises the following warnings:
<stdin>:2: RuntimeWarning: divide by zero encountered in true_divide
<stdin>:2: RuntimeWarning: invalid value encountered in arcsin
What is the problem, or is there a better way to do this?
I think this could be done by
Getting the indices ks of x for which x[k] > 1.1 for each k in ks.
Applying np.arcsin(1 / x[ks]) to the slice x[ks], and using 1 for the rest of the elements.
Recombining the arrays.
I am not sure about the efficiency, though.
The statement np.where(x < 1.1, 1, np.arcsin(1 / x)) is equivalent to
mask = x < 1.1
a = 1
b = np.arcsin(1 / x)
np.where(mask, a, b)
Notice that you're calling np.arcsin on all the elements of x, regardless of whether 1 / x <= 1 or not. Your basic plan is correct. You can do the operations in-place on an output array using the where keyword of np.arcsin and np.reciprocal, without having to recombine anything:
def myfx(x):
mask = (x >= 1.1)
out = np.ones(x.shape)
np.reciprocal(x, where=mask, out=out) # >= 1.1 implies != 0
return np.arcsin(out, where=mask, out=out)
Using np.ones ensures that the unmasked elements of out are initialized correctly. An equivalent method would be
out = np.empty(x.shape)
out[~mask] = 1
You can always find an arithmetic expression that prevents the "divide by zero".
Example:
def myfx(x):
return np.where( x < 1.1, 1, np.arcsin(1/np.maximum(x, 1.1)) )
The values where x<1.1 in the right wing are not used, so it's not an issue computing np.arcsin(1/1.1) where x < 1.1.
I am new to scipy minimize. I want to minimize a function. There are 2 vectors in play :
x : 4 element vector of spending
y : 4 element vector of cost per customer
each element in y is defined something like 50 from 0 to 100000, and 0.0005 * X from 100000 to infinity
The objective function is to minimize the spend :
def objective(x):
x1=x[0]
x2=x[1]
x3=x[2]
x4=x[3]
return x1+x2+x3+x4
As the constraint I have the number of users I have to sign up for like this :
def constraint1(x,y):
x[0]/y[0]+x[1]/y[1]+x[2]/y[2]+x[3]/y[3]>5035
bounds and definition like this
b=(0,1000000)
bnds=(b,b,b,b)
con1={'type': 'ineq', 'fun': constraint1}
x0=[20000,20000,20000,20000]
sol= minimize(objective,x0, method= "SLSQP",bounds=bnds,constraints=con1)
I simply do not know to define the Y vector properly. Any feedback or help wound be very much appreciated .
I just started learning optimization and I have some issues finding the optimal value for the problem below.
Note: This is just a random problem that came to my mind and has no real application.
Problem:
where x can be any value from the list ([2,4,6]) and y is between 1 and 3.
My attempt:
from gekko import GEKKO
import numpy as np
import math
def prob(x,y,sel):
z = np.sum(np.array(x)*np.array(sel))
cst = 0
i=0
while i <= y.VALUE:
fact = 1
for num in range(2, i + 1): # find the factorial value
fact *= num
cst += (z**i)/fact
i+=1
return cst
m = GEKKO(remote=False)
sel = [2,4,6] # list of possible x values
x = m.Array(m.Var, 3, **{'value':1,'lb':0,'ub':1, 'integer': True})
y = m.Var(value=1,lb=1,ub=3,integer=True)
# switch to APOPT
m.options.SOLVER = 1
m.Equation(m.sum(x) == 1) # restrict choice to one selection
m.Maximize(prob(x,y,sel))
m.solve(disp=True)
print('Results:')
print(f'x: {x}')
print(f'y : {y.value}')
print('Objective value: ' + str(m.options.objfcnval))
Results:
----------------------------------------------------------------
APMonitor, Version 0.9.2
APMonitor Optimization Suite
----------------------------------------------------------------
--------- APM Model Size ------------
Each time step contains
Objects : 0
Constants : 0
Variables : 4
Intermediates: 0
Connections : 0
Equations : 2
Residuals : 2
Number of state variables: 4
Number of total equations: - 1
Number of slack variables: - 0
---------------------------------------
Degrees of freedom : 3
----------------------------------------------
Steady State Optimization with APOPT Solver
----------------------------------------------
Iter: 1 I: 0 Tm: -0.00 NLPi: 2 Dpth: 0 Lvs: 0 Obj: -7.00E+00 Gap: 0.00E+00
Successful solution
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 0.024000000000000004 sec
Objective : -7.
Successful solution
---------------------------------------------------
Results:
x: [[0.0] [0.0] [1.0]]
y : [1.0]
Objective value: -7.0
x should be [0,0,1] (i.e. 6) and y should be 3 to get the maximum value (61). x value I get is correct but for some reason the y value I get is wrong. What is causing this issue ? Is there something wrong with my formulation ? Also it would be very helpful if you could kindly point me towards more information about the various notations (like Tm, NLPi, etc) in APOPT solver output.
Here is a solution in gekko:
x=6.0
y=3.0
You'll need to use the gekko functions to build the functions and pose the problem in a way so that the equations don't change as the variable values change.
from gekko import GEKKO
import numpy as np
from scipy.special import factorial
m = GEKKO(remote=False)
x = m.sos1([2,4,6])
yb = m.Array(m.Var,3,lb=0,ub=1,integer=True)
m.Equation(m.sum(yb)==1)
y = m.sum([yb[i]*(i+1) for i in range(3)])
yf = factorial(np.linspace(0,3,4))
obj = x**0/yf[0]
for j in range(1,4):
obj += x**j/yf[j]
m.Maximize(yb[j-1]*obj)
m.solve()
print('x='+str(x.value[0]))
print('y='+str(y.value[0]))
print('Objective='+str(-m.options.objfcnval))
For your problem, I used a Special Ordered Set (type 1) to get the options of 2, 4, or 6. To select y as 1, 2, or 3 I calculated all possible values and then used a binary selector yb to choose one. There is a constraint that only one of them can be used with m.sum(yb)==1. There are gekko examples, documentation, and a short course available if you need additional resources.
Here is the solver output:
----------------------------------------------------------------
APMonitor, Version 0.9.2
APMonitor Optimization Suite
----------------------------------------------------------------
--------- APM Model Size ------------
Each time step contains
Objects : 1
Constants : 0
Variables : 11
Intermediates: 1
Connections : 4
Equations : 10
Residuals : 9
Number of state variables: 11
Number of total equations: - 7
Number of slack variables: - 0
---------------------------------------
Degrees of freedom : 4
----------------------------------------------
Steady State Optimization with APOPT Solver
----------------------------------------------
Iter: 1 I: 0 Tm: 0.00 NLPi: 6 Dpth: 0 Lvs: 0 Obj: -6.10E+01 Gap: 0.00E+00
Successful solution
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 0.047799999999999995 sec
Objective : -61.
Successful solution
---------------------------------------------------
x=6.0
y=3.0
Objective=61.0
Here is more information on the solver APOPT options. The iteration summary describes the branch and bound progress. It is Iter=iteration number, Tm=time to solve the NLP, NLPi=NLP iterations, Dpth=depth in the branching tree, Lvs=number of candidates leaves, Obj=NLP solution objective, and Gap=gap between integer solution and best non-integer solution.
equation to be minimzedhey how to solve these type of prolems
problem:
Minimization
summation(xij*yij)
i=from 0 to 4000
j= from 0 to 100
y is coast matrix given
m = GEKKO(remote=False)
dem_var = m.Array(m.Var,(4096,100),lb=0)
for i,j in s_d:
m.Minimize(sum([dem_var[i][j]*coast_new[i][j]]))
here y=coast_new
x= dem_var