I have a 2d numpy array, and I would like to evaluate the following equation.
Is there a numpy like(vectorized) way to implement this evaluation rather than just iterating over the values?
Bonus question, is there a way(vectorized again) to save each result of the product inside a new 1d numpy array?
Thanks in advance
btw, The formula was found from this site: https://www.seas.upenn.edu/~ese502/lab-content/extra_materials/Polygon%20Area%20and%20Centroid.pdf
You can use numpy.roll as follows:
import numpy as np
N = 5
a = np.random.rand(N, 2)
x, y = a[:,0], a[:,1]
area = 1/2 * np.sum(x*np.roll(y, 1) - y*np.roll(x, 1))
Here an implementation in SQF language:
private _n0 = count(_polygon);
private _surface = 0;
for "_i0" from 0 to (_n0 - 1) do {
private _p1 = _polygon select _i0;
private _p2 = _polygon select ((_i0 + 1) % _n0);
private _p1x = _p1 select 0;
private _p1y = _p1 select 1;
private _p2x = _p2 select 0;
private _p2y = _p2 select 1;
_surface = _surface + ((_p1x * _p2y) - (_p2x * _p1y)) / 2;
You just need to translate it in python
einsum is handy to implement the shoelace formula
A # --- sample array representing a polygon ordered clockwise
array([[ 0.00, 0.00],
[ 0.00, 10.00],
[ 10.00, 10.00],
[ 10.00, 0.00],
[ 0.00, 0.00]])
# --- can be implemented in a list comprehension of for loop
def _bit_area_(a):
"""Mini e_area, used by `areas` and `centroids`."""
x0, y1 = (a.T)[:, 1:]
x1, y0 = (a.T)[:, :-1]
e0 = np.einsum('...i,...i->...i', x0, y0)
e1 = np.einsum('...i,...i->...i', x1, y1)
return np.sum((e0 - e1) * 0.5)
_bit_area_(A)
100.0 # --- result for a 10 x 10 unit square,
Related
I'm setting up a linear programming optimization model using CPLEX and am wondering if it's possible to accomplish a modification of the cost function dependent upon which binary decision variables are 'active' in an arbitrary solution. This is mostly a question about how to formulate the LP model (if it's even possible), but responses in the context of CPLEX are welcome or even preferred.
Say I have an LP problem in canonical form:
minimize cTx
s.t. Ax <= b
With cost function:
c = [c_1, c_2,...,c_100]
All variables are binary. I have this basic setup modeled and running effectively in CPLEX.
Now say I have a subset of variables:
efficiency_set = [x_1, x_2,...,x_5]
With the condition:
if any x_n in efficiency_set == 1
then c_n for all other x_n in the set = 0.9 * c_n
Essentially there is a dependency where if any x_n in the efficiency set is 'active', it becomes 10% less expensive for other variables in the set to appear in the solution.
I thought that CPLEX indicator constraints were what I was looking for, but after reading through documentation, I don't think I can enforce an on-the-fly change to cost function with them (I could be wrong). So I feel like it needs to be done through formulation of the LP, but I can't reason how to accomplish it. Any ideas?. Thanks.
In CPLEX you have many APIs, let me answer you with the easiest one OPL.
Your canonical form can be written
int n=3;
int m=4;
range N=1..n;
range M=1..m;
float A[N][M]=[[1,4,9,6],[8,5,0,8],[2,9,0,2]];
float B[M]=[3,1,3,0];
float C[N]=[1,1,1];
dvar boolean x[N];
minimize sum(i in N) C[i]*x[i];
subject to
{
forall(j in M) sum(i in N) A[i,j]*x[i]>=B[j];
}
and then you can you write logical constraints:
int n=3;
int m=4;
range N=1..n;
range M=1..m;
float A[N][M]=[[1,4,9,6],[8,5,0,8],[2,9,0,2]];
float B[M]=[3,1,3,0];
float C[N]=[1,1,1];
{int} efficiencySet={1,2};
dvar boolean activeEfficiencySet;
dvar boolean x[N];
minimize sum(i in N) C[i]*x[i]*(1-0.1*activeEfficiencySet*(i not in efficiencySet));
subject to
{
forall(j in M) sum(i in N) A[i,j]*x[i]>=B[j];
activeEfficiencySet==(1<=sum(i in efficiencySet) x[i]);
}
Using Alex's data, I have written the program in docplex (cplex python API)
from docplex.mp.model import Model
n = 3
m = 4
A = {}
A[0, 0] = 1
A[0, 1] = 4
A[0, 2] = 9
A[0, 3] = 6
A[1, 0] = 8
A[1, 1] = 5
A[1, 2] = 0
A[1, 3] = 8
A[2, 0] = 2
A[2, 1] = 9
A[2, 2] = 0
A[2, 3] = 2
B = {}
B[0] = 3
B[1] = 1
B[2] = 3
B[3] = 0
C = {}
C[0] = 1
C[1] = 1
C[2] = 1
efficiencySet = [0, 1]
mdl = Model(name="")
activeEfficiencySet = mdl.binary_var()
x = mdl.binary_var_dict(range(n), name="x")
# constraint 1:
for j in range(m):
mdl.add_constraint(mdl.sum(A[i, j] * x[i] for i in range(n)) >= B[j])
# constraint 2:
mdl.add(activeEfficiencySet == (mdl.sum(x) >= 1))
# objective function:
# expr = mdl.linear_expr()
lst = []
for i in range(n):
if i not in efficiencySet:
lst.append((C[i] * x[i] * (1 - 0.1 * activeEfficiencySet)))
else:
lst.append(C[i] * x[i])
mdl.minimize(mdl.sum(lst))
mdl.solve()
for i in range(n):
print(str(x[i]) + " : " + str(x[i].solution_value))
activeEfficiencySet.solution_value
this is my code and this is my data, and this is the output of the code. I've tried adding one the values on the x axes, thinking maybe values so little can be interpreted as zeros. I've no idea what true_divide could be, and I cannot explain this divide by zero error since there is not a single zero in my data, checked all of my 2500 data points. Hoping that some of you could provide some clarification. Thanks in advance.
import pandas as pd
import matplotlib.pyplot as plt
from iminuit import cost, Minuit
import numpy as np
frame = pd.read_excel('/Users/lorenzotecchia/Desktop/Analisi Laboratorio/Analisi dati/Quinta Esperienza/500Hz/F0000CH2.xlsx', 'F0000CH2')
data = pd.read_excel('/Users/lorenzotecchia/Desktop/Analisi Laboratorio/Analisi dati/Quinta Esperienza/500Hz/F0000CH1.xlsx', 'F0000CH1')
# tempi_500Hz = pd.DataFrame(frame,columns=['x'])
# Vout_500Hz = pd.DataFrame(frame,columns=['y'])
tempi_500Hz = pd.DataFrame(frame,columns=['x1'])
Vout_500Hz = pd.DataFrame(frame,columns=['y1'])
# Vin_500Hz = pd.DataFrame(data,columns=['y'])
def fit_esponenziale(x, α, β):
return α * (1 - np.exp(-x / β))
plt.xlabel('ω(Hz)')
plt.ylabel('Attenuazioni')
plt.title('Fit Parabolico')
plt.scatter(tempi_500Hz, Vout_500Hz)
least_squares = cost.LeastSquares(tempi_500Hz, Vout_500Hz, np.sqrt(Vout_500Hz), fit_esponenziale)
m = Minuit(least_squares, α=0, β=0)
m.migrad()
m.hesse()
plt.errorbar(tempi_500Hz, Vout_500Hz, fmt="o", label="data")
plt.plot(tempi_500Hz, fit_esponenziale(tempi_500Hz, *m.values), label="fit")
fit_info = [
f"$\\chi^2$ / $n_\\mathrm{{dof}}$ = {m.fval:.1f} / {len(tempi_500Hz) - m.nfit}",]
for p, v, e in zip(m.parameters, m.values, m.errors):
fit_info.append(f"{p} = ${v:.3f} \\pm {e:.3f}$")
plt.legend()
plt.show()
input
output and example of data
There is actually a way to fit this completely linear.
See e.g.here
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import cumtrapz
def fit_exp(x, a, b, c):
return a * (1 - np.exp( -b * x) ) + c
nn = 170
xl = np.linspace( 0, 0.001, nn )
yl = fit_exp( xl, 15, 5300, -8.1 ) + np.random.normal( size=nn, scale=0.05 )
"""
with y = a( 1- exp(-bx) ) + c
we have Y = int y = -1/b y + d x + h ....try it out or see below
so we get a linear equation for b (actually 1/b ) to optimize
this goes as:
"""
Yl = cumtrapz( yl, xl, initial=0 )
ST = [xl, yl, np.ones( nn ) ]
S = np.transpose( ST )
eta = np.dot( ST, Yl )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
bFit = -1/sol[1]
print( bFit )
"""
now we can do a normal linear fit
"""
ST = [ fit_exp(xl, 1, bFit, 0), np.ones( nn ) ]
S = np.transpose( ST )
A = np.dot( ST, S )
eta = np.dot( ST, yl )
aFit, cFit = np.linalg.solve( A, eta )
print( aFit, cFit )
print(aFit + cFit, sol[0] ) ### consistency check
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1 )
ax.plot(xl, yl, marker ='+', ls='' )
## at best a sufficient fit, at worst a good start for a non-linear fit
ax.plot(xl, fit_exp( xl, aFit, bFit, cFit ) )
plt.show()
"""
a ( 1 - exp(-b x)) + c = a + c - a exp(-b x) = d - a exp( -b x )
int y = d x + a/b exp( -b x ) + g
= d x +a/b exp( -b x ) + a/b - a/b + c/b - c/b + g
= d x - 1/b ( a - a exp( -b x ) + c ) + c/b + a/b + g
= d x + k y + h
with k = -1/b and h = g + c/b + a/b.
d and h are fitted but not used, but as a+c = d we can check
for consistency
"""
Here is a working Minuit vs curve_fit example. I scaled the function such that the decay in the exponential is in the order of 1 (generally a good idea for non linear fits ). Eventually, both methods give very similar results.
Note:I leave it open whether the error makes sense like this or not. The starting values equal to zero was definitively a bad idea.
import pandas as pd
import matplotlib.pyplot as plt
from iminuit import cost, Minuit
from scipy.optimize import curve_fit
import numpy as np
def fit_exp(x, a, b, c):
return a * (1 - np.exp(- 1000 * b * x) ) + c
nn = 170
xl = np.linspace( 0, 0.001, nn )
yl = fit_exp( xl, 15, 5.3, -8.1 ) + np.random.normal( size=nn, scale=0.05 )
#######################
### Minuit
#######################
least_squares = cost.LeastSquares(xl, yl, np.sqrt( np.abs( yl ) ), fit_exp )
print(least_squares)
m = Minuit(least_squares, a=1, b=5, c=-7)
print( "grad: ")
print( m.migrad() ) ### needs to be called to get fit values
print( m.values )### gives slightly different output
print("Hesse:")
print( m.hesse() )
#######################
### curve_fit
#######################
opt, cov = curve_fit(
fit_exp, xl, yl, sigma=np.sqrt( np.abs( yl ) ),
absolute_sigma=True
)
print( " curve_fit: ")
print( opt )
print( " covariance ")
print( cov )
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1 )
ax.plot(xl, yl, marker ='+', ls='' )
ax.plot(xl, fit_exp(xl, *m.values), ls="--")
ax.plot(xl, fit_exp(xl, *opt), ls=":")
plt.show()
I am trying to generate a sample of 100 scenarios (X, Y) where both X and Y are normally distributed X=N(50,5^2), Y=N(30,2^2) and X and Y are correlated Cov(X,Y)=0.4.
I have been able to generate 100 scenarios with the Cholesky decomposition:
# We do a Cholesky decomposition to generate correlated scenarios
nScenarios = 10
Σ = [25 0.4; 0.4 4]
μ = [50, 30]
L = cholesky(Σ)
v = [rand(Normal(0, 1), nScenarios), rand(Normal(0, 1), nScenarios)]
X = reshape(zeros(nScenarios),1,nScenarios)
Y = reshape(zeros(nScenarios),1,nScenarios)
for i = 1:nScenarios
X[1, i] = sum(L.U[1, j] *v[j][i] for j = 1:nBreadTypes) + μ[1]
Y[1, i] = sum(L.U[2, j] *v[j][i] for j = 1:nBreadTypes) + μ[2]
end
However I need the probability of each scenario, i.e P(X=k and Y=p). My question would be, how can we get a sample of a certain distribution with the probability of each scenario?
Following the BatWannaBe explanation, normally I would do it like this:
julia> using Distributions
julia> d = MvNormal([50.0, 30.0], [25.0 0.4; 0.4 4.0])
FullNormal(
dim: 2
μ: [50.0, 30.0]
Σ: [25.0 0.4; 0.4 4.0]
)
julia> point = rand(d)
2-element Vector{Float64}:
52.807189619051485
32.693811008760676
julia> pdf(d, point)
0.0056519503173830515
I'm trying to implement RGB to HSV conversion from opencv in pure numpy using formula from here:
def rgb2hsv_opencv(img_rgb):
img_hsv = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2HSV)
return img_hsv
def rgb2hsv_np(img_rgb):
assert img_rgb.dtype == np.float32
height, width, c = img_rgb.shape
r, g, b = img_rgb[:,:,0], img_rgb[:,:,1], img_rgb[:,:,2]
t = np.min(img_rgb, axis=-1)
v = np.max(img_rgb, axis=-1)
s = (v - t) / (v + 1e-6)
s[v==0] = 0
# v==r
hr = 60 * (g - b) / (v - t + 1e-6)
# v==g
hg = 120 + 60 * (b - r) / (v - t + 1e-6)
# v==b
hb = 240 + 60 * (r - g) / (v - t + 1e-6)
h = np.zeros((height, width), np.float32)
h = h.flatten()
hr = hr.flatten()
hg = hg.flatten()
hb = hb.flatten()
h[(v==r).flatten()] = hr[(v==r).flatten()]
h[(v==g).flatten()] = hg[(v==g).flatten()]
h[(v==b).flatten()] = hb[(v==b).flatten()]
h[h<0] += 360
h = h.reshape((height, width))
img_hsv = np.stack([h, s, v], axis=-1)
return img_hsv
img_bgr = cv2.imread('00000.png')
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
img_rgb = img_rgb / 255.0
img_rgb = img_rgb.astype(np.float32)
img_hsv1 = rgb2hsv_np(img_rgb)
img_hsv2 = rgb2hsv_opencv(img_rgb)
print('max diff:', np.max(np.fabs(img_hsv1 - img_hsv2)))
print('min diff:', np.min(np.fabs(img_hsv1 - img_hsv2)))
print('mean diff:', np.mean(np.fabs(img_hsv1 - img_hsv2)))
But I get big diff:
max diff: 240.0
min diff: 0.0
mean diff: 0.18085355
Do I missing something?
Also maybe it's possible to write numpy code more efficient, for example without flatten?
Also I have hard time finding original C++ code for cvtColor function, as I understand it should be actually function cvCvtColor from C code, but I can't find actual source code with formula.
From the fact that the max difference is exactly 240, I'm pretty sure that what's happening is in the case when both or either of v==r, v==g are simultaneously true alongside v==b, which gets executed last.
If you change the order from:
h[(v==r).flatten()] = hr[(v==r).flatten()]
h[(v==g).flatten()] = hg[(v==g).flatten()]
h[(v==b).flatten()] = hb[(v==b).flatten()]
To:
h[(v==r).flatten()] = hr[(v==r).flatten()]
h[(v==b).flatten()] = hb[(v==b).flatten()]
h[(v==g).flatten()] = hg[(v==g).flatten()]
The max difference may start showing up as 120, because of that added 120 in that equation. So ideally, you would want to execute these three lines in the order b->g->r. The difference should be negligible then (still noticing a max difference of 0.01~, chalking it up to some round off somewhere).
h[(v==b).flatten()] = hb[(v==b).flatten()]
h[(v==g).flatten()] = hg[(v==g).flatten()]
h[(v==r).flatten()] = hr[(v==r).flatten()]
I need to write a program to generate and display a piecewise quadratic Bezier curve that interpolates each set of data points (I have a txt file contains data points). The curve should have continuous tangent directions, the tangent direction at each data point being a convex combination of the two adjacent chord directions.
0.1 0,
0 0,
0 5,
0.25 5,
0.25 0,
5 0,
5 5,
10 5,
10 0,
9.5 0
The above are the data points I have, does anyone know what formula I can use to calculate control points?
You will need to go with a cubic Bezier to nicely handle multiple slope changes such as occurs in your data set. With quadratic Beziers there is only one control point between data points and so each curve segment much be all on one side of the connecting line segment.
Hard to explain, so here's a quick sketch of your data (black points) and quadratic control points (red) and the curve (blue). (Pretend the curve is smooth!)
Look into Cubic Hermite curves for a general solution.
From here: http://blog.mackerron.com/2011/01/01/javascript-cubic-splines/
To produce interpolated curves like these:
You can use this coffee-script class (which compiles to javascript)
class MonotonicCubicSpline
# by George MacKerron, mackerron.com
# adapted from:
# http://sourceforge.net/mailarchive/forum.php?thread_name=
# EC90C5C6-C982-4F49-8D46-A64F270C5247%40gmail.com&forum_name=matplotlib-users
# (easier to read at http://old.nabble.com/%22Piecewise-Cubic-Hermite-Interpolating-
# Polynomial%22-in-python-td25204843.html)
# with help from:
# F N Fritsch & R E Carlson (1980) 'Monotone Piecewise Cubic Interpolation',
# SIAM Journal of Numerical Analysis 17(2), 238 - 246.
# http://en.wikipedia.org/wiki/Monotone_cubic_interpolation
# http://en.wikipedia.org/wiki/Cubic_Hermite_spline
constructor: (x, y) ->
n = x.length
delta = []; m = []; alpha = []; beta = []; dist = []; tau = []
for i in [0...(n - 1)]
delta[i] = (y[i + 1] - y[i]) / (x[i + 1] - x[i])
m[i] = (delta[i - 1] + delta[i]) / 2 if i > 0
m[0] = delta[0]
m[n - 1] = delta[n - 2]
to_fix = []
for i in [0...(n - 1)]
to_fix.push(i) if delta[i] == 0
for i in to_fix
m[i] = m[i + 1] = 0
for i in [0...(n - 1)]
alpha[i] = m[i] / delta[i]
beta[i] = m[i + 1] / delta[i]
dist[i] = Math.pow(alpha[i], 2) + Math.pow(beta[i], 2)
tau[i] = 3 / Math.sqrt(dist[i])
to_fix = []
for i in [0...(n - 1)]
to_fix.push(i) if dist[i] > 9
for i in to_fix
m[i] = tau[i] * alpha[i] * delta[i]
m[i + 1] = tau[i] * beta[i] * delta[i]
#x = x[0...n] # copy
#y = y[0...n] # copy
#m = m
interpolate: (x) ->
for i in [(#x.length - 2)..0]
break if #x[i] <= x
h = #x[i + 1] - #x[i]
t = (x - #x[i]) / h
t2 = Math.pow(t, 2)
t3 = Math.pow(t, 3)
h00 = 2 * t3 - 3 * t2 + 1
h10 = t3 - 2 * t2 + t
h01 = -2 * t3 + 3 * t2
h11 = t3 - t2
y = h00 * #y[i] +
h10 * h * #m[i] +
h01 * #y[i + 1] +
h11 * h * #m[i + 1]
y