I have two matrices: quantities and displacements.
The problem is as follows:
[0.011 * x + 0.0295 * y + 0.080 * w + 0.182 * z] = [-4.31, 8.15, 0.83]
[0.011 * x + 0.0220 * y + 0.098 * w + 0.180 * z] = [-3.70, 6.30, 1.03]
[0.013 * x + 0.0230 * y + 0.108 * w + 0.172 * z] = [-3.89, 6.33, 0.52]
[0.013 * x + 0.0230 * y + 0.105 * w + 0.175 * z] = [-3.38, 5.55, 0.54]
In numpy:
quantities = np matrix ([[0.011, 0.0295, 0.080, 0.182], [0.011, 0.022, 0.098, 0.180], [0.013, 0.023, 0.108, 0.172], [0.013, 0.023, 0.105, 0.175]))
displacements = np matrix ([[-4.31, 8.15, 0.83], [-3.7, 6.3, 1.03], [-3.89, 6.33, 0.52] , [-3.38, 5.55, 0.54]])
To obtain the displacement [-4.37, 7.44, 1.01], what are the values of x, y, w, z used?
That is:
[ax + by + cw + dz] = [-4.37, 7.44, 1.01]
What are the values of a, b, c and d?
Related
import numpy as np
import time
features, labels = d2l.get_data_ch7()
def init_adam_states():
v_w, v_b = torch.zeros((features.shape[1], 1),dtype=torch.float32), torch.zeros(1, dtype=torch.float32)
s_w, s_b = torch.zeros((features.shape[1], 1),dtype=torch.float32), torch.zeros(1, dtype=torch.float32)
return ((v_w, s_w), (v_b, s_b))
def adam(params, states, hyperparams):
beta1, beta2, eps = 0.9, 0.999, 1e-6
for p, (v, s) in zip(params, states):
v[:] = beta1 * v + (1 - beta1) * p.grad.data
s = beta2 * s + (1 - beta2) * p.grad.data**2
v_bias_corr = v / (1 - beta1 ** hyperparams['t'])
s_bias_corr = s / (1 - beta2 ** hyperparams['t'])
p.data -= hyperparams['lr'] * v_bias_corr / (torch.sqrt(s_bias_corr) + eps)
hyperparams['t'] += 1
def train_ch7(optimizer_fn, states, hyperparams, features, labels, batch_size=10, num_epochs=2):
# 初始化模型
net, loss = d2l.linreg, d2l.squared_loss
w = torch.nn.Parameter(torch.tensor(np.random.normal(0, 0.01, size=(features.shape[1], 1)), dtype=torch.float32),
requires_grad=True)
b = torch.nn.Parameter(torch.zeros(1, dtype=torch.float32), requires_grad=True)
def eval_loss():
return loss(net(features, w, b), labels).mean().item()
ls = [eval_loss()]
data_iter = torch.utils.data.DataLoader(torch.utils.data.TensorDataset(features, labels), batch_size, shuffle=True)
for _ in range(num_epochs):
start = time.time()
print(w)
print(b)
for batch_i, (X, y) in enumerate(data_iter):
l = loss(net(X, w, b), y).mean() # 使⽤平均损失
# 梯度清零
if w.grad is not None:
w.grad.data.zero_()
b.grad.data.zero_()
l.backward()
optimizer_fn([w, b], states, hyperparams) # 迭代模型参数
if (batch_i + 1) * batch_size % 100 == 0:
ls.append(eval_loss()) # 每100个样本记录下当前训练误差
# 打印结果和作图
print('loss: %f, %f sec per epoch' % (ls[-1], time.time() - start))
d2l.set_figsize()
d2l.plt.plot(np.linspace(0, num_epochs, len(ls)), ls)
d2l.plt.xlabel('epoch')
d2l.plt.ylabel('loss')
train_ch7(adam, init_adam_states(), {'lr': 0.01, 't': 1}, features, labels)
I want to implement the Adam algorithm in the follow code and I feel confused in the function named adam.
v = beta1 * v + (1 - beta1) * p.grad.data
s = beta2 * s + (1 - beta2) * p.grad.data**2
when I use the follow code, the loss function curve is figure 1.
figure 1
v[:] = beta1 * v + (1 - beta1) * p.grad.data
s = beta2 * s + (1 - beta2) * p.grad.data**2
or
v = beta1 * v + (1 - beta1) * p.grad.data
s[:] = beta2 * s + (1 - beta2) * p.grad.data**2
when I use the follow code, the loss function curve is figure 2.
figure 2
v[:] = beta1 * v + (1 - beta1) * p.grad.data
s[:] = beta2 * s + (1 - beta2) * p.grad.data**2
when I use the follow code, the loss function curve is figure 3.
figure 3
The loss function curve in case 3 has always been smoother than that in case 1.
The loss function curve in case 2 sometimes can't converge.
Why is different?
To answer the first question,
v = beta1 * v + (1 - beta1) * p.grad.data
is an out-of-place operation. Remember that python variables are references to objects. By assigning a new value to variable v, the underlying object which v referred to before this assignment will not be changed. Instead the expression beta1 * v + (1 - beta1) * p.grad.data results in a new tensor which is then referred to by v.
On the other hand
v[:] = beta1 * v + (1 - beta1) * p.grad.data
is an in-place operation. After this operation v still refers to the same underlying object, and the elements of that tensor are modified and replaced with the values of the new tensor beta1 * v + (1 - beta1) * p.grad.data.
Take a look at the following 3 lines to see why this matters
for p, (v, s) in zip(params, states):
v[:] = beta1 * v + (1 - beta1) * p.grad.data
s[:] = beta2 * s + (1 - beta2) * p.grad.data**2
v and s are actually referring to tensors which are stored in states. If we do in-place operations then the values in states are changed to reflect the value assigned to v[:] and s[:].
If out-of-place operations are used then the values in states remain unchanged.
this is my code and this is my data, and this is the output of the code. I've tried adding one the values on the x axes, thinking maybe values so little can be interpreted as zeros. I've no idea what true_divide could be, and I cannot explain this divide by zero error since there is not a single zero in my data, checked all of my 2500 data points. Hoping that some of you could provide some clarification. Thanks in advance.
import pandas as pd
import matplotlib.pyplot as plt
from iminuit import cost, Minuit
import numpy as np
frame = pd.read_excel('/Users/lorenzotecchia/Desktop/Analisi Laboratorio/Analisi dati/Quinta Esperienza/500Hz/F0000CH2.xlsx', 'F0000CH2')
data = pd.read_excel('/Users/lorenzotecchia/Desktop/Analisi Laboratorio/Analisi dati/Quinta Esperienza/500Hz/F0000CH1.xlsx', 'F0000CH1')
# tempi_500Hz = pd.DataFrame(frame,columns=['x'])
# Vout_500Hz = pd.DataFrame(frame,columns=['y'])
tempi_500Hz = pd.DataFrame(frame,columns=['x1'])
Vout_500Hz = pd.DataFrame(frame,columns=['y1'])
# Vin_500Hz = pd.DataFrame(data,columns=['y'])
def fit_esponenziale(x, α, β):
return α * (1 - np.exp(-x / β))
plt.xlabel('ω(Hz)')
plt.ylabel('Attenuazioni')
plt.title('Fit Parabolico')
plt.scatter(tempi_500Hz, Vout_500Hz)
least_squares = cost.LeastSquares(tempi_500Hz, Vout_500Hz, np.sqrt(Vout_500Hz), fit_esponenziale)
m = Minuit(least_squares, α=0, β=0)
m.migrad()
m.hesse()
plt.errorbar(tempi_500Hz, Vout_500Hz, fmt="o", label="data")
plt.plot(tempi_500Hz, fit_esponenziale(tempi_500Hz, *m.values), label="fit")
fit_info = [
f"$\\chi^2$ / $n_\\mathrm{{dof}}$ = {m.fval:.1f} / {len(tempi_500Hz) - m.nfit}",]
for p, v, e in zip(m.parameters, m.values, m.errors):
fit_info.append(f"{p} = ${v:.3f} \\pm {e:.3f}$")
plt.legend()
plt.show()
input
output and example of data
There is actually a way to fit this completely linear.
See e.g.here
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import cumtrapz
def fit_exp(x, a, b, c):
return a * (1 - np.exp( -b * x) ) + c
nn = 170
xl = np.linspace( 0, 0.001, nn )
yl = fit_exp( xl, 15, 5300, -8.1 ) + np.random.normal( size=nn, scale=0.05 )
"""
with y = a( 1- exp(-bx) ) + c
we have Y = int y = -1/b y + d x + h ....try it out or see below
so we get a linear equation for b (actually 1/b ) to optimize
this goes as:
"""
Yl = cumtrapz( yl, xl, initial=0 )
ST = [xl, yl, np.ones( nn ) ]
S = np.transpose( ST )
eta = np.dot( ST, Yl )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
bFit = -1/sol[1]
print( bFit )
"""
now we can do a normal linear fit
"""
ST = [ fit_exp(xl, 1, bFit, 0), np.ones( nn ) ]
S = np.transpose( ST )
A = np.dot( ST, S )
eta = np.dot( ST, yl )
aFit, cFit = np.linalg.solve( A, eta )
print( aFit, cFit )
print(aFit + cFit, sol[0] ) ### consistency check
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1 )
ax.plot(xl, yl, marker ='+', ls='' )
## at best a sufficient fit, at worst a good start for a non-linear fit
ax.plot(xl, fit_exp( xl, aFit, bFit, cFit ) )
plt.show()
"""
a ( 1 - exp(-b x)) + c = a + c - a exp(-b x) = d - a exp( -b x )
int y = d x + a/b exp( -b x ) + g
= d x +a/b exp( -b x ) + a/b - a/b + c/b - c/b + g
= d x - 1/b ( a - a exp( -b x ) + c ) + c/b + a/b + g
= d x + k y + h
with k = -1/b and h = g + c/b + a/b.
d and h are fitted but not used, but as a+c = d we can check
for consistency
"""
Here is a working Minuit vs curve_fit example. I scaled the function such that the decay in the exponential is in the order of 1 (generally a good idea for non linear fits ). Eventually, both methods give very similar results.
Note:I leave it open whether the error makes sense like this or not. The starting values equal to zero was definitively a bad idea.
import pandas as pd
import matplotlib.pyplot as plt
from iminuit import cost, Minuit
from scipy.optimize import curve_fit
import numpy as np
def fit_exp(x, a, b, c):
return a * (1 - np.exp(- 1000 * b * x) ) + c
nn = 170
xl = np.linspace( 0, 0.001, nn )
yl = fit_exp( xl, 15, 5.3, -8.1 ) + np.random.normal( size=nn, scale=0.05 )
#######################
### Minuit
#######################
least_squares = cost.LeastSquares(xl, yl, np.sqrt( np.abs( yl ) ), fit_exp )
print(least_squares)
m = Minuit(least_squares, a=1, b=5, c=-7)
print( "grad: ")
print( m.migrad() ) ### needs to be called to get fit values
print( m.values )### gives slightly different output
print("Hesse:")
print( m.hesse() )
#######################
### curve_fit
#######################
opt, cov = curve_fit(
fit_exp, xl, yl, sigma=np.sqrt( np.abs( yl ) ),
absolute_sigma=True
)
print( " curve_fit: ")
print( opt )
print( " covariance ")
print( cov )
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1 )
ax.plot(xl, yl, marker ='+', ls='' )
ax.plot(xl, fit_exp(xl, *m.values), ls="--")
ax.plot(xl, fit_exp(xl, *opt), ls=":")
plt.show()
def build_gaussian_map(s, point, sigma=25):
x, y = point[0], point[1]
gmap = np.zeros(s)
for row in range(s[0]):
for col in range(s[1]):
gmap[row][col] = 1 / (2 * np.pi * sigma * sigma) * np.exp(-((x - row) * (x - row) + (y - col) * (y - col)) / (2 * sigma * sigma))
return gmap
s - 2D array shape
point - point coordinates
I am calculating distance gaussian map with a center in a certain point of image point. Can I do it somehow using matrix operations?
Result map example:
import numpy as np
def build_gaussian_map(s, point, sigma=25):
x, y = point[0], point[1]
gmap = np.zeros(s)
for row in range(s[0]):
for col in range(s[1]):
gmap[row][col] = 1 / (2 * np.pi * sigma * sigma) * np.exp(-((x - row) * (x - row) + (y - col) * (y - col)) / (2 * sigma * sigma))
return gmap
def build_gaussian_map2(shape, point, sigma=25):
x, y = point[0], point[1]
row, col = np.indices(shape)
gmap = 1 / (2 * np.pi * sigma * sigma) * np.exp(-((x - row) * (x - row) + (y - col) * (y - col)) / (2 * sigma * sigma))
return gmap
def main():
s = (1000, 1000)
result1 = build_gaussian_map(s, (100, 100))
result2 = build_gaussian_map2(s, (100, 100))
assert np.all(result1 == result2)
main()
Profiling results:
24 def main():
25 1 3.0 3.0 0.0 s = (1000, 1000)
26 1 6126705.0 6126705.0 98.2 result1 = build_gaussian_map(s, (100, 100))
27 1 105593.0 105593.0 1.7 result2 = build_gaussian_map2(s, (100, 100))
def gaussian_map(shape, point, sigma=20):
a = np.arange(shape[0])
b = np.arange(shape[1])
x_grid, y_grid = np.meshgrid(a, b)
return 1 / (2 * np.pi * sigma * sigma) * np.exp(- ((x_grid - point[0]) * (x_grid - point[0]) + (y_grid - point[1]) * (y_grid - point[1])) / (2 * sigma * sigma))
Thought up this function. It seems to be effective
I am fitting the following function (variables A, D, μ and τ) and x and E are fixed:
I created some example data using the equation and added some noise. The fit looks very good and has a low chi-squared however the errors from the covariance matrix are odd; some are very large whereas others are smaller. What am I doing wrong?
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Constants
E_field = 1
x = 1
def function(t, A, D, μ, τ):
return A/np.sqrt(4*np.pi*D*t) * np.exp(-pow(x-μ*E_field*t, 2)/(4*D*t) - t/τ)
def chi(E, O):
return np.sum(np.ma.masked_invalid(pow(O-E, 2)/E))
def fit(t, n, m, p0):
ddof = n.size - m
popt, pcov = curve_fit(function, t, n, p0=p0)
fitted_n = function(t, *popt)
reduced_χ_squared = chi(n, fitted_n) / ddof
σ = np.sqrt(np.diag(pcov))
return popt, σ, reduced_χ_squared
# Choose random variables to generate data
x, t = 1, np.linspace(0.01, 5, num=100)
A, D, μ, τ = 1, 0.2, 1, 1
n = function(t, A, D, μ, τ)
n_noise = n + 0.005 * np.random.normal(size=n.size)
n_noise += abs(min(n_noise)) # Shift data to lie on y = 0
p0 = [1, 0.25, 1, 1]
vars, σ, reduced_χ_squared = fit(t, n_noise, 4, p0)
fitted_A, fitted_D, fitted_μ, fitted_τ = vars
σ_A, σ_D, σ_μ, σ_τ = σ
fitted_n = function(t, *vars)
fig, ax = plt.subplots()
ax.plot(t, n_noise)
ax.plot(t, fitted_n)
#ax.text(0.82, 0.75, "χᵣ²={:.4f}".format(reduced_χ_squared), transform = ax.transAxes)
ax.legend(["Observed n", "Expected n"])
print("Fitted parameters: A = {:.4f}, D = {:.4f}, μ = {:.4f}, τ = {:.4f}".format(*vars))
print("Fitted parameter errors: σ_A = {:.4f}, σ_D = {:.4f}, σ_μ = {:.4f}, σ_τ = {:.4f}".format(*σ))
print("Reduced χ² = {:.4f}".format(reduced_χ_squared))
Running this code gives me the following output
As mentioned in my comment above, correlation is a big problem here. Biggest problem though is that you fit more parameters than required.
Let us transform:
A = exp( alpha) i.e alpha = log(A)
delta = 4 * D
epsilon = mu * E
We then get:
1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t) / ( delta * t ) -t / tau + alpha )
= 1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t) / ( delta * t ) -delta / tau * t**2/( delta * t) + delta * alpha * t/ ( delta * t ) )
= 1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t + delta / tau * t**2 - delta * alpha * t ) / ( delta * t ) )
= 1 / sqrt( pi* delta ) * exp( -( x**2 + ( epsilon**2 + delta / tau ) * t**2 -x * ( 2 * epsilon + delta * alpha ) * t ) / ( delta * t ) )
now renaming:
( epsilon**2 + delta / tau ) -> gamma**2
( 2 * epsilon + delta * alpha ) -> eta
we get
= 1 / sqrt( pi * delta ) * exp( -( x**2 + gamma**2 * t**2 - x * eta * t ) / ( delta * t ) )
So there are actually only 3 parameters to fit and it looks like this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Constants
E_field = 1
x = 1
def function(t, A, D, μ, τ):
return A/np.sqrt(4*np.pi*D*t) * np.exp(-pow(x-μ*E_field*t, 2)/(4*D*t) - t/τ)
def alt_func( t, gamma, eta, delta ):
return np.exp( -( x**2 + gamma**2 * t**2 - eta * t ) / ( delta * t ) ) / np.sqrt( np.pi * delta * t )
# Choose random variables to generate data
x, t = 1, np.linspace(0.01, 5, num=100)
A, D, μ, τ = 1, 0.2, 1, 1
n = function(t, A, D, μ, τ)
n_noise = n + 0.005 * np.random.normal(size=n.size)
n_noise += abs(min(n_noise)) # Shift data to lie on y = 0
guess=[1.34, 2, .8]
palt, covalt = curve_fit( alt_func, t, n_noise)
print( covalt )
print( palt )
yt = alt_func( t, *palt )
yg = alt_func( t, *guess )
yorg = function( t, A, D, μ, τ )
fig, ax = plt.subplots()
ax.plot(t, n_noise)
ax.plot(t, yg )
ax.plot(t, yt, ls="--")
ax.plot(t, yorg, ls=":" )
plt.show()
This has a reasonable covariance matrix. One can get the original parameters easily via error propagation.
Altzernatively, it should be enough to fix A=1 and only fit the three left parameters in the original function.
Concerning the transformation and back calculation one has to keep in mind that this is of course from R³ to R⁴, so it is naturally not unique either. Again one can just fix one value, or one might to try to spread the error evenly between the parameters or who knows....
I was calculating Y from yCbCr for histogram equalization.
The kb value I want is "kb / (b + g + r)" => ratio of b to RGB in a pixel.
I thought it was a normal calculation, but errors continue to occur in this part.
It is as follows.
RuntimeWarning: overflow encountered in ubyte_scalars
kb[y][x] = b[y][x] / (b[y][x] + g[y][x] + r[y][x])
What should I do to solve this problem?
import numpy as np
import cv2
def transform(img) :
height, width, color = img.shape
result = np.zeros((height, width), np.uint8)
b,g,r = cv2.split(img)
kb = [[0.0] * width] * height
kg = [[0.0] * width] * height
kr = [[0.0] * width] * height
yi = [[0.0] * width] * height
yo = [[0.0] * width] * height
list(b), list(g), list(r), list(kb), list(kg), list(kr), list(yi), list(yo)
for x in range(width):
for y in range(height):
kb[y][x] = b[y][x] / (b[y][x] + g[y][x] + r[y][x])
kg[y][x] = g[y][x] / (b[y][x] + g[y][x] + r[y][x])
kr[y][x] = b[y][x] / (b[y][x] + g[y][x] + r[y][x])
# ...
return result