How do I obtain Monte Carlo error in R2OpenBugs? - bayesian

Has anyone managed to obtain a Monte Carlo error for a parameter when running bayesian model un R2OpenBugs?
It is provided in a standard output of OpenBugs, but when run under R2OpenBugs, the log file doesn't have MC error.Is there a way to ask R2OpenBugs to calculate MC error? Or maybe there is a way to calculate it manually? Please, let me know if you heard of any way to do that. Thank you!
Here is the standard log output of R2OpenBugs:
$stats
mean sd val2.5pc median val97.5pc sample
beta0 1.04700 0.13250 0.8130 1.03800 1.30500 1500
beta1 -0.31440 0.18850 -0.6776 -0.31890 0.03473 1500
beta2 -0.05437 0.05369 -0.1648 -0.05408 0.04838 1500
deviance 588.70000 7.87600 575.3000 587.50000 606.90000 1500
$DIC
Dbar Dhat DIC pD
t 588.7 570.9 606.5 17.78
total 588.7 570.9 606.5 17.78

A simple way to calculate Monte Carlo standard error (MCSE) is to divide the standard deviation of the chain by the square root of the effective number of samples. The standard deviation is provided in your output, but the effective sample size should be given as n.eff (the rightmost column) when you print the model output - or at least that is the impression I get from:
https://cran.r-project.org/web/packages/R2OpenBUGS/vignettes/R2OpenBUGS.pdf
I don't use OpenBugs any more so can't easily check for you, but there should be something there that indicates the effective sample size (this is NOT the same as the number of iterations you have sampled, as it also takes into account the loss of information due to correlation within the chains).
Otherwise you can obtain it yourself by extracting the raw MCMC chains and then either computing the effective sample size using the coda package (?coda::effectiveSize) or just use LaplacesDemon::MCSE to calculate the Monte Carlo standard error directly. For more information see:
https://rdrr.io/cran/LaplacesDemon/man/MCSE.html
Note that some people (including me!) would suggest focusing on the effective sample size directly rather than looking at the MCSE, as the old "rule of thumb" that MCSE should be less than 5% of the sample standard deviation is equivalent to saying that the effective sample size should be at least 400 (1/0.05^2). But opinions do vary :)

The MCMC-error is named Time-series SE, and can be found in the statistics section of the summary of the coda object:
library(R2OpenBUGS)
library(coda)
my_result <- bugs(...., codaPg = TRUE)
my_coda <- read.bugs(my_result)
summary(my_coda$statistics)

Related

How to perform dynamic optimization for a nonlinear discrete optimization problem with nonlinear constraints, using non-linear solvers like SNOPT?

I am new to the field of optimization and I need help in the following optimization problem. I have tried to solve it using normal coding to make sure that I got he correct results. However, the results I got are different and I am not sure my way of analysis is correct or not. This is a short description of the problem:
The objective function shown in the picture is used to find the optimal temperature of the insulating system that minimizes the total cost over a given horizon.
[This image provides the mathematical description of the objective function and the constraints] (https://i.stack.imgur.com/yidrO.png)
The data of the problems are as follow:
1-
Problem data:
A=1.07×10^8
h=1
T_ref=87.5
N=20
p1=0.001;
p2=0.0037;
This is the curve I want to obtain
2- Optimization variable:
u_t
3- Model type:
The model is a nonlinear cost function with non-linear constraints and it is solved using non-linear solver SNOPT.
4-The meaning of the symbols in the objective and constrained functions
The optimization is performed over a prediction horizon of N years.
T_ref is The reference temperature.
Represent the degree of polymerization in the kth year.
X_DP Represents the temperature of the insulating system in the kth year.
h is the time step (1 year) of the discrete-time model.
R is the ratio of the load loss at the rated load to the no-load loss.
E is the activation energy.
A is the pre-exponential constant.
beta is a linear coefficient representing the cost due to the decrement of the temperature.
I have developed the source code in MATLAB, this code is used to check if my analysis is correct or not.
I have tried to initialize the Ut value in its increasing or decreasing states so that I can have the curves similar to the original one. [This is the curve I obtained] (https://i.stack.imgur.com/KVv2q.png)
I have tried to simulate the problem using conventional coding without optimization and I got the figure shown above.
close all; clear all;
h=1;
N=20;
a=250;
R=8.314;
A=1.07*10^8;
E=111000;
Tref=87.5;
p1=0.0019;
p2=0.0037;
p3=0.0037;
Utt=[80,80.7894736842105,81.5789473684211,82.3684210526316,83.1578947368421,... % The value of Utt given here represent the temperature increament over a predictive horizon.
83.9473684210526,84.7368421052632,85.5263157894737,86.3157894736842,...
87.1052631578947,87.8947368421053,88.6842105263158,89.4736842105263,...
90.2631578947369,91.0526315789474,91.8421052631579,92.6315789473684,...
93.4210526315790,94.2105263157895,95];
Utt1 = [95,94.2105263157895,93.4210526315790,92.6315789473684,91.8421052631579,... % The value of Utt1 given here represent the temperature decreament over a predictive horizon.
91.0526315789474,90.2631578947369,89.4736842105263,88.6842105263158,...
87.8947368421053,87.1052631578947,86.3157894736842,85.5263157894737,...
84.7368421052632,83.9473684210526,83.1578947368421,82.3684210526316,...
81.5789473684211,80.7894736842105,80];
Ut1=zeros(1,N);
Ut2=zeros(1,N);
Xdp =zeros(N,N);
Xdp(1,1)=1000;
Xdp1 =zeros(N,N);
Xdp1(1,1)=1000;
for L=1:N-1
for k=1:N-1
%vt(k+L)=Ut(k-L+1);
Xdq(k+1,L) =(1/Xdp(k,L))+A*exp((-1*E)/(R*(Utt(k)+273)))*24*365*h;
Xdp(k+1,L)=1/(Xdq(k+1,L));
Xdp(k,L+1)=1/(Xdq(k+1,L));
Xdq1(k+1,L) =(1/Xdp1(k,L))+A*exp((-1*E)/(R*(Utt1(k)+273)))*24*365*h;
Xdp1(k+1,L)=1/(Xdq1(k+1,L));
Xdp1(k,L+1)=1/(Xdq1(k+1,L));
end
end
% MATLAB code
for j =1:N-1
Ut1(j)= -p1*(Utt(j)-Tref);
Ut2(j)= -p2*(Utt1(j)-Tref);
end
sum00=sum(Ut1);
sum01=sum(Ut2);
X1=1./Xdp(:,1);
Xf=1./Xdp(:,20);
Total= table(X1,Xf);
Tdiff =a*(Total.Xf-Total.X1);
X22=1./Xdp1(:,1);
X2f=1./Xdp1(:,20);
Total22= table(X22,X2f);
Tdiff22 =a*(Total22.X2f-Total22.X22);
obj=(sum00+(Tdiff));
ob1 = min(obj);
obj2=sum01+Tdiff22;
ob2 = min(obj2);
plot(Utt,obj,'-o');
hold on
plot(Utt1,obj)

How can I order the basic solutions of a min cost flow problem according to their cost?

I was wondering if, given a min cost flow problem and an integer n, there is an efficient algorithm/package or mathematical method, to obtain the set of the
n-best basic solutions of the min cost flow problem (instead of just the best).
Not so easy. There were some special LP solvers that could do that (see: Ralph E. Steuer, Multiple Criteria Optimization: Theory, Computation, and Application, Wiley, 1986), but currently available LP solvers can't.
There is a way to encode a basis using binary variables:
b[i] = 1 if variable x[i] = basic
0 nonbasic
Using this, we can use "no good cuts" or "solution pool" technology to get the k best bases. See: https://yetanothermathprogrammingconsultant.blogspot.com/2016/01/finding-all-optimal-lp-solutions.html. Note that not all solution-pools can do the k-best. (Cplex can't, Gurobi can.) The "no-good" cuts work with any mip solver.
Update: a more recent reference is Craig A. Piercy, Ralph E. Steuer,
Reducing wall-clock time for the computation of all efficient extreme points in multiple objective linear programming, European Journal of Operational Research, 2019, https://doi.org/10.1016/j.ejor.2019.02.042

How do I compare effectiveness of different linear regression models

I have a dataframe which contains three more or less significant correlations between target column and other columns ( LinarRegressionModel.coef_ from sklearn shows 57, 97 and 79). And I don't know what exact model to choose: should I use only most correlated column for regression or use regression with all three predictors. Is there any way to compare models effectiveness? Sorry, I'm very new to data analysis, I couldn't google any tools for this task
Well first at all, you must know that when we are choosing the best model to apply to new data, we are going to choose the best model to fit out of sample data, which is the kind of samples that might are not present in the training process, after all, you want to predict new probabilities or cases. In your case, predict a new number.
So, how can we do this? Well, the best is to use metrics which can help us to choose which model is better for our dataset.
There are so many kinds of metrics for regression:
MAE: Mean absolute error is the mean of the absolute value of the errors. This is the easiest of the metrics to understand since it’s just the average error.
MSE: Mean squared error is the mean of the squared error. It’s more popular than a mean absolute error because the focus is geared more towards large errors.
RMSE: Root means the squared error is the square root of the mean squared error. This is one of the most popular of the evaluation metrics because root means the squared error is interpretable in the same units as the response vector or y units, making it easy to relate its information.
RAE: Relative absolute error, also known as the residual sum of a square, where y bar is a mean value of y, takes the total absolute error and normalizes it by dividing by the total absolute error of the simple predictor.
You can work with any of these, but I highly recommend to use MSE and RMSE.

Standard Errors for Differential Evolution

Is it possible to calculate standard errors for Differential Evolution?
From the Wikipedia entry:
http://en.wikipedia.org/wiki/Differential_evolution
It's not derivative based (indeed that is one of its strengths) but how then so you calculate the standard errors?
I would have thought some kind of bootstrapping strategy might have been applicable but can't seem to find any sources than apply bootstrapping to DE?
Baz
Concerning the standard errors, differential evolution is just like any other evolutionary algorithm.
Using a bootstrapping strategy seems a good idea: the usual formulas assume a normal (Gaussian) distribution for the underlying data. That's almost never true for evolutionary computation (exponential distributions being far more common, probably followed by bimodal distributions).
The simplest bootstrap method involves taking the original data set of N numbers and sampling from it to form a new sample (a resample) that is also of size N. The resample is taken from the original using sampling with replacement. This process is repeated a large number of times (typically 1000 or 10000 times) and for each of these bootstrap samples we compute its mean / median (each of these are called bootstrap estimates).
The standard deviation (SD) of the means is the bootstrapped standard error (SE) of the mean and the SD of the medians is the bootstrapped SE of the median (the 2.5th and 97.5th centiles of the means are the bootstrapped 95% confidence limits for the mean).
Warnings:
the word population is used with different meanings in different contexts (bootstrapping vs evolutionary algorithm)
in any GA or GP, the average of the population tells you almost nothing of interest. Use the mean/median of the best-of-run
the average of a set that is not normally distributed produces a value that behaves non-intuitively. Especially if the probability distribution is skewed: large values in "tail" can dominate and average tends to reflect the typical value of the "worst" data not the typical value of the data in general. In this case it's better the median
Some interesting links are:
A short guide to using statistics in Evolutionary Computation
An Introduction to Statistics for EC Experimental Analysis

Mathematical analysis of a sound sample (as an array of numbers)

I need to find the frequency of a sample, stored (in vb) as an array of byte. Sample is a sine wave, known frequency, so I can check), but the numbers are a bit odd, and my maths-foo is weak.
Full range of values 0-255. 99% of numbers are in range 235 to 245, but there are some outliers down to 0 and 1, and up to 255 in the remaining 1%.
How do I normalise this to remove outliers, (calculating the 235-245 interval as it may change with different samples), and how do I then calculate zero-crossings to get the frequency?
Apologies if this description is rubbish!
The FFT is probably the best answer, but if you really want to do it by your method, try this:
To normalize, first make a histogram to count how many occurrances of each value from 0 to 255. Then throw out X percent of the values from each end with something like:
for (i=lower=0;i< N*(X/100); lower++)
i+=count[lower];
//repeat in other direction for upper
Now normalize with
A[i] = 255*(A[i]-lower)/(upper-lower)-128
Throw away results outside the -128..127 range.
Now you can count zero crossings. To make sure you are not fooled by noise, you might want to keep track of the slope over the last several points, and only count crossings when the average slope is going the right way.
The standard method to attack this problem is to consider one block of data, hopefully at least twice the actual frequency (taking more data isn't bad, so it's good to overestimate a bit), then take the FFT and guess that the frequency corresponds to the largest number in the resulting FFT spectrum.
By the way, very similar problems have been asked here before - you could search for those answers as well.
Use the Fourier transform, it's much more noise insensitive than counting zero crossings
Edit: #WaveyDavey
I found an F# library to do an FFT: From here
As it turns out, the best free
implementation that I've found for F#
users so far is still the fantastic
FFTW library. Their site has a
precompiled Windows DLL. I've written
minimal bindings that allow
thread-safe access to FFTW from F#,
with both guru and simple interfaces.
Performance is excellent, 32-bit
Windows XP Pro is only up to 35%
slower than 64-bit Linux.
Now I'm sure you can call F# lib from VB.net, C# etc, that should be in their docs
If I understood well from your description, what you have is a signal which is a combination of a sine plus a constant plus some random glitches. Say, like
x[n] = A*sin(f*n + phi) + B + N[n]
where N[n] is the "glitch" noise you want to get rid of.
If the glitches are one-sample long, you can remove them using a median filter which has to be bigger than the glitch length. On both sides of the glitch. Glitches of length 1, mean you will have enough with a median of 3 samples of length.
y[n] = median3(x[n])
The median is computed so: Take the samples of x you want to filter (x[n-1],x[n],x[n+1]), sort them, and your output is the middle one.
Now that the noise signal is away, get rid of the constant signal. I understand the buffer is of a limited and known length, so you can just compute the mean of the whole buffer. Substract it.
Now you have your single sinus signal. You can now compute the fundamental frequency by counting zero crossings. Count the amount of samples above 0 in which the former sample was below 0. The period is the total amount of samples of your buffer divided by this, and the frequency is the oposite (1/x) of the period.
Although I would go with the majority and say that it seems like what you want is an fft solution (fft algorithm is pretty quick), if fft is not the answer for whatever reason you may want to try fitting a sine curve to the data using a fitting program and reading off the fitted frequency.
Using Fityk, you can load the data, and fit to a*sin(b*x-c) where 2*pi/b will give you the frequency after fitting.
Fityk can be used from a gui, from a command-line for scripting and has a C++ API so could be included in your programs directly.
I googled for "basic fft". Visual Basic FFT Your question screams FFT, but be careful, using FFT without understanding even a little bit about DSP can lead results that you don't understand or don't know where they come from.
get the Frequency Analyzer at http://www.relisoft.com/Freeware/index.htm and run it and look at the code.