Which optimization techniques can I use for maximizing the sum of minimum distance of each point to other points in a unit hypercube? - optimization

Let's say I have the following unit hypercube with 9 points
My goal is to maximize this function:
In the image, Figure 1 is the original data, Figure 2 is computed using the function, and Figure 3 is the optimized function.
I want to know how can I reach to Figure 3 from Figure 1.
So far, I have tried using Simulated Annealing, but I am not able to do it in the correct way. Any other suggestions would be helpful!

You could model this as:
max sum(i, d[i])
d(i) ≤ sqrt( (x[i]-x[j])^2 + (y[i]-y[j])^2 ) for all j <> i
x[i],y[i] ∈ [0,1]
This is a non-convex problem and can be solved with a global solver such as Couenne or Baron. (Note: it will find good solutions quickly but proving global optimality is difficult and time-consuming).
This can also be attacked using a multi-start approach with a local solver (I used CONOPT in the test below). The algorithm would be:
bestobj = 0
for k = 1 to N (say N=50)
(x,y) = random points in [0,1]x[0,1]
solve NLP model
if obj > bestobj
save solution
bestobj = obj
end
Using both approaches (global solver, multistart approach) I get for 9 points:
---- VAR x x-coordinates
LOWER LEVEL UPPER MARGINAL
i1 . 0.5000 1.0000 EPS
i2 . 1.0000 1.0000 EPS
i3 . 0.5000 1.0000 EPS
i4 . . 1.0000 EPS
i5 . 0.5000 1.0000 EPS
i6 . . 1.0000 EPS
i7 . . 1.0000 EPS
i8 . 1.0000 1.0000 EPS
i9 . 1.0000 1.0000 EPS
---- VAR y y-coordinates
LOWER LEVEL UPPER MARGINAL
i1 . . 1.0000 EPS
i2 . . 1.0000 EPS
i3 . 0.5000 1.0000 EPS
i4 . 1.0000 1.0000 EPS
i5 . 1.0000 1.0000 EPS
i6 . 0.5000 1.0000 EPS
i7 . . 1.0000 EPS
i8 . 1.0000 1.0000 EPS
i9 . 0.5000 1.0000 EPS
---- VAR d min distances from point i
LOWER LEVEL UPPER MARGINAL
i1 . 0.5000 1.4142 EPS
i2 . 0.5000 1.4142 EPS
i3 . 0.5000 1.4142 EPS
i4 . 0.5000 1.4142 EPS
i5 . 0.5000 1.4142 EPS
i6 . 0.5000 1.4142 EPS
i7 . 0.5000 1.4142 EPS
i8 . 0.5000 1.4142 EPS
i9 . 0.5000 1.4142 EPS
LOWER LEVEL UPPER MARGINAL
---- VAR z -INF 4.5000 +INF .
z objective

Related

Negative binomial , Poisson-gamma mixture winbugs

Winbugs trap error
model
{
for (i in 1:5323) {
Y[i] ~ dpois(mu[i]) # NB model as a Poisson-gamma mixture
mu[i] ~ dgamma(b[i], a[i]) # NB model as a poisson-gamma mixture
a[i] <- b[i] / Emu[i]
b[i] <- B * X[i]
Emu[i] <- beta0 * pow(X[i], beta1) # model equation
}
# Priors
beta0 ~ dunif(0,10) # parameter
beta1 ~ dunif(0,10) # parameter
B ~ dunif(0,10) # over-dispersion parameter
}
X[] Y[]
1.5 0
2.9 0
1.49 0
0.39 0
3.89 0
2.03 0
0.91 0
0.89 0
0.97 0
2.16 0
0.04 0
1.12 1s
2.26 0
3.6 1
1.94 0
0.41 1
2 0
0.9 0
0.9 0
0.9 0
0.1 0
0.88 1
0.91 0
6.84 2
3.14 3
End ```
This is just a sample of the data, the model question is coming from Ezra Hauer 8.3.2, the art of regression of road safety, the model is providing an **error undefined real result. **
The aim of model is to fully Bayesian and a one step model and not use empirical bayes.
The results should be similar to MLE where beta0 is 1.65, beta1 0.871, overdispersion is 0.531
X is the only variable and y is actual collision,
So X cannot be zero or negative, while y cannot be lower than zero, if the model in solved as Poisson gamma mixture using maximum likelihood then it can be created
How can I make this model work
Solving an error in winbugs?
the data is in excel, the model worked fine when I selected the biggest 1000 observations only.

making rows to column and saved in separate files

I have 4 text file in a folder and each text file contain many rows of data as follows
cat a.txt
10.0000 0.0000 10.0000 0.0000
11.0000 0.0000 11.0000 0.0000
cat b.txt
5.1065 3.8423 2.6375 3.5098
4.7873 5.9304 1.9943 4.7599
cat c.txt
3.5257 3.9505 3.8323 4.3359
3.3414 4.0014 4.0383 4.4803
cat d.txt
1.8982 2.0342 1.9963 2.1575
1.8392 2.0504 2.0623 2.2037
I want to make each corresponding rows of the text file to column as
file001.txt
10.0000 5.1065 3.5257 1.8982
0.0000 3.8423 3.9505 2.0342
10.0000 2.6375 3.8323 1.9963
0.0000 3.5098 4.3359 2.1575
file002.txt
11.0000 4.7873 3.3414 1.8329
0.0000 5.9304 4.0014 2.0504
11.0000 1.9943 4.0383 2.0623
0.0000 4.7599 4.4803 2.2037
Anf finally I want to add this value 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000 to every line so the final output should be
file001.txt
10.0000 5.1065 3.5257 1.8982 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
0.0000 3.8423 3.9505 2.0342 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
10.0000 2.6375 3.8323 1.9963 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
0.0000 3.5098 4.3359 2.1575 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
file002.txt
11.0000 4.7873 3.3414 1.8329 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
0.0000 5.9304 4.0014 2.0504 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
11.0000 1.9943 4.0383 2.0623 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
0.0000 4.7599 4.4803 2.2037 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
Finally, I want to append some comments on the top of every created files
So for example file001.txt should be
#
# ascertain thin
# Metamorphs
# pch
# what is that
# 5-r
# Add the thing
# liop34
# liop36
# liop45
# liop34
# M(CM) N(M) O(S) P(cc) ab cd efgh ijkl mnopq rstuv
#
10.0000 5.1065 3.5257 1.8982 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
0.0000 3.8423 3.9505 2.0342 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
10.0000 2.6375 3.8323 1.9963 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
0.0000 3.5098 4.3359 2.1575 5.0000 6.0000 9.0000 0.0000 1.0000 1.0000
files = ["a.txt", "b.txt", "c.txt", "d.txt"]
# get number of columns per file, i.e., 4 in sample data
n_each = np.loadtxt(files[0]).shape[1]
# concatanate transposed data
arrays = np.concatenate([np.loadtxt(file).T for file in files])
# rows are in column now for easier reshaping; reshape and save
n_all = arrays.shape[1]
for n in range(n_all):
np.savetxt(f"file{str(n+1).zfill(3)}.txt",
arrays[:, n].reshape(n_each, len(files)).T,
fmt="%7.4f")
to add a fixed array of values right to the new arrays, you can perform horizontal stacking after tiling the new values n_each times:
# other things same as above
new_values = np.tile([5, 6, 9, 0, 1, 1], (n_each, 1))
for n in range(n_all):
np.savetxt(f"file{str(n+1).zfill(3)}.txt",
np.hstack((arrays[:, n].reshape(n_each, len(files)).T,
new_values)),
fmt="%7.4f")
to add comments, header and comments parameters of np.savetxt is useful. we pass the string to header and since it already contains "# " in it, we suppress extra "# " from np.savetxt by passing comments="":
comment = """\
#
# ascertain thin
# Metamorphs
# pch
# what is that
# 5-r
# Add the thing
# liop34
# liop36
# liop45
# liop34
# M(CM) N(M) O(S) P(cc) ab cd efgh ijkl mnopq rstuv
#"""
# rows are in column now for easier reshaping; reshape and save
n_all = arrays.shape[1]
new_values = np.tile([5, 6, 9, 0, 1, 1], (n_each, 1))
for n in range(n_all):
np.savetxt(f"file{str(n+1).zfill(3)}.txt",
np.hstack((arrays[:, n].reshape(n_each, len(files)).T,
new_values)),
fmt="%7.4f",
header=comment,
comments="")

how to extract every nth row from numpy array

I have a numpy array and I want to extract every 3rd rows from it
input
0.00 1.0000
0.34 1.0000
0.68 1.0000
1.01 1.0000
1.35 1.0000
5.62 2.0000
I need to extract every 3rd row so that expected output will be
0.68 1.0000
5.62 2.0000
My code:
import numpy as np
a=np.loadtxt('input.txt')
out=a[::3]
But it gives different result.Hope experts will guide me.Thanks.
When undefined, the starting point of a (positive) slice is the first item.
You need to slice starting on the n-1th item:
N = 3
out = a[N-1::N]
Output:
array([[0.68, 1. ],
[5.62, 2. ]])

TensorFlow XOR implementation, fail to achieve 100% accuracy

I am a newbie in machine learning and tensorflow. I am trying to implement XOR gate in tensor flow I have come up with this code.
import numpy as np
import tensorflow as tf
tf.reset_default_graph()
learning_rate = 0.01
n_epochs = 1000
n_inputs = 2
n_hidden1 = 2
n_outputs = 2
arr1, target = [[0, 0], [0, 1], [1, 0], [1,1]], [0, 1, 1, 0]
X_data = np.array(arr1).astype(np.float32)
y_data = np.array(target).astype(np.int)
X = tf.placeholder(tf.float32, shape=(None, n_inputs), name="X")
y = tf.placeholder(tf.int64, shape=(None), name="y")
with tf.name_scope("dnn_tf"):
hidden1 = tf.layers.dense(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
logits = tf.layers.dense(hidden1, n_outputs, name="outputs")
with tf.name_scope("loss"):
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
loss = tf.reduce_mean(xentropy, name="loss")
with tf.name_scope("train"):
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum=0.9)
training_op = optimizer.minimize(loss)
with tf.name_scope("eval"):
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch: ", epoch, " Train Accuracy: ", acc_train)
sess.run(training_op, feed_dict={X:X_data, y:y_data})
acc_train = accuracy.eval(feed_dict={X:X_data, y:y_data})
The code runs fine but I am getting different outputs in each run
Run-1
Epoch: 0 Train Accuracy: 0.75
Epoch: 100 Train Accuracy: 1.0
Epoch: 200 Train Accuracy: 1.0
Epoch: 300 Train Accuracy: 1.0
Epoch: 400 Train Accuracy: 1.0
Epoch: 500 Train Accuracy: 1.0
Epoch: 600 Train Accuracy: 1.0
Epoch: 700 Train Accuracy: 1.0
Epoch: 800 Train Accuracy: 1.0
Epoch: 900 Train Accuracy: 1.0
Run -2
Epoch: 0 Train Accuracy: 1.0
Epoch: 100 Train Accuracy: 0.75
Epoch: 200 Train Accuracy: 0.75
Epoch: 300 Train Accuracy: 0.75
Epoch: 400 Train Accuracy: 0.75
Epoch: 500 Train Accuracy: 0.75
Epoch: 600 Train Accuracy: 0.75
Epoch: 700 Train Accuracy: 0.75
Epoch: 800 Train Accuracy: 0.75
Epoch: 900 Train Accuracy: 0.75
Run3-
Epoch: 0 Train Accuracy: 1.0
Epoch: 100 Train Accuracy: 0.5
Epoch: 200 Train Accuracy: 0.5
Epoch: 300 Train Accuracy: 0.5
Epoch: 400 Train Accuracy: 0.5
Epoch: 500 Train Accuracy: 0.5
Epoch: 600 Train Accuracy: 0.5
Epoch: 700 Train Accuracy: 0.5
Epoch: 800 Train Accuracy: 0.5
Epoch: 900 Train Accuracy: 0.5
I am unable to understand what I am doing wrong here and why my solution is not converging.
In theory it's possible to solve XOR with one hidden layer with two units with ReLU activations as you have in your code. However, there is always the crucial difference between a network being able to represent a solution and being able to learn it. I would assume that due to the small size of the network you run into the "dead ReLU" problem where due to unfortunate random initialization one (or both) of your hidden units doesn't activate for any input. Unfortunately ReLU also has zero gradient when it has zero activation, so a unit that never activates also cannot learn anything.
Increasing the number of hidden units makes it less likely that this happens (i.e. you can have three dead units and the other two will still be enough to solve the problem), which could explain why you are more successful with five hidden units.
You might want to check out the interactive TensorFlow Playground. They have a XOR dataset available. You can play around with the number of hidden layers, size, activation functions etc. and visualise the decision boundaries the classifier learns with the number of epohcs.

group by in Matlab to find the value that resulted minimum similar to SQL

I have a dataset having columns a, b, c and d
I want to group the dataset by a,b and find c such that d is minimum for each group
I can do "group by" using 'grpstats" as :
grpstats(M,[M(:,1) M(:,2) ],{'min'});
I don't know how to find the value of M(:,3) that resulted the min in d
In SQL I suppose we use nested queries for that and use the primary keys. How can I solve it in Matlab?
Here is an example:
>> M =[4,1,7,0.3;
2,1,8,0.4;
2,1,9,0.2;
4,2,1,0.2;
2,2,2,0.6;
4,2,3,0.1;
4,3,5,0.8;
5,3,6,0.2;
4,3,4,0.5;]
>> grpstats(M,[M(:,1) M(:,2)],'min')
ans =
2.0000 1.0000 8.0000 0.2000
2.0000 2.0000 2.0000 0.6000
4.0000 1.0000 7.0000 0.3000
4.0000 2.0000 1.0000 0.1000
4.0000 3.0000 4.0000 0.5000
5.0000 3.0000 6.0000 0.2000
But M(1,3) and M(4,3) are wrong. The correct answer that I am looking for is:
2.0000 1.0000 9.0000 0.2000
2.0000 2.0000 2.0000 0.6000
4.0000 1.0000 7.0000 0.3000
4.0000 2.0000 3.0000 0.1000
4.0000 3.0000 4.0000 0.5000
5.0000 3.0000 6.0000 0.2000
To conclude, I don't want the minimum of third column; but I want it's values corresponding to minimum in 4th column
grpstats won't do this, and MATLAB doesn't make it as easy as you might hope.
Sometimes brute force is best, even if it doesn't feel like great MATLAB style:
[b,m,n]=unique(M(:,1:2),'rows');
for i =1:numel(m)
idx=find(n==i);
[~,subidx] = min(M(idx,4));
a(i,:) = M(idx(subidx),3:4);
end
>> [b,a]
ans =
2 1 9 0.2
2 2 2 0.6
4 1 7 0.3
4 2 3 0.1
4 3 4 0.5
5 3 6 0.2
I believe that
temp = grpstats(M(:, [1 2 4 3]),[M(:,1) M(:,2) ],{'min'});
result = temp(:, [1 2 4 3]);
would do what you require. If it doesn't, please explain in the comments and we can figure it out...
If I understand the documentation correctly, even
temp = grpstats(M(:, [1 2 4 3]), [1 2], {'min'});
result = temp(:, [1 2 4 3]);
should work (giving column numbers rather than full contents of columns)... Can't test right now, so can't vouch for that.