Variation (complex?) on the shortest path - optimization

I have the following problem. Given a directed graph G=(V,E) with edge costs cij between all edges {i,j}. We have multiple sources, say s1,...,sk, and one target, say t. The problem is to find the lowest combined costs going from s1,...sk to t, where the total amount of visited vertexes by all different paths is M. (The sources and target don't count as visited vertexes and 0 <= M <= |V|-k+1, so if M = 0 all paths go directly from source to target.)
I came up with the following, but haven't found a solution yet.
The problem is similar to multiple targets (t1,...,tk) and one source by just reversing all the edges and making the sources targets and the target source. I thought this could be useful since e.g. Dijkstra computes shortest path from one source to all other vertexes in the graph.
With just one target and one source one can find the shortest path with max. amount of visited vertexes M with the Bellman Ford algorithm. This is done by increasing the number of visited vertexes iteratively.
The problem of finding the shortest path from one source to one target while vertexes v1,...,vk have to be visited can, for small k, be solved as follows:
i) compute shortest path between all vertexes.
ii) check which of the k! permutations is the shortest.
I thought this could be useful when transforming my adjusted problem at 1) into the problem of going from one source to one "supertarget", with mandatory visits at the "old" targets t1=v1,...,tk=vk.
Unfortunately, combining 1, 2 and 3 doesn't provide a solution but it may help. Does anyone know the solution? Can this be solved efficiently?

Why not do a separate Dijkstra for each s, and later sum the costs?
Something like:
float totalCost;
for (int i=0; i<k; i++)
totalCost += Dijkstra(myGraph,s[i],t);
I hope I understood the question correctly.

Related

Octave minimization for a many-body Hamiltonian with non-linear constraint

I work in theoretical physics, and I have come upon a problem that requires the minimization of a particular Hamiltonian operator for a system of 8 particles, with one non-linear constraint. Due to the complexity of the system, I cannot define the entire Hamiltonian "in one go", nor the constraint. By this I mean that the quantity I am searching for is defined recurrently, depending on complex summations over quantities calculated for systems of 7 particles, which in turn depend on quantities calculated for systems of 6, and so on, until it reaches a one or two-particle system, for which said quantities are given as initial values, dependent on the elements of a column vector (the argument/minization parameters). The constraint itself is also of this form, requiring the "overlap" between the states of 8 particles to be exactly 1. (I.E. the state be normalized) I have been thinking of a way to use fmincon for this, but I've come up short, since my function has an implicit dependence on the parameters, and I can't write the whole thing explicitly. For a better understanding, here is some of the code:
for m=3:npairs+1
for n=3:npairs+1
for i=1:nsps
for j=1:nsps
overlap(m,n)=overlap(m,n)+x(i)*x(j)*(delta(i,j)*(overlap(m-1,n-1)-N(m-1,n-1,i))+p0p(m-1,n-1,j,i));
p(m,n,i)=(n-1)*x(i)*overlap(m,n-1)-(n-2)*(n-1)*x(i)*x(i)*((m-1)*x(i)*overlap(m-1,n-1)-(m-2)*(m-1)*x(i)*x(i)*p(m-1,n-1,i));
N(m,n,i)=2*(n-1)*x(i)*p(n-1,m,i);
p0p(m,n,i,j)=(m-1)*(n-1)*x(i)*x(j)*overlap(m-1,n-1)-(m-1)*(n-1)*(m-2)*x(i)*x(i)*x(j)*p(m-2,n-1,i)-(m-1)*(n-1)*(n-2)*x(i)*x(j)*x(j)*p0(m-1,n-2,j)-(m-1)*(n-1)*(m-2)*(n-2)*x(i)*x(i)*x(j)*x(j)*(delta(i,j)*(overlap(m-2,n-2)-N(m-2,n-2,i))+p0p(m-2,n-2,j,i));
endfor
endfor
endfor
endfor
function [E]=H(x)
E=summation over all i and j of N and p0p for m=n=8 %not actual code
endfunction
overlap(9,9)=1 %constraint
It's hard to give a specific answer, but I would advise the following to get you started.
First, note that, the inner two steps of the nest loop can be vectorised, since i and j always appear as indices (whereas m and n make backreferences, so they cannot be vectorised). So your 4-level loop can be reduced to a 2-level loop containing 4 functions operating over i-by-j matrices.
Second, note that the whole construct can be expressed as a recursive function. If you have suitable base cases for m = 0, n = 0, you can iteratively obtain all i,j matrices for all cases up to m=9,n=9. In particular, you can try to 'memoize' the early steps, and plug them into higher steps, rather than rely on actual recursion.
Assuming you need to sum with the first two indeces fixed to 8 (if I understood correctly), you can easily do with Anonymous Functions
https://octave.org/doc/v6.1.0/Anonymous-Functions.html#Anonymous-Functions
# creating same data
A=ones(8,8,4,4);
B=2*ones(8,8,4,4);
# defining 2 versions of sums
f = #(A,B) [sum(sum(A(8,8,:,:))), sum(sum(B(8,8,:,:)))];
g = #(A,B) sum(sum(A(8,8,:,:)))+ sum(sum(B(8,8,:,:)));
E1=f(A,B)
E2=g(A,B)
the output will be:
octave:21> E1=f(A,B)
E1 =
16 32
octave:22> E2=g(A,B)
E2 = 48

Understanding Google Code Jam 2013 - X Marks the Spot

I was trying to solve Google Code Jam problems and there is one of them that I don't understand. Here is the question (World Finals 2013 - problem C): https://code.google.com/codejam/contest/2437491/dashboard#s=p2&a=2
And here follows the problem analysis: https://code.google.com/codejam/contest/2437491/dashboard#s=a&a=2
I don't understand why we can use binary search. In order to use binary search the elements have to be sorted. In order words: for a given element e, we can't have any element less than e at its right side. But that is not the case in this problem. Let me give you an example:
Suppose we do what the analysis tells us to do: we start with a left bound angle of 90° and a right bound angle of 0°. Our first search will be at angle of 45°. Suppose we find that, for this angle, X < N. In this case, the analysis tells us to make our left bound 45°. At this point, we can have discarded a viable solution (at, let's say, 75°) and at the same time there can be no more solutions between 0° and 45°, leading us to say that there's no solution (wrongly).
I don't think Google's solution is wrong =P. But I can't figure out why we can use a binary search in this case. Anyone knows?
I don't understand why we can use binary search. In order to use
binary search the elements have to be sorted. In order words: for a
given element e, we can't have any element less than e at its right
side. But that is not the case in this problem.
A binary search works in this case because:
the values vary by at most 1
we only need to find one solution, not all of them
the first and last value straddle the desired value (X .. N .. 2N-X)
I don't quite follow your counter-example, but here's an example of a binary search on a sequence with the above constraints. Looking for 3:
1 2 1 1 2 3 2 3 4 5 4 4 3 3 4 5 4 4
[ ]
[ ]
[ ]
[ ]
*
I have read the problem and in the meantime thought about the solution. When I read the solution I have seen that they have mostly done the same as I would have, however, I did not thought about some minor optimizations they were using, as I was still digesting the task.
Solution:
Step1: They choose a median so that each of the line splits the set into half, therefore there will be two provinces having x mines, while the other two provinces will have N - x mines, respectively, because the two lines each split the set into half and
2 * x + 2 * (2 * N - x) = 2 * x + 4 * N - 2 * x = 4 * N.
If x = N, then we were lucky and accidentally found a solution.
Step2: They are taking advantage of the "fact" that no three lines are collinear. I believe they are wrong, as the task did not tell us this is the case and they have taken advantage of this "fact", because they assumed that the task is solvable, however, in the task they were clearly asking us to tell them if the task is impossible with the current input. I believe this part is smelly. However, the task is not necessarily solvable, not to mention the fact that there might be a solution even for the case when three mines are collinear.
Thus, somewhere in between X had to be exactly equal to N!
Not true either, as they have stated in the task that
You should output IMPOSSIBLE instead if there is no good placement of
borders.
Step 3: They are still using the "fact" described as un-true in the previous step.
So let us close the book and think ourselves. Their solution is not bad, but they assume something which is not necessarily true. I believe them that all their inputs contained mines corresponding to their assumption, but this is not necessarily the case, as the task did not clearly state this and I can easily create a solvable input having three collinear mines.
Their idea for median choice is correct, so we must follow this procedure, the problem gets more complicated if we do not do this step. Now, we could search for a solution by modifying the angle until we find a solution or reach the border of the period (this was my idea initially). However, we know which provinces have too much mines and which provinces do not have enough mines. Also, we know that the period is pi/2 or, in other terms 90 degrees, because if we move alpha by pi/2 into either positive (counter-clockwise) or negative (clockwise) direction, then we have the same problem, but each child gets a different province, which is irrelevant from our point of view, they will still be rivals, I guess, but this does not concern us.
Now, we try and see what happens if we rotate the lines by pi/4. We will see that some mines might have changed borders. We have either not reached a solution yet, or have gone too far and poor provinces became rich and rich provinces became poor. In either case we know in which half the solution should be, so we rotate back/forward by pi/8. Then, with the same logic, by pi/16, until we have found a solution or there is no solution.
Back to the question, we cannot arrive into the situation described by you, because if there was a valid solution at 75 degrees, then we would see that we have not rotated the lines enough by rotating only 45 degrees, because then based on the number of mines which have changed borders we would be able to determine the right angle-interval. Remember, that we have two rich provinces and two poor provinces. Each rich provinces have two poor bordering provinces and vice-versa. So, the poor provinces should gain mines and the rich provinces should lose mines. If, when rotating by 45 degrees we see that the poor provinces did not get enough mines, then we will choose to rotate more until we see they have gained enough mines. If they have gained too many mines, then we change direction.

Can I run a GA to optimize wavelet transform?

I am running a wavelet transform (cmor) to estimate damping and frequencies that exists in a signal.cmor has 2 parameters that I can change them to get more accurate results. center frequency(Fc) and bandwidth frequency(Fb). If I construct a signal with few freqs and damping then I can measure the error of my estimation(fig 2). but in actual case I have a signal and I don't know its freqs and dampings so I can't measure the error.so a friend in here suggested me to reconstruct the signal and find error by measuring the difference between the original and reconstructed signal e(t)=|x(t)−x^(t)|.
so my question is:
Does anyone know a better function to find the error between reconstructed and original signal,rather than e(t)=|x(t)−x^(t)|.
can I use GA to search for Fb and Fc? or do you know a better search method?
Hope this picture shows what I mean, the actual case is last one. others are for explanations
Thanks in advance
You say you don't know the error until after running the wavelet transform, but that's fine. You just run a wavelet transform for every individual the GA produces. Those individuals with lower errors are considered fitter and survive with greater probability. This may be very slow, but conceptually at least, that's the idea.
Let's define a Chromosome datatype containing an encoded pair of values, one for the frequency and another for the damping parameter. Don't worry too much about how their encoded for now, just assume it's an array of two doubles if you like. All that's important is that you have a way to get the values out of the chromosome. For now, I'll just refer to them by name, but you could represent them in binary, as an array of doubles, etc. The other member of the Chromosome type is a double storing its fitness.
We can obviously generate random frequency and damping values, so let's create say 100 random Chromosomes. We don't know how to set their fitness yet, but that's fine. Just set it to zero at first. To set the real fitness value, we're going to have to run the wavelet transform once for each of our 100 parameter settings.
for Chromosome chr in population
chr.fitness = run_wavelet_transform(chr.frequency, chr.damping)
end
Now we have 100 possible wavelet transforms, each with a computed error, stored in our set called population. What's left is to select fitter members of the population, breed them, and allow the fitter members of the population and offspring to survive into the next generation.
while not done
offspring = new_population()
while count(offspring) < N
parent1, parent2 = select_parents(population)
child1, child2 = do_crossover(parent1, parent2)
mutate(child1)
mutate(child2)
child1.fitness = run_wavelet_transform(child1.frequency, child1.damping)
child2.fitness = run_wavelet_transform(child2.frequency, child2.damping)
offspring.add(child1)
offspring.add(child2)
end while
population = merge(population, offspring)
end while
There are a bunch of different ways to do the individual steps like select_parents, do_crossover, mutate, and merge here, but the basic structure of the GA stays pretty much the same. You just have to run a brand new wavelet decomposition for every new offspring.

Building an MKPolygon using outer boundary of a set of coordinates - How do I split coordinates that fall on either side of a line?

I'm trying to build a MKPolygon using the outer boundary of a set of coordinates.
From what I can tell, there is no delivered functionality to achieve this in Xcode (the MKPolygon methods would use all points to build the polygon, including interior points).
After some research I've found that a convex-hull solves this problem.
After looking into various algorithms, the one I can best wrap my head around to implement is QuickHull.
This takes the outer lat coords and draws a line between the two. From there, you split your points based on that line into two subsets and process distance between the outer lats to start building triangles and eliminating points within until you are left with the outer boundary.
I can find the outer points just by looking at min/max lat and can draw a line between the two (MKPolyline) - but how would I determine whether a point falls on one side or the other of this MKPolyline?
A follow up question is whether there is a hit test to determine whether points fall within an MKPolygon.
Thanks!
I ended up using a variation of the gift wrap algorithm. Certainly not a trivial task.
Having trouble with formatting of the full code so I'll have to just put my steps (probably better because I have some clean up to do!)
I started with an array of MKPointAnnotations
1) I got the lowest point that is furthest left. To do this, I looped through all of the points and compared lat/lng to get lowest point. This point will definitely be in the convex hull, so add it to a NSMutableArray that will store our convex hull points (cvp)
2) Get all points to the left of the lowest point and loop through them, calculating the angle of the cvp to the remaining points on the left. Whichever has the greatest angle, will be the point you need to add to the array.
atan(cos(lat1)sin(lat2)-sin(lat1)*cos(lat2)*cos(lon2-lon1), sin(lon2-lon1)*cos(lat2))
For each point found, create a triangle (by using lat from new point and long from previous point) and create a polygon. I used this code to do a hit test on my polygon:
BOOL mapCoordinateIsInPolygon = CGPathContainsPoint(polygonView.path, NULL, polygonViewPoint, NO);
If anything was found in the hit test, remove it from the comparison array (all those on the left of the original array minus the hull points)
Once you have at least 3 points in your cvp array, build another polygon with all of the cvp's in the array and remove anything within using the hit test.
3) Once you've worked through all of the left points, create a new comparison array of the remaining points that haven't been eliminated or added to the hull
4) Use the same calculations and polygon tests to remove points and add the cvp's found
At the end, you're left with a list of points in that make up your convex hull.

How can I compare two NSImages for differences?

I'm attempting to gauge the percentage difference between two images.
Having done a lot of reading I seem to have a number of options but I'm not sure what the best method to follow for:
Ease of coding
Performance.
The methods I've seen are:
Non language specific - academic Image comparison - fast algorithm and Mac specific direct pixel access http://www.markj.net/iphone-uiimage-pixel-color/
Does anyone have any advice about what solutions make most sense for the above two cases and have code samples to show how to apply them?
I've had success calculating the difference between two images using the histogram technique mentioned here. redmoskito's answer in the SO question you linked to was actually my inspiration!
The following is an overview of the algorithm I used:
Convert the images to grayscale—compare one channel instead of three.
Divide each image into an n * n grid of "subimages". Then, for subimage pair:
Calculate their colour composition histograms.
Calculate the absolute difference between the two histograms.
The maximum difference found between two subimages is a measure of the two images' difference. Other metrics could also be used (e.g. the average difference betwen subimages).
As tskuzzy noted in his answer, if your ultimate goal is a binary "yes, these two images are (roughly) the same" or "no, they're not", you need some meaningful threshold value. You could produce such a value by passing images into the algorithm and tweaking the threshold based on its output and how similar you think the images are. A form of machine learning, I suppose.
I recently wrote a blog post on this very topic, albeit as part of a larger goal. I also created a simple iPhone app to demonstrate the algorithm. You can find the source on GitHub; perhaps it will help?
It is really difficult to suggest something when you don't tell us more about the images or the variations. Are they shapes? Are they the different objects and you want to know what class of objects? Are they the same object and you want to distinguish the object instance? Are they faces? Are they fingerprints? Are the objects in the same pose? Under the same illumination?
When you say performance, what exactly do you mean? How large are the images? All in all it really depends. With what you've said if it is only ease of coding and performance I would suggest to just find the absolute value of the difference of pixels. That is super easy to code and about as fast as it gets, but really unlikely to work for anything other than the most synthetic examples.
That being said I would like to point you to: DHOG, GLOH, SURF and SIFT.
You can use fairly basic subtraction technique that the lads above suggested. #carlosdc has hit the nail on the head with regard to the type of image this basic technique can be used for. I have attached an example so you can see the results for yourself.
The first shows a image from a simulation at some time t. A second image was subtracted away from the first which was taken some (simulation) time later t + dt. The subtracted image (in black and white for clarity) then shows how the simulation has changed in that time. This was done as described above and is very powerful and easy to code.
Hope this aids you in some way
This is some old nasty FORTRAN, but should give you the basic approach. It is not that difficult at all. Due to the fact that I am doing it on a two colour pallette you would do this operation for R, G and B. That is compute the intensities or values in each cell/pixal, store them in some array. Do the same for the other image, and subtract one array from the other, this will leave you with some coulorfull subtraction image. My advice would be to do as the lads suggest above, compute the magnitude of the sum of the R, G and B componants so you just get one value. Write that to array, do the same for the other image, then subtract. Then create a new range for either R, G or B and map the resulting subtracted array to this, the will enable a much clearer picture as a result.
* =============================================================
SUBROUTINE SUBTRACT(FNAME1,FNAME2,IOS)
* This routine writes a model to files
* =============================================================
* Common :
INCLUDE 'CONST.CMN'
INCLUDE 'IO.CMN'
INCLUDE 'SYNCH.CMN'
INCLUDE 'PGP.CMN'
* Input :
CHARACTER fname1*(sznam),fname2*(sznam)
* Output :
integer IOS
* Variables:
logical glue
character fullname*(szlin)
character dir*(szlin),ftype*(3)
integer i,j,nxy1,nxy2
real si1(2*maxc,2*maxc),si2(2*maxc,2*maxc)
* =================================================================
IOS = 1
nomap=.true.
ftype='map'
dir='./pictures'
! reading first image
if(.not.glue(dir,fname2,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy2
read(unit2,err=11)rad,dxy
do i=1,nxy2
do j=1,nxy2
read(unit2,err=11)si2(i,j)
enddo
enddo
CLOSE(unit2)
! reading second image
if(.not.glue(dir,fname1,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy1
read(unit2,err=11)rad,dxy
do i=1,nxy1
do j=1,nxy1
read(unit2,err=11)si1(i,j)
enddo
enddo
CLOSE(unit2)
! substracting images
if(nxy1.eq.nxy2)then
nxy=nxy1
do i=1,nxy1
do j=1,nxy1
si(i,j)=si2(i,j)-si1(i,j)
enddo
enddo
else
print *,'SUBSTRACT: Different sizes of image arrays'
IOS=0
return
endif
* normal finishing
IOS=0
nomap=.false.
return
* exceptional finishing
10 write (*,30) fullname
return
11 write (*,32) fullname
return
30 format('Cannot open file ',72A)
31 format('Improper filename ',72A)
32 format('Error reading from file ',72A)
end
! =============================================================
Hope this is of some use. All the best.
Out of the methods described in your first link, the histogram comparison method is by far the simplest to code and the fastest. However key point matching will provide far more accurate results since you want to know a precise number describing the difference between two images.
To implement the histogram method, I would do the following:
Compute the red, green, and blue histograms of each image
Add up the differences between each bucket
If the difference is above a certain threshold, then the percentage is 0%
Otherwise the colors found in the images are similar. So then do a pixel by pixel comparison and convert the difference into a percentage.
I don't know any precise algorithms for finding the key points of an image. However once you find them for each image you can do a pixel by pixel comparison for each of the key points.