Island Model in ECJ - genetic-programming

In Genetic Programming (GP), when island model is used, does it mean that it will split the population size between islands?
For example, if in parameters file we have
pop.subpop.0.size = 4000
and we have 4 islands, does it mean that each island will have a population of size 1000? What if we put this line of code in parameters file of each island? Is it possible to have different population size for each island?
I'm using Java and ECJ package to implement island models in GP.

No, in your example you only have defined one island of 4000 individuals. The number is never automatically splited.
There are two ways to use Islands model in ECJ:
Using InterPopulationExchanger class:
One unique Java process that share variables. The islands are the subpopulations of the Population object. Therefore, you need to set sizes for each subpopulation in the parameter file. In your example, you only have set the island (subpopulation) 0 to 4000 individuals, but you should also set the other sizes. For example, for 10 islands of 4000 individuals each:
exch = ec.exchange.InterPopulationExchange
pop.subpops = 10
pop.subpop.0.size = 4000
pop.subpop.1.size = 4000
pop.subpop.2.size = 4000
...etc
pop.subpop.10.size = 4000
Using IslandExchanger class:
In this case, every island is executed in a different Java process, so, every islandID.params file (one per island/process) needs to set only one population:
exch = ec.exchange.InterPopulationExchange
pop.subpop.0.size = 4000
And the number of islands is set in the server.params file:
exch.num-islands = 10
You can see the rest of parameters and more information on page 223 of the ECJ documentation pdf: https://cs.gmu.edu/~eclab/projects/ecj/docs/manual/manual.pdf

I have not studied the ECJ package, but that is the general idea: you have a population which is divided in multiple subpopulations.
I don't know why you want subpopulations of different sizes. Is there a benefit compared to fixed-size subpopulations?
Anyway, I did a very simple implementation of a Genetic-Programming variant with multiple subpopulations. You can download it here: http://www.mepx.org/source_code.html
It is written in C++, but it should be very easy to understand by Java programmers.

Related

Is there a way to generate random numbers between 0 and 500, but if first number is 300 not to deviate more than 20 for the next?

Is there a way to generate random numbers between 0 and 500, but if first number for example, is 300, not to deviate more than 20 for the next? I don't want 500 then 0 then 399 then 1. Thanks.
Just plug the first random number back into the "Random Number (Range)" built-in VI.
Bonus
Use a shift register to find a new random number within range of the last random number:
Previous answer refers to usage of minimal LabVIEW version 2019.
OpenG Numeric Library has similar function for generation of random number is the specified range, and supports earlier versions of LabVIEW.
Also, based on task description - if I've understood correctly - anyway random numbers should be in range 0 - 500; so we need to do additional check whether +/- 20 offset would not cause number "overflow".
Let me attach snippet of the solution which implements it. Note, that Select functions I've used just in order to show all the code on one snippet (instead of having Case Structure with pages).

Redeclaring two Medium packages in One system component

I am new to modelica, and i don't have this much experience in it, but i got the basics of course. I am trying to model a micrfluidic network. The network consists of two sources of water and oil, controlled by two valves. The flow of the two mediums interact at a Tjunction and then into a tank or chamber. I don't care about the fluid properties of the mixture because its not my purpose. My question is how do redeclare two medium packages (water and oil) in one system component such as the Tjunction or a tank in order to simulate the system. In my real model, the two mediums doesn't meet, becuase every medium passes through the channels at a different time.
I attached the model with this message. Here's the link.
https://www.dropbox.com/s/yq6lg9la8z211uc/twomediumsv2.zip?dl=0
Thanks for the help .
I don't think you can redeclare a medium during simulation. In your case (where you don't need the mixing of the two fluids) you could create a new medium, for instance called OilWaterMixture, extending from Modelica.Media.Interfaces.PartialMedium.
If you look into the code of PartialMedium you'll see that it contains a lot of partial ("empty") functions that you should fill in in your new medium model. For example, in OilWaterMixture you should extend the function specificEnthalpy_pTX to return the specific enthalpy of your water/oil mixture, for a certain water/oil mixture (given by the mass fraction vector X). This could be done by adding the following model to the OilWaterMixture package:
redeclare function extends specificEnthalpy_pTX "Return specific enthalpy"
Oil = Modelica.Media.Incompressible.Examples.Essotherm650;
Water = Modelica.Media.Water.StandardWater;
algorithm
h_oil := Oil.h_pT(p,T);
h_water := Water.specificEnthalpy_pT(p,T);
h := X[0]*h_oil + X[1]*h_water;
end specificEnthalpy_pTX;
The mass fraction vector X is defined in PartialMedium and in OilWaterMixture you must define that it has two elements.
Again, since you are not going to actually use the mixing properties but only mass fraction vectors {0,1} or {1,0} the simple linear mixing equation should be adequate.
When you use OilWaterMixture in the various components, the error log will tell you which medium functions they need. So you probably don't need to extend all the partial functions in PartialMedium.

Hardware implementation of Data Encryption Standard (S-boxes /permutations)

I am trying to implement the DES circuit and according to a lot of papers, the S-boxes usually is implemented using a SRL or LUT, i'm not familiar with SRL, so i thought i use 8 LUT, each one has 6 adress lines and 4 data lines ( the 1st 2 adress lines represent the 1st and last bits of the bloc, and the 4 other adress lines represent the rest which will define the column)
For example if we take S-box 1 (shown in this figure)
Here is the table that comes with it
That's just for one box, it seems to me wrong, to do all of the boxes we have to write 512 lines. My first question is: is a LUT a hardware component? if so, i am using it correctly? and, is there a more appropriate implementation or representation?
My second question is: what does it mean hardware wiring? I found out that all the permutation function could be implemented using wire crossing, i didn't understand it. Should i make a wire for every bit?

Can I run a GA to optimize wavelet transform?

I am running a wavelet transform (cmor) to estimate damping and frequencies that exists in a signal.cmor has 2 parameters that I can change them to get more accurate results. center frequency(Fc) and bandwidth frequency(Fb). If I construct a signal with few freqs and damping then I can measure the error of my estimation(fig 2). but in actual case I have a signal and I don't know its freqs and dampings so I can't measure the error.so a friend in here suggested me to reconstruct the signal and find error by measuring the difference between the original and reconstructed signal e(t)=|x(t)−x^(t)|.
so my question is:
Does anyone know a better function to find the error between reconstructed and original signal,rather than e(t)=|x(t)−x^(t)|.
can I use GA to search for Fb and Fc? or do you know a better search method?
Hope this picture shows what I mean, the actual case is last one. others are for explanations
Thanks in advance
You say you don't know the error until after running the wavelet transform, but that's fine. You just run a wavelet transform for every individual the GA produces. Those individuals with lower errors are considered fitter and survive with greater probability. This may be very slow, but conceptually at least, that's the idea.
Let's define a Chromosome datatype containing an encoded pair of values, one for the frequency and another for the damping parameter. Don't worry too much about how their encoded for now, just assume it's an array of two doubles if you like. All that's important is that you have a way to get the values out of the chromosome. For now, I'll just refer to them by name, but you could represent them in binary, as an array of doubles, etc. The other member of the Chromosome type is a double storing its fitness.
We can obviously generate random frequency and damping values, so let's create say 100 random Chromosomes. We don't know how to set their fitness yet, but that's fine. Just set it to zero at first. To set the real fitness value, we're going to have to run the wavelet transform once for each of our 100 parameter settings.
for Chromosome chr in population
chr.fitness = run_wavelet_transform(chr.frequency, chr.damping)
end
Now we have 100 possible wavelet transforms, each with a computed error, stored in our set called population. What's left is to select fitter members of the population, breed them, and allow the fitter members of the population and offspring to survive into the next generation.
while not done
offspring = new_population()
while count(offspring) < N
parent1, parent2 = select_parents(population)
child1, child2 = do_crossover(parent1, parent2)
mutate(child1)
mutate(child2)
child1.fitness = run_wavelet_transform(child1.frequency, child1.damping)
child2.fitness = run_wavelet_transform(child2.frequency, child2.damping)
offspring.add(child1)
offspring.add(child2)
end while
population = merge(population, offspring)
end while
There are a bunch of different ways to do the individual steps like select_parents, do_crossover, mutate, and merge here, but the basic structure of the GA stays pretty much the same. You just have to run a brand new wavelet decomposition for every new offspring.

How do I process enormous numbers? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Most efficient implementation of a large number class
Suppose I needed to calculate 2^150000. Obviously that number is going to exceed the size of an int, float, or double. How can I make a data type that allows normal math functions but exceeds the basic number types?
If this is a "depends which language you use" kind of deal. I will say C#.
See
Most efficient implementation of a large number class
for some leads.
If C# is not cast in stone, and you want something that just works out of the box, then there are several options. The one I know best is Python, but I think that languages like Scheme and Ruby support large numbers, too.
Python: 2**150000. Prints the result after about 1 second.
If you want free mathematics software, look at Maxima or Sage.
You might also consider using Frink, which is a language with the native capability of dealing with measurement units.
It computes 2^150000 without difficulty, deals with fractions (e.g. 1/3+2/5 --> 11/15), computes 3 meters + 2 inch --> 3.0508 m and is a full programming language.
Frink - Copyright 2000-2008 Alan Eliasen, eliasen#mindspring.com
http://futureboy.us/frinkdocs/
Several languages have built in support for arbitrary large numbers. You could use Mathematica, for example. I tried your example in Mathematica, and the result has 45,155 digits. I tried the same example with bc on a Unix machine. bc supports extended precision, but not that extended; it bombed on the example.
Lisp is your friend. Default biginteger numbers.
I find it very frustrating to use a language without arbitrarily large numbers: it seems nonsensical to be able to use ordinary operators like addition on most numbers, but to have to switch to method calls on a BigInt instance simply because of its size.
A whole bunch of languages have more complete numeric towers, and seamlessly coerce when needed; e.g., Allegro Common Lisp evaluates and prints all 45,155 digits of (expt 2 150000) in 1ms.
cl-user(2): (time (expt 2 150000))
; cpu time (non-gc) 0 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 0 msec user, 0 msec system
; real time 1 msec
; space allocation:
; 2 cons cells, 18,784 other bytes, 0 static bytes
There is a product in C called calc which is an arbitrary precision calculator. I used it once when working as a researcher and found it fairly straightforward to use...
http://sourceforge.net/projects/calc/
It can be programmed for difficult or long calculations and can accept arguments from the command line. In interactive mode, it accepts one command at a time, and displays the answer.
Ordinarily the commands are simply expressions such as:
3 * (4 + 1)
and calc will print:
15
Calc does the arithmetic operators +, -, /, * as well as ^ (exponentiation), % (modulus) and // (integer divide).
For example:
3 * 19 ^ 43 - 1
will produce:
29075426613099201338473141505176993450849249622191102976
Calc values can be VERY large. For example:
2 ^ 23209 - 1
will print:
402874115778988778181873329071 ... loads of digits ... 3779264511
Hope this helps...
I don't know C# but I do know the Ruby programming language has the BigDemical class that seems to allow numbers of unlimited size.
Python has a bignum library. If you need to implement a bignum library in another language you can at least use the Python one as reference for validating your work. Note that bignums have a few implementation gotchas that aren't immediately obvious if you don't know what you're looking for.