Exponents in Genetic Programming - genetic-programming

I want to have real-valued exponents (not just integers) for the terminal variables.
For example, lets say I want to evolve a function y = x^3.5 + x^2.2 + 6. How should I proceed? I haven't seen any GP implementations which can do this.
I tried using the power function, but sometimes the initial solutions have so many exponents that the evaluated value exceeds 'double' bounds!
Any suggestion would be appreciated. Thanks in advance.

DEAP (in Python) implements it. In fact there is an example for that. By adding the math.pow from Python in the primitive set you can acheive what you want.
pset.addPrimitive(math.pow, 2)
But using the pow operator you risk getting something like x^(x^(x^(x))), which is probably not desired. You shall add a restriction (by a mean that I not sure) on where in your tree the pow is allowed (just before a leaf or something like that).
OpenBeagle (in C++) also allows it but you will need to develop your own primitive using the pow from <math.h>, you can use as an example the Sin or Cos primitive.

If only some of the initial population are suffering from the overflow problem then just penalise them with a poor fitness score and they will probably be removed from the population within a few generations.
But, if the problem is that virtually all individuals suffer from this problem, then you will have to add some constraints. The simplest thing to do would be to constrain the exponent child of the power function to be a real literal - which would mean powers would not be allowed to be nested. It depends on whether this is sufficient for your needs though. There are a few ways to add constraints like these (or more complex ones) - try looking in to Constrained Syntactic Structures and grammar guided GP.
A few other simple thoughts: can you use a data-type with a larger range? Also, you could reduce the maximum depth parameter, so that there will be less room for nested exponents. Of course that's only possible to an extent, and it depends on the complexity of the function.

Integers have a different binary representation than reals, so you have to use a slightly different bitstring representation and recombination/mutation operator.
For an excellent demonstration, see slide 24 of www.cs.vu.nl/~gusz/ecbook/slides/Genetic_Algorithms.ppt or check out the Eiben/Smith book "Introduction to Evolutionary Computing Genetic Algorithms." This describes how to map a bit string to a real number. You can then create a representation where x only lies within an interval [y,z]. In this case, choose y and z to be the of less magnitude than the capacity of the data type you are using (e.g. 10^308 for a double) so you don't run into the overflow issue you describe.

You have to consider that with real-valued exponents and a negative base you will not obtain a real, but a complex number. For example, the Math.Pow implementation in .NET says that you get NaN if you attempt to calculate the power of a negative base to a non-integer exponent. You have to make sure all your x values are positive. I think that's the problem that you're seeing when you "exceed double bounds".
Btw, you can try the HeuristicLab GP implementation. It is very flexible with a configurable grammar.

Related

Principled reasoning about tolerances and formulas for comparing floating-point numbers?

The Python standard library contains the function math.isclose, which is equivalent to:
abs(a - b) <= max(rtol * max(abs(a), abs(b)), atol)
The Numpy library likewise contains numpy.isclose and numpy.allclose, which are equivalent to:
abs(a - b) <= (atol + rtol * abs(b))
Neither documentation page explains why you would want to use one of these formulas over the other, or provides any principled criteria for choosing sensible absolute and relative tolerances, written above as atol and rtol respectively.
I very often end up having to use these functions in tests for my code, but I never learned any principled basis for choosing between these two formulas, or for choosing tolerances that might be appropriate to my use case.
I usually just leave the default values as-is unless I happen to know that I'm doing something that could result in a loss of numerical precision, at which point I hand-tune the tolerances until the results seem right, largely based on gut feeling and checking examples by hand. This is tedious, imperfect, and seems antithetical to the purpose of software testing, particularly property-based testing.
For example, I might want to assert that two different implementations of the same algorithm produce "the same" result, acknowledging that an exact equality comparison doesn't make sense.
What are principled techniques that I can use for choosing a sensible formula and tolerances for comparing floating point numbers? For the sake of this question, I am happy to focus on the case of testing code that uses floating-point numbers.
For example, I might want to assert that two different implementations of the same algorithm produce "the same" result, acknowledging that an exact equality comparison doesn't make sense.
Consider instead of a singular true/false assessment of the "same" result, attempt to rate the algorithms same-ness on various properties.
If the assessments are within your tolerance/limits, functions are the "same".
Given g(x) and r(x) (the reference function).
Absolute difference: Try y = abs(g(x) - r(x)) for various (if not all) x. What is the largest y?
Relative difference: Try y = abs((g(x) - r(x))/r(x)) for various normal r(x) (not zeroes). What is the largest y?
Relative difference: Like above with r(x) with sub-normal results. Here relative difference may be far larger than with normals and so handled separately. r(x) == +/-0.0 deserves special assessment.
Range test/ edge cases: What is largest/smallest greatest/least x that "works". e.g. y = my_exp(x) and exp(x) may return infinity or 0.0 at different x, but are otherwise nearly the "same".
Total ordering difference: (a favorite). Map all non-NAN floating point values -inf to +inf to an integer: [-ORDER_N to ORDER_N] with a helper function called total order(). total order(+/-0.0) is 0. Find the maximum difference abs(total_order(g(x)) - total_order(r(x))) and use that metric to determine "same"-ness.
Various function deserve special handling. This field of study has many further considerations.
One question when using relative tolerance is - relative to what? If you want to know if 90 and 100 are "equal" with a 10% tolerance, you get different answers if you take 10% of 90 vs 10% of 100.
The standard library uses the larger of a or b when defining the "what" in that scenario, so it would use 10% of 100 as the tolerance. It also uses the larger of that relative tolerance or the absolute tolerance as the "ultimate" tolerance.
The numpy method simbly uses b for the "relative" tolerance and takes the total of the relative and absolute tolerance as the "ultimate" tolerance.
Which is better? Neither is better or worse- they are different ways of establishing a tolerance. You can choose which one to use based on how you want to define "close enough".
The tolerances you choose are contextual as well - are you comparing lengths of lumber or the distance between circuit paths in a microprocessor? Is 1% tolerance "good enough" or do you need ultra-precise tolerance? A tolerance too low might yield too many "false positives" depending on the application, while too high a tolerance will yield too many "false negatives" that might let some problems "slip through the cracks".
Note that the standard function is not vectorized, so if you want to use it on arrays you'll either have to use the numpy function or build a vertorized version of the standard one.
Nobody can choose the tolerances for you, they are problem dependent. Because in real-life the input data that you work on has (very) limited accuracy, be it the result of experimental measurement or of numerical computation that introduces truncation errors. So you need to know your data and understand the concepts and methods of error calculus to adjust them.
As regards the formulas, they were designed to be general-purpose, i.e. not knowing if the quantities to be compared can be strictly equal or not (when they are strictly equal, the relative error does not work). Again, this should not be a blind choice.

Why kotlin.math functions does not have implementation of Long

I have been working with kotlin for little over 2 years now.
Looking over what I learned in these 2 years, I noticed that I have been using(num.toDouble()).toLong() for kotlin.math functions a bit too much. For example, Math.sqrt(num.toDouble()).toLong(). Two of my projects have extension function sumByLong() inside util created by team, because kotlin libs only have sumBy:Int and sumByDouble:Double and a lot of work in the project, uses Long.
In short, Mathematical operations using Long is more common than using Double or Float, yet Long has a very small footprint in kotlin standard library. And since kotlin.math is different than java.lang.Math, mixed usage is not a recommended practice.
Going over docs of kotlin.math, all functions except for abs, min, max, only have implementation for Float and Double only.
Can someone Explain like I am 5 the possible reasoning behind this. Something real, not silly stuff like devs were lazy, or more code means more work, which is all I could find in search engine results.
--
Update: Some Clarification
1. I can understand that in most cases, return types will contain floating point numbers. I am also talking about parameters lacking long counterpart. Maybe using Math.sqrt wasn't the best example, something like math.log, math.cos, etc would be better example, where floating return type us expected, but parameters doesn't even support Int
2. When I said "Long is more common than using Double", I was not talking about public at large, but was looking over my past two years working with kotlin. I am sorry if my phrasing wasn't clear.
Disclaimer: this answer may be a little opinionated, but I believe it is according to general consensus and best practices of using maths in computer science.
Mathematics for integers and for real numbers (floats) are really two much different math "sub-worlds". They're pretty separate, they have different uses and we usually don't mix them.
If we work on some physics, we do real-world simulations, we operate on units like temperature or speed, we use doubles. If we have identifiers (bank account number), we count something (number of bank accounts) or we operate on a discrete values with 100% precision (bank account value) we always use integers and never doubles.
Operations like sinus, square root or logarithm make perfect sense for physics, but not really for bank account values. They very often produce either very small or very large numbers that can't be safely represented as integers. They operate on approximations and don't really provide 100% precise results. They are continuous by nature while integers are discrete.
What is the point of using integers with sqrt() or log() if they almost always return a floating point result? What is the point of passing an integer to sin() if e.g. there are only 2 distinct angles smaller than square angle that can be represented as an integer: 0 and 1? Using integers with these functions is unnatural and impractical.
I can't think of a case where we have to often convert between longs and doubles. Usually, we operate either on longs or on doubles top to bottom and we don't convert between them too often. By converting we lose advantages of these specific "math sub-worlds", we sum their disadvantages. Maybe you should just keep using doubles in your application and don't convert to/from longs? Why do you use longs?
BTW, you mentioned that you can't/shouldn't use java.lang.Math in the Kotlin application. Well, if you look into java.lang.Math you will notice that... it supports only doubles :-)
In the case of ceil, it returns a Double because a Double has a bigger range of values than Long. Consider, for example:
ceil(Long.MAX_VALUE.toDouble() * 1000)
What would you expect it to return if it returned a Long? For further discussion, see Why does Math.ceil return a double?
In the case of log and trigonometric functions, the use cases requiring Long parameters are rare and the requirements varied. For example, should it round up, down, or to the nearest integral value? These are decisions that should be made for your particular project, and therefore can't be made in the stdlib.
In your project, you can simply define your required functions in a single, small source file, making your project's choice of rounding method, and then use it everywhere instead of converting at each call site, e.g.:
fun cos(n: Long): Long = cos(x.toDouble()).roundToLong()

Math.Net Complex32 Division giving NaN

Alright, so i have two large complex values. Top, and Bottom:
Top = 4.0107e+030
Bot = 5.46725E26 -2.806428e26i
when i divide these two numbers in Math.Net's Complex32, it gives me a NaN for both the real and imaginray. I am assuming that it has smething to do with the precision.
When i use Matlab i get the following:
Top/Bot = 5.8060e+003 +2.9803e+003i
When i use System.Numerics i get something very close to matlabs, at least in the correct order of magnitute:
Top/Bot = +5575.19343780947 +2676.09270239214i System.Numerics.Complex
i wonder, which one is the right one? and why is Math.Net giving me a wrong answer
I am running simulations and i very much care about the accuracy of the numerics?
Anyway to fix this? i will be dealing with a lot of large complex numbers.
Plus, if anyone knows of a good Complex library for .net with support for special functions such as the complemetary error function and the error function of Complex parameters, that would be great.
As i found out that Math.Net doesn't support cerf of a complex32
If you care about accuracy you should obviously use the double precision/64 bit type, not the single precision/32 bit one. Note that we only provide a Complex32 but no Complex (64) type in the normal package because we want you to use the Complex type provided in System.Numerics for compatibility - we only provide an equivalent Complex (64) type in the portable build as System.Numerics is not available there.
But in this specific case, this is not a problem of precision (or accuracy), but about range. Remember that 32 bit floating point numbers can not be larger than ~3.4e+38. Computing a complex division in normal direct form requires computing the square of both real and imaginary components of the denominator, which in your case will get out of range and become "infinity" and thus NaN in the final result.
Now, it might be possible to implement the division in a form that avoids computing the square when the denominator is larger than about 1e+19, but we have not done that yet in Math.NET Numerics (as there was no demand for it up to now). This would also not be a problem if the complex type would implement the polar form, but that is quite uncommon.

Why do SSE integer averaging instructions (PAVGB/PAVGW) add 1 to temporary sum before calculating final result?

I have been working on SSE optimization for a video processing algorithm recently. I need to write the exactly same algorithm in C code to cross-check correctness of the algorithm. I forgot about this fact several time, that makes results of the two implementations become different.
I can modify the C implementation to make them match since this difference doesn't matter. But why these instructions are designed like this? Is there any mathematical reason behind it?
The Intel Instructions Reference only mentions this behavior and don't explain why. I also tried googling, but couldn't find anything about it.
UPDATE:
Thanks to Paul's answer. I didn't realize that is rounding/truncation problem. But since both operands are integer, the only fraction will be 0.5, and it has 2 "nearest integer". AFAIK there are several rounding methods for this situation. Why the instructions use rounding up specifically? Do most related applications need rounding up?
It's to give correct rounding, i.e. round to nearest rather than truncation. In general when you divide by N with integer values you need to do this to get correct rounding:
y = (x + N / 2) / N;
If you just do:
y = x / N;
then you will get a truncated (round to zero) result.
Round to nearest is generally preferred for image processing and DSP type applications.

How do I perform binary addition on a mod type in Ada?

Very specific issue here…and no this isn’t homework (left that far…far behind). Basically I need to compute a checksum for code being written to an EPROM and I’d like to write this function in an Ada program to practice my bit manipulation in the language.
A section of a firmware data file for an EPROM is being changed by me and that change requires a new valid checksum at the end so the resulting system will accept the changed code. This checksum starts out by doing a modulo 256 binary sum of all data it covers and then other higher-level operations are done to get the checksum which I won’t go into here.
So now how do I do binary addition on a mod type?
I assumed if I use the “+” operator on a mod type it would be summed like an integer value operation…a result I don’t want. I’m really stumped on this one. I don’t want to really do a packed array and perform the bit carry if I don’t have to, especially if that’s considered “old hat”. References I’m reading claim you need to use mod types to ensure more portable code when dealing with binary operations. I’d like to try that if it’s possible. I'm trying to target multiple platforms with this program so portability is what I'm looking for.
Can anyone suggest how I might perform binary addition on a mod type?
Any starting places in the language would be of great help.
Just use a modular type, for which the operators do unsigned arithmetic.
type Word is mod 2 ** 16; for Word'Size use 16;
Addendum: For modular types, the predefined logical operators operate on a bit-by-bit basis. Moreover, "the binary adding operators + and – on modular types include a final reduction modulo the modulus if the result is outside the base range of the type." The function Update_Crc is an example.
Addendum: §3.5.4 Integer Types, ¶19 notes that for modular types, the results of the predefined operators are reduced modulo the modulus, including the binary adding operators + and –. Also, the shift functions in §B.2 The Package Interfaces are available for modular types. Taken together, the arithmetical, logical and shift capabilities are sufficient for most bitwise operations.