How do I perform binary addition on a mod type in Ada? - cross-platform

Very specific issue here…and no this isn’t homework (left that far…far behind). Basically I need to compute a checksum for code being written to an EPROM and I’d like to write this function in an Ada program to practice my bit manipulation in the language.
A section of a firmware data file for an EPROM is being changed by me and that change requires a new valid checksum at the end so the resulting system will accept the changed code. This checksum starts out by doing a modulo 256 binary sum of all data it covers and then other higher-level operations are done to get the checksum which I won’t go into here.
So now how do I do binary addition on a mod type?
I assumed if I use the “+” operator on a mod type it would be summed like an integer value operation…a result I don’t want. I’m really stumped on this one. I don’t want to really do a packed array and perform the bit carry if I don’t have to, especially if that’s considered “old hat”. References I’m reading claim you need to use mod types to ensure more portable code when dealing with binary operations. I’d like to try that if it’s possible. I'm trying to target multiple platforms with this program so portability is what I'm looking for.
Can anyone suggest how I might perform binary addition on a mod type?
Any starting places in the language would be of great help.

Just use a modular type, for which the operators do unsigned arithmetic.
type Word is mod 2 ** 16; for Word'Size use 16;
Addendum: For modular types, the predefined logical operators operate on a bit-by-bit basis. Moreover, "the binary adding operators + and – on modular types include a final reduction modulo the modulus if the result is outside the base range of the type." The function Update_Crc is an example.
Addendum: §3.5.4 Integer Types, ¶19 notes that for modular types, the results of the predefined operators are reduced modulo the modulus, including the binary adding operators + and –. Also, the shift functions in §B.2 The Package Interfaces are available for modular types. Taken together, the arithmetical, logical and shift capabilities are sufficient for most bitwise operations.

Related

Why kotlin.math functions does not have implementation of Long

I have been working with kotlin for little over 2 years now.
Looking over what I learned in these 2 years, I noticed that I have been using(num.toDouble()).toLong() for kotlin.math functions a bit too much. For example, Math.sqrt(num.toDouble()).toLong(). Two of my projects have extension function sumByLong() inside util created by team, because kotlin libs only have sumBy:Int and sumByDouble:Double and a lot of work in the project, uses Long.
In short, Mathematical operations using Long is more common than using Double or Float, yet Long has a very small footprint in kotlin standard library. And since kotlin.math is different than java.lang.Math, mixed usage is not a recommended practice.
Going over docs of kotlin.math, all functions except for abs, min, max, only have implementation for Float and Double only.
Can someone Explain like I am 5 the possible reasoning behind this. Something real, not silly stuff like devs were lazy, or more code means more work, which is all I could find in search engine results.
--
Update: Some Clarification
1. I can understand that in most cases, return types will contain floating point numbers. I am also talking about parameters lacking long counterpart. Maybe using Math.sqrt wasn't the best example, something like math.log, math.cos, etc would be better example, where floating return type us expected, but parameters doesn't even support Int
2. When I said "Long is more common than using Double", I was not talking about public at large, but was looking over my past two years working with kotlin. I am sorry if my phrasing wasn't clear.
Disclaimer: this answer may be a little opinionated, but I believe it is according to general consensus and best practices of using maths in computer science.
Mathematics for integers and for real numbers (floats) are really two much different math "sub-worlds". They're pretty separate, they have different uses and we usually don't mix them.
If we work on some physics, we do real-world simulations, we operate on units like temperature or speed, we use doubles. If we have identifiers (bank account number), we count something (number of bank accounts) or we operate on a discrete values with 100% precision (bank account value) we always use integers and never doubles.
Operations like sinus, square root or logarithm make perfect sense for physics, but not really for bank account values. They very often produce either very small or very large numbers that can't be safely represented as integers. They operate on approximations and don't really provide 100% precise results. They are continuous by nature while integers are discrete.
What is the point of using integers with sqrt() or log() if they almost always return a floating point result? What is the point of passing an integer to sin() if e.g. there are only 2 distinct angles smaller than square angle that can be represented as an integer: 0 and 1? Using integers with these functions is unnatural and impractical.
I can't think of a case where we have to often convert between longs and doubles. Usually, we operate either on longs or on doubles top to bottom and we don't convert between them too often. By converting we lose advantages of these specific "math sub-worlds", we sum their disadvantages. Maybe you should just keep using doubles in your application and don't convert to/from longs? Why do you use longs?
BTW, you mentioned that you can't/shouldn't use java.lang.Math in the Kotlin application. Well, if you look into java.lang.Math you will notice that... it supports only doubles :-)
In the case of ceil, it returns a Double because a Double has a bigger range of values than Long. Consider, for example:
ceil(Long.MAX_VALUE.toDouble() * 1000)
What would you expect it to return if it returned a Long? For further discussion, see Why does Math.ceil return a double?
In the case of log and trigonometric functions, the use cases requiring Long parameters are rare and the requirements varied. For example, should it round up, down, or to the nearest integral value? These are decisions that should be made for your particular project, and therefore can't be made in the stdlib.
In your project, you can simply define your required functions in a single, small source file, making your project's choice of rounding method, and then use it everywhere instead of converting at each call site, e.g.:
fun cos(n: Long): Long = cos(x.toDouble()).roundToLong()

Choosing serialization frameworks

I was reading about the downsides of using java serialization and the necessity to go for serialization frameworks. There are so many frameworks like avro, parquet, thrift, protobuff.
Question is what framework addresses what and what are all the parameters that are to be considered while choosing a serialization framework.
I would like to get hands on with a practical use case and compare/choose the serialization frameworks based on the requirements.
Can somebody please assist on this topic?
There are a lot of factors to consider. I'll go through some of the important ones.
0) Schema first or Code First
If you have a project that'll involve different languages, code first approaches are likely to be problematic. It's all very well have a JAVA class that can be serialised, but it might be a nuisance if that has to be deserialised in C.
Generally I favour schema first approaches, just in case.
1) Inter-object Demarkation
Some serialisations produce a byte stream that makes it possible to see where one object stops and another begins. Other do not.
So, if you have a message transport / data store that will separate out batches of bytes for you, e.g. ZeroMQ, or a data base field, then you can use a serialisation that doesn't demarkate messages. Examples include Google Protocol Buffers. With demarkation done by the transport / store, the reader can get a batch of bytes knowing for sure that it encompasses one object, and only one object.
If your message transport / data store doesn't demarkate between batches of bytes, e.g. a network stream or a file, then either you invent your own demarkation markers, or use a serialisation that demarkates for you. Examples include ASN.1 BER, XML.
2) Cannonical
This is a property of a serialisation which means that the serialised data desribes its own structure. In principal the reader of a cannonical message doesn't have to know up front what the message structure is, it can simply work that out as it reads the bytes (even if it doesn't know the field names). This can be useful in circumstances where you're not entirely sure where the data is coming from. If the data is not cannonical, the reader has to know in advance what the object structure was otherwise the deserialisation is ambiguous.
Examples of cannonical serialisations include ASN.1 BER, ASN.1 cannonical PER, XML. Ones that aren't include ASN.1 uPER, possibly Google Protocol Buffers (I may have that wrong).
AVRO does something different - the data schema is itself part of the serialised data, so it is always possible to reconstruct the object from arbitrary data. As you can imagine the libraries for this are somewhat clunky in languages like C, but rather better in dynamic languages.
3) Size and Value Constrained.
Some serialisation technologies allow the developer to set constraints on the values of fields and the sizes of arrays. The intention is that code generated from a schema file incorporating such constraints will automatically validate objects on serialisation and on deserialistion.
This can be extremely useful - free, schema driven content inspection done automatically. It's very easy to spot out-of-specification data.
This is extremely useful in large, hetrogenous projects (lots of different languages in use), as all sources of truth about what's valid and what's not comes from the schema, and only the schema, and is enforced automatically by the auto-generated code. Developers can't ignore / get round the constraints, and when the constraints change everyone can't help but notice.
Examples include ASN.1 (usually done pretty well by tool sets), XML (not often done properly by free / cheap toolsets; MS's xsd.exe purposefully ignores any such constraints) and JSON (down to object validators). Of these three, ASN.1 has by far the most elaborate of constraints syntaxes; it's really very powerful.
Examples that don't - Google Protocol Buffers. In this regard GPB is extremely irritating, because it doesn't have constraints at all. The only way of having value and size constraints is to either write them as comments in the .proto file and hope developers read them and pay attention, or some other sort of non-sourcecode approach. With GPB being aimed very heavily at hetrogrenous systems (literally every language under the sun is supported), I consider this to be a very serious omission, because value / size validation code has to be written by hand for each language used in a project. That's a waste of time. Google could add syntactical elements to .proto and code generators to support this without changing wire foramts at all (it's all in the auto-generated code).
4) Binary / Text
Binary serialisations will be smaller, and probably a bit quicker to serialise / deserialise. Text serialisations are more debuggable. But it's amazing what can be done with binary serialisations. For example, one can easily add ASN.1 decoders to Wireshark (you compile them up from your .asn schema file using your ASN.1 tools), et voila - on the wire decoding of programme data. The same is possible with GPB I should think.
ASN.1 uPER is extremely useful in bandwidth constrained situations; it automatically uses the size / value constraints to economise in bits on the wire. For example, a field valid between 0 and 15 needs only 4 bits, and that's what uPER will use. It's no coincidence that uPER features heavily in protocols like 3G, 4G, and 5G too I should think. This "minimum bits" approach is a whole lot more elegant than compressing a text wireformat (which is what's done a lot with JSON and XML to make them less bloaty).
5) Values
This is a bit of an oddity. In ASN.1 a schema file can define both the structure of objects, and also values of objects. With the better tools you end up with (in your C++, JAVA, etc source code) classes, and pre-define objects of that class already filled in with values.
Why is that useful? Well, I use it a lot for defining project constants, and to give access to the limits on constraints. For example, suppose you'd got a an array field with a valid length of 15 in a message. You could have a literal 15 in the field constraint or you could cite the value of an integer value object in the constraint, with the integer also being available to developers.
--ASN.1 value that, in good tools, is built into the
--generated source code
arraySize INTEGER ::= 16
--A SET that has an array of integers that size
MyMessage ::= SET
{
field [0] SEQUENCE (SIZE(arraySize)) OF INTEGER
}
This is really handy in circumstances where you want to loop over that constraint, because the loop can be
for (int i = 0; i < arraySize; i++) {do things with MyMessage.field[i];} // ArraySize is an integer in the auto generated code built from the .asn schema file
Clearly this is fantastic if the constraint ever needs to be changed, because the only place it has to be changed is in the schema, followed by a project recompile (where every place it's used will use the new value). Better still, if it's renamed in the schema file a recompile identifies everywhere in the poject it was used (because the developer written source code that uses it is still using the old name, which is now an undefined symbol --> compiler errors.
ASN.1 constraints can get very elaborate. Here's a tiny taste of what can be done. This is fantastic for system developers, but is pretty complicated for the tool developers to implement.
arraySize INTEGER ::= 16
minSize INTEGER ::= 4
maxVal INTEGER ::= 31
minVal INTEGER ::= 16
oddVal INTEGER ::= 63
MyMessage2 ::= SET
{
field_1 [0] SEQUENCE (SIZE(arraySize)) OF INTEGER, -- 16 elements
field_2 [1] SEQUENCE (SIZE(0..arraySize)) OF INTEGER, -- 0 to 16 elements
field_3 [2] SEQUENCE (SIZE(minSize..arraySize)) OF INTEGER, -- 4 to 16 elements
field_4 [3] SEQUENCE (SIZE(minSize<..<arraySize)) OF INTEGER, -- 5 to 15 elements
field_5 [4] SEQUENCE (SIZE(minSize<..<arraySize)) OF INTEGER(0..maxVal), -- 5 to 15 elements valued 0..31
field_6 [5] SEQUENCE (SIZE(minSize<..<arraySize)) OF INTEGER(minVal..maxVal), -- 5 to 15 elements valued 16..31
field_7 [6] SEQUENCE (SIZE(minSize<..<arraySize)) OF INTEGER(minVal<..maxVal), -- 5 to 15 elements valued 17..31
field_8 [7] SEQUENCE (SIZE(arraySize)) OF INTEGER(minVal<..<maxVal), -- 16 elements valued 17..30
field_9 [8] INTEGER (minVal..maxVal AND oddVal) -- valued 16 to 31, and also 63
f8_indx [10] INTEGER (0..<arraySize) -- index into field 8, constrained to be within the bounds of field 8
}
So far as I know, only ASN.1 does this. And then it's only the more expensive tools that actually pick up these elements out of a schema file. With it, this makes it tremendously useful in a large project because literally everything related to data and its constraints and how to handle it is defined in only the .asn schema, and nowhere else.
As I said, I use this a lot, for the right type of project. Once one has got it pervading an entire project, the amount of time and risk saved is fantastic. It changes the dynamics of a project too; one can make late changes to a schema knowing that the entire project picks those up with nothing more than a recompile. So, protocol changes late in a project go from being high risk to something you might be content to do every day.
6) Wireformat Object Type
Some serialisation wireformats will identify the type of an object in the wireformat bytestrean. This helps the reader in situations where objects of many different types may arrive from one or more sources. Other serialisations won't.
ASN.1 varies from wireformat to wireformat (it has several, including a few binary ones as well as XML and JSON). ASN.1 BER uses type, value and length fields in its wireformat, so it is possible for the reader to peek at an object's tag up front decode the byte stream accordingly. This is very useful.
Google Protocol Buffers doesn't quite do the same thing, but if all message types in a .proto are bundled up into one final oneof, and it's that that's only every serialised, then you can achieve the same thing
7) Tools cost.
ASN.1 tools range from very, very expensive (and really good), to free (and less good). A lot of others are free, though I've found that the best XML tools (paying proper attention to value / size constraints) are quite expensive too.
8) Language Coverage
If you've heard of it, it's likely covered by tools for lots of different languages. If not, less so.
The good commercial ASN.1 tools cover C/C++/Java/C#. There are some free C/C++ ones of varying completeness.
9) Quality
It's no good picking up a serialisation technology if the quality of the tools is poor.
In my experience, GPB is good (it generally does what it says it will). The commercial ASN1 tools are very good, eclipsing GPB's toolset comprehensively. AVRO works. I've heard of some occassional problems with Capt'n Proto, but having not used it myself you'd have to check that out. XML works with good tools.
10) Summary
In case you can't tell, I'm quite a fan of ASN.1.
GPB is incredibly useful too for its widespread support and familiarity, but I do wish Google would add value / size constraints to fields and arrays, and also incorporate a value notation. If they did this it'd be possible to have the same project workflow as can be achieved with ASN.1. If Google added just these two features, I'd consider GPB to be pretty well nigh on "complete", needing only an equivalent of ASN.1's uPER to finish it off for those people with little storage space or bandwidth.
Note that quite a lot of it is focused on what a project's circumstances are, as well as how good / fast / mature the technology actually is.

Why does the Java API use int instead of short or byte?

Why does the Java API use int, when short or even byte would be sufficient?
Example: The DAY_OF_WEEK field in class Calendar uses int.
If the difference is too minimal, then why do those datatypes (short, int) exist at all?
Some of the reasons have already been pointed out. For example, the fact that "...(Almost) All operations on byte, short will promote these primitives to int". However, the obvious next question would be: WHY are these types promoted to int?
So to go one level deeper: The answer may simply be related to the Java Virtual Machine Instruction Set. As summarized in the Table in the Java Virtual Machine Specification, all integral arithmetic operations, like adding, dividing and others, are only available for the type int and the type long, and not for the smaller types.
(An aside: The smaller types (byte and short) are basically only intended for arrays. An array like new byte[1000] will take 1000 bytes, and an array like new int[1000] will take 4000 bytes)
Now, of course, one could say that "...the obvious next question would be: WHY are these instructions only offered for int (and long)?".
One reason is mentioned in the JVM Spec mentioned above:
If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte
Additionally, the Java Virtual Machine can be considered as an abstraction of a real processor. And introducing dedicated Arithmetic Logic Unit for smaller types would not be worth the effort: It would need additional transistors, but it still could only execute one addition in one clock cycle. The dominant architecture when the JVM was designed was 32bits, just right for a 32bit int. (The operations that involve a 64bit long value are implemented as a special case).
(Note: The last paragraph is a bit oversimplified, considering possible vectorization etc., but should give the basic idea without diving too deep into processor design topics)
EDIT: A short addendum, focussing on the example from the question, but in an more general sense: One could also ask whether it would not be beneficial to store fields using the smaller types. For example, one might think that memory could be saved by storing Calendar.DAY_OF_WEEK as a byte. But here, the Java Class File Format comes into play: All the Fields in a Class File occupy at least one "slot", which has the size of one int (32 bits). (The "wide" fields, double and long, occupy two slots). So explicitly declaring a field as short or byte would not save any memory either.
(Almost) All operations on byte, short will promote them to int, for example, you cannot write:
short x = 1;
short y = 2;
short z = x + y; //error
Arithmetics are easier and straightforward when using int, no need to cast.
In terms of space, it makes a very little difference. byte and short would complicate things, I don't think this micro optimization worth it since we are talking about a fixed amount of variables.
byte is relevant and useful when you program for embedded devices or dealing with files/networks. Also these primitives are limited, what if the calculations might exceed their limits in the future? Try to think about an extension for Calendar class that might evolve bigger numbers.
Also note that in a 64-bit processors, locals will be saved in registers and won't use any resources, so using int, short and other primitives won't make any difference at all. Moreover, many Java implementations align variables* (and objects).
* byte and short occupy the same space as int if they are local variables, class variables or even instance variables. Why? Because in (most) computer systems, variables addresses are aligned, so for example if you use a single byte, you'll actually end up with two bytes - one for the variable itself and another for the padding.
On the other hand, in arrays, byte take 1 byte, short take 2 bytes and int take four bytes, because in arrays only the start and maybe the end of it has to be aligned. This will make a difference in case you want to use, for example, System.arraycopy(), then you'll really note a performance difference.
Because arithmetic operations are easier when using integers compared to shorts. Assume that the constants were indeed modeled by short values. Then you would have to use the API in this manner:
short month = Calendar.JUNE;
month = month + (short) 1; // is july
Notice the explicit casting. Short values are implicitly promoted to int values when they are used in arithmetic operations. (On the operand stack, shorts are even expressed as ints.) This would be quite cumbersome to use which is why int values are often preferred for constants.
Compared to that, the gain in storage efficiency is minimal because there only exists a fixed number of such constants. We are talking about 40 constants. Changing their storage from int to short would safe you 40 * 16 bit = 80 byte. See this answer for further reference.
The design complexity of a virtual machine is a function of how many kinds of operations it can perform. It's easier to having four implementations of an instruction like "multiply"--one each for 32-bit integer, 64-bit integer, 32-bit floating-point, and 64-bit floating-point--than to have, in addition to the above, versions for the smaller numerical types as well. A more interesting design question is why there should be four types, rather than fewer (performing all integer computations with 64-bit integers and/or doing all floating-point computations with 64-bit floating-point values). The reason for using 32-bit integers is that Java was expected to run on many platforms where 32-bit types could be acted upon just as quickly as 16-bit or 8-bit types, but operations on 64-bit types would be noticeably slower. Even on platforms where 16-bit types would be faster to work with, the extra cost of working with 32-bit quantities would be offset by the simplicity afforded by only having 32-bit types.
As for performing floating-point computations on 32-bit values, the advantages are a bit less clear. There are some platforms where a computation like float a=b+c+d; could be performed most quickly by converting all operands to a higher-precision type, adding them, and then converting the result back to a 32-bit floating-point number for storage. There are other platforms where it would be more efficient to perform all computations using 32-bit floating-point values. The creators of Java decided that all platforms should be required to do things the same way, and that they should favor the hardware platforms for which 32-bit floating-point computations are faster than longer ones, even though this severely degraded PC both the speed and precision of floating-point math on a typical PC, as well as on many machines without floating-point units. Note, btw, that depending upon the values of b, c, and d, using higher-precision intermediate computations when computing expressions like the aforementioned float a=b+c+d; will sometimes yield results which are significantly more accurate than would be achieved of all intermediate operands were computed at float precision, but will sometimes yield a value which is a tiny bit less accurate. In any case, Sun decided everything should be done the same way, and they opted for using minimal-precision float values.
Note that the primary advantages of smaller data types become apparent when large numbers of them are stored together in an array; even if there were no advantage to having individual variables of types smaller than 64-bits, it's worthwhile to have arrays which can store smaller values more compactly; having a local variable be a byte rather than an long saves seven bytes; having an array of 1,000,000 numbers hold each number as a byte rather than a long waves 7,000,000 bytes. Since each array type only needs to support a few operations (most notably read one item, store one item, copy a range of items within an array, or copy a range of items from one array to another), the added complexity of having more array types is not as severe as the complexity of having more types of directly-usable discrete numerical values.
If you used the philosophy where integral constants are stored in the smallest type that they fit in, then Java would have a serious problem: whenever programmers write code using integral constants, they have to pay careful attention to their code to check if the type of the constants matter, and if so look up the type in the documentation and/or do whatever type conversions are needed.
So now that we've outlined a serious problem, what benefits could you hope to achieve with that philosophy? I would be unsurprised if the only runtime-observable effect of that change would be what type you get when you look the constant up via reflection. (and, of course, whatever errors are introduced by lazy/unwitting programmers not correctly accounting for the types of the constants)
Weighing the pros and the cons is very easy: it's a bad philosophy.
Actually, there'd be a small advantage. If you have a
class MyTimeAndDayOfWeek {
byte dayOfWeek;
byte hour;
byte minute;
byte second;
}
then on a typical JVM it needs as much space as a class containing a single int. The memory consumption gets rounded to a next multiple of 8 or 16 bytes (IIRC, that's configurable), so the cases when there are real saving are rather rare.
This class would be slightly easier to use if the corresponding Calendar methods returned a byte. But there are no such Calendar methods, only get(int) which must returns an int because of other fields. Each operation on smaller types promotes to int, so you need a lot of casting.
Most probably, you'll either give up and switch to an int or write setters like
void setDayOfWeek(int dayOfWeek) {
this.dayOfWeek = checkedCastToByte(dayOfWeek);
}
Then the type of DAY_OF_WEEK doesn't matter, anyway.
Using variables smaller than the bus size of the CPU means more cycles are necessary. For example when updating a single byte in memory, a 64-bit CPU needs to read a whole 64-bit word, modify only the changed part, then write back the result.
Also, using a smaller data type requires overhead when the variable is stored in a register, since the behavior of the smaller data type to be accounted for explicitly. Since the whole register is used anyways, there is nothing to be gained by using a smaller data type for method parameters and local variables.
Nevertheless, these data types might be useful for representing data structures that require specific widths, such as network packets, or for saving space in large arrays, sacrificing speed.

Math.Net Complex32 Division giving NaN

Alright, so i have two large complex values. Top, and Bottom:
Top = 4.0107e+030
Bot = 5.46725E26 -2.806428e26i
when i divide these two numbers in Math.Net's Complex32, it gives me a NaN for both the real and imaginray. I am assuming that it has smething to do with the precision.
When i use Matlab i get the following:
Top/Bot = 5.8060e+003 +2.9803e+003i
When i use System.Numerics i get something very close to matlabs, at least in the correct order of magnitute:
Top/Bot = +5575.19343780947 +2676.09270239214i System.Numerics.Complex
i wonder, which one is the right one? and why is Math.Net giving me a wrong answer
I am running simulations and i very much care about the accuracy of the numerics?
Anyway to fix this? i will be dealing with a lot of large complex numbers.
Plus, if anyone knows of a good Complex library for .net with support for special functions such as the complemetary error function and the error function of Complex parameters, that would be great.
As i found out that Math.Net doesn't support cerf of a complex32
If you care about accuracy you should obviously use the double precision/64 bit type, not the single precision/32 bit one. Note that we only provide a Complex32 but no Complex (64) type in the normal package because we want you to use the Complex type provided in System.Numerics for compatibility - we only provide an equivalent Complex (64) type in the portable build as System.Numerics is not available there.
But in this specific case, this is not a problem of precision (or accuracy), but about range. Remember that 32 bit floating point numbers can not be larger than ~3.4e+38. Computing a complex division in normal direct form requires computing the square of both real and imaginary components of the denominator, which in your case will get out of range and become "infinity" and thus NaN in the final result.
Now, it might be possible to implement the division in a form that avoids computing the square when the denominator is larger than about 1e+19, but we have not done that yet in Math.NET Numerics (as there was no demand for it up to now). This would also not be a problem if the complex type would implement the polar form, but that is quite uncommon.

Exponents in Genetic Programming

I want to have real-valued exponents (not just integers) for the terminal variables.
For example, lets say I want to evolve a function y = x^3.5 + x^2.2 + 6. How should I proceed? I haven't seen any GP implementations which can do this.
I tried using the power function, but sometimes the initial solutions have so many exponents that the evaluated value exceeds 'double' bounds!
Any suggestion would be appreciated. Thanks in advance.
DEAP (in Python) implements it. In fact there is an example for that. By adding the math.pow from Python in the primitive set you can acheive what you want.
pset.addPrimitive(math.pow, 2)
But using the pow operator you risk getting something like x^(x^(x^(x))), which is probably not desired. You shall add a restriction (by a mean that I not sure) on where in your tree the pow is allowed (just before a leaf or something like that).
OpenBeagle (in C++) also allows it but you will need to develop your own primitive using the pow from <math.h>, you can use as an example the Sin or Cos primitive.
If only some of the initial population are suffering from the overflow problem then just penalise them with a poor fitness score and they will probably be removed from the population within a few generations.
But, if the problem is that virtually all individuals suffer from this problem, then you will have to add some constraints. The simplest thing to do would be to constrain the exponent child of the power function to be a real literal - which would mean powers would not be allowed to be nested. It depends on whether this is sufficient for your needs though. There are a few ways to add constraints like these (or more complex ones) - try looking in to Constrained Syntactic Structures and grammar guided GP.
A few other simple thoughts: can you use a data-type with a larger range? Also, you could reduce the maximum depth parameter, so that there will be less room for nested exponents. Of course that's only possible to an extent, and it depends on the complexity of the function.
Integers have a different binary representation than reals, so you have to use a slightly different bitstring representation and recombination/mutation operator.
For an excellent demonstration, see slide 24 of www.cs.vu.nl/~gusz/ecbook/slides/Genetic_Algorithms.ppt or check out the Eiben/Smith book "Introduction to Evolutionary Computing Genetic Algorithms." This describes how to map a bit string to a real number. You can then create a representation where x only lies within an interval [y,z]. In this case, choose y and z to be the of less magnitude than the capacity of the data type you are using (e.g. 10^308 for a double) so you don't run into the overflow issue you describe.
You have to consider that with real-valued exponents and a negative base you will not obtain a real, but a complex number. For example, the Math.Pow implementation in .NET says that you get NaN if you attempt to calculate the power of a negative base to a non-integer exponent. You have to make sure all your x values are positive. I think that's the problem that you're seeing when you "exceed double bounds".
Btw, you can try the HeuristicLab GP implementation. It is very flexible with a configurable grammar.