I don't understand why the following does not work:
Main> 2.1 + 2.3
Error: Can't find an implementation for FromDouble Integer.
(Interactive):1:1--1:4
1 | 2.1 + 2.3
^^^
It works with integers, and if I understand the docs correctly, it should work with doubles too!
I get the same error with 2.1 * 2.3, but 2.1 - 2.3 and abs 2.1 work fine, which gives me the impression that there is a problem with the Num interface. However I'm a total beginner in Idris, so maybe I'm missing something very silly! I could not find any example of summing doubles on the web…
(I compiled Idris2 from the latest commit on github — 69f680e10a336c4f33414cfd55c13d41b68d735b.)
Edited to add: 2.1 + 2.3 actually works in a file (defining a constant), the problem only occurs in the REPL. I guess I'll have to file a bug.
I want to allow yarn to update to all newer minor + patch versions, so I am using the caret.
If the latest version of "my-package" is "3.3.3", what are the behavioral differences between "my-package": "^3" and "my-package": "^3.2.1"?
If there is none, then is there a reasonable argument in favor of one or the other?
Following this post on perlgeek, it gives an example of currying:
my &add_two := * + 2;
say add_two(5); # 7
Makes sense. But if I swap the + infix operator for the min infix operator:
my &min_two := * min 2;
say min_two(5); # Type check failed in binding; expected 'Callable' but got 'Int'
Even trying to call + via the infix syntax fails:
>> my &curry := &infix:<+>(2, *);
Method 'Int' not found for invocant of class 'Whatever'
Do I need to qualify the Whatever as a numeric value, and if so how? Or am I missing the point entirely?
[Edited with responses from newer rakudo; Version string for above: perl6 version 2014.08 built on MoarVM version 2014.08]
Your Rakudo version is somewhat ancient. If you want to use a more recent cygwin version, you'll probably have to compile it yourself. If you're fine with the Windows version, you can get a binary from rakudo.org.
That said, the current version also does not transform * min 2 into a lambda, but from a cursory test, seems to treat the * like Inf. My Perl6-fu is too weak to know if this is per spec, or a bug.
As a workaround, use
my &min_two := { $_ min 2 };
Note that * only auto-curries (or rather 'auto-primes' in Perl6-speak - see S02) with operators, not function calls, ie your 3rd example should be written as
my &curry := &infix:<+>.assuming(2);
This is because the meaning of the Whatever-* depends on context: it is supposed to DWIM.
In case of function calls, it gets passed as an argument to let the callee decide what it wants to do with it. Even operators are free to handle Whatever explicitly (eg 1..*) - but if they don't, a Whatever operand transforms the operation into a 'primed' closure.
On a modern Pentium it is no longer possible to give branching hints to the processor it seems. Assuming that a profiling compiler such as gcc with profile-guided optimization gains information about likely branching behavior, what can it do to produce code that will execute more quickly?
The only option I know of is to move unlikely branches to the end of a function. Is there anything else?
Update.
http://download.intel.com/products/processor/manual/325462.pdf volume 2a, section 2.1.1 says
"Branch hint prefixes (2EH, 3EH) allow a program to give a hint to the processor about the most likely code path for
a branch. Use these prefixes only with conditional branch instructions (Jcc). Other use of branch hint prefixes
and/or other undefined opcodes with Intel 64 or IA-32 instructions is reserved; such use may cause unpredictable
behavior."
I don't know if these actually have any effect however.
On the other hand section 3.4.1. of http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf says
"
Compilers generate code that improves the efficiency of branch prediction in Intel processors. The Intel
C++ Compiler accomplishes this by:
keeping code and data on separate pages
using conditional move instructions to eliminate branches
generating code consistent with the static branch prediction algorithm
inlining where appropriate
unrolling if the number of iterations is predictable
With profile-guided optimization, the compiler can lay out basic blocks to eliminate branches for the most
frequently executed paths of a function or at least improve their predictability. Branch prediction need
not be a concern at the source level. For more information, see Intel C++ Compiler documentation.
"
http://cache-www.intel.com/cd/00/00/40/60/406096_406096.pdf says in "Performance Improvements with PGO "
"
PGO works best for code with many frequently executed branches that are difficult to
predict at compile time. An example is the code with intensive error-checking in which
the error conditions are false most of the time.
The infrequently executed (cold) errorhandling code can be relocated so the branch is rarely predicted incorrectly. Minimizing
cold code interleaved into the frequently executed (hot) code improves instruction cache
behavior."
There are two possible sources for the information you want:
There's Intel 64 and IA-32 Architectures Software Developer's Manual (3 volumes). This is a huge work which has evolved for decades. It's the best reference I know on a lot of subjects, including floating-point. In this case, you want to check volume 2, the instruction set reference.
There's Intel 64 and IA-32 Architectures Optmization Reference Manual. This will tell you in somewhat brief terms what to expect from each microarchitecture.
Now, I don't know what you mean by a "modern Pentium" processor, this is 2013, right? There aren't any Pentiums anymore...
The instruction set does support telling the processor if the branch is expected to be taken or not taken by a prefix to the conditional branch instructions (such as JC, JZ, etc). See volume 2A of (1), section 2.1.1 (of the version I have) Instruction Prefixes. There is the 2E and 3E prefixes for not taken and taken respectively.
As to whether these prefixes actually have any effect, if we can get that information, it will be on Optimization Reference Manual, the section for the microarchitecture you want (and I'm sure it won't be the Pentium).
Apart from using those, there is an entire section on the Optimization Reference Manual on that subject, that's section 3.4.1 (of the version I have).
It makes no sense to reproduce that here, since you can download the manual for free.
Briefly:
Eliminate branches by using conditional instructions (CMOV, SETcc),
Consider the static prediction algorithm (3.4.1.3),
Inlining
Loop unrolling
Also, some compilers, GCC, for instance, even when CMOV is not possible, often perform bitwise arithmetic to select one of two distinct things computed, thus avoiding branches. It does this particularly with SSE instructions when vectorizing loops.
Basically, the static conditions are:
Unconditional branches are predicted to be taken (... kind of expectable...)
Indirect branches are predicted not to be taken (because of a data dependency)
Backward conditionals are predicted to be taken (good for loops)
Forward conditionals are predicted not to be taken
You probably want to read the entire section 3.4.1.
If it's clear that a loop is rarely entered, or that it normally iterates very few times, then the compiler might avoid unrolling the loop, as doing so can add a lot of harmful complexity to handle edge conditions (an odd-number iterations, etc.). Vectorisation, in particular, should be avoided in such cases.
The compiler might rearrange nested tests, so that the one that most frequently results in a short-cut can be used to avoid performing a test on something with a 50% pass rate.
Register allocation can be optimised to avoid having a rarely-used block force register spill in the common case.
These are just some examples. I'm sure there are others I haven't thought of.
Off the top of my head, you have two options.
Option #1: Inform the compiler of the hints and let the compiler organize the code appropriately. For example, GCC supports the following ...
__builtin_expect((long)!!(x), 1L) /* GNU C to indicate that <x> will likely be TRUE */
__builtin_expect((long)!!(x), 0L) /* GNU C to indicate that <x> will likely be FALSE */
If you put them in macro form such as ...
#if <some condition to indicate support>
#define LIKELY(x) __builtin_expect((long)!!(x), 1L)
#define UNLIKELY(x) __builtin_expect((long)!!(x), 0L)
#else
#define LIKELY(x) (x)
#define UNLIKELY(x) (x)
#endif
... you can now use them as ...
if (LIKELY (x != 0)) {
/* DO SOMETHING */
} else {
/* DO SOMETHING ELSE */
}
This leaves the compiler free to organize the branches according to static branch prediction algorithms, and/or if the processor and compiler support it, to use instructions that indicate which branch is more likely to be taken.
Option #2: Use math to avoid branching.
if (a < b)
y = C;
else
y = D;
This could be re-written as ...
x = -(a < b); /* x = -1 if a < b, x = 0 if a >= b */
x &= (C - D); /* x = C - D if a < b, x = 0 if a >= b */
x += D; /* x = C if a < b, x = D if a >= b */
Hope this helps.
It can make the fall-through (ie the case where a branch is not taken) the most used path. That has two big effects:
only 1 branch can be taken per clock, or on some processors even per 2 clocks, so if there are any other branches (there usually are, most code that matters is in a loop), a taken branch is bad news, a non-taken branch less so.
when the branch predictor is wrong, the code that it does have to execute is more likely to be in the code cache (or µop cache, where applicable). If it wasn't, that would have been a double-whammy of restarting the pipeline and waiting for a cache miss. This is less of an issue in most loops, since both sides of the branch are likely to be in the cache, but it comes into play in big loops and other code.
It can also decide whether to do if-conversion based on better data than a heuristic guess. If-conversions may seem like "always a good idea", but they're not, they're only "often a good idea". If the branch in the branching implementation is very well-predicted, the if-converted code can well be slower.
I need a compile time check for what version of glibc will be used.
The only compile time checks (ie #defines) I can find return the glibc date (__GLIBCXX__) and correspondence between the date and version seems iffy. How do you check at compile time for the version of glibc that will be used?
My code will compile and run on several systems, including a very old one. In particular I am interested in using malloc_info (see http://man7.org/linux/man-pages/man3/malloc_info.3.html). This was added to glibc in version 2.10. The program will be used on the same (or an identical system) it was built on.
I think what you're looking for is __GLIBC__ and __GLIBC_MINOR__, which represent an int of the major and minor version numbers of the GNU C Library. Have a look at this(archive link) for more details.
So if __GLIBC__ is greater than 2, or __GLIBC__ is equal to 2 and __GLIBC_MINOR__ is greater than or equal to 10, then malloc_info() should work.