How well can Ada optimize static constant arrays? - optimization

Let's say I declare an array of constant values for use as a lookup table:
function Square(num : Integer) return Integer is
use Ada.Text_IO;
type LookupItemsRange is range 1..10;
type LookupTable is array(LookupItemsRange) of integer;
A : constant LookupTable := (2312,2,3,4,5,6,7,8,9, 10);
begin
Sqr := A(1);
Sqr := 2 * A(2);
return Sqr;
end Square;
The LookupTable shouldn't use any space. The compiler should be able to substitute array access with the actual inlined values. At the highest optimization setting, here's what Ada yields:
_ada_square:
li a0,4 #,
ret
I am planning on substituting a large amount of redundant MCU code with a cleaner lookup table implementation. Since the microcontroller's resources are constrained, I would like to avoid needless allocation and indexing.
Under what circumstances will Ada allocate the array or am I completely safe?

This depends on your compiler and compiler options.
As you have demonstrated, whatever compiler you are using does an excellent job of optimization.
In Ada Outperforms Assembly: A Case Study, presented in 1992, an Ada (83) compiler produced faster code than assembler hand optimized by a team of experts. Presumably optimizers have gotten even better in the past 30+ years, so worrying about this kind of thing, in the absence of measurements showing otherwise, is A Waste of Time.

So for the Ada language specification, you won't get a guarantee for array objects. Ada is starting to catch up to constexpr in the newest upcoming release (Ada2022), but will only be providing it's analogous property (Static in Ada) to expression functions (think of them analogous to constexpr functions but with slightly different limitations).
So if you don't mind relying on your compiler vendor, then the array will most likely work fine (but is not guaranteed). If you want a more language supported guarantee on the optimization, then you will need to mimic the array using expression functions with the Static aspect set (in a recent compiler). For a static input, a static expression function will give a static result (again Static in Ada, not static in c++, more like constexpr).
function Square return Integer is
type Lookup_Items_Range is range 1..10;
function A(Index : Lookup_Items_Range) return Integer is
(case Index is
when 1 => 2312,
when 2 => 2,
when 3 => 3,
when 4 => 4,
when 5 => 5,
when 6 => 6,
when 7 => 7,
when 8 => 8,
when 9 => 9,
when 10 => 10
)
with Static;
Sqr : Integer;
begin
Sqr := A(1);
Sqr := 2 * A(2);
return Sqr;
end Square;
It's slightly more verbose than using an array, but has the language backing, which is really important in certain embedded programming areas. Note, most recent compilers will need a switch to enable this feature since it is so new, but they generally tell you that switch in the error message. I tested on godbolt with GNAT 12.1 and it compiled with the correct switch.

Related

Does Perl 6 have an infinite Int?

I had a task where I wanted to find the closest string to a target (so, edit distance) without generating them all at the same time. I figured I'd use the high water mark technique (low, I guess) while initializing the closest edit distance to Inf so that any edit distance is closer:
use Text::Levenshtein;
my #strings = < Amelia Fred Barney Gilligan >;
for #strings {
put "$_ is closest so far: { longest( 'Camelia', $_ ) }";
}
sub longest ( Str:D $target, Str:D $string ) {
state Int $closest-so-far = Inf;
state Str:D $closest-string = '';
if distance( $target, $string ) < $closest-so-far {
$closest-so-far = $string.chars;
$closest-string = $string;
return True;
}
return False;
}
However, Inf is a Num so I can't do that:
Type check failed in assignment to $closest-so-far; expected Int but got Num (Inf)
I could make the constraint a Num and coerce to that:
state Num $closest-so-far = Inf;
...
$closest-so-far = $string.chars.Num;
However, this seems quite unnatural. And, since Num and Int aren't related, I can't have a constraint like Int(Num). I only really care about this for the first value. It's easy to set that to something sufficiently high (such as the length of the longest string), but I wanted something more pure.
Is there something I'm missing? I would have thought that any numbery thing could have a special value that was greater (or less than) all the other values. Polymorphism and all that.
{new intro that's hopefully better than the unhelpful/misleading original one}
#CarlMäsak, in a comment he wrote below this answer after my first version of it:
Last time I talked to Larry about this {in 2014}, his rationale seemed to be that ... Inf should work for all of Int, Num and Str
(The first version of my answer began with a "recollection" that I've concluded was at least unhelpful and plausibly an entirely false memory.)
In my research in response to Carl's comment, I did find one related gem in #perl6-dev in 2016 when Larry wrote:
then our policy could be, if you want an Int that supports ±Inf and NaN, use Rat instead
in other words, don't make Rat consistent with Int, make it consistent with Num
Larry wrote this post 6.c. I don't recall seeing anything like it discussed for 6.d.
{and now back to the rest of my first answer}
Num in P6 implements the IEEE 754 floating point number type. Per the IEEE spec this type must support several concrete values that are reserved to stand in for abstract concepts, including the concept of positive infinity. P6 binds the corresponding concrete value to the term Inf.
Given that this concrete value denoting infinity already existed, it became a language wide general purpose concrete value denoting infinity for cases that don't involve floating point numbers such as conveying infinity in string and list functions.
The solution to your problem that I propose below is to use a where clause via a subset.
A where clause allows one to specify run-time assignment/binding "typechecks". I quote "typecheck" because it's the most powerful form of check possible -- it's computationally universal and literally checks the actual run-time value (rather than a statically typed view of what that value can be). This means they're slower and run-time, not compile-time, but it also makes them way more powerful (not to mention way easier to express) than even dependent types which are a relatively cutting edge feature that those who are into advanced statically type-checked languages tend to claim as only available in their own world1 and which are intended to "prevent bugs by allowing extremely expressive types" (but good luck with figuring out how to express them... ;)).
A subset declaration can include a where clause. This allows you to name the check and use it as a named type constraint.
So, you can use these two features to get what you want:
subset Int-or-Inf where Int:D | Inf;
Now just use that subset as a type:
my Int-or-Inf $foo; # ($foo contains `Int-or-Inf` type object)
$foo = 99999999999; # works
$foo = Inf; # works
$foo = Int-or-Inf; # works
$foo = Int; # typecheck failure
$foo = 'a'; # typecheck failure
1. See Does Perl 6 support dependent types? and it seems the rough consensus is no.

Fortran serialization using C_LOC and C_F_POINTER

I'm looking for a Fortran Library or preferred method of serializing data to a memory buffer in Fortran.
After researching the topic, I found examples using the EQUIVALENCE statement and the TRANSFER intrinsic function. I wrote code to test them and they both worked. In my limited testing, the transfer function seems to be quite a bit slower than the equivalence statement. However, I have found several references stating that in general not to use the equivalence statement.
So, I've been trying to come up with another way to serialize data efficiently. After ready up on the Fortran 2003 spec I discovered that I could use the C_LOC and C_F_POINTER together to cast my "byte" array into the desired data type (int, real, etc..). Initial testing shows that it is working and is faster than the transfer function. An example program is listed below. I was wondering if this is valid use of the C_LOC and C_F_POINTER functions.
Thanks!
program main
use iso_c_binding
implicit none
real(c_float) :: a, b, c
integer(c_int8_t), target :: buf(12)
a = 12345.6789_c_float
b = 4567.89123_c_float
c = 9079.66788_c_float
call pack_float( a, c_loc(buf(1)) )
call pack_float( b, c_loc(buf(5)) )
call pack_float( c, c_loc(buf(9)) )
print '(A,12I5)', 'Bin: ', buf
contains
subroutine pack_float( src, dest )
implicit none
real(c_float), intent(in) :: src
type(c_ptr), intent(in) :: dest
real(c_float), pointer :: p
call c_f_pointer(dest, p)
p = src
end subroutine
end program
Output:
Bin: -73 -26 64 70 33 -65 -114 69 -84 -34 13 70
I also coded this up in Python to double check the answer above. The code and output is listed below.
import struct
a = struct.pack( '3f', 12345.6789, 4567.89123, 9079.66788)
b = struct.unpack('12b', a)
print b
Output:
(-73, -26, 64, 70, 33, -65, -114, 69, -84, -34, 13, 70)
Practically, use of C_LOC and C_F_POINTER in this way is likely to work, but formally it is not standard conforming. The type and type parameters of the Fortran pointer passed to C_F_POINTER must either be interoperable with the object nominated by the C address or be the same as the type and type parameters of the object that was originally passed to C_LOC (see the description of the FPTR argument in F2008 15.2.3.3). Depending on what you are trying to serialize, you may also find that formal restrictions on the C_LOC argument (which are progressively less restrictive in later standards than F2003) come into play.
(The C equivalent requires use of unsigned char for this sort of trick - which is not necessarily the same thing as int8_t.)
There are constraints on the items in an EQUIVALENCE set that make that approach not generally applicable (see constraints C591 through C594 in F2008). Interpreting the internal representation of an object through equivalence is also formally subject to the rules around definition and undefinition of variables - see F2008 16.6.6 item 1 in particular.
The conforming way to access the representation of one object as a different type in Fortran is through TRANSFER. Be mindful that serialization of the internal representation of derived types with allocatable or pointer components (is that what you mean by dynamic fields?) may not be useful.
Depending on circumstance, it may be simpler and more robust to simply store your real time results in an array of the type of the thing to be stored.

What are Free and Bound variables?

I've been programming for a long time (too long, actually), but I'm really struggling to get a handle on the terms "Free Variables" and "Bound Variables".
Most of the "explanations" I've found online start by talking about topics like Lambda calculus and Formal logic, or Axiomatic Semantics. which makes me want to reach for my revolver.
Can someone explain these two terms, ideally from an implementation perspective. Can they exist in compiled languages, and what low-level code do they translate to?
A free variable is a variable used in some function that its value depends on the context where the function is invoked, called or used. For example, in math terms, z is a free variable because is not bounded to any parameter. x is a bounded variable:
f(x) = x * z
In programming languages terms, a free variable is determined dynamically at run time searching the name of the variable backwards on the function call stack.
A bounded variable evaluation does not depend on the context of the function call. This is the most common modern programming languages variable type. Local variables, global variables and parameters are all bounded variables.
A free variable is somewhat similar to the "pass by name" convention of some ancient programming languages.
Let's say you have a function f that just prints some variable:
def f():
print(X)
This is Python. While X is not a local variable, its value follows the Python convention: it searches upwards on the chain of the blocks where the function is defined until it reaches the top level module.
Since in Python the value of X is determined by the function declaration context, we say that X is a bounded variable.
Hypothetically, if X were a free variable, this should print 10:
X = 2
def f():
print(X)
def g():
# X is a local variable to g, shadowing the global X
X = 10
f()
In Python, this code prints 2 because both X variables are bounded. The local X variable on g is bounded as a local variable, and the one on f is bounded to the global X.
Implementation
The implementation of a programming language with free variables needs to take care the context on where each function is called, and for every free variable use some reflection to find which variable to use.
The value of a free variable can not be generally determined at compile time, since is heavily determined by the run time flow and the call stack.
Whether a variable is free or bound is relative; it depends on the fragment of code you are looking at.
In this fragment, x is bound:
function(x) {return x + x;};
And here, x occurs free:
return x + x;
In other words, freeness is a property of the context. You don't say "x is a free variable" or "x is a bound variable," but instead identify the context you are talking about: "x is free in the expression E." For this reason, the same variable x can be either free or bound depending on which fragment of code you are talking about. If the fragment contains the variable's binding site (for example, it is listed in the function arguments) then it is bound, if not, it is free.
Where the free/bound distinction is important from an implementation perspective is when you are implementing how variable substitution works (e.g. what happens when you apply arguments to a function.) Consider the evaluation steps:
(function(x) {return x + x;})(3);
=> 3 + 3
=> 6
This works fine because x is free in the body of the function. If x was bound in the body of the function, however, our evaluation would need to be careful:
(function(x) {return (function(x){return x * 2;})(x + x);})(3);
=> (function(x){return x * 2;})(3 + 3); // careful to leave this x alone for now!
=> (function(x){return x * 2;})(6);
=> 6 * 2
=> 12
If our implementation didn't check for bound occurrences, it could have replaced the bound x for 3 and given us the wrong answer:
(function(x) {return (function(x){return x * 2;})(x + x);})(3);
=> (function(x){return 3 * 2;})(3 + 3); // Bad! We substituted for a bound x!
=> (function(x){return 3 * 2;})(6);
=> 3 * 2
=> 6
Also, it should be clarified that free vs. bound is a property of the syntax (i.e. the code itself), not a property of how a the code is evaluated at run-time. vz0 talks about dynamically scoped variables, which are somewhat related to, but not synonymous with, free variables. As vz0 correctly describes, dynamic variable scope is a language feature that allows expressions that contains free variables to be evaluated by looking at the run-time call-stack to find the value of a variable that shares the same name. However, It still makes sense to talk about free occurrences of variables in languages that don't allow dynamic scope: you would just get an error (like "x is not defined") if you were to try to evaluate an such expression in these languages.
And I can't restrain myself: I hope one day you can find the strength to put your revolvers away when people mention lambda calculus! Lambda calculus is a good tool for thinking about variables and bindings because it is an extremely minimal programming language that supports variables and substitution and nothing else. Real-world programming languages contain a lot of other junk (like dynamic scope, for example) that obscures the essence.

How do I access an integer array within a struct/class from in-line assembly (blackfin dialect) using gcc?

Not very familiar with in-line assembly to begin with, and much less with that of the blackfin processor. I am in the process of migrating a legacy C application over to C++, and ran into a problem this morning regarding the following routine:
//
void clear_buffer ( short * buffer, int len ) {
__asm__ (
"/* clear_buffer */\n\t"
"LSETUP (1f, 1f) LC0=%1;\n"
"1:\n\t"
"W [%0++] = %2;"
:: "a" ( buffer ), "a" ( len ), "d" ( 0 )
: "memory", "LC0", "LT0", "LB0"
);
}
I have a class that contains an array of shorts that is used for audio processing:
class AudProc
{
enum { buffer_size = 512 };
short M_samples[ buffer_size * 2 ];
// remaining part of class omitted for brevity
};
Within the AudProc class I have a method that calls clear_buffer, passing it the samples array:
clear_buffer ( M_samples, sizeof ( M_samples ) / 2 );
This generates a "Bus Error" and aborts the application.
I have tried making the array public, and that produces the same result. I have also tried making it static; that allows the call to go through without error, but no longer allows for multiple instances of my class as each needs its own buffer to work with. Now, my first thought is, it has something to do with where the buffer is in memory, or from where it is being accessed. Does something need to be changed in the in-line assembly to make this work, or in the way it is being called?
Thought that this was similar to what I was trying to accomplish, but it is using a different dialect of asm, and I can't figure out if it is the same problem I am experiencing or not:
GCC extended asm, struct element offset encoding
Anyone know why this is occurring and how to correct it?
Does anyone know where there is helpful documentation regarding the blackfin asm instruction set? I've tried looking on the ADSP site, but to no avail.
I would suspect that you could define your clear_buffer as
inline void clear_buffer (short * buffer, int len) {
memset (buffer, 0, sizeof(short)*len);
}
and probably GCC is able to optimize (when invoked with -O2 or -O3) that cleverly (because GCC knows about memset).
To understand assembly code, I suggest running gcc -S -O -fverbose-asm on some small C file, then to look inside the produced .s file.
I would have take a guess, because I don't know Blackfin assembler:
That LC0 sounds like "loop counter", LSETUP looks like a macro/insn, which, well, setups a loop between two labels and with a certain loop counter.
The "%0" operands is apparently the address to write to and we can safely guess it's incremented in the loop, in other words it's both an input and output operand and should be described as such.
Thus, I suggest describing it as in input-output operand, using "+" constraint modifier, as follows:
void clear_buffer ( short * buffer, int len ) {
__asm__ (
"/* clear_buffer */\n\t"
"LSETUP (1f, 1f) LC0=%1;\n"
"1:\n\t"
"W [%0++] = %2;"
: "+a" ( buffer )
: "a" ( len ), "d" ( 0 )
: "memory", "LC0", "LT0", "LB0"
);
}
This is, of course, just a hypothesis, but you could disassemble the code and check if by any chance GCC allocated the same register for "%0" and "%2".
PS. Actually, only "+a" should be enough, early-clobber is irrelevant.
For anyone else who runs into a similar circumstance, the problem here was not with the in-line assembly, nor with the way it was being called: it was with the classes / structs in the program. The class that I believed to be the offender was not the problem - there was another class that held an instance of it, and due to other members of that outer class, the inner one was not aligned on a word boundary. This was causing the "Bus Error" that I was experiencing. I had not come across this before because the classes were not declared with __attribute__((packed)) in other code, but they are in my implementation.
Giving Type Attributes - Using the GNU Compiler Collection (GCC) a read was what actually sparked the answer for me. Two particular attributes that affect memory alignment (and, thus, in-line assembly such as I am using) are packed and aligned.
As taken from the aforementioned link:
aligned (alignment)
This attribute specifies a minimum alignment (in bytes) for variables of the specified type. For example, the declarations:
struct S { short f[3]; } __attribute__ ((aligned (8)));
typedef int more_aligned_int __attribute__ ((aligned (8)));
force the compiler to ensure (as far as it can) that each variable whose type is struct S or more_aligned_int is allocated and aligned at least on a 8-byte boundary. On a SPARC, having all variables of type struct S aligned to 8-byte boundaries allows the compiler to use the ldd and std (doubleword load and store) instructions when copying one variable of type struct S to another, thus improving run-time efficiency.
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question. This means that you can effectively adjust the alignment of a struct or union type by attaching an aligned attribute to any one of the members of such a type, but the notation illustrated in the example above is a more obvious, intuitive, and readable way to request the compiler to adjust the alignment of an entire struct or union type.
As in the preceding example, you can explicitly specify the alignment (in bytes) that you wish the compiler to use for a given struct or union type. Alternatively, you can leave out the alignment factor and just ask the compiler to align a type to the maximum useful alignment for the target machine you are compiling for. For example, you could write:
struct S { short f[3]; } __attribute__ ((aligned));
Whenever you leave out the alignment factor in an aligned attribute specification, the compiler automatically sets the alignment for the type to the largest alignment that is ever used for any data type on the target machine you are compiling for. Doing this can often make copy operations more efficient, because the compiler can use whatever instructions copy the biggest chunks of memory when performing copies to or from the variables that have types that you have aligned this way.
In the example above, if the size of each short is 2 bytes, then the size of the entire struct S type is 6 bytes. The smallest power of two that is greater than or equal to that is 8, so the compiler sets the alignment for the entire struct S type to 8 bytes.
Note that although you can ask the compiler to select a time-efficient alignment for a given type and then declare only individual stand-alone objects of that type, the compiler's ability to select a time-efficient alignment is primarily useful only when you plan to create arrays of variables having the relevant (efficiently aligned) type. If you declare or use arrays of variables of an efficiently-aligned type, then it is likely that your program also does pointer arithmetic (or subscripting, which amounts to the same thing) on pointers to the relevant type, and the code that the compiler generates for these pointer arithmetic operations is often more efficient for efficiently-aligned types than for other types.
The aligned attribute can only increase the alignment; but you can decrease it by specifying packed as well. See below.
Note that the effectiveness of aligned attributes may be limited by inherent limitations in your linker. On many systems, the linker is only able to arrange for variables to be aligned up to a certain maximum alignment. (For some linkers, the maximum supported alignment may be very very small.) If your linker is only able to align variables up to a maximum of 8-byte alignment, then specifying aligned(16) in an __attribute__ still only provides you with 8-byte alignment. See your linker documentation for further information.
.
packed
This attribute, attached to struct or union type definition, specifies that each member (other than zero-width bit-fields) of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used.
Specifying this attribute for struct and union types is equivalent to specifying the packed attribute on each of the structure or union members. Specifying the -fshort-enums flag on the line is equivalent to specifying the packed attribute on all enum definitions.
In the following example struct my_packed_struct's members are packed closely together, but the internal layout of its s member is not packed—to do that, struct my_unpacked_struct needs to be packed too.
struct my_unpacked_struct
{
char c;
int i;
};
struct __attribute__ ((__packed__)) my_packed_struct
{
char c;
int i;
struct my_unpacked_struct s;
};
You may only specify this attribute on the definition of an enum, struct or union, not on a typedef that does not also define the enumerated type, structure or union.
The problem which I was experiencing was specifically due to the use of packed. I attempted to simply add the aligned attribute to the structs and classes, but the error persisted. Only removing the packed attribute resolved the problem. For now, I am leaving the aligned attribute on them and testing to see if I find any improvements in the efficiency of the code as mentioned above, simply due to their being aligned on word boundaries. The application makes use of arrays of these structures, so perhaps there will be better performance, but only profiling the code will say for certain.

How can I write like "x == either 1 or 2" in a programming language? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why do most programming languages only have binary equality comparison operators?
I have had a simple question for a fairly long time--since I started learning programming languages.
I'd like to write like "if x is either 1 or 2 => TRUE (otherwise FALSE)."
But when I write it in a programming language, say in C,
( x == 1 || x == 2 )
it really works but looks awkward and hard to read. I guess it should be possible to simplify such an or operation, and so if you have any idea, please tell me. Thanks, Nathan
Python allows test for membership in a sequence:
if x in (1, 2):
An extension version in C#
step 1: create an extension method
public static class ObjectExtensions
{
public static bool Either(this object value, params object[] array)
{
return array.Any(p => Equals(value, p));
}
}
step 2: use the extension method
if (x.Either(1,2,3,4,5,6))
{
}
else
{
}
While there are a number of quite interesting answers in this thread, I would like to point out that they may have performance implications if you're doing this kind of logic inside of a loop depending on the language. A far as for the computer to understand, the if (x == 1 || x == 2) is by far the easiest to understand and optimize when it's compiled into machine code.
When I started programming it seemed weird to me as well that instead of something like:
(1 < x < 10)
I had to write:
(1 < x && x < 10)
But this is how most programming languages work, and after a while you will get used to it.
So I believe it is perfectly fine to write
( x == 1 || x == 2 )
Writing it this way also has the advantage that other programmers will understand easily what you wrote. Using a function to encapsulate it might just make things more complicated because the other programmers would need to find that function and see what it does.
Only more recent programming languages like Python, Ruby etc. allow you to write it in a simpler, nicer way. That is mostly because these programming languages are designed to increase the programmers productivity, while the older programming languages' main goal was application performance and not so much programmer productivity.
It's Natural, but Language-Dependent
Your approach would indeed seem more natural but that really depends on the language you use for the implementation.
Rationale for the Mess
C being a systems programming language, and fairly close to the hardware (funny though, as we used to consider a "high-level" language, as opposed to writing machine code), it's not exactly expressive.
Modern higher-level languages (again, arguable, lisp is not that modern, historically speaking, but would allow you to do that nicely) allow you to do such things by using built-in constructs or library support (for instances, using Ranges, Tuples or equivalents in languages like Python, Ruby, Groovy, ML-languages, Haskell...).
Possible Solutions
Option 1
One option for you would be to implement a function or subroutine taking an array of values and checking them.
Here's a basic prototype, and I leave the implementation as an exercise to you:
/* returns non-zero value if check is in values */
int is_in(int check, int *values, int size);
However, as you will quickly see, this is very basic and not very flexible:
it works only on integers,
it works only to compare identical values.
Option 2
One step higher on the complexity ladder (in terms of languages), an alternative would be to use pre-processor macros in C (or C++) to achieve a similar behavior, but beware of side effects.
Other Options
A next step could be to pass a function pointer as an extra parameter to define the behavior at call-point, define several variants and aliases for this, and build yourself a small library of comparators.
The next step then would be to implement a similar thing in C++ using templates to do this on different types with a single implementation.
And then keep going from there to higher-level languages.
Pick the Right Language (or learn to let go!)
Typically, languages favoring functional programming will have built-in support for this sort of thing, for obvious reasons.
Or just learn to accept that some languages can do things that others cannot, and that depending on the job and environment, that's just the way it is. It mostly is syntactic sugar, and there's not much you can do. Also, some languages will address their shortcomings over time by updating their specifications, while others will just stall.
Maybe a library implements such a thing already and that I am not aware of.
that was a lot of interesting alternatives. I am surprised nobody mentioned switch...case - so here goes:
switch(x) {
case 1:
case 2:
// do your work
break;
default:
// the else part
}
it is more readable than having a
bunch of x == 1 || x == 2 || ...
more optimal than having a
array/set/list for doing a
membership check
I doubt I'd ever do this, but to answer your question, here's one way to achieve it in C# involving a little generic type inference and some abuse of operator overloading. You could write code like this:
if (x == Any.Of(1, 2)) {
Console.WriteLine("In the set.");
}
Where the Any class is defined as:
public static class Any {
public static Any2<T> Of<T>(T item1, T item2) {
return new Any2<T>(item1, item2);
}
public struct Any2<T> {
T item1;
T item2;
public Any2(T item1, T item2) {
this.item1 = item1;
this.item2 = item2;
}
public static bool operator ==(T item, Any2<T> set) {
return item.Equals(set.item1) || item.Equals(set.item2);
}
// Defining the operator== requires these three methods to be defined as well:
public static bool operator !=(T item, Any2<T> set) {
return !(item == set);
}
public override bool Equals(object obj) { throw new NotImplementedException(); }
public override int GetHashCode() { throw new NotImplementedException(); }
}
}
You could conceivably have a number of overloads of the Any.Of method to work with 3, 4, or even more arguments. Other operators could be provided as well, and a companion All class could do something very similar but with && in place of ||.
Looking at the disassembly, a fair bit of boxing happens because of the need to call Equals, so this ends up being slower than the obvious (x == 1) || (x == 2) construct. However, if you change all the <T>'s to int and replace the Equals with ==, you get something which appears to inline nicely to be about the same speed as (x == 1) || (x == 2).
Err, what's wrong with it? Oh well, if you really use it a lot and hate the looks do something like this in c#:
#region minimizethisandneveropen
public bool either(value,x,y){
return (value == x || value == y);
}
#endregion
and in places where you use it:
if(either(value,1,2))
//yaddayadda
Or something like that in another language :).
In php you can use
$ret = in_array($x, array(1, 2));
As far as I know, there is no built-in way of doing this in C. You could add your own inline function for scanning an array of ints for values equal to x....
Like so:
inline int contains(int[] set, int n, int x)
{
int i;
for(i=0; i<n; i++)
if(set[i] == x)
return 1;
return 0;
}
// To implement the check, you declare the set
int mySet[2] = {1,2};
// And evaluate like this:
contains(mySet,2,x) // returns non-zero if 'x' is contained in 'mySet'
In T-SQL
where x in (1,2)
In COBOL (it's been a long time since I've even glanced briefly at COBOL, so I may have a detail or two wrong here):
IF X EQUALS 1 OR 2
...
So the syntax is definitely possible. The question then boils down to "why is it not used more often?"
Well, the thing is, parsing expressions like that is a bit of a bitch. Not when standing alone like that, mind, but more when in compound expressions. The syntax starts to become opaque (from the compiler implementer's perspective) and the semantics downright hairy. IIRC, a lot of COBOL compilers will even warn you if you use syntax like that because of the potential problems.
In .Net you can use Linq:
int[] wanted = new int{1, 2};
// you can use Any to return true for the first item in the list that passes
bool result = wanted.Any( i => i == x );
// or use Contains
bool result = wanted.Contains( x );
Although personally I think the basic || is simple enough:
bool result = ( x == 1 || x == 2 );
Thanks Ignacio! I translate it into Ruby:
[ 1, 2 ].include?( x )
and it also works, but I'm not sure whether it'd look clear & normal. If you know about Ruby, please advise. Also if anybody knows how to write this in C, please tell me. Thanks. -Nathan
Perl 5 with Perl6::Junction:
use Perl6::Junction 'any';
say 'yes' if 2 == any(qw/1 2 3/);
Perl 6:
say 'yes' if 2 == 1|2|3;
This version is so readable and concise I’d use it instead of the || operator.
Pascal has a (limited) notion of sets, so you could do:
if x in [1, 2] then
(haven't touched a Pascal compiler in decades so the syntax may be off)
A try with only one non-bitwise boolean operator (not advised, not tested):
if( (x&3) ^ x ^ ((x>>1)&1) ^ (x&1) ^ 1 == 0 )
The (x&3) ^ x part should be equal to 0, this ensures that x is between 0 and 3. Other operands will only have the last bit set.
The ((x>>1)&1) ^ (x&1) ^ 1 part ensures last and second to last bits are different. This will apply to 1 and 2, but not 0 and 3.
You say the notation (x==1 || x==2) is "awkward and hard to read". I beg to differ. It's different than natural language, but is very clear and easy to understand. You just need to think like a computer.
Also, the notations mentioned in this thread like x in (1,2) are semantically different then what you are really asking, they ask if x is member of the set (1,2), which is not what you are asking. What you are asking is if x equals to 1 or to 2 which is logically (and semantically) equivalent to if x equals to 1 or x equals to 2 which translates to (x==1 || x==2).
In java:
List list = Arrays.asList(new Integer[]{1,2});
Set set = new HashSet(list);
set.contains(1)
I have a macro that I use a lot that's somewhat close to what you want.
#define ISBETWEEN(Var, Low, High) ((Var) >= (Low) && (Var) <= (High))
ISBETWEEN(x, 1, 2) will return true if x is 1 or 2.
Neither C, C++, VB.net, C#.net, nor any other such language I know of has an efficient way to test for something being one of several choices. Although (x==1 || x==2) is often the most natural way to code such a construct, that approach sometimes requires the creation of an extra temporary variable:
tempvar = somefunction(); // tempvar only needed for 'if' test:
if (tempvar == 1 || tempvar == 2)
...
Certainly an optimizer should be able to effectively get rid of the temporary variable (shove it in a register for the brief time it's used) but I still think that code is ugly. Further, on some embedded processors, the most compact and possibly fastest way to write (x == const1 || x==const2 || x==const3) is:
movf _x,w ; Load variable X into accumulator
xorlw const1 ; XOR with const1
btfss STATUS,ZERO ; Skip next instruction if zero
xorlw const1 ^ const2 ; XOR with (const1 ^ const2)
btfss STATUS,ZERO ; Skip next instruction if zero
xorlw const2 ^ const3 ; XOR with (const2 ^ const3)
btfss STATUS,ZERO ; Skip next instruction if zero
goto NOPE
That approach require two more instructions for each constant; all instructions will execute. Early-exit tests will save time if the branch is taken, and waste time otherwise. Coding using a literal interpretation of the separate comparisons would require four instructions for each constant.
If a language had an "if variable is one of several constants" construct, I would expect a compiler to use the above code pattern. Too bad no such construct exists in common languages.
(note: Pascal does have such a construct, but run-time implementations are often very wasteful of both time and code space).
return x === 1 || x === 2 in javascript