How are "special" return values that signal a condition called? - error-handling

Suppose I have a function which calculates a length and returns it as a positive integer, but could also return -1 for timeout, -2 for "cannot compute", and -3 for invalid arguments.
Notwithstanding any discussion on best practices, proper exceptions, and whatnot, this occurs regularly in legacy codebases. What is the name for this practice or for return values which are outside of the normal output value range, -1 being the most common?

Exceptions vs. status returns article refers to them as return status codes:
Broadly speaking, there are two ways to handle errors as they pass
from layer to layer in software: throwing exceptions and returning
status codes...
With status returns, a valuable channel of communication (the return
value of the function) has been taken over for error handling.
Personally I would also call them status codes, similarly to HTTP status codes (if we pretend HTTP response is like a function return).
As a side note, beside exceptions and return status codes, there also exists a monadic approach to error handling, which in some sense combines the former two approaches. For example, in Scala, Either monad may be used to specify a return value that can express both an error status code and a regular happy value without having to block out part of the domain for status codes:
def divide(a: Double, b: Double): Either[String, Double] =
if (b == 0.0) Left("Division by zero") else Right(a / b)
divide(4,0)
divide(4,2)
which outputs
res0: Either[String,Double] = Left(Division by zero)
res1: Either[String,Double] = Right(2.0)

That's an example of a "magic" value. I don't know of any more specific term for when this idea is applied to a function's return value.

Related

I/O in pure Fortran procedures

I'm trying to incorporate error checking within a pure procedure I am writing. I would like something like:
pure real function func1(output_unit,a)
implicit none
integer :: a, output_unit
if (a < 0) then
write(output_unit,*) 'Error in function func1: argument must be a nonnegative integer. It is ', a
else
func1 = a/3
endif
return
end function func1
However, pure functions are not allowed to have IO statements to external files, so I tried passing a unit number to the function, e.g. output_unit = 6, which is the default output. gfortran still regards this as illegal. Is there a way around this? Is it possible to make the function a derived type (instead of intrinsic type real here) which outputs a string when there is an error?
You are not the first person to have this problem, and I'm happy to say that this flaw in the standard will be remedied in Fortran 2015. As stated in this document (page 6, header "Approved changes to the standard"), "the restriction on the appearance of an error stop statement in a pure procedure should be removed".
The Fortran 2008 standard included the error stop statement in the context of some new parallel computing features. It signals an error and makes all processes stop as soon as is practicable. Currently, neither stop nor error stop statements are allowed in pure procedures, because they're obviously not thread-safe. In practice this is unnecessarily restrictive in cases where an internal error occurs.
Depending on your compiler, you may have to wait patiently for the implementation. I know that Intel has implemented it in their ifort compiler. ("F2015: Lift restriction on STOP and ERROR STOP in PURE/ELEMENTAL procedures")
alternative
For an alternative approach, you could have a look at this question, though in you case this is probably slightly trickier as you have to change the do concurrent keyword, not just pure.
(end of proper answer)
if getting dirty hands is an option ...
In the meantime you could do something brutal like
pure subroutine internal_error(error_msg)
! Try hard to produce a runtime error, regardless of compiler flags.
! This is useful in pure subprograms where you want to produce an error,
! preferably with a traceback.
!
! Though far from pretty, this solution contains all the ugliness in this
! single subprogram.
!
! TODO: replace with ERROR STOP when supported by compiler
implicit none
character(*), intent(in) :: error_msg
integer, dimension(:), allocatable :: molested
allocate(molested(2))
allocate(molested(2))
molested(3) = molested(4)
molested(1) = -10
molested(2) = sqrt(real(molested(1)))
deallocate(molested)
deallocate(molested)
molested(3) = molested(-10)
end subroutine internal_error
Should anyone ask, you didn't get this from me.
I've found an answer myself, detailed here. It uses what is considered "obsolescent", but still does the trick; it is called alternate return. Write the procedure as a subroutine as it doesn't work on functions.
pure real subroutine procA(arg1)
implicit none
integer :: arg1
if (arg < 0) then
return 1 ! exit the function and go to the first label supplied
! when function was called. Also return 2, 3 etc.
else
procA = ... ! whatever it should do under normal circumstances
endif
endsubroutine procA
....
! later on, procedure is called
num = procA(a, *220)
220 write(6,*) 'Error with func1: you've probably supplied a negative argument'
What would probably be better is what eriktous suggested--get the procedure to return a status, perhaps as a logical value or an integer, and get the program to check this value every time after it calls the procedure. If all's well, carry on. Otherwise, print a relevant error message.
Comments welcome.

matlab subsref: {} with string argument fails, why?

There are a few implementations of a hash or dictionary class in the Mathworks File Exchange repository. All that I have looked at use parentheses overloading for key referencing, e.g.
d = Dict;
d('foo') = 'bar';
y = d('foo');
which seems a reasonable interface. It would be preferable, though, if you want to easily have dictionaries which contain other dictionaries, to use braces {} instead of parentheses, as this allows you to get around MATLAB's (arbitrary, it seems) syntax limitation that multiple parentheses are not allowed but multiple braces are allowed, i.e.
t{1}{2}{3} % is legal MATLAB
t(1)(2)(3) % is not legal MATLAB
So if you want to easily be able to nest dictionaries within dictionaries,
dict{'key1'}{'key2'}{'key3'}
as is a common idiom in Perl and is possible and frequently useful in other languages including Python, then unless you want to use n-1 intermediate variables to extract a dictionary entry n layers deep, this seems a good choice. And it would seem easy to rewrite the class's subsref and subsasgn operations to do the same thing for {} as they previously did for (), and everything should work.
Except it doesn't when I try it.
Here's my code. (I've reduced it to a minimal case. No actual dictionary is implemented here, each object has one key and one value, but this is enough to demonstrate the problem.)
classdef TestBraces < handle
properties
% not a full hash table implementation, obviously
key
value
end
methods(Access = public)
function val = subsref(obj, ref)
% Re-implement dot referencing for methods.
if strcmp(ref(1).type, '.')
% User trying to access a method
% Methods access
if ismember(ref(1).subs, methods(obj))
if length(ref) > 1
% Call with args
val = obj.(ref(1).subs)(ref(2).subs{:});
else
% No args
val = obj.(ref.subs);
end
return;
end
% User trying to access something else.
error(['Reference to non-existant property or method ''' ref.subs '''']);
end
switch ref.type
case '()'
error('() indexing not supported.');
case '{}'
theKey = ref.subs{1};
if isequal(obj.key, theKey)
val = obj.value;
else
error('key %s not found', theKey);
end
otherwise
error('Should never happen')
end
end
function obj = subsasgn(obj, ref, value)
%Dict/SUBSASGN Subscript assignment for Dict objects.
%
% See also: Dict
%
if ~strcmp(ref.type,'{}')
error('() and dot indexing for assignment not supported.');
end
% Vectorized calls not supported
if length(ref.subs) > 1
error('Dict only supports storing key/value pairs one at a time.');
end
theKey = ref.subs{1};
obj.key = theKey;
obj.value = value;
end % subsasgn
end
end
Using this code, I can assign as expected:
t = TestBraces;
t{'foo'} = 'bar'
(And it is clear that the assignment work from the default display output for t.) So subsasgn appears to work correctly.
But I can't retrieve the value (subsref doesn't work):
t{'foo'}
??? Error using ==> subsref
Too many output arguments.
The error message makes no sense to me, and a breakpoint at the first executable line of my subsref handler is never hit, so at least superficially this looks like a MATLAB issue, not a bug in my code.
Clearly string arguments to () parenthesis subscripts are allowed, since this works fine if you change the code to work with () instead of {}. (Except then you can't nest subscript operations, which is the object of the exercise.)
Either insight into what I'm doing wrong in my code, any limitations that make what I'm doing unfeasible, or alternative implementations of nested dictionaries would be appreciated.
Short answer, add this method to your class:
function n = numel(obj, varargin)
n = 1;
end
EDIT: The long answer.
Despite the way that subsref's function signature appears in the documentation, it's actually a varargout function - it can produce a variable number of output arguments. Both brace and dot indexing can produce multiple outputs, as shown here:
>> c = {1,2,3,4,5};
>> [a,b,c] = c{[1 3 5]}
a =
1
b =
3
c =
5
The number of outputs expected from subsref is determined based on the size of the indexing array. In this case, the indexing array is size 3, so there's three outputs.
Now, look again at:
t{'foo'}
What's the size of the indexing array? Also 3. MATLAB doesn't care that you intend to interpret this as a string instead of an array. It just sees that the input is size 3 and your subsref can only output 1 thing at a time. So, the arguments mismatch. Fortunately, we can correct things by changing the way that MATLAB determines how many outputs are expected by overloading numel. Quoted from the doc link:
It is important to note the significance of numel with regards to the
overloaded subsref and subsasgn functions. In the case of the
overloaded subsref function for brace and dot indexing (as described
in the last paragraph), numel is used to compute the number of
expected outputs (nargout) returned from subsref. For the overloaded
subsasgn function, numel is used to compute the number of expected
inputs (nargin) to be assigned using subsasgn. The nargin value for
the overloaded subsasgn function is the value returned by numel plus 2
(one for the variable being assigned to, and one for the structure
array of subscripts).
As a class designer, you must ensure that the value of n returned by
the built-in numel function is consistent with the class design for
that object. If n is different from either the nargout for the
overloaded subsref function or the nargin for the overloaded subsasgn
function, then you need to overload numel to return a value of n that
is consistent with the class' subsref and subsasgn functions.
Otherwise, MATLAB produces errors when calling these functions.
And there you have it.

Reliable clean-up in Mathematica

For better or worse, Mathematica provides a wealth of constructs that allow you to do non-local transfers of control, including Return, Catch/Throw, Abort and Goto. However, these kinds of non-local transfers of control often conflict with writing robust programs that need to ensure that clean-up code (like closing streams) gets run. Many languages provide ways of ensuring that clean-up code gets run in a wide variety of circumstances; Java has its finally blocks, C++ has destructors, Common Lisp has UNWIND-PROTECT, and so on.
In Mathematica, I don't know how to accomplish the same thing. I have a partial solution that looks like this:
Attributes[CleanUp] = {HoldAll};
CleanUp[body_, form_] :=
Module[{return, aborted = False},
Catch[
CheckAbort[
return = body,
aborted = True];
form;
If[aborted,
Abort[],
return],
_, (form; Throw[##]) &]];
This certainly isn't going to win any beauty contests, but it also only handles Abort and Throw. In particular, it fails in the presence of Return; I figure if you're using Goto to do this kind of non-local control in Mathematica you deserve what you get.
I don't see a good way around this. There's no CheckReturn for instance, and when you get right down to it, Return has pretty murky semantics. Is there a trick I'm missing?
EDIT: The problem with Return, and the vagueness in its definition, has to do with its interaction with conditionals (which somehow aren't "control structures" in Mathematica). An example, using my CleanUp form:
CleanUp[
If[2 == 2,
If[3 == 3,
Return["foo"]]];
Print["bar"],
Print["cleanup"]]
This will return "foo" without printing "cleanup". Likewise,
CleanUp[
baz /.
{bar :> Return["wongle"],
baz :> Return["bongle"]},
Print["cleanup"]]
will return "bongle" without printing cleanup. I don't see a way around this without tedious, error-prone and maybe impossible code-walking or somehow locally redefining Return using Block, which is heinously hacky and doesn't actually seem to work (though experimenting with it is a great way to totally wedge a kernel!)
Great question, but I don't agree that the semantics of Return are murky; They are documented in the link you provide. In short, Return exits the innermost construct (namely, a control structure or function definition) in which it is invoked.
The only case in which your CleanUp function above fails to cleanup from a Return is when you directly pass a single or CompoundExpression (e.g. (one;two;three) directly as input to it.
Return exits the function f:
In[28]:= f[] := Return["ret"]
In[29]:= CleanUp[f[], Print["cleaned"]]
During evaluation of In[29]:= cleaned
Out[29]= "ret"
Return exits x:
In[31]:= x = Return["foo"]
In[32]:= CleanUp[x, Print["cleaned"]]
During evaluation of In[32]:= cleaned
Out[32]= "foo"
Return exits the Do loop:
In[33]:= g[] := (x = 0; Do[x++; Return["blah"], {10}]; x)
In[34]:= CleanUp[g[], Print["cleaned"]]
During evaluation of In[34]:= cleaned
Out[34]= 1
Returns from the body of CleanUp at the point where body is evaluated (since CleanUp is HoldAll):
In[35]:= CleanUp[Return["ret"], Print["cleaned"]];
Out[35]= "ret"
In[36]:= CleanUp[(Print["before"]; Return["ret"]; Print["after"]),
Print["cleaned"]]
During evaluation of In[36]:= before
Out[36]= "ret"
As I noted above, the latter two examples are the only problematic cases I can contrive (although I could be wrong) but they can be handled by adding a definition to CleanUp:
In[44]:= CleanUp[CompoundExpression[before___, Return[ret_], ___], form_] :=
(before; form; ret)
In[45]:= CleanUp[Return["ret"], Print["cleaned"]]
During evaluation of In[46]:= cleaned
Out[45]= "ret"
In[46]:= CleanUp[(Print["before"]; Return["ret"]; Print["after"]),
Print["cleaned"]]
During evaluation of In[46]:= before
During evaluation of In[46]:= cleaned
Out[46]= "ret"
As you said, not going to win any beauty contests, but hopefully this helps solve your problem!
Response to your update
I would argue that using Return inside If is unnecessary, and even an abuse of Return, given that If already returns either the second or third argument based on the state of the condition in the first argument. While I realize your example is probably contrived, If[3==3, Return["Foo"]] is functionally identical to If[3==3, "foo"]
If you have a more complicated If statement, you're better off using Throw and Catch to break out of the evaluation and "return" something to the point you want it to be returned to.
That said, I realize you might not always have control over the code you have to clean up after, so you could always wrap the expression in CleanUp in a no-op control structure, such as:
ret1 = Do[ret2 = expr, {1}]
... by abusing Do to force a Return not contained within a control structure in expr to return out of the Do loop. The only tricky part (I think, not having tried this) is having to deal with two different return values above: ret1 will contain the value of an uncontained Return, but ret2 would have the value of any other evaluation of expr. There's probably a cleaner way to handle that, but I can't see it right now.
HTH!
Pillsy's later version of CleanUp is a good one. At the risk of being pedantic, I must point out a troublesome use case:
Catch[CleanUp[Throw[23], Print["cleanup"]]]
The problem is due to the fact that one cannot explicitly specify a tag pattern for Catch that will match an untagged Throw.
The following version of CleanUp addresses that problem:
SetAttributes[CleanUp, HoldAll]
CleanUp[expr_, cleanup_] :=
Module[{exprFn, result, abort = False, rethrow = True, seq},
exprFn[] := expr;
result = CheckAbort[
Catch[
Catch[result = exprFn[]; rethrow = False; result],
_,
seq[##]&
],
abort = True
];
cleanup;
If[abort, Abort[]];
If[rethrow, Throw[result /. seq -> Sequence]];
result
]
Alas, this code is even less likely to be competitive in a beauty contest. Furthermore, it wouldn't surprise me if someone jumped in with yet another non-local control flow that that this code will not handle. Even in the unlikely event that it handles all possible cases now, problematic cases could be introduced in Mathematica X (where X > 7.01).
I fear that there cannot be a definitive answer to this problem until Wolfram introduces a new control structure expressly for this purpose. UnwindProtect would be a fine name for such a facility.
Michael Pilat provided the key trick for "catching" returns, but I ended up using it in a slightly different way, using the fact that Return forces the return value of a named function as well as control structures like Do. I made the expression that is being cleaned up after into the down-value of a local symbol, like so:
Attributes[CleanUp] = {HoldAll};
CleanUp[expr_, form_] :=
Module[{body, value, aborted = False},
body[] := expr;
Catch[
CheckAbort[
value = body[],
aborted = True];
form;
If[aborted,
Abort[],
value],
_, (form; Throw[##]) &]];

Why ifTrue and ifFalse are not separated by ; in Smalltalk?

a > b
ifTrue:[ 'greater' ]
ifFalse:[ 'less or equal' ]
My understanding is that Boolean a > b receives the message ifTrue:[ 'greater' ], and then ifFalse:[ 'less or equal' ] complying to the generalization:
objectInstance selector; selector2
But there a semicolon is needed to specify that the receiver of selector2 is not (objectInstance selector) but objectInstance. Is not the same with the above conditional execution?
The selector of the method is Boolean>>ifTrue:ifFalse:, which means it is one method with two parameters, not two methods with one parameter.
Ergo, to invoke the method, you send it the message ifTrue:ifFalse: with two block arguments.
Note that for convenience reasons, there are also methods Boolean>>ifFalse:ifTrue:, Boolean>>ifTrue: and Boolean>>ifFalse:.
Everything relevant has already been sayd, but just for your amusement:
As already told,
rcvr ifTrue:[...] ifFalse:[...]
is the one and single message #'ifTrue:ifFalse:' with 2 args sent to rcvr. The value of that expression is the one from that message send.
In contrast:
rcvr ifTrue:[...]; ifFalse:[...]
is a cascade of 2 sequential messages (#'ifTrue:' and #'ifFalse:'), each with 1 arg sent to rcvr. The value of the expression is the one returned from the last send.
Now the funny thing is that booleans do understand ifTrue: / ifFalse: (each with 1 arg),
so your code works for the side effect (evaluating those blocks), but not for its value.
This means that:
a > b ifTrue:[Transcript showCR:'gt'] ; ifFalse:[Transcript showCR:'le']
generates the same output as:
a > b ifTrue:[Transcript showCR:'gt'] ifFalse:[Transcript showCR:'le']
but:
msg := a > b ifTrue:['gt'] ; ifFalse:['le']
will generate different values in msg than:
msg := a > b ifTrue:['gt'] ifFalse:['le']
depending on the values of a and b. Try (a b)=(1 2) vs. (a b)=(2 1)...
The problem of many Smalltalk beginners is that they think of ifXXX: as syntax, where it is actually a message send which generates value. Also, the semi is not a statement separator as in many previously learned languages, but a sequencing message send construct.
A bad trap for beginners, because the code seems to work for some particular value combinations, whereas it generates funny results for others.
Let's hope your unit tests cover these ;-)
edit: to see where the bad value comes from, take a look at what is returned by the Boolean >> ifFalse: method for a true receiver...

== Operator and operands

I want to check whether a value is equal to 1. Is there any difference in the following lines of code
Evaluated value == 1
1 == evaluated value
in terms of the compiler execution
In most languages it's the same thing.
People often do 1 == evaluated value because 1 is not an lvalue. Meaning that you can't accidentally do an assignment.
Example:
if(x = 6)//bug, but no compiling error
{
}
Instead you could force a compiling error instead of a bug:
if(6 = x)//compiling error
{
}
Now if x is not of int type, and you're using something like C++, then the user could have created an operator==(int) override which takes this question to a new meaning. The 6 == x wouldn't compile in that case but the x == 6 would.
It depends on the programming language.
In Ruby, Smalltalk, Self, Newspeak, Ioke and many other single-dispatch object-oriented programming languages, a == b is actually a message send. In Ruby, for example, it is equivalent to a.==(b). What this means, is that when you write a == b, then the method == in the class of a is executed, but when you write b == a, then the method in the class of b is executed. So, it's obviously not the same thing:
class A; def ==(other) false end; end
class B; def ==(other) true end; end
a, b = A.new, B.new
p a == b # => false
p b == a # => true
No, but the latter syntax will give you a compiler error if you accidentally type
if (1 = evaluatedValue)
Note that today any decent compiler will warn you if you write
if (evaluatedValue = 1)
so it is mostly relevant for historical reasons.
Depends on the language.
In Prolog or Erlang, == is written = and is a unification rather than an assignment (you're asserting that the values are equal, rather then testing that they are equal or forcing them to be equal), so you can use it for an assertion if the left hand side is a constant, as explained here.
So X = 3 would unify the variable X and the value 3, whereas 3 = X would attempt to unify the constant 3 with the current value of X, and be equivalent of assert(x==3) in imperative languages.
It's the same thing
In general, it hardly matters whether you use,
Evaluated value == 1 OR 1 == evaluated value.
Use whichever appears more readable to you. I prefer if(Evaluated value == 1) because it looks more readable to me.
And again, I'd like to quote a well known scenario of string comparison in java.
Consider a String str which you have to compare with say another string "SomeString".
str = getValueFromSomeRoutine();
Now at runtime, you are not sure if str would be NULL. So to avoid exception you'll write
if(str!=NULL)
{
if(str.equals("SomeString")
{
//do stuff
}
}
to avoid the outer null check you could just write
if ("SomeString".equals(str))
{
//do stuff
}
Though this is less readable which again depends on the context, this saves you an extra if.
For this and similar questions can I suggest you find out for yourself by writing a little code, running it through your compiler and viewing the emitted asembler output.
For example, for the GNU compilers, you do this with the -S flag. For the VS compilers, the most convenient route is to run your test program in the debugger and then use the assembeler debugger view.
Sometimes in C++ they do different things, if the evaluated value is a user type and operator== is defined. Badly.
But that's very rarely the reason anyone would choose one way around over the other: if operator== is not commutative/symmetric, including if the type of the value has a conversion from int, then you have A Problem that probably wants fixing rather than working around. Brian R. Bondy's answer, and others, are probably on the mark for why anyone worries about it in practice.
But the fact remains that even if operator== is commutative, the compiler might not do exactly the same thing in each case. It will (by definition) return the same result, but it might do things in a slightly different order, or whatever.
if value == 1
if 1 == value
Is exactly the same, but if you accidentally do
if value = 1
if 1 = value
The first one will work while the 2nd one will produce an error.
They are the same. Some people prefer putting the 1 first, to void accidentally falling into the trap of typing
evaluated value = 1
which could be painful if the value on the left hand side is assignable. This is a common "defensive" pattern in C, for instance.
In C languages it's common to put the constant or magic number first so that if you forget one of the "=" of the equality check (==) then the compiler won't interpret this as an assignment.
In java, you cannot do an assignment within a boolean expression, and so for Java, it is irrelevant which order the equality operands are written in; The compiler should flag an error anyway.