Passed array with more elements that expected in subroutine - api

I have a subroutine in a shared library:
SUBROUTINE DLLSUBR(ARR)
IMPLICIT NONE
INTEGER, PARAMETER :: N = 2
REAL ARR(0:N)
arr(0) = 0
arr(1) = 1
arr(2) = 2
END
And let's assume I will call it from executable by:
REAL ARR(0:3)
CALL DLLSUBR(ARR)
Note: The code happily compiles and runs (DLLSUBR is inside a module) without any warning or error in Debug + /check:all option switched on.
Could this lead to memory corruption or some weird behaviour? Where I can find info about passing array with different size in the Fortran specification?

It is actually allowed for explicit shape arrays by the rules of sequence association, if you make the dummy argument element count to be smaller or equal. It is prohibited when the subroutine expects more elements then it gets.
The explicit shape arrays often require the arguments to be passed by a copy. This happens when the compiler cannot prove the array is contiguous (a pointer or an assumed shape array dummy argument). If smaller number of elements was passed, the subroutine could then access some garbage after the copy of the portion of the array.
In your case everything will be OK, because you are passing more to a subroutine expecting less.
Fortran 2008 12.5.2.11.4:
4 An actual argument that represents an element sequence and
corresponds to a dummy argument that is an array is sequence
associated with the dummy argument if the dummy argument is an
explicit-shape or assumed-size array. The rank and shape of the actual
argument need not agree with the rank and shape of the dummy argument,
but the number of elements in the dummy argument shall not exceed the
number of elements in the element sequence of the actual argument. If
the dummy argument is assumed-size, the number of elements in the
dummy argument is exactly the number of elements in the element
sequence.

Related

Fortran arrays argument [duplicate]

I'm going through a Fortran code, and one bit has me a little puzzled.
There is a subroutine, say
SUBROUTINE SSUB(X,...)
REAL*8 X(0:N1,1:N2,0:N3-1),...
...
RETURN
END
Which is called in another subroutine by:
CALL SSUB(W(0,1,0,1),...)
where W is a 'working array'. It appears that a specific value from W is passed to the X, however, X is dimensioned as an array. What's going on?
This is non-uncommon idiom for getting the subroutine to work on a (rectangular in N-dimensions) subset of the original array.
All parameters in Fortran (at least before Fortran 90) are passed by reference, so the actual array argument is resolved as a location in memory. Choose a location inside the space allocated for the whole array, and the subroutine manipulates only part of the array.
Biggest issue: you have to be aware of how the array is laid out in memory and how Fortran's array indexing scheme works. Fortran uses column major array ordering which is the opposite convention from c. Consider an array that is 5x5 in size (and index both directions from 0 to make the comparison with c easier). In both languages 0,0 is the first element in memory. In c the next element in memory is [0][1] but in Fortran it is (1,0). This affects which indexes you drop when choosing a subspace: if the original array is A(i,j,k,l), and the subroutine works on a three dimensional subspace (as in your example), in c it works on Aprime[i=constant][j][k][l], but in Fortran in works on Aprime(i,j,k,l=constant).
The other risk is wrap around. The dimensions of the (sub)array in the subroutine have to match those in the calling routine, or strange, strange things will happen (think about it). So if A is declared of size (0:4,0:5,0:6,0:7), and we call with element A(0,1,0,1), the receiving routine is free to start the index of each dimension where ever it likes, but must make the sizes (4,5,6) or else; but that means that the last element in the j direction actually wraps around! The thing to do about this is not use the last element. Making sure that that happens is the programmers job, and is a pain in the butt. Take care. Lots of care.
in fortran variables are passed by address.
So W(0,1,0,1) is value and address. so basically you pass subarray starting at W(0,1,0,1).
This is called "sequence association". In this case, what appears to be a scaler, an element of an array (actual argument in caller) is associated with an array (implicitly the first element), the dummy argument in the subroutine . Thereafter the elements of the arrays are associated by storage order, known as "sequence". This was done in Fortran 77 and earlier for various reasons, here apparently for a workspace array -- perhaps the programmer was doing their own memory management. This is retained in Fortran >=90 for backwards compatibility, but IMO, doesn't belong in new code.

Assign a Seq(Seq) into arrays

What is it the correct syntax to assign a Seq(Seq) into multiple typed arrays without assign the Seq to an scalar first? Has the Seq to be flattened somehow? This fails:
class A { has Int $.r }
my A (#ra1, #ra2);
#create two arrays with 5 random numbers below a certain limit
#Fails: Type check failed in assignment to #ra1; expected A but got Seq($((A.new(r => 3), A.n...)
(#ra1, #ra2) =
<10 20>.map( -> $up_limit {
(^5).map({A.new( r => (^$up_limit).pick ) })
});
TL;DR Binding is faster than assignment, so perhaps this is the best practice solution to your problem:
:(#ra1, #ra2) := <10 20>.map(...);
While uglier than the solution in the accepted answer, this is algorithmically faster because binding is O(1) in contrast to assignment's O(N) in the length of the list(s) being bound.
Assigning / copying
Simplifying, your non-working code is:
(#listvar1, #listvar2) = list1, list2;
In Raku infix = means assignment / copying from the right of the = into one or more of the container variables on the left of the =.
If a variable on the left is bound to a Scalar container, then it will assign one of the values on the right. Then the assignment process starts over with the next container variable on the left and the next value on the right.
If a variable on the left is bound to an Array container, then it uses up all remaining values on the right. So your first array variable receives both list1 and list2. This is not what you want.
Simplifying, here's Christoph's answer:
#listvar1, #listvar2 Z= list1, list2;
Putting the = aside for a moment, Z is an infix version of the zip routine. It's like (a physical zip pairing up consecutive arguments on its left and right. When used with an operator it applies that operator to the pair. So you can read the above Z= as:
#listvar1 = list1;
#listvar2 = list2;
Job done?
Assignment into Array containers entails:
Individually copying as many individual items as there are in each list into the containers. (In the code in your example list1 and list2 contain 5 elements each, so there would be 10 copying operations in total.)
Forcing the containers to resize as necessary to accommodate the items.
Doubling up the memory used by the items (the original list elements and the duplicates copied into the Array elements).
Checking that the type of each item matches the element type constraint.
Assignment is in general much slower and more memory intensive than binding...
Binding
:(#listvar1, #listvar2) := list1, list2;
The := operator binds whatever's on its left to the arguments on its right.
If there's a single variable on the left then things are especially simple. After binding, the variable now refers precisely to what's on the right. (This is especially simple and fast -- a quick type check and it's done.)
But that's not so in our case.
Binding also accepts a standalone signature literal on its left. The :(...) in my answer is a standalone Signature literal.
(Signatures are typically attached to a routine without the colon prefix. For example, in sub foo (#var1, #var2) {} the (#var1, #var2) part is a signature attached to the routine foo. But as you can see, one can write a signature separately and let Raku know it's a signature by prefixing a pair of parens with a colon. A key difference is that any variables listed in the signature must have already been declared.)
When there's a signature literal on the left then binding happens according to the same logic as binding arguments in routine calls to a receiving routine's signature.
So the net result is that the variables get the values they'd have inside this sub:
sub foo (#listvar1, #listvar2) { }
foo list1, list2;
which is to say the effect is the same as:
#listvar1 := list1;
#listvar2 := list2;
Again, as with Christoph's answer, job done.
But this way we'll have avoided assignment overhead.
Not entirely sure if it's by design, but what seems to happen is that both of your sequences are getting stored into #ra1, while #ra2 remains empty. This violates the type constraint.
What does work is
#ra1, #ra2 Z= <10 20>.map(...);

Perl 6 reports "Cannot unbox a type object" when typing an array

I suspect this may be a bug in Rakudo, but I just started playing with Perl 6 today, so there's a good chance I'm just making a mistake. In this simple program, declaring a typed array inside a sub appears to make the Perl 6 compiler angry. Removing the type annotation on the array gets rid of the compiler error.
Here's a simple prime number finding program:
#!/usr/bin/env perl6
use v6;
sub primes(int $max) {
my int #vals = ^$max; # forcing a type on vals causes compiler error (bug?)
for 2..floor(sqrt($max)) -> $i {
next if not #vals[$i];
#vals[2*$i, 3*$i ... $max-1] = 0;
}
return ($_ if .Bool for #vals)[1..*];
}
say primes(1000);
On Rakudo Star 2016.07.1 (from the Fedora 24 repos), this program gives the following error:
[sultan#localhost p6test]$ perl6 primes.p6
Cannot unbox a type object
in sub primes at primes.p6 line 8
in block <unit> at primes.p6 line 13
If I remove the type annotation on the vals array, the program works correctly:
...
my #vals = ^$max; # I removed the int type
...
Am I making a mistake in my usage of Perl 6, or is this a bug in Rakudo?
There's a potential error in your code that's caught by type checking
The error message you got draws attention to line 8:
#vals[2*$i, 3*$i ... $max-1] = 0;
This line assigns the list of values on the right of the = to the list of elements on the left.
The first element in the list on the left, #vals[2*$i], gets a zero.
You didn't define any more values on the right so the rest of the elements on the left are assigned a Mu. Mus work nicely as placeholders for elements that do not have a specific type and do not have a specific value. Think of a Mu as being, among other things, like a Null, except that it's type safe.
You get the same scenario with this golfed version:
my #vals;
#vals[0,1] = 0; # assigns 0 to #vals[0], Mu to #vals[1]
As you've seen, everything works fine when you do not specify an explicit type constraint for the elements of the #vals array.
This is because the default type constraint for array elements is Mu. So assigning a Mu to an element is fine.
If you felt it tightened up your code you could explicitly assign zeroes:
#vals[2*$i, 3*$i ... $max-1] = 0 xx Inf;
This generates a (lazy) infinite list of zeroes on the RHS so that zero is assigned to each of the list of elements on the LHS.
With just this change your code will work even if you specify a type constraint for #vals.
If you don't introduce the xx Inf but do specify an element type constraint for #vals that isn't Mu, then your code will fail a type check if you attempt to assign a Mu to an element of #vals.
The type check failure will come in one of two flavors depending on whether you're using object types or native types.
If you specify an object type constraint (eg Int):
my Int #vals;
#vals[0,1] = 0;
then you get an error something like this:
Type check failed in assignment to #vals; expected Int but got Mu (Mu)
If you specify a native type constraint (eg int rather than Int):
my int #vals;
#vals[0,1] = 0;
then the compiler first tries to produce a suitable native value from the object value (this is called "unboxing") before attempting a type check. But there is no suitable native value corresponding to the object value (Mu). So the compiler complains that it can not even unbox the value. Finally, as hinted at at the start, while Mu works great as a type safe Null, that's just one facet of Mu. Another is that it's a "type object". So the error message is Cannot unbox a type object.

Resizing matrix using pointer attribute

I have a Fortran program which uses a routine in a module to resize a matrix like:
module resizemod
contains
subroutine ResizeMatrix(A,newSize,lindx)
integer,dimension(:,:),intent(inout),pointer :: A
integer,intent(in) :: newSize(2)
integer,dimension(:,:),allocatable :: B
integer,optional,intent(in) :: lindx(2)
integer :: i,j
allocate(B(lbound(A,1):ubound(A,1),lbound(A,2):ubound(A,2)))
forall (i=lbound(A,1):ubound(A,1),j=lbound(A,2):ubound(A,2))
B(i,j)=A(i,j)
end forall
if (associated(A)) deallocate(A)
if (present(lindx)) then
allocate(A(lindx(1):lindx(1)+newSize(1)-1,lindx(2):lindx(2)+newSize(2)-1))
else
allocate(A(newSize(1),newSize(2)))
end if
do i=lbound(B,1),ubound(B,1)
do j=lbound(B,2), ubound(B,2)
A(i,j)=B(i,j)
end do
end do
deallocate(B)
end subroutine ResizeMatrix
end module resizemod
The main program looks like:
program resize
use :: resizemod
implicit none
integer,pointer :: mtest(:,:)
allocate(mtest(0:1,3))
mtest(0,:)=[1,2,3]
mtest(1,:)=[1,4,5]
call ResizeMatrix(mtest,[3,3],lindx=[0,1])
mtest(2,:)=0
print *,mtest(0,:)
print *,mtest(1,:)
print *,mtest(2,:)
end program resize
I use ifort 14.0 to compile the codes. The issue that I am facing is that sometimes I don't get the desired result:
1 0 0
1 0 5
0 0 -677609912
Actually I couldn't reproduce the issue (which is present in my original program) using the minimal test codes. But the point that I noticed was that when I remove the compiler option -fast, this problem disappears.
Then my question would be
If the pieces of code that I use are completely legal?
If any other method for resizing the matrices would be recommended which is better than the one presented in here?
The relevance of the described issue and the compiler option "-fast".
If I've read the code right it's legal but incorrect. In your example you've resized a 2x3 array into 3x3 but the routine ResizeMatrix doesn't do anything to set the values of the extra elements. The strange values you see, such as -677609912, are the interpretation, as integers. of whatever bits were lying around in memory when the memory location corresponding to the unset array element was read (so that it's value could be written out).
The relevance of -fast is that it is common for compilers in debug or low-optimisation modes, to zero-out memory locations but not to bother when higher optimisation is switched on. The program is legal in the sense that it contains no compiler-determinable syntax errors. But it is incorrect in the sense that reading a variable whose value has not been initialised or assigned is not something you regularly ought to do; doing so makes your program, essentially, non-deterministic.
As for your question 2, it raises the possibility that you are not familiar with the intrinsic functions reshape or (F2003) move_alloc. The latter is almost certainly what you want, the former may help too.
As an aside: these days I rarely use pointer on arrays, allocatable is much more useful and generally easier and safer too. But you may have requirements of which I wot not.

matlab subsref: {} with string argument fails, why?

There are a few implementations of a hash or dictionary class in the Mathworks File Exchange repository. All that I have looked at use parentheses overloading for key referencing, e.g.
d = Dict;
d('foo') = 'bar';
y = d('foo');
which seems a reasonable interface. It would be preferable, though, if you want to easily have dictionaries which contain other dictionaries, to use braces {} instead of parentheses, as this allows you to get around MATLAB's (arbitrary, it seems) syntax limitation that multiple parentheses are not allowed but multiple braces are allowed, i.e.
t{1}{2}{3} % is legal MATLAB
t(1)(2)(3) % is not legal MATLAB
So if you want to easily be able to nest dictionaries within dictionaries,
dict{'key1'}{'key2'}{'key3'}
as is a common idiom in Perl and is possible and frequently useful in other languages including Python, then unless you want to use n-1 intermediate variables to extract a dictionary entry n layers deep, this seems a good choice. And it would seem easy to rewrite the class's subsref and subsasgn operations to do the same thing for {} as they previously did for (), and everything should work.
Except it doesn't when I try it.
Here's my code. (I've reduced it to a minimal case. No actual dictionary is implemented here, each object has one key and one value, but this is enough to demonstrate the problem.)
classdef TestBraces < handle
properties
% not a full hash table implementation, obviously
key
value
end
methods(Access = public)
function val = subsref(obj, ref)
% Re-implement dot referencing for methods.
if strcmp(ref(1).type, '.')
% User trying to access a method
% Methods access
if ismember(ref(1).subs, methods(obj))
if length(ref) > 1
% Call with args
val = obj.(ref(1).subs)(ref(2).subs{:});
else
% No args
val = obj.(ref.subs);
end
return;
end
% User trying to access something else.
error(['Reference to non-existant property or method ''' ref.subs '''']);
end
switch ref.type
case '()'
error('() indexing not supported.');
case '{}'
theKey = ref.subs{1};
if isequal(obj.key, theKey)
val = obj.value;
else
error('key %s not found', theKey);
end
otherwise
error('Should never happen')
end
end
function obj = subsasgn(obj, ref, value)
%Dict/SUBSASGN Subscript assignment for Dict objects.
%
% See also: Dict
%
if ~strcmp(ref.type,'{}')
error('() and dot indexing for assignment not supported.');
end
% Vectorized calls not supported
if length(ref.subs) > 1
error('Dict only supports storing key/value pairs one at a time.');
end
theKey = ref.subs{1};
obj.key = theKey;
obj.value = value;
end % subsasgn
end
end
Using this code, I can assign as expected:
t = TestBraces;
t{'foo'} = 'bar'
(And it is clear that the assignment work from the default display output for t.) So subsasgn appears to work correctly.
But I can't retrieve the value (subsref doesn't work):
t{'foo'}
??? Error using ==> subsref
Too many output arguments.
The error message makes no sense to me, and a breakpoint at the first executable line of my subsref handler is never hit, so at least superficially this looks like a MATLAB issue, not a bug in my code.
Clearly string arguments to () parenthesis subscripts are allowed, since this works fine if you change the code to work with () instead of {}. (Except then you can't nest subscript operations, which is the object of the exercise.)
Either insight into what I'm doing wrong in my code, any limitations that make what I'm doing unfeasible, or alternative implementations of nested dictionaries would be appreciated.
Short answer, add this method to your class:
function n = numel(obj, varargin)
n = 1;
end
EDIT: The long answer.
Despite the way that subsref's function signature appears in the documentation, it's actually a varargout function - it can produce a variable number of output arguments. Both brace and dot indexing can produce multiple outputs, as shown here:
>> c = {1,2,3,4,5};
>> [a,b,c] = c{[1 3 5]}
a =
1
b =
3
c =
5
The number of outputs expected from subsref is determined based on the size of the indexing array. In this case, the indexing array is size 3, so there's three outputs.
Now, look again at:
t{'foo'}
What's the size of the indexing array? Also 3. MATLAB doesn't care that you intend to interpret this as a string instead of an array. It just sees that the input is size 3 and your subsref can only output 1 thing at a time. So, the arguments mismatch. Fortunately, we can correct things by changing the way that MATLAB determines how many outputs are expected by overloading numel. Quoted from the doc link:
It is important to note the significance of numel with regards to the
overloaded subsref and subsasgn functions. In the case of the
overloaded subsref function for brace and dot indexing (as described
in the last paragraph), numel is used to compute the number of
expected outputs (nargout) returned from subsref. For the overloaded
subsasgn function, numel is used to compute the number of expected
inputs (nargin) to be assigned using subsasgn. The nargin value for
the overloaded subsasgn function is the value returned by numel plus 2
(one for the variable being assigned to, and one for the structure
array of subscripts).
As a class designer, you must ensure that the value of n returned by
the built-in numel function is consistent with the class design for
that object. If n is different from either the nargout for the
overloaded subsref function or the nargin for the overloaded subsasgn
function, then you need to overload numel to return a value of n that
is consistent with the class' subsref and subsasgn functions.
Otherwise, MATLAB produces errors when calling these functions.
And there you have it.