Fortran Functions with a pointer result in a normal assignment - oop

After some discussion on the question found here Correct execution of Final routine in Fortran
I thought it will be useful to know when a function with a pointer result is appropriate to use with a normal or a pointer assignment. For example, given this simple function
function pointer_result(this)
implicit none
type(test_type),intent(in) pointer :: this
type(test_type), pointer :: pointer_result
allocate(pointer_result)
end function
I would normally do test=>pointer_result(test), where test has been declared with the pointer attribute. While the normal assignment test=pointer_result(test) is legal it means something different.
What does the normal assignment imply compared to the pointer assignment?
When does it make sense to use one or the other assignment?

A normal assignment
test = pointer_result()
means that the value of the current target of test will be overwritten by the value pointed to by the resulting pointer. If test points to some invalid address (is undefined or null) the program will crash or produce undefined results. The anonymous target allocated by the function will have no pointer to it any more and the memory will be leaked.
There is hardly any legitimate use for this, but it is likely to happen when one makes a typo and writes = instead of =>. It is a very easy one to make and several style guides recommend to never use pointer functions.

Related

Fortran constructor returning pointer to allocated object

In this question: Fortran Functions with a pointer result in a normal assignment, it is stated that functions returning pointers are not recommended.
My question concerns constructors of user defined types. Consider the code below:
program PointTest
use PointMod, only: PointType
implicit none
class(PointType), allocatable :: TypeObject
TypeObject = PointType(10)
end program PointTest
module PointMod
implicit none
type PointType
real(8), dimension(:), allocatable :: array
contains
final :: Finalizer
end type PointType
interface PointType
procedure NewPointType
end interface PointType
contains
function NewPointType(n) result(TypePointer)
implicit none
integer, intent(in) :: n
type(PointType), pointer :: TypePointer
allocate(TypePointer)
allocate(TypePointer%array(n))
end function NewPointType
subroutine Finalizer(this)
implicit none
type(PointType) :: this
print *, 'Finalizer called'
end subroutine Finalizer
end module PointMod
In the code, I have defined a type with a constructor that allocates the object and then allocates an array in the object. It then returns a pointer to the object.
If the constructor just returned the object, the object and the array would be copied and then deallocated (at least with standard compliant compilers). This could cause overhead and mess with our memory tracking.
Compiling the above code with ifort gives no warnings with -warn all (except unused variable in the finalizer) and the code behaves the way I expect. It also works fine with gfortran, except I get a warning when using -Wall
TypeObject = PointType(10)
1
Warning: POINTER-valued function appears on right-hand side of assignment at (1) [-Wsurprising]
What are the risks of using constructors like these? As far as I can tell, there will be no dangling pointers and we will have more control on when objects are allocated. One workaround that would achieve the same result is to explicitly allocate the object and turn the constructor into a subroutine that sets variables and does the allocation of array, but it looks a lot less elegant. Are there other solutions? Our code is in the Fortran 2008 standard.
Do not use pointer valued functions. As a rule I never make functions that return functions. They are bad and confusing. They lead to nasty bugs, especially when one confuses => and =.
What the function does is that it allocates a new object and creates a pointer that allocates the object.
What
TypeObject = PointType(10)
does is that it copies the value of the object stored in the pointer. Then the pointer is forgotten and the memory where the pointer had pointed is leaked and lost forever.
You write "As far as I can tell, there will be no dangling pointers and we will have more control on when objects are allocated." However, I do not see a way to avoid the dangling pointer allocated inside the function. Not even a finalizer can help here. I also do not see how you have more control. The memory you explicitly allocated is just lost. You have a different memory for TypeObject (likely on the main program's stack) and the array inside the type will get allocated again during the copy at the intrinsic assignment TypeObject = PointType(10).
The finalizer could take care of the array component so the array allocated inside the function does not have to be lost. However, the type itself, to which the pointer TypePointer points, with its non-allocatable non-pointer components and descriptors and so on, cannot be deallocated from the finalizer and will remain dangling and the memory will be leaked.
Do not be afraid of functions that return objects as values. That is not a problem. Compilers are smart and are able to optimize an unnecessary copy. Compiler might be easily able to find out that you are just assigning the function result so it can use the memory location of the assignment target for the function result variable (if it does not have to be allocatable).
Many other optimizations exist.
function NewPointType(n) result(TypePointer)
integer, intent(in) :: n
type(PointType) :: TypePointer
allocate(TypePointer%array(n))
end function NewPointType
is simpler and should work just fine. With optimizations it could even be faster. If using a non-pointer non-allocatable result is not possible, use allocatable. Do not use pointers for function results.

What is the required syntax to pass a block to a pure C function?

I have a pure C function, to which I would like to pass a block (a closure?). As per Apple, the block should always be the last parameter to a function.
double pureCfunctionWithABlockParameter( int ignore, double ignore2, void (^myVoidBlockWithoutParameters)(void) ) {
myVoidBlockWithoutParameters(); /
return 0.0;
}
Next is the Objective C code to call the C function:
- (void) testBlockFunctionality {
declare and define the block:
void (^myBlock1)(void) ;
myBlock1=^(void){ NSLog(#"myBlock1 just logs this message to the console");};
Attempt to invoke the block directly, without parentheses. This doesn't work. Xcode warns result is unused. Block's message is NOT logged to console.
myBlock1;
Now attempt to invoke the block directly, this time with parentheses. This works as intended. No Xcode warnings, and the block's message IS logged to console.
myBlock1();
Now call the function, passing the block as parameter, WITHOUT parentheses. This works as intended, but the syntax is not consistent with the previous invocation of the block.
double someNumber;
someNumber= pureCfunctionWithABlockParameter(0, 1, myBlock1 );
Now call the function, again passing the block as a parameter, this time WITH parentheses. This doesn't work, it won't even compile, as Xcode gives a: "Passing 'void' to parameter of incompatible type 'void (^)(void)'" message.
someNumber= pureCfunctionWithABlockParameter(0, 1, myBlock1());
At the end of it all, I am actually looking to have a block defined that gets passed an int parameter, like this:
void(^block)(int)
But I cannot progress to that because of what I think is a syntax issue.
I've looked in Apple's Block Programming Topics, and even K&R C, but no luck.
The question has caused some confusion, because blocks (in the question's sense) are not a feature of standard C. Apple added them as an extension to its C and C++ compilers when it added them to Objective C, but they are not a C thing outside the Apple ecosystem. I confess that I've no experience actually using them, but as far as I can tell from the docs, such as these, the syntax was chosen so as to be the same for C, C++, and Objective C. Indeed, some sources claim that details of the syntax were chosen specifically to avoid the possibility of conflict with C++.
From a C perspective, accepting a block as a parameter and calling a block received that way are thoroughly analogous to accepting a function pointer and calling the pointed-to function, respectively. Your example C function appears to be correct.
Similar applies to declaring and and working with blocks, in all three languages -- it is analogous to declaring and working with function pointers. I am confident that this was an intentional design consideration. Thus
void (^myBlock1)(void) ;
indeed declares myBlock1 as a block taking no parameters and returning nothing, but does not define its value. Having elsewhere set a valid value for it, such as is demonstrated in the question, the OP observes
Attempt to invoke the block directly, without parentheses. This
doesn't work. Xcode warns result is unused. Block's message is NOT
logged to console.
myBlock1;
, as indeed should be expected. That's a statement expression evaluating to the value of the block, not to the result of executing the block. It is analogous to
int myInt = 1;
myInt; // specifically, analogous to this
To execute a block, one provides a postfix argument list in parentheses (even if the list is empty), just like when calling a function through a function pointer:
Now attempt to invoke the block directly, this time with parentheses.
This works as intended. No Xcode warnings, and the block's message IS
logged to console.
myBlock1();
The presence or absence of an argument list is what disambiguates whether one is accessing the block's value or calling it.
The confusion is about passing a block to a function (or method):
Now call the function, passing the block as parameter, WITHOUT
parentheses. This works as intended, but the syntax is not consistent
with the previous invocation of the block.
double someNumber;
someNumber= pureCfunctionWithABlockParameter(0, 1, myBlock1 );
Yet, contrary to the assertion in the question, that syntax as completely consistent, both internally consistent with other aspects of block syntax and usage, and consistent with analogous function pointer syntax and usage. That passes the block to the function, identifying the block by its name. The block itself is passed, not the result of executing it, because no argument list for it is provided.
At the end of it all, I am actually looking to have a block defined
that gets passed an int parameter, like this:
void (^block)(int)
But I cannot progress to that because of what I think is a syntax
issue.
A C function accepting and using such a block might look like this
void pass_2(void (^do_something)(int)) {
do_something(2);
}
Given variable block declared as shown above, and assigned a valid block as its value, that function could be called like so:
pass_2(block);
Just as we recognize that function pass_2 is called by the presence of an argument list, we recognize that the value of variable block is passed as an argument -- not called -- by the absence of an argument list.

Why Microsoft CRT is so permissive regarding a BSTR double free

This is a simplified question for the one I asked here. I'm using VS2010 (CRT v100) and it doesn't complain, in any way ever, when i double free a BSTR.
BSTR s1=SysAllocString(L"test");
SysFreeString(s1);
SysFreeString(s1);
Ok, the question is highly hypothetical (actually, the answer is :).
SysFreeString takes a BSTR, which is a pointer, which actually is a number which has a specific semantic. This means that you can provide any value as an argument to the function, not just a valid BSTR or a BSTR which was valid moments ago. In order for SysFreeString to recognize invalid values, it would need to know all the valid BSTRs and to check against all of them. You can imagine the price of that.
Besides, it is consistent with other C, C++, COM or Windows APIs: free, delete, CloseHandle, IUnknown::Release... all of them expect YOU to know whether the argument is eligible for releasing.
In a nutshell your question is: "I am calling SysFreeString with an invalid argument. Why compiler allows me this".
Visual C++ compiler allows the call and does not issue a warning because the call itself is valid: there is a match of argument type, the API function is good, this can be converted to binary code that executes. The compiler has no knowledge whether your argument is valid or not, you are responsible to track this yourselves.
The API function on the other hand expects that you pass valid argument. It might or might not check its validity. Documentation says about the argument: "The previously allocated string". So the value is okay for the first call, but afterward the pointer value is no longer a valid argument for the second call and behavior is basically undefined.
Nothing to do with the CRT, this is a winapi function. Which is C based, a language that has always given programmers enough lengths of rope to hang themselves by invoking UB with the slightest mistake. Fast and easy-to-port has forever been at odds with safe and secure.
SysFreeString() doesn't win any prizes, clearly it should have had a BOOL return type. But it can't, the IMalloc::Free() interface function was fumbled a long time ago. Nothing you can't fix yourself:
BOOL SafeSysFreeString(BSTR* str) {
if (str == NULL) {
SetLastError(ERROR_INVALID_ARGUMENT);
return FALSE;
}
SysFreeString(*str);
*str = NULL;
return TRUE;
}
Don't hesitate to yell louder, RaiseException() gives a pretty good bang that is hard to ignore. But writing COM code in C is cruel and unusual punishment, outlawed by the Geneva Convention on Programmers Rights. Use the _bstr_t or CComBSTR C++ wrapper types instead.
But do watch out when you slice the BSTR out of them, they can't help when you don't or can't use them consistently. Which is how you got into trouble with that VARIANT. Always pay extra attention when you have to leave the safety of the wrapper, there are C sharks out there.
See this quote from MSDN:
Automation may cache the space allocated for BSTRs. This speeds up
the SysAllocString/SysFreeString sequence.
(...)if the application allocates a BSTR and frees it, the free block
of memory is put into the BSTR cache by Automation(...)
This may explain why calling SysFreeString(...) twice with the same pointer does not produce a crash,since the memory is still available (kind of).

Does fortran permit inline operations on the return value of a function?

I am trying to design a data structure composed of objects which contain, as instance variables, objects of another type.
I'd like to be able to do something like this:
CALL type1_object%get_nested_type2_object()%some_type2_method()
Notice I am trying to immediately use the getter, get_nested_type2_object() and then act on its return value to call a method in the returned type2 object.
As it stands, gfortran v4.8.2 does not accept this syntax and thinks get_nested_type2_object() is an array reference, not a function call. Is there any syntax that I can use to clarify this or does the standard not allow this?
To give a more concrete example, here is some code illustrating this:
furniture_class.F95:
MODULE furniture_class
IMPLICIT NONE
TYPE furniture_object
INTEGER :: length
INTEGER :: width
INTEGER :: height
CONTAINS
PROCEDURE :: get_length
END TYPE furniture_object
CONTAINS
FUNCTION get_length(self)
IMPLICIT NONE
CLASS(furniture_object) :: self
INTEGER :: get_length
get_length = self%length
END FUNCTION
END MODULE furniture_class
Now a room object may contain one or more furniture objects.
room_class.F95:
MODULE room_class
USE furniture_class
IMPLICIT NONE
TYPE :: room_object
CLASS(furniture_object), POINTER :: furniture
CONTAINS
PROCEDURE :: get_furniture
END TYPE room_object
CONTAINS
FUNCTION get_furniture(self)
USE furniture_class
IMPLICIT NONE
CLASS(room_object) :: self
CLASS(furniture_object), POINTER :: get_furniture
get_furniture => self%furniture
END FUNCTION get_furniture
END MODULE room_class
Finally, here is a program where I attempt to access the furniture object inside the room (but the compiler won't let me):
room_test.F95
PROGRAM room_test
USE room_class
USE furniture_class
IMPLICIT NONE
CLASS(room_object), POINTER :: room_pointer
CLASS(furniture_object), POINTER :: furniture_pointer
ALLOCATE(room_pointer)
ALLOCATE(furniture_pointer)
room_pointer%furniture => furniture_pointer
furniture_pointer%length = 10
! WRITE(*,*) 'The length of furniture in the room is', room_pointer%furniture%get_length() - This works.
WRITE(*,*) 'The length of furniture in the room is', room_pointer%get_furniture()%get_length() ! This line fails to compile
END PROGRAM room_test
I can of course directly access the furniture object if I don't use a getter to return the nested object, but this ruins the encapsulation and can become problematic in production code that is much more complex than what I show here.
Is what I am trying to do not supported by the Fortran standard or do I just need a more compliant compiler?
What you want to do is not supported by the syntax of the standard language.
(Variations on the general syntax (not necessarily this specific case) that might apply for "dereferencing" a function result could be ambiguous - consider things like substrings, whole array references, array sections, etc.)
Typically you [pointer] assign the result of the first function call to a [pointer] variable of the appropriate type, and then apply the binding for the second function to that variable.
Alternatively, if you want to apply an operation to a primary in an expression (such as a function reference) to give another value, then you could use an operator.
Some, perhaps rather subjective, comments:
Your room object doesn't really contain a furniture object - it holds a reference to a furniture object. Perhaps you use that reference in a manner that implies the parent object "containing" it, but that's not what the component definition naturally suggests.
(Use of a pointer component suggests that you want the room to point at (i.e. reference) some furniture. In terms of the language, the object referenced by a pointer component is not usually considered part of the value of the parent object of the component - consider how intrinsic assignment works, restrictions around modifying INTENT(IN) arguments, etc.
A non-pointer component suggests to me that the furniture is part of the room. In a Fortran language sense an object that is a non-pointer component it is always part of the value of the parent object of the component.
To highlight - pointer components in different rooms could potentially point at the same piece of furniture; a non-pointer furniture object is only ever directly part of one room.)
You need to be very careful using functions with pointer results. In the general case, is it:
p = some_ptr_function(args)
(and perhaps I accidentally leak memory) or
p => some_ptr_function(args)
Only one little character difference, both valid syntax, quite different semantics. If the second case is what is intended, then why not just pass the pointer back via a subroutine argument? An inconsequential difference in typing and it is much safer.
A general reminder applicable to some of the above - in the context of an expression, evaluation of a function reference yields a value. Values are not variables and hence you are not permitted to vary [modify] them.

Handle declarations

Can anyone tell me what the difference is between these 2 lines of code, which one is better to use?
System::String ^MyStr = gcnew System::String(MyStr);
System::String ^MyStr;
Those lines are not equivalent. In the first one, you will get an exception beacuse you're trying to create a String from an uninitialized tracking handle (MyStr). In the second one, MyStr is declared, not defined, it points to garbage and will throw an exception if you attempt to use it. Which one you should use depends on the rest of the code
The second one creates a new handle variable. If it's a local variable, then as #dario_ramos says, it's uninitialized, and your program will likely crash if you try to use the handle before assigning it. If it's a member variable or global, then it will be nullptr.
The first one is similar, although it can only be used for locals or globals (member variables use the ctor-initializer syntax in C++/CLI just like plain C++), and does exactly what you're not permitted to do. It reads the brand new uninitialized handle and passes it to the System::String constructor. If by chance the constructor finishes, a handle to the newly constructed String will be placed into the variable as part of initialization. But because the constructor is trying to make a copy of random garbage (if it's a local) or nullptr (if a global), most likely it will simply crash.
It's a bad idea to use the value of any variable in its own initializer (sometimes you need to use the address, never the value).