Procedure copy in each instance of data type - oop

When we create multiple instances from a data type (class) that has a pass procedure pointer, is the actual procedure (subroutines/functions) copied in each instance? Or is just the pointer copied?
For example consider the following code that compiles and runs correctly.
module mod2
implicit none
private
type class_type
integer :: a, b, c
contains
procedure :: add => add_it
end type class_type
public :: class_type
contains
subroutine add_it(this)
implicit none
class(class_type), intent(inout) :: this
this%c = this%a + this%b
end subroutine add_it
end module mod2
program tester
use mod2
implicit none
type(class_type), dimension(10) :: objs
objs(:) = class_type(1, 2, 0)
end program tester
Is subroutine add_it duplicated in each of the 10 objects created from data type class_type? Or is the instruction-set of subroutine add_it stored somewhere and the pointers to it, i.e. "procedure :: add => add_it" copied in each object?

Typically neither. Note this is very much implementation specific - what I describe below is typical but different processors may do things differently.
Note there are no procedure pointers in your example. The type class_type has a binding. If the class_type had a procedure pointer, things are different.
Typical implementation for bindings is that the compiler creates a table of machine level pointers, with one entry for each specific binding, with the pointer pointing at the code for the procedure. A table (sometimes known as a "vtable", from the similar technique used for virtual member functions in C++ and similar languages) is created for each type in the program.
For polymorphic objects (things declared with CLASS), the compiler then creates a descriptor that has a machine level pointer to the relevant table for the dynamic (runtime) type of the object. This pointer effectively indicates the dynamic type of the object and may be used in constructs such as SELECT TYPE and invocations of things like SAME_TYPE_AS. If you have a polymorphic array the compiler will initially typically only create one descriptor for the entire array, as individual elements in the array must all have the same dynamic type.
When you call a binding on a polymorphic object, the compiler follows the pointer to the vtable, then looks up the relevant pointer to the procedure binding.
No such descriptor or pointer dereferencing is required for non-polymorphic objects (things declared with TYPE) as the dynamic and declared type are always the same, the compiler knows what the declared type is and the compiler knows, at compile time, which procedure will be called.
If you have a procedure call where a non-polymorphic actual argument is associated with a polymorphic dummy argument, then the compiler will typically create the necessary descriptor as part of making the procedure call. Similarly for passing a polymorphic array element to a procedure taking a polymorphic scalar.
The main program of your code contains no polymorphic entities, and you call no procedures, so there may not be any machine pointers back to the vtable.
Procedure pointer components (components declared PROCEDURE(xxx), POINTER :: yyy before the CONTAINS of the type declaration) can be different for every object (including being different for every element in an array). In that case typical implementation is to store a machine level pointer to the code for the relevant procedure (or a null pointer if the procedure pointer component has not been associated).

Related

Fortran constructor returning pointer to allocated object

In this question: Fortran Functions with a pointer result in a normal assignment, it is stated that functions returning pointers are not recommended.
My question concerns constructors of user defined types. Consider the code below:
program PointTest
use PointMod, only: PointType
implicit none
class(PointType), allocatable :: TypeObject
TypeObject = PointType(10)
end program PointTest
module PointMod
implicit none
type PointType
real(8), dimension(:), allocatable :: array
contains
final :: Finalizer
end type PointType
interface PointType
procedure NewPointType
end interface PointType
contains
function NewPointType(n) result(TypePointer)
implicit none
integer, intent(in) :: n
type(PointType), pointer :: TypePointer
allocate(TypePointer)
allocate(TypePointer%array(n))
end function NewPointType
subroutine Finalizer(this)
implicit none
type(PointType) :: this
print *, 'Finalizer called'
end subroutine Finalizer
end module PointMod
In the code, I have defined a type with a constructor that allocates the object and then allocates an array in the object. It then returns a pointer to the object.
If the constructor just returned the object, the object and the array would be copied and then deallocated (at least with standard compliant compilers). This could cause overhead and mess with our memory tracking.
Compiling the above code with ifort gives no warnings with -warn all (except unused variable in the finalizer) and the code behaves the way I expect. It also works fine with gfortran, except I get a warning when using -Wall
TypeObject = PointType(10)
1
Warning: POINTER-valued function appears on right-hand side of assignment at (1) [-Wsurprising]
What are the risks of using constructors like these? As far as I can tell, there will be no dangling pointers and we will have more control on when objects are allocated. One workaround that would achieve the same result is to explicitly allocate the object and turn the constructor into a subroutine that sets variables and does the allocation of array, but it looks a lot less elegant. Are there other solutions? Our code is in the Fortran 2008 standard.
Do not use pointer valued functions. As a rule I never make functions that return functions. They are bad and confusing. They lead to nasty bugs, especially when one confuses => and =.
What the function does is that it allocates a new object and creates a pointer that allocates the object.
What
TypeObject = PointType(10)
does is that it copies the value of the object stored in the pointer. Then the pointer is forgotten and the memory where the pointer had pointed is leaked and lost forever.
You write "As far as I can tell, there will be no dangling pointers and we will have more control on when objects are allocated." However, I do not see a way to avoid the dangling pointer allocated inside the function. Not even a finalizer can help here. I also do not see how you have more control. The memory you explicitly allocated is just lost. You have a different memory for TypeObject (likely on the main program's stack) and the array inside the type will get allocated again during the copy at the intrinsic assignment TypeObject = PointType(10).
The finalizer could take care of the array component so the array allocated inside the function does not have to be lost. However, the type itself, to which the pointer TypePointer points, with its non-allocatable non-pointer components and descriptors and so on, cannot be deallocated from the finalizer and will remain dangling and the memory will be leaked.
Do not be afraid of functions that return objects as values. That is not a problem. Compilers are smart and are able to optimize an unnecessary copy. Compiler might be easily able to find out that you are just assigning the function result so it can use the memory location of the assignment target for the function result variable (if it does not have to be allocatable).
Many other optimizations exist.
function NewPointType(n) result(TypePointer)
integer, intent(in) :: n
type(PointType) :: TypePointer
allocate(TypePointer%array(n))
end function NewPointType
is simpler and should work just fine. With optimizations it could even be faster. If using a non-pointer non-allocatable result is not possible, use allocatable. Do not use pointers for function results.

How to create and use array of type extensions in Fortran? [duplicate]

I am trying to use pointers to create links between objects. Using Fortran and here is the code piece:
module base_pars_module
type,abstract,public :: base_pars
end type
end module
module test_parameters_module
use base_pars_module
type, extends(base_pars) :: test_pars
contains
procedure :: whoami
end type
contains
function whoami(this) result(iostat)
class( test_pars) :: this
write(*,*) 'i am a derived type child of base_pars'
end type
end module
module base_mask_module
use base_pars module
type, abstract , public :: base_mask
class(base_pars),pointer :: parameters
end type
end module
module test_mask_module
use base_mask_module
implicit none
type, extends(base_mask) :: test_mask
end type
end module
program driver
type(test_pars) , target :: par_Test
type(test_mask) :: mask_test
iostat= par_test%whoami()
mask_test%parameters=>par_test
iostat=mask_test%parameters%whoami()
end program
parameters at base_mask_module is a pointer with base_pars class. I would like to use this pointer to refer par_test object which is test_pars type that extends base_pars type. So the pointer and the target has the same class. But when I compile this it gives an error:
driver.f90:17.37:
iostat=mask_test%parameters%whoami()
1
Error: 'whoami' at (1) is not a member of the 'base_pars' structure
Is it a bug or am i doing something wrong?
When you have polymorphism like this there are two things to consider about an object: its dynamic type and its declared type. The parameters component of test_mask (base_mask) is declared as
class(base_pars),pointer :: parameters
Such a component therefore has declared type base_pars.
Come the pointer assignment
mask_test%parameters=>par_test
mask_test%parameters has dynamic type the same as par_test: test_pars. It's of declared type base_pars, though, and it's the declared type that is important when we care about its components and bindings. base_pars indeed has no whoami.
You need, then, something which has declared type par_test. Without changing the definitions of the derived types you can do this with the select type construct.
select type (pars => mask_test%parameters)
class is (par_test)
iostat=pars%whoami() ! pars of declared type par_test associated with mask_test%parameters
end select
That said, things get pretty tedious quite quickly with this approach. Always using select type, distinguishing between numerous extending types, will be quite a bind. An alternative would be to ensure that the declared type base_pars has a binding whoami. Instead of changing the main program as above, we alter the module base_pars_module:
module base_par_modules
implicit none ! Encourage good practice
type,abstract,public :: base_pars
contains
procedure(whoami_if), deferred :: whoami
end type
interface
integer function whoami_if(this)
import base_pars ! Recall we're in a different scope from the module
class(base_pars) this
end function
end interface
end module
So, we've a deferred binding in base_pars that is later over-ridden by a binding in the extending type test_pars. mask_test%parameters%whoami() in the main program is then a valid and the function called is that offered by the dynamic type of parameters.
Both approaches here address the problem with the binding of the declared type of parameters. Which best suits your real-world problem depends on your overall design.
If you know that your hierarchy of types will all have enough in common with the base type (that is, all will offer a whoami binding) then it makes sense to go for this second approach. Use the first approach rather when you have odd special cases, which I'd suggest should be rare.

Does fortran permit inline operations on the return value of a function?

I am trying to design a data structure composed of objects which contain, as instance variables, objects of another type.
I'd like to be able to do something like this:
CALL type1_object%get_nested_type2_object()%some_type2_method()
Notice I am trying to immediately use the getter, get_nested_type2_object() and then act on its return value to call a method in the returned type2 object.
As it stands, gfortran v4.8.2 does not accept this syntax and thinks get_nested_type2_object() is an array reference, not a function call. Is there any syntax that I can use to clarify this or does the standard not allow this?
To give a more concrete example, here is some code illustrating this:
furniture_class.F95:
MODULE furniture_class
IMPLICIT NONE
TYPE furniture_object
INTEGER :: length
INTEGER :: width
INTEGER :: height
CONTAINS
PROCEDURE :: get_length
END TYPE furniture_object
CONTAINS
FUNCTION get_length(self)
IMPLICIT NONE
CLASS(furniture_object) :: self
INTEGER :: get_length
get_length = self%length
END FUNCTION
END MODULE furniture_class
Now a room object may contain one or more furniture objects.
room_class.F95:
MODULE room_class
USE furniture_class
IMPLICIT NONE
TYPE :: room_object
CLASS(furniture_object), POINTER :: furniture
CONTAINS
PROCEDURE :: get_furniture
END TYPE room_object
CONTAINS
FUNCTION get_furniture(self)
USE furniture_class
IMPLICIT NONE
CLASS(room_object) :: self
CLASS(furniture_object), POINTER :: get_furniture
get_furniture => self%furniture
END FUNCTION get_furniture
END MODULE room_class
Finally, here is a program where I attempt to access the furniture object inside the room (but the compiler won't let me):
room_test.F95
PROGRAM room_test
USE room_class
USE furniture_class
IMPLICIT NONE
CLASS(room_object), POINTER :: room_pointer
CLASS(furniture_object), POINTER :: furniture_pointer
ALLOCATE(room_pointer)
ALLOCATE(furniture_pointer)
room_pointer%furniture => furniture_pointer
furniture_pointer%length = 10
! WRITE(*,*) 'The length of furniture in the room is', room_pointer%furniture%get_length() - This works.
WRITE(*,*) 'The length of furniture in the room is', room_pointer%get_furniture()%get_length() ! This line fails to compile
END PROGRAM room_test
I can of course directly access the furniture object if I don't use a getter to return the nested object, but this ruins the encapsulation and can become problematic in production code that is much more complex than what I show here.
Is what I am trying to do not supported by the Fortran standard or do I just need a more compliant compiler?
What you want to do is not supported by the syntax of the standard language.
(Variations on the general syntax (not necessarily this specific case) that might apply for "dereferencing" a function result could be ambiguous - consider things like substrings, whole array references, array sections, etc.)
Typically you [pointer] assign the result of the first function call to a [pointer] variable of the appropriate type, and then apply the binding for the second function to that variable.
Alternatively, if you want to apply an operation to a primary in an expression (such as a function reference) to give another value, then you could use an operator.
Some, perhaps rather subjective, comments:
Your room object doesn't really contain a furniture object - it holds a reference to a furniture object. Perhaps you use that reference in a manner that implies the parent object "containing" it, but that's not what the component definition naturally suggests.
(Use of a pointer component suggests that you want the room to point at (i.e. reference) some furniture. In terms of the language, the object referenced by a pointer component is not usually considered part of the value of the parent object of the component - consider how intrinsic assignment works, restrictions around modifying INTENT(IN) arguments, etc.
A non-pointer component suggests to me that the furniture is part of the room. In a Fortran language sense an object that is a non-pointer component it is always part of the value of the parent object of the component.
To highlight - pointer components in different rooms could potentially point at the same piece of furniture; a non-pointer furniture object is only ever directly part of one room.)
You need to be very careful using functions with pointer results. In the general case, is it:
p = some_ptr_function(args)
(and perhaps I accidentally leak memory) or
p => some_ptr_function(args)
Only one little character difference, both valid syntax, quite different semantics. If the second case is what is intended, then why not just pass the pointer back via a subroutine argument? An inconsequential difference in typing and it is much safer.
A general reminder applicable to some of the above - in the context of an expression, evaluation of a function reference yields a value. Values are not variables and hence you are not permitted to vary [modify] them.

How to specify procedures to be executed depending on data type of polymorphic variables

Conside the following sample code:
module mod
implicit none
type :: typeBase1
integer :: A1
end type
type :: typeBase2
integer :: A3
end type
type :: typeBase3
integer :: A3
end type
type, extends(typeBase1) :: typeDerived1
! Void
end type
type, extends(typeBase2) :: typeDerived2
! Void
end type
type, extends(typeBase3) :: typeDerived3
! Void
end type
type, extends(typeBase1) :: typeDerived11
! Void
end type
type, extends(typeBase2) :: typeDerived21
! Void
end type
type, extends(typeBase3) :: typeDerived31
! Void
end type
type :: complexType
class(typeBase1), pointer :: ptrBase1 ! typeBase1, 2 and 3 are extensible
class(typeBase2), pointer :: ptrBase2
class(typeBase3), pointer :: ptrBase3
end type
interface calcul
subroutine calculA(obj1, obj2, obj3)
import
type(typeDerived1) :: obj1 ! typeDerived 1, 2 et 3 are derived type of typeBase1, 2 and 3
type(typeDerived2) :: obj2
type(typeDerived3) :: obj3
end subroutine
subroutine calculB(obj1, obj2, obj3)
import
type(typeDerived11) :: obj1 ! typeDerived 11, 21 et 31 are derived type of typeBase1, 2 and 3
type(typeDerived21) :: obj2
type(typeDerived31) :: obj3
end subroutine
end interface calcul
contains
subroutine calculComplexType(complex)
type(ComplexType), intent(inout) :: complex
call calcul(complex % ptrBase1, complex % ptrBase2, complex % ptrBase3)
end subroutine
end module mod
What I am trying to do is that the subroutine calculComplexType calls a different version of the subroutine calcul, basing on the dynamic type of ptrBase1, ptrBase2 and ptrBase3.
The code does not work, because the compiler looks for a subroutine with the following interface:
subroutine calcul(obj1, obj2, obj3)
class(typeBase1) :: obj1
class(typeBase1) :: obj2
class(typeBase1) :: obj3
end subroutine
whatever the dynamic type of ptrBase1, ptrBase2 and ptrBase3 is.
My question is: is there a way in Fortran to write the interface calcul in order to automatically select a procedure basing on the dynamic type of the arguments?
I would like to avoid to use a long sequence of "select class".
Any suggestion to rewrite the code is welcome!
If you request dispatch based on all three arguments, it cannot be done. Some languages offer so called multimethods for this.
In Fortran you can use normal single dispatch methods (type-bound procedures), but in that case it can choose the subroutine only according to one argument.
Otherwise you have to use select the select type construct and make a case for every possible combination, be it inside one single procedure, or to select between more versions of it.
For two arguments, you can also consider the double dispatch pattern.
This is simply not possible in Fortran; the best you can do with polymorphism is to use an overridden type-bound procedure, selecting a function based on the dynamic type of one particular entity.
However, depending on the nature of what calcul does, it may make more sense to define just one version of calcul that takes polymorphic arguments (i.e. class(typeBase1), class(typeBase2), class(typeBase3)), and deal with the dynamic type inside calcul itself. The benefits are twofold:
calcul may be able to test the type of each argument independently from the others. If that is the case, you will still have to write three select type constructs, but they won't be nested or duplicated.
It's likely that you can use single dispatch (with type-bound procedures) to remove the need for a select type construct completely.
It's difficult for me to think of a situation where the code in this question is really the best design you could use.
If calcul is really doing something completely "different" for each dynamic type, in a way that's relevant to the code that calls it, then the calling code should not be using polymorphic pointers (e.g. perhaps there should be a different complexType for each different calcul).
But if every version of calcul is doing essentially "the same" operation (as far as higher-level code knows/cares about), regardless of dynamic type, then there should only be one version and it should accept arguments that are of the base class.

run-time polymorphism in fortran 2003

I'm writing some code in Fortran 2003 that does a lot of linear algebra with sparse matrices. I'm trying to exploit some of the more abstract features of the new standard so I have simpler programs without too much repeated code.
I have a procedure solver which takes in a matrix, some vectors, the tolerance for the iterative method used etc. I'm passing a pointer to a procedure called matvec to it; matvec is the subroutine we use for matrix-vector multiplications.
The problem is, sometimes matvec is a procedure which takes in extra arguments colorlist, color1, color2 above the usual ones sent to this procedure. I can think of several ways of dealing with this.
First idea: define two different abstract interfaces matvec1, matvec2 and two different solvers. This works but it means duplicating some code, which is just what I'm trying to avoid.
Another idea: keep the same abstract interface matvec, and make the extra arguments colorlist, color1, color2 optional. That means making them optional in every matvec routine -- even ones for which they're not really optional, and for routines where they're not even used at all. Pretty sure I'll go to hell if I do this.
I can think of plenty of other less than optimal solutions. I'd like some input on this -- I'm sure there's some elegant way to do it, I'm just not sure what it is.
The question is really, whether the additional arguments must be passed every time the procedure is invoked (because they change between two invocations), or they can be initialized at some point and then just used in the function. In the later case you could create a class with an abstract interface, which defines your subroutine matvec with the essential arguments. You can then extend that class with more specialized ones, which can hold the additional options needed. They will still have to define the same matvec interface as the parent class (with the same argument list), but they can use the additional values stored in them when their matvec procedure is called.
You find a detailed example in this answer for a similar case (look for the second example showing module rechercheRacine).
Instead of passing the procedure pointer as an explicit argument, you could put the various matvec routines behind a generic interface:
interface matvec
module procedure matvec1, matvec2
end interface
Then your solver routine can just use the generic name with or without the extra arguments. The same approach can of course also be taken when using Bálint's suggested approach of defining a solver as a derived type with type-bound procedures:
type :: solver
real, allocatable :: matrix(:,:), v1(:), v2(:)
contains
procedure, pass :: matvec1
procedure, pass :: matvec2
generic :: matvec => matvec1, matvec2
end type
The main difference is that this does not use polymorphism to determine the correct procedure to invoke, but rather the characteristics of the dummy arguments.
I'm not sure of your intentions for the procedure pointer; if you wish to change its target at runtime (or perhaps assign some special meaning to its 'undefined' status), then pointers are the only way and all targets need to match the same abstract interface. If instead you just need to select one of several procedures based on their arguments, then you can exploit interfacing (my example) or overloading (Bálint's example). Each extension of a type can extend an inherited generic binding with new procedures, or overload an inherited specific binding.