Related
from GNU gawk's page
https://www.gnu.org/software/gawk/manual/html_node/Checking-for-MPFR.html
they have a formula to check arbitrary precision
function adequate_math_precision(n) { return (1 != (1+(1/(2^(n-1))))) }
My question is : wouldn't it be more efficient by staying within integer math domain with a formula such as
( 2^abs(n) - 1 ) % 2 # note 2^(n-1) vs. 2^|n| - 1
Since any power of 2 must also be even, then subtracting 1 must always be odd, then its modulo (%) over 2 becomes indicator function for is_odd() for n >= 0, while the abs(n) handles the cases where it's negative.
Or does the modulo necessitate a casting to float point, thus nullifying any gains ?
Good question. Let's tackle it.
The proposed snippet aims at checking wether gawk was invoked with the -M option.
I'll attach some digression on that option at the bottom.
The argument n of the function is the floating point precision needed for whatever operation you'll have to perform. So, say your script is in a library somewhere and will get called but you have no control over it. You'll run that function at the beginning of the script to promptly throw exception and bail out, suggesting that the end result will be wrong due to lack of bits to store numbers.
Your code stays in the integer realm: a power of two of an integer is an integer. There is no need to use abs(n) here, because there is no point in specifying how many bits you'll need as a negative number in the first place.
Then you subtract one from an even, integer number. Now, unless n=0, in which case 2^0=1 and then your code reads (1 - 1) % 2 = 0, your snippet shall always return 1, because the quotient (%) of an odd number divided by two is 1.
Problem is: you are trying to calculate a potentially stupidly large number in a function that should check if you are able to do so in the first place.
Since any power of 2 must also be even, then subtracting 1 must always
be odd, then its modulo (%) over 2 becomes indicator function for
is_odd() for n >= 0, while the abs(n) handles the cases where it's
negative.
Except when n=0 as we discussed above, you are right. The snippet will tell that any power of 2 is even, and any power of 2, minus 1, is odd. We were discussing another subject entirely thought.
Let's analyze the other function instead:
return (1 != (1+(1/(2^(n-1)))))
Remember that booleans in awk runs like this: 0=false and non zero equal true. So, if 1+x where x is a very small number, typically a large power of two (2^122 in the example page) is mathematically guaranteed to be !=1, in the digital world that's not the case. At one point, floating computation will reach a precision rock bottom, will be rounded down, and x=0 will be suddenly declared. At that point, the arbitrary precision function will return 0: false: 1 is equal 1.
A larger discussion on types and data representation
The page you link explains precision for gawk invoked with the -M option. This sounds like technoblahblah, let's decipher it.
At one point, your OS architecture has to decide how to store data, how to represent it in memory so that it can be accessed again and displayed. Terms like Integer, Float, Double, Unsigned Integer are examples of data representation. We here are addressing Integer representation: how is an integer stored in memory?
A 32-bit system will use 4 bytes to represent and integer, which in turn determines how larger the integer will be. The 32 bits are read from most significative (MSB) to less significative (LSB) and if signed, one bit will represent the sign (the MSB typically, drastically reducing the max size of the integer).
If asked to compute a large number, a machine will try to fit in in the max number available. If the end result is larger than that, you have overflow and end up with a wrong result or an error. Many online challenges typically ask you to write code for arbitrary long loops or large sums, then test it with inputs that will break the 64bit barrier, to see if you master proper types for indexes.
AWK is not a strongly typed language. Meaning, any variable can store data, regardless of the type. The data type can change and it is determined at runtime by the interpreter, so that the developer doesn't need to care. For instance:
$awk '{a="this is text"; print a; a=2; print a; print a+3.0*2}'
-| this is text
-| 2
-| 8
In the example, a is text, then is an integer and can be summed to a floating point number and printed as integer without any special type handling.
The Arbitrary Precision Page presents the following snippet:
$ gawk -M 'BEGIN {
> s = 2.0
> for (i = 1; i <= 7; i++)
> s = s * (s - 1) + 1
> print s
> }'
-| 113423713055421845118910464
There is some math voodoo behind, we will skip that. Since s is interpreted as a floating point number, the end result is computed as floating point.
Try to input that number on Windows calculator as decimal, and it will fail. Although you can compute it as a binary. You'll need the programmer setting and to add up to 53 bits to be able to fit it as unsigned integer.
53 is a magic number here: with the -M option, gawk uses arbitrary precision for numbers. In other words, it commandeers how many bits are necessary, track them and breaks free of the native OS architecture. The default option says that gawk will allocate 53 bits for any given arbitrary number. Fun fact, the actual result of that snippet is wrong, and it would take up to 100 bits to compute correctly.
To implement arbitrary large numbers handling, gawk relies on an external library called MPFR. Provided with an arbitrary large number, MPFR will handle the memory allocation and bit requisition to store it. However, the interface between gawk and MPFR is not perfect, and gawk can't always control the type that MPFR will use. In case of integers, that's not an issue. For floating point numbers, that will result in rounding errors.
This brings us back to the snippet at the beginning: if gawk was called with the -M option, numbers up to 2^53 can be stored as integers. Floating points will be smaller than that (you'll need to make the comma disappear somehow, or rather represent it spending some of the bits allocated for that number, just like the sign). Following the example of the page, and asking an arbitrary precision larger than 32, the snippet will return TRUE only if the -M option was passed, otherwise 1/2^(n-1) will be rounded down to be 0.
This question already has answers here:
Smart printing of integers in fortran90
(1 answer)
Integer output formatting with print statement
(3 answers)
Output formatting: too much whitespace in gfortran
(2 answers)
Closed 5 years ago.
I'm attempting to get a time stamp with DATE_AND_TIME(), but Fortran returns an excessive amount of spaces, and I can't seem to trim these spaces because functions like TRIM() and ADJUST() work on strings only. Fortran appears to not have a function to convert integers to strings or at least I haven't been able to find one so far. I'd like to use the date-time stamp in a file name, so the included spaces are a problem. Can anyone show me how to remove these additional spaces?
Unfortunately I have to use Fortran, although I believe other versions of Fortran are acceptable.
Code:
program
character(8) :: date
character(10) :: time
character(5) :: zone
integer, dimension(8) :: values
zone = "+01"
call date_and_time(ZONE=zone,VALUES=values)
print *, "values(1)", values(1), "-"
Returns (including text before and after values(1) to show spacing):
values(1) 2017 -
As shown above there seem to be about 6 spaces before the number.
I'm compiling this in gfortran, version is: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)
I've got a program, which compute a several variables and then these variables are writing in to the output file.
Is it possilbe, that when my program can't get a correct results for my formula, it does'nt terminate?
To clarify what I do, here is part of my code, where the variable of my interest are compute:
dx=x(1,i)-x(nk,i)
dy=y(1,i)-y(nk,i)
dz=z(1,i)-z(nk,i)
call PBC(dx,dy,dz)
r2i=dx*dx+dy*dy+dz*dz
r2=r2+r2i
r2g0=0.0d0
r2gx=0.0d0
dx=x(1,i)-x(2,i)
call PBC(dx,dy,dz)
rspani=dsqrt(dx*dx)
do ii=1,nk-1
rx=x(ii,i)
ry=y(ii,i)
rz=z(ii,i)
do jj=ii+1,nk
dx=x(jj,i)-rx
dy=y(jj,i)-ry
dz=z(jj,i)-rz
call PBC(dx,dy,dz)
r21=dx*dx+dy*dy+dz*dz
r21x=dx*dx
r2g=r2g+r21
r2gx=r2gx+r21x
r2g0=r2g0+r21
rh=rh+1.0d0/dsqrt(r21)
rh1=rh1+1.0d0
ir21=dnint(dsqrt(r21)/dr)
p(ir21)=p(ir21)+2.0D0
dxs=dsqrt(r21x)
if(dxs.gt.rspani) rspani=dxs
end do
and then in to the output I just write these variables:
write(12,870)r2i,sqrt(r2i),r2g0,r2gx/(nk*nk)
870 FORMAT(3(f15.7,3x),f15.7)
The x, y, z are actully generate via a random number generator.
The problem is that my output contains, correct values for lets say 457 lines, and then a one line is just "*********" when I use mc viewer and then the output continues with correct values, but let's say 12 steps form do cycle which compute these variables is missing.
So my questions are basic:
Is it possible, that my program can't get a correct numbers, and that's why the result is not writing in to the program?
or
Could it this been caused due to wrong output formating or something related with formating?
Thank you for any suggestion
********* is almost certainly the result of trying to write a number too large for the field specified in a format string.
For example, a field specified as f15.7 will take 1 spot for the decimal point, 1 spot for a leading sign (- will always be printed if required, + may be printed if options are set), 7 for the fractional digits, leaving 6 digits for the whole part of the number. There may therefore be cases where the program won't fit the number into the field and will print 15 *s instead.
Programs compiled with an up to date Fortran compiler will write a string such as NaN or -Inf if they encounter a floating-point number which represents one of the IEEE special values
I'm trying to write a subroutine in a module that I can include in various codes to read data from a given file. I have several codes (numerical algorithms) which will be reading the data from the file.
The file has the following format:
First entry: No. of rows and columns of my array of data (e.g. 720)
First n(=720) entries: entire first row of the matrix A
Second n(=720) entries: entire 2nd row of the matrix A
etc.
Last n(=720) entries: all n entries of vector b
Each entry has two columns, one for the REAL part of the number, the other for the COMPLEX part.
In summary, an example basic input file:
2
-0.734192049E+00 0.711486186E+01
0.274492957E+00 0.378855374E+01
0.248391205E-01 0.412154039E+01
-0.632557864E+00 0.195397735E+01
0.289619736E+00 0.895562183E+00
-0.284756160E+00 -0.892163111E+00
where the first entry says its a 2x2 matrix and 2x1 vector
The first 4 lines are the four entries of the matrix A (left column Real, right column Imag.)
The last 2 lines are the two entries of the vector b (left column Real, right column Imag.)
I have written the following code to try and implement this but it simply outputs the wrong results:
n= 2
A= ( 0.0000000 , 1.08420217E-19) (-9.15983229E-16, 3.69024734E+19) ( 1.26116862E-43, 0.0000000 ) ( 0.0000000 , 0.0000000 )
b= ( 0.0000000 , 1.08420217E-19) ( 0.0000000 , 1.08420217E-19)
With the code:
SUBROUTINE matrix_input(n,A,b)
IMPLICIT NONE
!
INTEGER, INTENT(OUT) ::n !size of matrix to be read
COMPLEX, DIMENSION(:,:), INTENT(OUT), ALLOCATABLE ::A !system matrix to be read
COMPLEX, DIMENSION(:), INTENT(OUT), ALLOCATABLE ::b !RHS b vector to be read
DOUBLE PRECISION ::A_Re,A_Im,b_Re,b_Im !
INTEGER ::i,j
!----------------------------------------------------------
! Subroutine outputs 'n'=size of matrix, 'A'=system matrix
! 'b'= RHS vector
!matrix194.txt
OPEN (UNIT = 24, FILE = "matrix_input_test.txt", STATUS="OLD", FORM="FORMATTED", ACTION="READ")
!Read in size of matrix
READ(24,*) n
ALLOCATE(A(n,n))
ALLOCATE(b(n))
!Read matrix A:
DO i=1,n
DO j=1,n
READ(24,*) A_Re, A_Im
A(i,j)=CMPLX(A_Re,A_Im)
END DO
END DO
!Read RHS vector b:
DO i=((n*n)+1),((n*n)+n)
READ(24,*) b_Re, b_Im
b(i)=CMPLX(b_Re,b_Im)
END DO
CLOSE(UNIT=24)
DEALLOCATE(A,b)
END SUBROUTINE matrix_input
EDIT: Following HPC Mark's insights, I have edited my code, and this yields the correct result, however if there are any commands which could lead to issues later on down the line (e.g. with very large arrays that I will be using) I would very grateful to hear about them!
SUBROUTINE matrix_input(n,A,b)
IMPLICIT NONE
!
INTEGER, INTENT(OUT) ::n !size of matrix to be read
COMPLEX, DIMENSION(:,:), INTENT(OUT), ALLOCATABLE ::A !system matrix to be read
COMPLEX, DIMENSION(:), INTENT(OUT), ALLOCATABLE ::b !RHS b vector to be read
!
COMPLEX, DIMENSION(:), ALLOCATABLE ::temp,A_temp !temp vector of matrix A
DOUBLE PRECISION ::A_Re,A_Im,b_Re,b_Im
INTEGER ::i,j,k
!----------------------------------------------------------
! Subroutine outputs 'n'=size of matrix, 'A'=system matrix
! 'b'= RHS vector
!matrix194.txt
OPEN (UNIT = 24, FILE = "matrix_input_test.txt", STATUS="OLD", FORM="FORMATTED", ACTION="READ")
!Read in size of matrix
READ(24,*) n
!Allocate arrays/vectors
ALLOCATE(A(n,n))
ALLOCATE(b(n))
ALLOCATE(temp(n*n+n))
ALLOCATE(A_temp(n*n))
!Read matrix A & vector b:
!16 characters, 9 decimal places, exponent notation, 2 spaces
DO i=1,(n*n)+n
READ(24, FMT="(E16.9, 2X, E16.9)") A_Re, A_Im
temp(i)=CMPLX(A_Re,A_Im)
END DO
!Select A:
DO i=1,n*n
A_temp(i)=temp(i)
END DO
!Reshape
A=RESHAPE(A_temp, (/n,n/))
!Select b:
k=0
DO i=n*n+1,n*n+n
k=k+1
b(k)=temp(i)
END DO
CLOSE(UNIT=24)
!Do not deallocate A & b otherwise won't return anything properly
DEALLOCATE(temp, A_temp)
END SUBROUTINE matrix_input
RESULTS FROM THE EDITED CODE:
n= 2
A= (-0.73419207 , 7.1148620 ) ( 0.27449295 , 3.7885537 ) ( 0.24839121 , 4.1215405 ) (-0.63255787 , 1.9539773 )
b= ( 0.28961974 , 0.89556217 ) (-0.28475615 ,-0.89216310 )
High Performance Mark has already identified the significant issue, some other notes...
Your format specification in your updated code suggests two spaces in between numbers. Your example input suggests one - be mindful of the (optional) leading sign on the second number!
A and b are allocatable dummy arguments. That's a widely supported and very useful Fortran 2003 feature. You are well beyond Fortran 90! If that is unintentional then you will need to seriously redesign things, otherwise...
Being beyond Fortran 90 is a good thing - Fortran 95 (which is the minimum level of standard support offered by all current mainstream Fortran compilers - Fortran 90 is practically obsolete) and Fortran 2003 by extension fixed a serious deficiency in Fortran 90 - as of Fortran 95 local allocatable objects are deallocated automatically when the return or end statement of a procedure is executed. Your deallocate statement at the end is therefore harmless but redundant.
You read in the components of each complex number into double precision (note a common source code style used for modern Fortran tends to avoid the DOUBLE PRECISION type specifier - it is just a synonym for REAL(KIND(0.0D0))). You then store them into default (single precision) complex. Is that intentional? If so, its harmless but a little pointless/inconsistent, otherwise if you intended for the real and imaginary components in the output arrays to be stored at higher precision, then you need to change the declaration of the complex arrays appropriately. All kinds available for REAL must be available for COMPLEX, so your declaration could be COMPLEX(KIND(0.0D0)) (typically you would have the kind as a named constant).
In an input-output list, a complex scalar variable represents two effective items - the real part followed by the imaginary part. Consequently your double precision variables A_Re and A_Im, etc., are somewhat superfluous... READ(24, FMT="(E16.9, 2X, E16.9)") temp(i) is all that is required.
Personally I wouldn't bother with the other temporary arrays - once you know the size of the input data (the first line) allocate your A and B arrays to the necessary size and read directly into them. In an io list an array is expanded into its elements in array element order (first subscript varies fastest) - which appears to be the way your file is arranged. You can combine this with what's known as format reversion to eliminate the need for loops.
Upon return from your subroutine the allocatable arrays "know" their shape - there's no need to return that information separately. Again, harmless but redundant.
Consequently - I think your entire subroutine could look something like:
! Read A and B in from a file that has... etc, etc...
! Assuming a module procedure, where the module already has IMPLICIT NONE.
SUBROUTINE matrix_input(A, b)
! Number of rows and columns of A, elements of B.
INTEGER :: n
! Our output data.
COMPLEX(KIND(0.0D0)), INTENT(OUT), ALLOCATABLE :: A(:,:), b(:)
! Number of the logical unit for IO. In F2008 this becomes a variable
! and you use the NEWUNIT specifier.
INTEGER, PARAMETER :: unit = 24
! The name of the file to read the data from.
CHARACTER(*), PARAMETER :: filename = "matrix_input_test.txt"
! Format for the array and vector component of the data.
CHARACTER(*), PARAMETER :: fmt = "(E16.9, 1X, E16.9)"
!*****************************************************************************
! Connect to the file for sequential formatted reading.
OPEN(unit, FILE=filename, STATUS='OLD', ACTION='READ')
READ (unit, *) n ! Get array dimension.
ALLOCATE(A(n,n), b(n)) ! Allocate result arrays.
READ (unit, fmt) A ! Read in A.
READ (unit, fmt) b ! Read in B.
CLOSE (unit) ! Clean up.
END SUBROUTINE matrix_input
Your code confuses me. SUBROUTINE matrix_input declares arrays A and b to be allocatable with intent(out). Just before the END SUBROUTINE statement you go right ahead and deallocate them. What do you expect the subroutine to return to the calling routine ?
this is my first time trying to program in Fortran. I'm trying to write a program that prints the first 1476 terms of the Fibonacci sequence, then examines the first digit of each term and stores the number of 1s, 2s, 3s, ..., 9s that occur in an array.
The problem that I can't seem to figure out is how to read the first digit of each term. I've tried several things but am having difficulty with my limited knowledge of Fortran techniques. I write the terms to a text file and the idea is to read the first digit of each line and accumulate the respective number in the array. Does anyone have any suggestions of how to do this?
Here is my code so far:
(edit: I included the code I have for reading the file. Right now it just prints out 3.60772951994415996E-313,
which seems like an address of some sort, because it's not one of the Fibonacci numbers. Also, it is the only thing printed, I expected that it would print out every line of the file...)
(edit edit: After considering this, perhaps there's a way to format the writing to the text file to just the first digit. Is there a way to set the number of significant digits of a real number to one? :P)
subroutine writeFib(n)
integer :: i
real*8 :: prev, current, newFib
prev = 0
current = 1
do i = 1, n
newFib = prev + current
prev = current
current = newFib
write(7,*) newFib
end do
return
end subroutine
subroutine recordFirstDigits(a)
integer :: openStat, inputStat
real*8 :: fibNum
open(7, file = "fort.7", iostat = openStat)
if (openStat > 0) stop "*** Cannot open the file ***"
do
read(7, *, iostat = inputStat) fibNum
print *,fibNum
if (inputStat > 0) stop "*** input error ***"
if (inputStat < 0) exit ! end of file
end do
close(7)
end subroutine
program test
integer :: k, a(9)
k = 1476
call writeFib(k)
call recordFirstDigits(a)
end program
Although the suggestions were in place, there were also several things that were forgotten. Range of the REAL kind, and some formatting problems.
Anyways, here's one patched up solution, compiled and working, so try to see if this will work for you. I've took the liberty of choosing my own method for fibonacci numbers calculation.
program SO1658805
implicit none
integer, parameter :: iwp = selected_real_kind(15,310)
real(iwp) :: fi, fib
integer :: i
character(60) :: line
character(1) :: digit
integer :: n0=0, n1=0, n2=0, n3=0, n4=0, n5=0, n6=0, n7=0, n8=0, n9=0
open(unit=1, file='temp.txt', status='replace')
rewind(1)
!-------- calculating fibonacci numbers -------
fi = (1+5**0.5)/2.
do i=0,1477
fib = (fi**i - (1-fi)**i)/5**0.5
write(1,*)fib,i
end do
!----------------------------------------------
rewind(1)
do i=0,1477
read(1,'(a)')line
line = adjustl(line)
write(*,'(a)')line
read(line,'(a1)')digit
if(digit.eq.' ') n0=n0+1
if(digit.eq.'1') n1=n1+1
if(digit.eq.'2') n2=n2+1
if(digit.eq.'3') n3=n3+1
if(digit.eq.'4') n4=n4+1
if(digit.eq.'5') n5=n5+1
if(digit.eq.'6') n6=n6+1
if(digit.eq.'7') n7=n7+1
if(digit.eq.'8') n8=n8+1
if(digit.eq.'9') n9=n9+1
end do
close(1)
write(*,'("Total number of different digits")')
write(*,'("Number of digits 0: ",i5)')n0
write(*,'("Number of digits 1: ",i5)')n1
write(*,'("Number of digits 2: ",i5)')n2
write(*,'("Number of digits 3: ",i5)')n3
write(*,'("Number of digits 4: ",i5)')n4
write(*,'("Number of digits 5: ",i5)')n5
write(*,'("Number of digits 6: ",i5)')n6
write(*,'("Number of digits 7: ",i5)')n7
write(*,'("Number of digits 8: ",i5)')n8
write(*,'("Number of digits 9: ",i5)')n9
read(*,*)
end program SO1658805
Aw, ... I just read you need the number of digits stored in to an array. While I just counted them.
Oh well, ... "left as an exercise for the reader ..." :-)
Can you read with a FORMAT(A1)? It's been 20 years so I don't remember the exact syntax.
I wonder why the open statement succeeds when file 7 hasn't been closed. I think you need an endfile statement and/or a rewind statement in between writing and reading.
Paul Tomblin posted what you have to do after you solve your problem in getting reads to work in the first place.
I am getting an 'end of line' runtime error
You don't show the ! code to read here... which makes it kind of difficult to guess what you are doing wrong :-)
Perhaps you need a loop to read each line and then jump out of the loop to a continue statement when there are no more lines.
Something like this:
do
read(7,*,end=10) fibNumber
end do
10 continue
Better still - look at the more modern style used in this revcomp program.
here are some hints:
You don't need to use characters,
much less file i/o for this problem
(unless you forgot to state that a
file must be created).
Therefore, use math to find your statistics. There are lots of resources on Fibonacci numbers that might provide a simplifying insight or at least a way to independently spot check your answers.
Here is a complicated hint in non-Fortran lingo:
floor(10^(frac(log_10(7214989861293412))))
(Put this in Wolfram Alpha to see what it does.)
A simpler hint (for a different approach) is that you can do very
well in Fortran with simple
arithmetic inside of looping
constructs--at least for a first pass at the solution.
Accumulate your statistics as you
go. This advice would even apply to your character-driven approach. (This problem is ideally suited
for coming up with a cute indexing
scheme for your statistics, but some
people hate cute schemes in
programming. If you don't fear
cuteness ... then you can have associative
arrays in Fortran as long as your
keys are integers ;-)
The most important aspect of this
problem is the data type you will
use to calculate your answers. For
example, here's the last number you
will have to print.
Cheers, --Jared