Can Fortran read bytes directly from a binary file? - file-io

I have a binary file that I would like to read with Fortran. The problem is that it was not written by Fortran, so it doesn't have the record length indicators. So the usual unformatted Fortran read won't work.
I had a thought that I could be sneaky and read the file as a formatted file, byte-by-byte (or 4 bytes by 4 bytes, really) into a character array and then convert the contents of the characters into integers and floats via the transfer function or the dreaded equivalence statement. But this doesn't work: I try to read 4 bytes at a time and, according to the POS output from the inquire statement, the read skips over like 6000 bytes or so, and the character array gets loaded with junk.
So that's a no go. Is there some detail in this approach I am forgetting? Or is there just a fundamentally different and better way to do this in Fortran? (BTW, I also tried reading into an integer*1 array and a byte array. Even though these codes would compile, when it came to the read statement, the code crashed.)

Yes.
Fortran 2003 introduced stream access into the language. Prior to this most processors supported something equivalent as an extension, perhaps called "binary" or similar.
Unformatted stream access imposes no record structure on the file. As an example, to read data from the file that corresponds to a single int in the companion C processor (if any) for a particular Fortran processor:
USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT
INTEGER, PARAMETER :: unit = 10
CHARACTER(*), PARAMETER :: filename = 'name of your file'
INTEGER(C_INT) :: data
!***
OPEN(unit, filename, ACCESS='STREAM', FORM='UNFORMATTED')
READ (unit) data
CLOSE(unit)
PRINT "('data was ',I0)", data
You may still have issues with endianess and data type size, but those aspects are language independent.
If you are writing to a language standard prior to Fortran 2003 then unformatted direct access reading into a suitable integer variable may work - it is Fortran processor specific but works for many of the current processors.

Related

How to read a binary file with TCL

So I have a function I'm using to read data from a file. It works fine if the file is plain text, but when I try to read a binary file, like a png, it returns a different text (diff confirms that). I opened a hex editor to see what was wrong and found out it is putting some c2 bytes along with the file (I don't know if the position is random or if there are other bytes except this c2 one).
This is my function. I just want it to read and save to a variable.
proc read_file {path} {
set channel [open $path r]
fconfigure $channel -translation binary
set return_string "[read $channel]"
close $channel
return "$return_string"
}
To actually print, I'm doing this:
puts -nonewline [read_file file.png]
When you open a file, it defaults to being in text mode . In text mode (which is really a combination of options) the IO layer translates characters from whatever encoding they are in into Tcl's internal encoding, and does the reverse operation on output. The default encoding scheme is platform specific, but in your case it sounds like it is UTF-8. (Tcl uses a complex internal system of encodings; it doesn't expose those to the outside world.)
By contrast, when you put the channel into binary mode, the bytes on the outside are directly mapped to characters in the range 0-255 (and vice versa on output). You get a perfect copy, provided you put both input and output channels in binary mode. (There are other optimisations for binary mode, but they don't matter here.)
When you only put one of the channels in binary mode, you get what looks like corruption. It isn't random though. In particular, when the input is binary but the output is UTF-8, input bytes in the range 128-255 get converted into multiple output bytes, where the first of those bytes is in the sort of range you observed. There are other combinations that mess things up; the whole range of problems is collectively known as mojibake.
tl;dr Don't mix up binary and text data unless you're very careful. The results of getting it wrong are "surprising".

Saving CSV file with degree symbol and ASCII encoded

I have string variable txt. It contains "°" degree symbol. I would like to save string into CSV file ASCII encoded. I use the procedure below But the "°" symbol is converted to "?". Do you have any idea how to save properly degree symbol?
Public Sub Write_File(ByVal txt As String, ByVal fName As String)
Try
Using OutFile As New StreamWriter(fName, False, Text.Encoding.ASCII)
OutFile.Write(txt)
End Using
Me.Write_Log("Succesfully Exported")
Catch ex As Exception
Me.Write_Log("Write Error during export")
End Try
End Sub
Encoding.ASCII is for the standard 7-bit ASCII encoding, which does not contain a degree symbol at all. In order to get a degree symbol in ASCII, you would have to use one of the many 8-bit ASCII encodings. For English, you'd probably be most interested in using the ISO 8859-1 code page, since that's the most standard-ish one there is of the bunch. For instance, instead of using Encoding.ASCII, you could do something like this:
Using OutFile As New StreamWriter(fName, False, Text.Encoding.GetEncoding("iso-8859-1"))
OutFile.Write(txt)
End Using
For a complete list of available encodings, use the Encoding.GetEncodings method, or look at the list of supported ones in the MSDN documentation.
Of course, none of the various 8-bit ASCII encodings are compatible with each other, so, if you do use that, the degree symbol will be a completely different symbol when viewed on a system that uses a different code page by default. That is precisely why UTF-8 has become the new standard. Usage of 8-bit ASCII is widely discouraged since it is practically unworkable in multi-cultural scenarios. If you can use UTF-8 instead, I would. If you must use ASCII, it's best to stick to the standard 7-bit encoding. If you must use an 8-bit ASCII encoding, please do so sparingly and with full awareness of its drawbacks.
One more thing. You mention the degree symbol as being character 167 (0xA7) in your desired target encoding. If that is the case, you may actually be wanting IBM437 encoding rather than ISO 8859-1. IBM437 is the old code page that was used by default in MS-DOS. If you really need to use that code page, you may have additional trouble for two reasons. As you'll see in the MSDN article, that code page is not well supported in the .NET framework. In my testing, outputting the Unicode string containing the degree symbol using that encoding did not work properly. Therefore, you may find yourself needing to use a byte array to represent the data rather than a String variable (which is Unicode). For instance:
File.WriteAllBytes("Test.txt", {167})
The second problem is that IBM437 is likely not the default code page for your windows OS, so even when it is written to the file as byte value 167, it won't actually look like a degree symbol when you view it in a windows application such as notepad.

Fortran runtime error: Bad integer for item 0 in list input?

How do I fix the Fortran runtime error: Bad integer for item 0 in list input?
Below is the Fortran program which generates a runtime error.
CHARACTER CNFILE*(*)
REAL BOX
INTEGER CNUNIT
PARAMETER ( CNUNIT = 10 )
INTEGER NN
OPEN ( UNIT = CNUNIT, FILE = CNFILE, STATUS = 'OLD' )
READ ( CNUNIT,* ) NN, BOX
The error message received from gdb is :
At line 688 of file MCNPT.f (unit = 10, file = 'LATTICE-256.txt')
Fortran runtime error: Bad integer for item 0 in list input
[Inferior 1 (process 3052) exited with code 02]
(gdb)
I am not sure what options must be specified for READ() to read to numbers from the text file. Does it matter if the two numbers on the same line are specified as either an integer or a real in the text file?
Below is the gdb execution of the program using a break point at the open call
Breakpoint 1, readcn (
cnfile=<error reading variable: Cannot access memory at address 0x7fffffffdff0>,
box=-3.37898272e+33, _cnfile=30) at MCNPT.f:686
Since you did not specify form="unformatted" on the open statement, the unit / file is opened for formatted IO. This is appropriate for a human-readable text file. ("unformatted" would be used for a non-human readable file in computer-native format, sometimes called "binary".) Therefore you should provide a format on the read, or use list-directed read, i.e., read(unit, *). To advise on a particular format we would have to know the layout of the numbers in the file. A possible read with format is: read (CNUINT, '(I4, 2X, F6.2)' ) NN, BOX
P.S. I'm answering the question in your question and not the title, which seems unrelated.
EDIT: now that you are show the text data file, a list-directed read looks easier. That is because the data doesn't line up in columns. It seems that the file has two integers on the first line, then three real numbers on each of the following lines. Most likely you need a different read for the first line. Is the code sample that you are showing us trying to read the first line, or one of the later lines? If the first line, it would seem plausible to read into two integer variables. If a later line, into two or three real variables. Two if you wish to skip the third data item on the line.
EDIT 2: the question has been substantially altered several times, which is very confusing. The first line of the text file that was shown in one version of the question contained integers, with later lines having reals. Since the listed-directed read is reading into an integer and a floating variable, it will have problems if you attempt to use it on the later lines that have two real values.

Reading file line by line in Lua

As per Lua documentation, file:read("*l") reads next line skipping end of line.
Note:- "*l": reads the next line skipping the end of line, returning nil on end of file. This is the default format
Is this documentation right? Because file:read("*l") reads the current line,instead of next line or my understanding is wrong? Pretty confusing...
Lua manages files using the same model of the underlying C implementation (this model is used also by other programming languages and it is fairly common). If you are not familiar with this way of looking at files, the terminology could be unclear, indeed.
In this model a file is represented as a stream of bytes having a so called current position. The current position is a sort of conceptual pointer to the first byte in the file that will be read or written by the next I/O operation. When you open a file for reading, a new stream is set-up so that its current position is the beginning of the file, i.e. the current position "points" to the first byte in the file.
In Lua you manage streams through so-called file handles, which are a sort of intermediaries for the underlying streams. Any operation you perform using the handle is carried over to the corresponding stream.
Lua io.open opens a file, associates a C stream with it and returns a file handle that represents that stream:
local file_handle = io.open( "myfile.txt" ) -- file opened for reading
Therefore, if you perform any operation that reads some bytes (usually interpreted as characters, if you work with text files) those are read from the stream and for each byte read the current position of the stream advances by one, pointing each time to the next byte to be read.
Lua documentation implies this model. Thus when it says next line, it means that the input operation will read all characters in the stream starting from the current position until an end-of-line character is found.
Note that if you look at text files as a sequence of lines you could be misled, since you could think of a "current line" and a "next line". That would be an higher level model compared to the C model. There is no "current line" in C. In C text files are nothing more than a sequence of bytes where some special characters (end-of-line characters) undergo some special treatment (which is mostly implementation-dependent) and are used by some C standard functions as line terminators, i.e. as marks to detect when stop reading characters.
Another source of confusion for newbies or people coming from higher level languages is that in C, for an historical accident, bytes are handled as characters (the basic data type to handle single bytes is char, which is the smallest numeric type in C!). Therefore for people with a C background it is natural to think of bytes as characters and vice versa.
Although Lua is a much higher level language than C, its close relationship with C (it was designed to be easily interfaced with C code) makes it inherit part of this C "bytes-as-characters" approach. In fact, for example, Lua strings can hold arbitrary bytes and can be used to process raw binary data.
Like Lorenso said above, read starts at the current file position and reads from that position some portion of the file. How much of the file it reads depends on read instruction. For reference, in Lua 5.3:
"*all" : reads to the end of the file
"*line" : reads from the current position to the end of the line.
The end of the line is marked by a special character usually denoted
LfCr (Line feed, carriage return )
"*number" : reads a number, that is, it will read up to the end of what
it recognizes in the text as a number, stopping at, for example, a
comma ",".
num : reads a string with up to num characters
Here's an example that reads a file with a list of numbers into an array (a table), then returns the array. (Just change the "*number" to "*line" and it would read a file line by line):
function read_array(file)
local arr = {}
local handle = assert( io.open(file,"r") )
local value = handle:read("*number")
while value do
table.insert( arr, value )
value = handle:read("*number")
end
handle:close()
return arr
end

How to write a custom assembly compiler (sort of) in VB.NET

I've been trying to write a simple script compiler for a custom language used by the Game Boy Advance's Z80 processor.
All I want it to do is look at a human-readable command, take it and its arguments and convert it into a hexadecimal value into a ROM file. That's it. Each command is a byte, and each may take a different number of arguments - arguments can be either 8, 16, or 32 bits and each command has a specific number of arguments that it takes.
All of this sort of code is handled by the game and converted into workable machine code within the game's memory, so I'm not writing a full-on assembly compiler if you will. The game automatically knows how many args a command has, what each command does, exactly how to execute it as it is, etc.
For instance, you have command 0x4E, which takes in one 8-bit argument and another 32-bit argument. In hex that would obviously be 4E XX YY YY YY YY. I want my compiler to read it from text as foo 0xXX 0xYYYYYYYY and directly write it into a file as the former.
My question is, how would I do that in VB.NET? I know it's probably a very simple answer, but I see a lot of different options to write it to a file--some work and most don't for me. Could you give me some sample code as to how I would do this?
Writing an assembly compiler as I understand it is not so simple. I recomed you to use one already written see: Software Development Tools for Z80 Family
If you are still interested in writing it here are instructions:
Write the text you want to translate to some file (or memory stream)
Read it line by line
Parse the line either splitting it to an array or with regular
expressions
Identify command and arguments (as far as I remember it some commands
does not have arguments)
Translate the command to Hex (with a collection or dictionary of
commands)
Write results to an array remembering the references for jump
addresses
When everything is translated resolve addresses and write them to
right places.
I think that the most tricky part is to deal with symbolic addressees.
If you are still interested write a first piece of code (or ask how to do it) and continue with next ones.
This sounds like an assembler, even if it for a 'custom language'.
Start by parsing the command lines. use string.split method to convert the string to an array of strings. the first element in the array is your foo, you can then look that up and output 4E, then convert the subsequent elements to bytes.