GNU Radio text file sink - gnuradio

I'm trying to teach myself basics of GNU Radio and DSP. I created a flowchart in GNU Radio Companion that takes a vector that is the binary representation of a single character (the character "1" as "00110001"), modulates, demodulates, and writes to a file sink.
The scope sink after demodulation looks like the values are returned (see below; appears to be correct pattern of 0s and 1s), but the file sink, although its size is 19 bytes, appears empty, or at least is not returning the correct values (I've looked at it in ASCII and Hex text editors). I assumed the single character transferred would result in 1 byte (or 8 bits) -- not 19 bytes. Changing some of the settings in the Polyphase Sync and adding a Repack Bits block after the binary slicer results in some characters in the output file, but never the right character.
My questions are:
Can GNU Radio take a single character, modulate/demodulate it, and return the same character?
Are there errors in my flowchart?
I'd appreciate any insights or suggestions, thank you.

Related

How to read a binary file with TCL

So I have a function I'm using to read data from a file. It works fine if the file is plain text, but when I try to read a binary file, like a png, it returns a different text (diff confirms that). I opened a hex editor to see what was wrong and found out it is putting some c2 bytes along with the file (I don't know if the position is random or if there are other bytes except this c2 one).
This is my function. I just want it to read and save to a variable.
proc read_file {path} {
set channel [open $path r]
fconfigure $channel -translation binary
set return_string "[read $channel]"
close $channel
return "$return_string"
}
To actually print, I'm doing this:
puts -nonewline [read_file file.png]
When you open a file, it defaults to being in text mode . In text mode (which is really a combination of options) the IO layer translates characters from whatever encoding they are in into Tcl's internal encoding, and does the reverse operation on output. The default encoding scheme is platform specific, but in your case it sounds like it is UTF-8. (Tcl uses a complex internal system of encodings; it doesn't expose those to the outside world.)
By contrast, when you put the channel into binary mode, the bytes on the outside are directly mapped to characters in the range 0-255 (and vice versa on output). You get a perfect copy, provided you put both input and output channels in binary mode. (There are other optimisations for binary mode, but they don't matter here.)
When you only put one of the channels in binary mode, you get what looks like corruption. It isn't random though. In particular, when the input is binary but the output is UTF-8, input bytes in the range 128-255 get converted into multiple output bytes, where the first of those bytes is in the sort of range you observed. There are other combinations that mess things up; the whole range of problems is collectively known as mojibake.
tl;dr Don't mix up binary and text data unless you're very careful. The results of getting it wrong are "surprising".

How do I define the record structure of ebcdic file?

I have ebcdic file in hdfs I want to load data to spark dataframe, process it and load results as orc files, I found that there is a open source solution which is cobrix cobrix, that allow to get data from ebcdic files, but developer must provide a copybook file which is a schema definition.
A few line of my ebcedic file are presented in the attached image.
I want to get the format of copybook of the ebcdic file, essentially I want to read the vin his length is 17, vin_data the length is 3 and finally vin_val the length is 100.
how to define a copybook file of ebcdic data?
You don't.
A copybook may be used as a record definition (=how the data is stored), it has nothing to do with the encoding of data that may be stored in that.
This leaves the question "How do I define the record structure?"
You'd need the amount of fields, their length and type (it likely is not only USAGE DISPLAY) and then just define it with some fancy names. Ideally you just get the original record definition from the COBOL program writing the file, put that into a copybook if it isn't in one yet, and use that.
Your link has samples that show actually how a copybook looks like, if you struggle on the definition then please edit your question with the copybook you've defined and we may be able to help.
Based on your comment in the question, and looking at the input file, you could start with this.
01 VIN-RECORD.
05 VIN PIC X(17).
05 VIN-COUNT PIC S9(5) COMP-3.
05 VIN-VALUE PIC X(100).
I'm guessing that the second field is COMP-3 based on the six examples all ending with a C byte. This indicates a positive COMP-3 value. A D byte would be a negative COMP-3 value. An F byte would indicate an unsigned COMP-3 value.
The third field is variable length and right padded with spaces.

SHA256 generation different for file and content of this file

I use online SHA256 converters to calculate a hash for a given file. There, I have seen an effect I don't understand.
For testing purposes, I wanted to calculate the hash for a very simple file. I named it "test.txt", and its only content is the string "abc", followed by a new line (I just pressed enter).
Now, when I put "abc" and newline into a SHA256 generator, I get the hash
edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb
But when I put the complete file into the same generator, I get the hash
552bab6864c7a7b69a502ed1854b9245c0e1a30f008aaa0b281da62585fdb025
Where does the difference come from? I used this generator (in fact, I tried several ones, and they always yield the same result):
https://emn178.github.io/online-tools/sha256_checksum.html
Note that this different does not arise without newlines. If the file just contains the string "abc", the hash is
ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
for the file as well as just for the content.
As noted in my comment, the difference is caused by how newline characters are represented across different operating systems (see details here):
On UNIX and UNIX-like systems, newlines are represented by a line feed character (\n).
On DOS and Windows systems, newlines are represented by a carriage return followed by a line feed character (\r\n).
Compare the following two commands and their output, corresponding to the SHA256 values in your question:
echo -en "abc\n" | sha256sum
edeaaff3f1774ad2888673770c6d64097e391bc362d7d6fb34982ddf0efd18cb
echo -en "abc\r\n" | sha256sum
552bab6864c7a7b69a502ed1854b9245c0e1a30f008aaa0b281da62585fdb025
The issue you are having could come from the character encoding of the new line.
In windows the new line is escaped with \r\n and in linux is escaped with \n.
These 2 have a different dec value (\r is 13 and \n is 10).
More info you can find here:
https://en.wikipedia.org/wiki/Newline
https://en.wikipedia.org/wiki/List_of_Unicode_characters
Even i faced same issue. but providing the data in hex mode helped to understand the actual behavior.
Canonicalization of data needs to be performed before SHA calculations which will eliminate such issues. Canonicalization needs to be performed both at Generation side and also at verification side.

Can Fortran read bytes directly from a binary file?

I have a binary file that I would like to read with Fortran. The problem is that it was not written by Fortran, so it doesn't have the record length indicators. So the usual unformatted Fortran read won't work.
I had a thought that I could be sneaky and read the file as a formatted file, byte-by-byte (or 4 bytes by 4 bytes, really) into a character array and then convert the contents of the characters into integers and floats via the transfer function or the dreaded equivalence statement. But this doesn't work: I try to read 4 bytes at a time and, according to the POS output from the inquire statement, the read skips over like 6000 bytes or so, and the character array gets loaded with junk.
So that's a no go. Is there some detail in this approach I am forgetting? Or is there just a fundamentally different and better way to do this in Fortran? (BTW, I also tried reading into an integer*1 array and a byte array. Even though these codes would compile, when it came to the read statement, the code crashed.)
Yes.
Fortran 2003 introduced stream access into the language. Prior to this most processors supported something equivalent as an extension, perhaps called "binary" or similar.
Unformatted stream access imposes no record structure on the file. As an example, to read data from the file that corresponds to a single int in the companion C processor (if any) for a particular Fortran processor:
USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT
INTEGER, PARAMETER :: unit = 10
CHARACTER(*), PARAMETER :: filename = 'name of your file'
INTEGER(C_INT) :: data
!***
OPEN(unit, filename, ACCESS='STREAM', FORM='UNFORMATTED')
READ (unit) data
CLOSE(unit)
PRINT "('data was ',I0)", data
You may still have issues with endianess and data type size, but those aspects are language independent.
If you are writing to a language standard prior to Fortran 2003 then unformatted direct access reading into a suitable integer variable may work - it is Fortran processor specific but works for many of the current processors.

VBA - Read file byte by byte on system with Asian locale

I am trying to convert a file from binary to text, by simply replacing each character with the hexadecimal code. For example, character 'c' will be replaced by '63'.
I have a code which is working fine in normal systems, but it breaks down in the PC where I need to use it as it has default locale set to Chinese.
I am using the following statements to read a byte -
ch$ = " "
Get #f%, , ch$
I suspect there is a problem when I am reading the file byte by byte, as it is skipping certain bytes because they form composite characters. It's probably reading 2 bytes which form an Asian character as one byte. It is thus forming a much smaller file than the expected size.
How can I read the file byte by byte?
Full code is pasted here: http://pastebin.com/kjpSnqzV
Your suspicion is correct. VB file reading automatically converts strings into Unicode from the default code page on the PC. On an Asian code page, some characters are represented as more than one byte.
I advise you to use a Byte variable rather than a string - that will stop VB being over helpful.
Dim ch As Byte
Get #f%, , ch
Another possible problem with the original code is that some byte sequences are illegal on Asian code pages (they don't represent valid characters). So your code could experience errors for some input files, but presumably you want it to work with any file.