I need to read a Cobol file into VB.net. Here is the description of the data types from the documentation:
All Magnetic tape files are recorded in 9-track, 8OOBPI mode with odd parity. They are created IBM equipment disk operating system. IBM System 360 Standard.
Binary - Data is coded in pure binary code.
BCD - Data is coded in binary coded decimal format. (Primarily
for files created by the IBM 1401 System).
EBCDIC - Data is coded in extended binary coded decimal interchange code. :(An IBM developed code.)
Packed - Data is coded in packed decimal format.
File Format:
1-2 Record Count [Numeric] (Binary)
3-4 Filler (Binary)
5-5 Record Type [B or R] (EBCDIC)
6-10 Sales Location Numeric [9 digit number] (Packed)
11-13 Sales Identifier (3 character Alpha) (EBCDIC]
etc
So, I know I should read the entire file into a byte array and that's about the limit of what I know to do...
A) I saw another post on EBCDIC conversation using
System.Text.Encoding.GetEncoding(37)
but it is for an entire file. If I run the whole file through it I see intelligible text, but of course the other fields are junk. I don't know the language to decode a single field properly.
B) I have no idea what to do with PURE Binary format.
C) I don't know how to read Packed, particularly as a single field
I've tried a variety of decoding options for PURE BINARY, but the number I get for the first field is not consistent with the stated length of the rows in the docs.
Packed decimal format:
For s9(5)V9(4) comp-3, 123.45 is represented in byte format as
00 12 34 50 0c
Each digit is represented by 4 bits, there is a 4 bit sign (c) at the end and an assumed decimal after the 3.
Most languages provide a routine for converting byte/bytes into a string i.e. byte x'34' -->> String '34'. So you can:
Convert the bytes to a String representation
Add the decimal point in
Strip off the sign character from the end and add the appropriate sign to the front
There are other ways:
Create an translation array and do an array lookup. (See https://github.com/bmTas/JRecord/blob/master/Source/JRecord_Project/JRecord_Common/src/main/java/net/sf/JRecord/Types/smallBin/TypePackedDecimal9.java for an example)
Process it 4 bits at a time
Other fields
The first field (binary) might be a big endian binary integer or another packed-decimal. There is probably a utility built in the .net to do this.
Convert the character fields from ebcdic to ascii one field at a time
In VBA you did not need to read the whole file in, you could read it record by record. I would presume you can do the same in vb.net
Useful Utilities
These tools might be useful for testing.
The RecordEditor should be able to display the file. The Layout Wizard should be able determine the format of the file. Alternatively use the Cobol copybook below
The Java program CobolToCsv should be able to convert the file to Csv
01 tape-record.
05 record-count pic s9(3) comp.
05 filler pic x(2).
05 record-type pic x.
05 Sales-Location pic s9(9) comp-3.
05 Sales-Identifier pic x(3).
Related
I have ebcdic file in hdfs I want to load data to spark dataframe, process it and load results as orc files, I found that there is a open source solution which is cobrix cobrix, that allow to get data from ebcdic files, but developer must provide a copybook file which is a schema definition.
A few line of my ebcedic file are presented in the attached image.
I want to get the format of copybook of the ebcdic file, essentially I want to read the vin his length is 17, vin_data the length is 3 and finally vin_val the length is 100.
how to define a copybook file of ebcdic data?
You don't.
A copybook may be used as a record definition (=how the data is stored), it has nothing to do with the encoding of data that may be stored in that.
This leaves the question "How do I define the record structure?"
You'd need the amount of fields, their length and type (it likely is not only USAGE DISPLAY) and then just define it with some fancy names. Ideally you just get the original record definition from the COBOL program writing the file, put that into a copybook if it isn't in one yet, and use that.
Your link has samples that show actually how a copybook looks like, if you struggle on the definition then please edit your question with the copybook you've defined and we may be able to help.
Based on your comment in the question, and looking at the input file, you could start with this.
01 VIN-RECORD.
05 VIN PIC X(17).
05 VIN-COUNT PIC S9(5) COMP-3.
05 VIN-VALUE PIC X(100).
I'm guessing that the second field is COMP-3 based on the six examples all ending with a C byte. This indicates a positive COMP-3 value. A D byte would be a negative COMP-3 value. An F byte would indicate an unsigned COMP-3 value.
The third field is variable length and right padded with spaces.
I am making an app in visualstudios's VB to autoinstall the printer in windows. Problem is, that the printer needs a login and pass. I found registry entry, where this is stored, but the password is stored in REG_BINARY format.
Here is how it looks after manually writing the password into printer settings - see UserPass:
Please could you tell me how to convert password (in string) into the reg_binary (see attachement - red square)?
The password in this case was 09882 and it has been stored as 98 09 e9 4c c3 24 26 35 14 6f 83 67 8c ec c4 90. Is there any function in VB to convert 09882 into this REG_BINARY format please?
REG_BINARY means that it is binary data and binary data in .NET is represent by a Byte array. The values you see in RegEdit are the hexadecimal values of the individual bytes, which is a common representation because every byte can be represented by two digits. You need to convert your String to a Byte array and then save it to the Registry like any other data.
How you do that depends on what the application expects. Maybe it is simply converting the text to Bytes based on a specific encoding, e.g. Encoding.ASCII.GetBytes. Maybe it's a hash. You might need to research and/or experiment to find out exactly what's expected.
I am looking for an online calculator, a tool or at least an understandable article, which lets me define the value of dPermissiions parameter of Ghostscript command line.
Please, advice!
Its documented in the VectorDevices.htm, where it says its a bit field and directs you to the PDF Reference Manual. The actual values are defined by Adobe.
The various access permissions are described under the Standard Security Handler (on p121 of the 1.7 PDF Reference) and the individual bits are described in Table 3.20 (p124 and 124 in the 1.7 PDF Reference Manual).
Bits 1 and 2 (the lowest 2 bits) are always defined as 0, as (currently) are bits 13-32. Bits 7 & 8, annoyingly are reserved and must be 1.
So lets say you want to grant permission to print the document, to do that you need to set bit 3. So bits 1-2 are 0 and bits 4-32 are also 0, bits 7 and 8 must be 1. In binary that corresponds to:
00000000 00000000 00000000 11000100
Which in hex is 00 00 00 C4 which in decimal is 196. So you would set -dPermissions=196
To take a more complex example, we might also want to set bit 12 to allow a high quality print (for revision 3 or better of the security handler). Now we want to set bits 3 and 12, in binary:
00000000 00000000 00001000 11000100
in hex 00 00 08 C4 which is decimal 2244 so you would set -dPermissions=2244
The Windows calculator, when set to programmer mode, has a binary entry configuration. If you enter the bitfield in binary, and then switch to decimal it'll convert it for you. Alternatively there's an online conversion tool here.
Just write out the bits you want set as binary, set bits 7 & 8, then convert to decimal, simples!
--EDIT--
So as Vsevolod Azovsky pointed out, the bits 12-32 should be 1. Using the same tool I pointed at above you can get the decimal signed 2's complement of the binary representation, which you can use as the value for Permissions.
However, if you do that, then Ghostscript's pdfwrite device will produce a warning. The reason is that some of the bits I've set above (anything above bit 8) are only compatible with the revision 3 (or better) security handler, and the default for pdfwrite is to use the revision 2 security encryption.
So if you want to use the bits marked in the Adobe documentation as 'revision 3' then you (obviously) need to set the revision to 3 using -dEncryptionR=3. This requires that the output PDF file be a 1.4 or greater file, you can't use revision 3 with a PDF 1.3 file.
Note that for the revision 2 security handler all the bits 1-2 and 7-32 must be 1.
Hopefully that also answers the questions in the last comment.
I'm trying to teach myself basics of GNU Radio and DSP. I created a flowchart in GNU Radio Companion that takes a vector that is the binary representation of a single character (the character "1" as "00110001"), modulates, demodulates, and writes to a file sink.
The scope sink after demodulation looks like the values are returned (see below; appears to be correct pattern of 0s and 1s), but the file sink, although its size is 19 bytes, appears empty, or at least is not returning the correct values (I've looked at it in ASCII and Hex text editors). I assumed the single character transferred would result in 1 byte (or 8 bits) -- not 19 bytes. Changing some of the settings in the Polyphase Sync and adding a Repack Bits block after the binary slicer results in some characters in the output file, but never the right character.
My questions are:
Can GNU Radio take a single character, modulate/demodulate it, and return the same character?
Are there errors in my flowchart?
I'd appreciate any insights or suggestions, thank you.
Is there a way to detect if the Base 64 string contained in an NSData instance is an image or a text or any other object?
You can't generally just look at the base 64 string and decide, but you can decode the first few bytes of data, look at the hex codes (you can do this by decoding your base-64 string into a NSData and just NSLog it or examining it in the debugger), and draw some conclusions. For example:
Image files generally start with special byte sequences (e.g. JPEG start with the hex bytes FF D8; PNG generally start with hex bytes 89 50 4E 47 0D 0A 1A 0A (e.g. 89 "PNG" CR LF EOF LF, etc.). Note, there are a dizzying number of different image formats, so this is a non-trivial exercise, but sometimes you can get lucky and it will be self-evident that it's one of these common format when you glance at the first few bytes.
NSKeyedArchiver archives generally start with the string "bplist".
ASCII text consists of codes between 20 and 7F (with linefeeds represented by 0A; carriage return and linefeeds represented by OD 0A; tab characters as 09; etc.). Then, again, if it was a text, it's unlikely they'd be base-64 encoding it.
If it was UTF-8 it would conform to the coding pattern outlined here. For example, you can look at the first few high bits of the first byte that might conceivably represent a UTF-8 character, and conclude (a) how many bytes the character is represented by and (b) what high bits will be turned on those subsequent bytes. You can often quickly look at it and confirm whether the data conforms to this UTF-8 pattern or not (especially easy to do for most western languages)
If the first three characters were EF BB BF, that often indicates a UTF-8 byte order mark.
This is, by no means, an exhaustive list of codes, but just a few that leapt out at me.
To do this programmatically and do so exhaustively would be a non-trivial exercise. But if you're just "eye-balling" a base-64 string and trying to draw some logical inferences, decode it and look at the hex bytes and you can quickly narrow down the possibilities, at the very least. If you're unsure about how to interpret it, update your question with the hex representation of the decoded base-64 string (just the first 16-32 bytes, please), and we might be able to point you in the right direction.
It is impossible to clearly distinguish text string and Base64 image encoding string. The only way - check if your string is valid Base 64 encoding string. If it is - probably it is an image. If not - you can be sure it is a text.
How to check if string is valid Base 64 you can ere How to check whether the string is base64 encoded or not.