How to parse ISO8583 message from a text file & write it to a database - iso8583

I am having few ISO8583 logs in a text file. I want to parse these logs from this text file and write them to any database with some descriptive information such as class of the message, message function, message origin, processing code, response code etc.
I am new to the BASE24/ISO8583 and was trying to find any ready-made parser for this. Is there any such parser available ? Does jPOS provides such functionality ?
EDIT
I have the logs in ISO8583 format in ".log" file as given below:
MTI : 0200
Field-3 : 201234
Field-4 : 000000010000
Field-7 : 0110722180
Field-11 : 123456
Field-44 : A5DFGR
Field-105 : ABCDEFGHIJ 1234567890
This is same as the format given in the link shared by you.
It also consists of hex dump but I dont want to parse that.
The code given in the link is doing packing and unpacking of the message where as what I am trying is to read these logs (in unpacked form) and write them into a database table.
I think i need to write my own code for this and use the jPOS packagers in it.

It really depends on the format of the log file - are the ISO8583 messages - HexStrings, and HexDump an XML representation of ISO8583, some other application trace file ?
Once you know the format and it might require some massaging - you will want to research the ISOMsg.unpack() methods using the appropriate jPOS packager. the packager defines the field structure - of the various ISO8583 fields and field construction (lengths, character set, etc.)
a good example was found at the following blog post: looking at the "Parse (unpack) ISO Message" seciton http://jimmod.com/blog/2011/07/26/jimmys-blog-iso-8583-tutorial-build-and-parse-iso-message-using-jpos-library/
You mention - Base24 - jPOS does have a few packagers that might be close starting point.:
https://github.com/jpos/jPOS/blob/master/jpos/src/dist/cfg/packager/base24.xml

Those human-readable log formats are usually difficult to parse without loosing information. Moreover, the logs are probably PCI compliant so there's a lot of masked information there. You want to ask for ah hex dump of the messages.

what is displayed in log file is parsed ISO.Hence you need not use jpos.jpos is only for packing and unpacking when you transmit the message.
Assign the field to variable and write in DB
for example,Field 39 is response code.

Using jpos is good idea. You should go for your custom packager design class.

Related

Reading *.cdpg file with python without knowing structure

I am trying to use python to read a .cdpg file. It was generated by the labview code. I do not have access to any information about the structure of the file. Using another post I have had some success, but the numbers are not making any sense. I do not know if my code is wrong or if my interpretation of the data is wrong.
The code I am using is:
import struct
with open(file, mode='rb') as file: # b is important -> binary
fileContent = file.read()
ints = struct.unpack("i" * ((len(fileContent) -24) // 4), fileContent[20:-4])
print(ints)
The file is located here. Any guidance would be greatly appreciated.
Thank you,
T
According to the documentation here https://www.ni.com/pl-pl/support/documentation/supplemental/12/logging-data-with-national-instruments-citadel.html
The .cdpg files contain trace data. Citadel stores data in a
compressed format; therefore, you cannot read and extract data from
these files directly. You must use the Citadel API in the DSC Module
or the Historical Data Viewer to access trace data. Refer to the
Citadel Operations section for more information about retrieving data
from a Citadel database.
.cdpg is a closed format containing compressed data. You won't be able to interpret them properly not knowing the file format structure. You can read the raw binary content and this is what you're actually doing with your example Python code

How to convert multiple LCI ecospold files to a custom excel format/ how to use parse_file from pyecospold/ how to read ecospold into brightway

I have multiple ecospold (version 1) files with LCI data that I want to convert to a custom excel format. I need all data given in the ecospold file. For my own convinience I want to use python to complete this task.
My research until now has lead me to the following conclusions:
There exist at least two converters (by GLAD and openLCA) to convert ecospold formats (1 and 2) to e.g. the ILCD. But those formats are not helping me to go anywhere, since I need to have all the data accessible in python and in order to then write it into my custom excel format.
To get the data in python, the package pyecospold (https://github.com/sami-m-g/pyecospold) seems to be a suitable choice.
According to the README that can be found at the pyecospold github repository,
ecoSpold = parse_file("data/v1/v1_1.xml") # Replace with your own XML file
should do the job. So I implemented the following lines:
import os
from pyecospold import parse_file, save_file, Defaults
from lxml import etree
cd = os.getcwd()
path_input = cd + r'\inputs\ecospold_test.xml'
# Parse the required XML file to EcoSpold class.
es = parse_file('inputs/ecospold_test.xml')
Now I run into the error:
TypeError: parse_file() missing 2 required positional arguments: 'schema_path' and 'ecospold_lookup'
I understood that a schema in xsd format is needed, therefore I got the schema files from the github and amended my last line of code:
es = parse_file('inputs/ecospold_test.xml', 'inputs/schemas/v1/EcoSpold01Dataset.xsd')
Now there is still one argument missing:
TypeError: parse_file() missing 1 required positional argument: 'ecospold_lookup'
Since I have no experience in parsing xml files in python, I have no idea what to do with this. Additionally, I am confused why the README does not say anything about those additionally needed arguments.
My second idea was to use brightway to get the data into python. But since brightway itself is quite an extensive package, I could not find a simple (or any) way to do this. (Sadly, the notebooks linked in the answer of this question Import Ecoinvent 2.2 Ecospold files into Brightway do not exist anymore)
Another option would of course be to write my own parser. But because I am lacking experience and pyecospold does exactly this (at least in my understanding), I would like to avoid this option.
Additionally, there in openLCA it is possible to read in ecospold files and then export them to an excel format. From this excel format I could of course make my custom excel format. The problem here is that I have no idea how to automize this, because I do not want to read in and export each file individually and manually in openLCA.
If anyone has an idea on how to solve one of my subproblems or a good alternative on how to solve my general problem, I would be very thankful. :)

How to get a RAW16 from CX3

This is my data flow for my system:
Because i can not found a demo to config a raw16, and i did not found the enum type "enum CyU3PMipicsiDataFormat_t " which not contain a RAW16type,
so i did't known how to transfer my raw16 data to the host.
I try to use the yuv422 configuration to transfer my raw data to the host, and i really received data from the CX3 by e-cam, but the image is wrong for the e-cam use the yuv2 formating to resolve the raw data. And now I think i can use the matlab to grap a frame and deal with it. But when i use the matlab getting a snashot and i found the data is a
type like this: 1280*800*3(full frame size:1280x800). Is it the matlab regard as a yuv data? and how can i config the cx3 to support raw16 or how to deal with the data i grap from the cx3 with the yuv format transfer.
Is there any other developer meet the requirement like me?

Solr pdf index bad request

I'd like to have a simple setup of solr where I can index and search large folders of pdf/docx files. I mostly need just full text search, no need to have fields separated and the original documents do not seem to have well defined structure anyway. I follow https://lucene.apache.org/solr/quickstart.html which is straightforward, however, when I try to index my own folder with some pdf files, some files return error like:
POSTing file G1504225.pdf (application/pdf) to [base]/extract
SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for
url: http://localhost:8983/solr/gettingstarted/update/extract?
resource.name=%2Fhome%2Fsolr%2Fsolr-6.5.1%2F..%2Ftrain_data%2FG1504225.pdf&literal.id=%2Fhome%2Fsolr%2Fsolr-6.5.1%2F..%2Ftrain_data%2FG1504225.pdf
SimplePostTool: WARNING: Response: <?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">400</int><int
name="QTime">263</int></lst><lst name="error"><lst name="metadata"><str
name="error-class">org.apache.solr.common.SolrException</str><str
name="root-error-class">java.lang.NumberFormatException</str><str
name="error-class">org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException</str><str name="root-error-class">org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException</str></lst><str name="msg">Async exception during distributed update: Error from server at http://127.0.1.1:8983/solr/gettingstarted_shard2_replica1: Bad Request
request:
http://127.0.1.1:8983/solr/gettingstarted_shard2_replica1/update?update.chain=add-unknown-fields-to-the-schema&update.distrib=TOLEADER&distrib.from=http%3A%2F%2F127.0.1.1%3A8983%2Fsolr%2Fgettingstarted_shard1_replica1%2F&wt=javabin&version=2
Remote error message: ERROR: [doc=/home/solr/solr-6.5.1/../train_data/G1504225.pdf] Error adding field 'title'='United Nations' msg=For input string: "United Nations"</str><int name="code">400</int></lst>
</response>
SimplePostTool: WARNING: IOException while reading response:
java.io.IOException: Server returned HTTP response code: 400 for URL:
http://localhost:8983/solr/gettingstarted/update/extract?
resource.name=%2Fhome%2Fsolr%2Fsolr-6.5.1%2F..%2Ftrain_data%2FG1504225.pdf&literal.id=%2Fhome%2Fsolr%2Fsolr-6.5.1%2F..%2Ftrain_data%2FG1504225.pdf
Most of the files are fine and I can search them. Any ideas?
Solr uses Tika to extract the text from those files. Some types of files, pdf specially, are hard to parse, as it is a proprietary format and Tika is always trying to catch up edge cases etc. So it is normal that some files will throw errors. You have to expect that.
See how many instances of NumberFormatException/pdfbox are found...(pdfbox is the library Tika uses for pdf files).
If you really want to get all the text from all pdf, even the ones erroring, you can put them in a special folder, and process them again extracting the text yourself with another library, different libraries will have different results of the same pdf, so you can use the superset of the text several libraries produce. But you will have to write some glue code for this, unless Tika allows you to plug specific libraries for specific file types (not sure if it does now, it didn't do that before).

MSBuild and IgnoreStandardErrorWarningFormat

I'm trying to write a MSBuild project that will generate html documentation using doxygen. I couldn't find anything about that on the net except for one example, which seems incomplete; it doesn't parse doxygen warnings.
I found that MSBuild's Exec task has parameters like IgnoreStandardErrorWarningFormat and CustomWarningRegularExpression. What is the "Standard Error/Warning Format" and what kind of REs are allowed in these properties?
Edit: ah, "Inside the Microsoft Build Engine" wrongly describes it as property in .NET 3.5, where it is actually from 4. No use for me...
The standard msbuild error/warning format is described here.
In a nutshell, the format is:
MSBuild recognizes error messages and warnings that have been specially formatted by many command line tools that typically write to the console. For instance, take a look at the following error messages - they are all properly formatted to be MSBuild and Visual Studio friendly.
Main.cs(17,20): warning CS0168: The variable 'foo' is declared but never used
C:\dir1\foo.resx(2) : error BC30188: Declaration expected.
cl : Command line warning D4024 : unrecognized source file type 'foo.cs', object file assumed
error CS0006: Metadata file 'System.dll' could not be found.
These messages confirm to special format that is shown below, and comprise 5 parts - the order of these parts are important and should not change:
Origin (Required)
Origin can be blank. If present, the origin is usually a tool name, like 'cl' in one of the examples. But it could also be a file name, like 'Main.cs' shown in another example. If it is a file name, then it must be an absolute or a relative file name, followed by an optional parenthesized line/column information in one of the following forms:
(line) or (line-line) or (line-col) or (line,col-col) or (line,col,line,col)
Subcategory (Optional)
Subcategory is used to classify the category itself further, and should not be localized.
Category (Required)
Category must be either 'error' or 'warning'. Case does not matter. Like origin, category must not be localized.
Code (Required)
Code identifies an application specific error code / warning code. Code must not be localized and it must not contain spaces.
Text (Optional)
User friendly text that explains the error, and must be localized if you cater to multiple locales.
The format is fully documented in the MSBuild source code here.
I can't find docs on it right now, but I think the standard error format is something like
.*(\d+(,\d+(,\d+,\d+)?)?)?: error .*:.*
.*(\d+(,\d+(,\d+,\d+)?)?)?: warning .*:.*
examples:
c:\somefile.txt(10,20,10,30): error CMD1234: blarg
c:\somefile.txt(10,20): error CMD1234: yadda yadda
c:\somefile.txt: warning ARG5678: blah blah