How to extract info from a file - conceptual

this may be a beginner's question. I've tried searching for info but couldn't find anything. Part of my work requires me to convert a specific, proprietary, file type. Unfortunately the software is no longer supported and can't be found. I have no idea where to start on this. I would like to write a little utility to basically convert the file for me to a standard file. Question is where do I start? Conceptually what am I looking at here? Is this even possible?

You could start by understanding what is stored in the file. Is there a pattern to the data, what is the pattern, how it is repeated, etc.
Then open the file in binary mode and try to find if there is indeed a pattern. If there is one, you should be able to see it, even if in binary mode.
And lots of patience :-)

Related

how-to-parse-a-ofx-version-1-0-2-file-in power BI?

I just read
How to parse a OFX (Version 1.0.2) file in PHP?
I am not a developer. What easy tool can I use to make this code run with no code skill or appetence ? web browser is pretty hard to use for non dev guys.
I need this to use the file into Power BI, which accept M code, json source or xml, but not sgml ofx or PHP.
Thanks in advance
Welcome Didier to StackOverflow!
I'm going to try and give you a clue how I'd approach the problem here. But keep in mind that your question really lacks details for us to help you, and I'm asking to update your question with example data that you want to integrate into PowerBI. Also, I'm not too familiar with PowerBI nor PHP, and won't go into making that PHP code you linked run for you.
Rather, I'd suggest to convert your OFX file into XML, and then use PowerBI's XML import on that converted file.
From your linked question, I get that your OFX file is in SGML format. There's a program specifically designed to convert SGML into XML (which is just a restricted form of SGML) called osx. I've detailed how to install it on Linux and Mac OS in another question related to SGML-to-XML down-converting; if you're on Windows, you may have luck by just downloading a really ancient (32bit) version of it from ftp://ftp.jclark.com/pub/sp/win32/sp1_3_4.zip. Alternatively, you can use my sgmljs.net software as explained in Converting HTML to XML though that tutorial is really about the much more complex task of converting HTML to XML/XHTML and will probably confuse you.
Anyway, if you manage to install osx, running it on your OFX file (which I assume to have the name yourfile.ofx just for illustration) is just a matter of invoking (on the Windows or Linux/Mac OS command line):
osx yourfile.ofx > yourfile.xml
to result in yourfile.xml which you can attempt to load with PowerBI.
Chances are your OFX file has additional text at the beginning (lines like XYZ:0001 that come before <ofx>). In that case, you can just remove those lines using a text editor before invoking osx on it. Maybe you also need a .dtd file or additional instructions at the top of the OFX file informing SGML about the grammar of your file; it's really difficult to say without seeing actual test data.
Before bothering with SGML and all that, however, I suggest to remove those first few lines in your OFX file (everything until the first < character) and check if PowerBI can already recognize your changed input file as XML (which, from other OFX example files, has a good chance of succeeding). Be sure to work on a copy of your original file rather than overwriting it. Then come back and update your question with your results and example data.

How to read a Executable(.EXE) file in OpenVMS

When am trying to open any .EXE file am getting information in encoded form. Any idea how to see the content of an .EXE file ????
I need to know what Database tables are used in the particular .EXE.
Ah, now we are getting closer to the real question.
It is probably much more productive to ask the targeted databases about the SQL queries being execute during the run, or a top-ten shortly afterwards.
The table-names might not be hard-coded recognizably as such in the executable.
They might be obtained by a lookup, and some fun pre-fixing or other transformation might be in place.
Admittedly they like are clear text.
Easiest is probably to just transfer to a Unix server and use STRINGS on the image.
I want to include the source here with but that failed, and I cannot find how to attach a file. Below you'll find a link OpenVMS macro program source for a STRINGS like tool. Not sure how long the link will survive.
Just read for instructions, save (strings.mar), compile ($ MACRO strings), link ($link strings), and activate ($ mcr sys$login:strings image_to_test.exe)
OpenVMS Macro String program text
Good luck!
Hein
Use analyze/image to view the contents of an executable image file.
I'm guessing you are trying to look in the EXE because you do not have access to the source. I do something like this:
$ dump/record/byte/hex/out=a.a myexe.exe
Then look at a.a with any text editor (132 columns). The linker groups string literals together, and they are mostly near the beginning of the EXE, so you don't have to look to far into the file. Of course this only helps if the database references are string literals.
The string literal might be broken across a block (512 byte) boundary, so if you use search in your editor, try looking for substrings.
Aksh - you are chasing your tail on this one. Its a false dawn. Even if you could (and you can't) find the database tables, you will need the source of the .exe to do anything sensible with it, or the problem you are trying to solve. Its possible to write a program which just lists all the tables in a database without reading any of 'em. So you could spend and awful lot of effort and get nowhere. Hope this helps

What firmware file format uses the header "Intel_FBF"

I'm trying to identify a file, but it has a header I've never seen before: "Intel_FBF". It's likely an embedded firmware file, but I have no idea.
Sorry, that's not much information to go by. Can you give any other hints about its history, context?
Is it binary, text?
What is the file name, extension?
Where did it come from?
What were you told it's good for?
One thing you could try is the file utility on Linux.

What file type starts with BOSS 7?

I am looking at some files generated in the early 90s. One of them seems to hold references to data packed in some binary format in a number of large files.
The first six bytes of the file are 0x42 0x4f 0x53 0x53 0x20 0x37 which spells BOSS 7.
My searches of various sources of file type information, including /usr/share/file/magic have not turned up anything. Does anyone know what software might have been used to generate files that start with these bytes? Any information on file layout would be great.
It looks like the file might have been generated by VisualWorks Smalltalk:
[BOSS 7.5]
Contains the Binary Object Streaming Service, which supports efficient storage and
retrieval of objects, including code, to and from files.
Note that for code storage, the parcel system now supercedes BOSS.
I tried to load the file using the IDE provided at http://www.cincomsmalltalk.com/ and it generated a meaningful exception:
The identifier MediaCollectionDictionary has no binding
The file does contain:
MediaCollectionDictionary
MediaCollection*
CallMediaVehDict2
etc which means, if I could now figure out what the rest of the files do and learn enough SmallTalk, I could disentangle this mess.
Of course, I am not sure if this analysis is correct. So, please if you have any other ideas, let me know. Thank you.
Much later: So, my initial assessment seems to be correct. I got some useful tips on comp.lang.smalltalk: http://groups.google.com/group/comp.lang.smalltalk/browse_thread/thread/5d55d857e2f80158#
Ask on comp.lang.smalltalk
Ask on the vwnc mailing list

Batch source-code aware spell check

What is a tool or technique that can be used to perform spell checks upon a whole source code base and its associated resource files?
The spell check should be source code aware meaning that it would stick to checking string literals in the code and not the code itself. Bonus points if the spell checker understands common resource file formats, for example text files containing name-value pairs (only check the values). Super-bonus points if you can tell it which parts of an XML DTD or Schema should be checked and which should be ignored.
Many IDEs can do this for the file you are currently working with. The difference in what I am looking for is something that can operate upon a whole source code base at once.
Something like a Findbugs or PMD type tool for mis-spellings would be ideal.
As you mentioned, many IDEs have this functionality already, and one such IDE is Eclipse. However, unlike many other IDEs Eclipse is:
A) open source
B) designed to be programmable
For instance, here's an article on using Eclipse's code formatting functionality from the command line:
http://www.peterfriese.de/formatting-your-code-using-the-eclipse-code-formatter/
In theory, you should be able to do something similar with it's spell-checking mechanism. I know this isn't exactly what you're looking for, and if there is a program for doing spell-checking in code then obviously that'd be better, but if not then Eclipse may be the next best thing.
This seems little old but seems to do a good job
Source Code Spell Checker