Convert .bin to .png for files located in word/embedding of a word.zip file - vba

I am stuck at this problem for some hours now.
I have a wordfile with embedded OLE objects. The OLE-Objects are PNG-files.
I unzipped the word-file via renaming it to .zip and found the files I need via \document.xml and _rels\document.xml.rels whereby I could identify the needed OLE-Object in the documents.xml and the r:id under <v:imagedata> linked to the respective OLE-Object in the document.xml.rels.
Now when I open the OLE-Object via the word file, it is recognized as expected as a PNG. When I look it up under \embeddings\oleObject1.bin, it is no longer a png, but a bin instead.
Now simply renaming the bin to png does not work...
My question would be: How do I turn this bin into a graphics format I can use?
Thank you very much!

*.bin streams are themselves zip archives. When you extract them you will find their true contents along with some meta-data.

Related

Edit source code of Impress odp file

I want to edit the source code of an Impress file (.odp) but when I open it is just machine coded.
I want to do it because when I converted files from PowerPoint to an Impress File some parts got mixed up. Like for example footer and numbering can't be changed globally. So by editing the source code, I hope to be able to use find/replace in a Text Editor.
LibreOffice formats are zipped archives primarily containing XML files. So unzip the .odp and then edit content.xml.
When finished, zip it back up, making sure to zip it from the correct directory (the one that contains content.xml).
Documentation: https://help.libreoffice.org/Common/XML_File_Formats#XML_file_structure.
If you are using a Mac do the following:
Change the .odp extension to .zip by manually clicking the icon and renaming the file
Unzip the file using something other than the standard Archiver (I used Keka)
You will see the folder of contents including the content.xml which you can easily edit now
Crucial: Go into the directory with your separate files, select all the files then hit 'compress' from the options menu when you right click
Next, rename the .zip to .odp and the file will open successfully
I found that if you don't do option 4 above exactly then the file is slightly different and won't open due to a corruption message.

I found a file in raw lyx output, how do I create a readable pdf or txt file from this mess?

I found a file in raw lyx output, how do I create a readable pdf or txt file from this mess?
https://raw.githubusercontent.com/jarcane/bedroom-wall-press/master/hulks-and-horrors/HnHCompanionI.lyx
I have installed LyX and tried pasting, I have tried pasting in OpenOffice and then exporting as plain text, then importing plain text into LyX, it always includes the format coding when I try to export the file as pdf or text.
I just want the human readable portion of the document.
Any help would be appreciated, thank you.
The LyX file you link to is indeed a valid .lyx file. To use it, do the following:
Download the file. The easiest way to do this is to just run
wget "https://raw.githubusercontent.com/jarcane/bedroom-wall-press/master/hulks-and-horrors/HnHCompanionI.lyx"
Open the file in LyX.
Compile to PDF by clicking on the "eyes" icon, or by going to File > Export > PDF (pdflatex) in which case a .pdf file will be created in the same directory as the .lyx file.
Note that you the .lyx file depends on other files. For example, there is an image included in the .lyx file with a path "C:/Users/BearBear/Google Drive/Hulks and Horrors B&W Logo for Print.png".
It is possible that you won't be able to compile the document because of the missing .png or because you do not have a complete TeX installation. In this case, you can simply read the document in LyX. It is not as pretty as in the PDF but it is certainly readable in my opinion.

Obtain the "absolute path to the workbook" from an .xls file

When I use the Excel "Document Inspector" on a particular .xls file to check for "hidden properties or personal information" it says:
The following document information was found:
* Absolute path to the workbook
How can I obtain the absolute path of the workbook from the file? If it needs to be done programmatically, I could use Java (e.g. Apache POI) or VBA.
I know where the file is currently saved, but what I want to extract is the absolute path to the workbook which is saved in the file I have. This is so I can know where it was saved by the author.
Here's what has happened to the file:
Someone authored it, saving it at some absolute filepath unknown to me
They uploaded it to a website
I downloaded it from the website
Excel indicates that the document contains the absolute path from step 1. I'm after this path, not the place I saved it at step 3 since I know that.
I can reproduced that warning message by simply creating an empty Excel file, added a formula, saved it as BIFF8 (.xls). The Document Inspector will then warn about the absolute path. ... but in my case, there was no filename inside the file.
A simple way to verify this, is to open the file in a hex-editor and search for a well-known save location (i.e. the location where a dummy/test file was stored) - this is either stored as ASCII or as 16-bit string, i.e. every odd byte is a character.
If you want to use the POI developer tools, you can use the following:
To list all Excel records:
java -cp poi-3.16-beta1.jar org.apache.poi.hssf.dev.BiffViewer file.xls
To list the document and summary properties:
java -cp poi-3.16-beta1.jar org.apache.poi.hpsf.extractor.HPSFPropertiesExtractor file.xls
To list any embedded objects beside the usual suspects SummaryInformation, DocumentSummaryInformation and Workbook:
java -cp poi-3.16-beta1.jar org.apache.poi.poifs.dev.POIFSLister file.xls
So after running the tools and recording the output, you can then remove the properties via the Excel Document Inspector and execute the tools again. The output can be diffed and you might find the culprit.
Assuming it is an .xlsx file rather than an older-style .xls file, you can
Rename the workbook as a .zip file
Look at the xl\workbook.xml "file" within the .zip file
and you will find the absolute path when last saved from Excel.
This is why it is not a good idea to share work-related spreadsheets with other people, unless you first clear out this sort of information.
I'm not sure how to find it in the binary format files.

Why replacing the value of xml node in custom xml part in word 2013 document makes it corrupt?

I created a Word 2013 document and did the following:
Added a Plain Text Content Control to it at design time.
Added a Custom Xml Part at design time.
Did a mapping from one node of Custom Xml Part to the Plain Text Content Control.
The value of node appeared in the Content control.
I saved and closed the document.
Renamed it to .zip and extracted it to a folder.
Edited the file in folder customXml/item1.xml which is my custom xml part and changed the value of node from <Name>John</Name> to <Name>Harry</Name>.
Re archived it as a zip file and renamed it to .docx.
When I opened the document, it was corrupted and Microsoft Word says:-
We're sorry. We can't open XYZ.docx because we found a problem with the contents. Microsoft Office cannot open this file because some parts are missing or invalid.
Reason - You cannot unzip and re-zip your .docx file as you did in Step 8
Guide - Try this. Create a dummy .docx file form word. Extract it and re-zip it and try to open the file. You will get the same error and you will not be able to use it as you expect.
Solution - If you want to edit your .docx file , use Open XML SDK and do it. Here is a link to a good guide - http://msdn.microsoft.com/EN-US/library/office/cc850833(v=office.15).aspx . And also Open XML productivity tool will come in handy - http://dotnet.dzone.com/articles/using-openxml-sdk-productivity
I was zipping and unzipping incorrectly. We don't need any Open XML SDK.
What I was doing was: Right click on XYZ.docx.zip and select Extract to XYZ.docx. When it got extracted into the folder XYZ.docx, I edited the contents inside and then zipped the entire folder to XYZ.docx which is wrong.
When I got inside of folder and zipped only the contents to XYZ.docx.zip, it started working.

Convert .mht files to pdf files using java

Is it possible to convert MHTML(.mht) files to pdf with proper css rendering.
Tried using pd4ml but the extenal css and links refered in .mht file fails to get loaded in the pdf genrated.
You could try unpacking the MHTML to HTML and separate files, then using your pd4ml method to generate the PDF.
Chilkasoft Java MHT is one solution you can look into, although after the 30 day trial you will need a license.