Unable to convert word with emf images into Html using docx4j - docx4j

I have tried converting the docx (word) documents into Html using docx4j. It works fine if the images are png or jpg, however when the image type is emf, the images does not render (or gets converted) into html. Any alternative solutions is much appreciated!!
Thank you

Related

Extract Image from PDF correctly

I have a PDF file that contains an image where this image is successfully displayed. When I try to extract the image from the PDF file using itextsharp or pdfsharp libs I get bytes, then decode them successfully (because there is /Filter/FlateDecode there). But when I try to convert these bytes to an image using different libs the exception occured (it looks like the bytes are actually not an image). As far as I understand the problem is processing these bytes, but the image in the Pdf is not corrupted because it is shown there correctly. PDF is here.
The images are most likely stored in the PDF image format which is documented in the PDF specification.
It is rather simple to convert them to the Windows BMP format. But still you must convert them and add headers with the specific information from the image attributes from the PDF file.
In PDF a new image line is byte-aligned, in Windows BMP it is DWORD-aligned.
Don't forget to extract the colour table if there is one.

Ghostscript converted PDF not rendered corectly with pdfium

I have a problem with some pdf documents I convert with Ghostsript to pdf/a documents. If the original document contains subseted fonts, the document is not correctly displayed in Chrome (pdfium) after converting. The chars will be displayed as squares.
In Adobe PDF Reader the output will be displayed correctly. Maybe the attached files can help you.
original PDF
converted PDF/A

Extracting non jpeg images from pdfs with iTextSharp

I'm able to extract images from pdfs using the RenderListener class which seems to only support JPEG, any other type throws an error. Another approach I was considering is looping through the XObjects but I can't get the actual content from them. Is there any work around to extracting other image types from a PDF?

Reformat image format in PDF

I got issues in rendering a couple of images and texts in PDF in Telerik PDF viewer - according to Telerik's documentation it seems those texts/images formats are incompatible.
Are there ways to convert existing images in a PDF and replace it back to the PDF so to make the file compatible to the Telerik PDF viewer?
Many thanks

Is there any easy way to convert images inside a pdf file?

I have a few books that I absolutely MUST be reading; they are a set of calculus textbooks as PDF files. The problem is that the graphs and images in these pdf file are all png, which is apparently not supported by my kindle. Is there anyway I can convert these images as a batch into jpeg or any other format inside the pdf file. I have tried everything from converting the pdf to other formats (equation formatting didn't let it work), to extracting the images from the pdf file and getting them converted. I just really need to know if there is any program I can use to help me or if maybe, there is a way I could 'open' the pdf container, and switch out the png images for the jpeg images and replace the png file extensions with jpg. Any help would be greatly appreciated.
The books are:
http://tutorial.math.lamar.edu/pdf/CalcI/CalcI_Complete.pdf
http://tutorial.math.lamar.edu/pdf/CalcII/CalcII_Complete.pdf