Is U3D the only supported file format for PDF - pdf

Can one only add U3D file formats to a PDF or is there other formats? And is there maybe a nuget package or library that I can use to convert my glb or fbx formats to u3d in .net standard (UWP)?

Can one only add U3D file formats to a PDF or is there other formats?
The PDF 2.0 ISO specification has added the PRC format to PDF standard.
https://3dpdfconsortium.org/prc/

Related

Extract xml from ZUGFeRD PDF with Ghostscript

We would like to automate the processing of Zugferd invoices.
Is there a way to extract and save the xml files embedded in the PDF using Ghostscript?
as mentioned by KenS Ghostscript can help assemble Zugferd files but not extract the contents. Below we can see those contents in the source xml (lower) and a good !? PDF where the plain text is visible (upper part of image is PDF viewed in WordPad) and can be easily extracted as text. However nothing about PDF extraction is reliable since the format of one PDF is rarely the same as the next unless you make it so.
Many PDF readers have the ability to export such attachments as the source file and many PDF libraries will allow for extraction of the named file in a scripted fashion.
The samples above are from currently very up to date Open Source Java application https://www.mustangproject.org/
For very simple cross platform use there is pdfdetach which can save any attachments by name or all attachments

linux tools that deal with metadata

there is pdfinfo for PDFs, exifinfo for images, ffprobe for multimedia and so on. are the a collection or a standardized toolset for extracting filetype dependent(like pdf, image, doc, odt) metadata of all files in linux?
or even distinct tool that are file-format specific for most common file types like ppt, epub and other file types we commonly find in internet downloads.
Exiftool can extract metadata from more than just images. See Supported File Types.

Conversion from PDF to TIFF file using XSLT

Is it possible to convert PDF to TIFF file using XSLT? Can someone point out some artcile or code i can refer regarding the image conversion using xslt.
THANKS!
No, it is not possible using just XSLT. XSLT is for transforming XML to other textual structures (usually XML, HTML, or plain text). Using XSL-FO, you can output a PDF from XML data - but that is a one way process as far as XSL-FO is concerned. Apache FOP does support outputting to TIFF instead of PDF, but again this is a one way process.
Assuming you could get a PDF -> XML conversion working (a quick google suggests such libraries exist, but it's unclear what they'd actually provide), it would be possible to use XSLT to transform that XML into something Apache FOP could render into a TIFF file, but at that point you'd really be better off investigating a direct PDF to TIFF conversion library (perhaps with an OCR library).
Possible? Maybe (but likely not). The real question is why do you even want to try to create a TIFF file from a PDF file using XSLT?
You do not need XSLT.
You want a raster image processor like Ghostscript (or many others). It can convert PDF (and Postscript) to other image formats like TIFF.
http://ghostscript.com/doc/current/Devices.htm
The only way to do that is to call a conversion service, e.g. aspose.com or to create another service externally to the DataPower box.
There might be some Node.js modules that could do it running in GatewayScript (GWS) (if you are on firmware 7+) but I believe they are all dependent on external binaries to function and that won't work in GWS.

Merge PDF Files from Fileupload

I want to merge four separate PDF uploads' contents and keep them in database as binary format.
There's no easy to way to merge pdf files. If you're using PHP you can use an open source library to do so.
Link: PDF Merger for PHP
If you're using .NET follow this tutorial: Simple .NET PDF Merger which uses iTextSharp library.

Why are ePub files so much smaller than mobi or PDF files for the same book

When I buy ebooks I download all of the available formats. I've noticed that the file sizes for the various formats can be markedly different and epub is typically much smaller.
For example:
PDF - 5.7mb;
ePub - 2.7mb;
Mobi - 8.1mb.
Or:
PDF - 4.5mb;
ePub - 1.8mb;
Mobi - 5.3mb.
I've flipped through them and tried to confirm that the contents are the same and they seem to be (i.e. no large images missing). Can anyone explain why epub is so much smaller than the other two?
The mobi versions can be larger because they include the legacy mobi format, the new KF8 format and a copy of the original epub, this is assuming the mobi file was generated with the latest version of kindlegen.
For the PDF's I'm guessing (and that's all it is here) that embedded fonts may be the cause of a larger file size, another thing that comes into play here is image optimisation. Depending on the image optimisation settings used when the PDF was created will largely affect the final file size.
Epub's are basically just a bunch HTML, CSS and image files with a few XML files for defining the books metadata, chapter order and table of contents navigation. The epub file is really just a zip file with a .epub extension and since it doesn't have 3 copies of the same book like the Kindle version does it will always be much smaller.
Because the epubs are similar to a website. An epub book is made from XHTML & CSS2 & some features like CSS3, then the software that reads epub interpret that file and make a visual representation from that code.
.epub files are compressed (in fact, they are just zip files).
.mobi files are not compressed. If you zip a mobi file, you may get a smaller file than the epub.
Incidentally, this makes text searching much faster on mobi files than on epub.
That depends on the format of the mobi that you have. As you must be already aware, an epub file can be converted into any ebook format that you choose - you can consider the epub format as the base for any other format.
I am guessing that the mobi file that you have has the original epub embedded inside it. This is to assist editing tools (as direct editing of mobi files is cumbersome). Also, some mobi files contain several versions of the mobi(mobi-7 and KF8) to maintain backward compatibility with readers that do not support the latest format.
You can find more information about the file formats here