Typo3 LTS9 PDF dimensions are not read and displayed in 0x0 - pdf

I am having an issue with PDF's in the latest Typo3 release. If I add PDF to the Image content element, I get this:
The file info looks like this:
Checking the Image Processing Test of Typo3, no errors are returned. PDF/AI also seems to be fine.
I tested several PDF's and AI files as well, they won't show dimensions either.
I have the suspicion that the command 'identify' does not work within Typo3, it still returns perfect results from shell.
Any idea where to look?

multiple reasons possible:
you just need to reimport metadata (scheduler task)
your PDF is coded in an unsual format (there is more then one option in PDF to include the title image)
missing/wrong rights:
maybe another program is executed from commandline than from PHP.
maybe the file can't be accessed correctly from ghostscript started from web

Related

Ghostscript adds whitespace no matter what bounding box I use

I'm trying to convert a page of a PDF to an image. I'm successful with most PDF's I've tried with but this one in particular always ends up with a lot of whitespace on one side or strange scaling.
I've tried every combination of every fixed media, fixed resolution, fit page, use crop/bleed/trim/art box, etc. parameter to fix the issue but nothing does it. The best I get is the right content size but offset and chopped off.
Here's what it should look like, according to every PDF reader I've tried:
Here's a link to the PDF (8 MB) for testing.
https://drive.google.com/file/d/1ErS3KxADb1YAdzM7FG7T5dO8QnW4l1AQ/view?usp=sharing
Edit 1:
Here's what it looks like using just -dUseCropBox without a cropbox override:
I'm using Ghostscript.NET with very simple code. I create a rasterizer, call Ope(PDF file, ghostscript dll in bytes), then GetPage(DPI, page number). To use other flags I add a custom switch to the rasterizer before calling open
using(var rasterizer = new GhostscriptRasterizer()) {
//rasterizer.CustomSwitches.Add("-dFIXEDMEDIA");
//rasterizer.CustomSwitches.Add("-dFIXEDRESOLUTION");
//rasterizer.CustomSwitches.Add("-dPSFitPage");
//rasterizer.CustomSwitches.Add("-dFitPage");
//rasterizer.CustomSwitches.Add("-dPDFFitPage");
//rasterizer.CustomSwitches.Add("-dUseCropBox");
//rasterizer.CustomSwitches.Add("-dPrinted");
//rasterizer.CustomSwitches.Add("-dUseBleedBox");
//rasterizer.CustomSwitches.Add("-dUseTrimBox");
//rasterizer.CustomSwitches.Add("-dUseArtBox");
//rasterizer.CustomSwitches.Add("-sPAPERSIZE=letter");
//rasterizer.CustomSwitches.Add("-dORIENT1=true");
//etc
rasterizer.Open(pdfFilePath, ghostscriptDLL);
img = rasterizer.GetPage(dpi, pageNumber);
img.Save(pageFilePath, imageFormat);
}
I'll try again with the latest version of just ghostscript (no .NET) and see if that makes a difference.
Edit 2:
Using just gswin64c version 9.55.0 and -dUseCropBox works as KenS said. Since I don't need Ghostscript.NET to do that, that's a good resolution.
Using just gswin64c version 9.55.0 and -dUseCropBox works as KenS said. Since I don't need Ghostscript.NET to do that, that's a good resolution.

Ghostscript - create a pdf with multiple identical pages and keep size down

Im trying to use Ghostscript to create a PDF with multiple identical pages. I will later use this together with another multipaged PDF to stamp on unique information onto every page.
Is it possible to use Ghostscript to create such a PDF and keep the size of the final file down? Maby there is a flag that i have not noticed that can do this in a better way than the script below?
I have tried to use a regular merge command like the one below but the size of the resulting PDF grows alot and the original file size of 2,061MB merged to a 100page pdf results in a final size of 46,117MB.
"C:\Program Files\gs\gs9.20\bin\gswin64.exe"^
-dBATCH^
-dNOPAUSE^
-q^
-sDEVICE=pdfwrite^
-sOutputFile=outputpdf.pdf^
"inputpdf.pdf"^
"inputpdf.pdf"^
"inputpdf.pdf"(and so on 100 times)
You can construct such a file manually easily enough, which is much smaller, by reusing the page content stream for each page.
However Ghostscript's pdfwrite device won;t do that, not least because it can't. It cannot know in advance that the page its about to receive is the same as the previous page. As a result it will create a new page content stream for each page, and create new content for it.
Note that resources (forms, patterns, colour spaces, image XObjects etc) which are used on each page will be reused on other pages.
However, it seems to me that you're already getting nearly a 5:1 ratio (2k * 100 pages = 200Kb, the final file is 46Kb) though in fairness a good bit of that 2Kb is 'stuff' around the page.
Without seeing your input file I can't really comment any further, but frankly I doubt its possible to make it any smaller without hand-crafting the file. What's the problem with a 46Kb file anyway ?

Can't get PDFBox CreatePDFA example to work - Color profile not found

I'm trying to get the example for creating a PDF/A document with Apache PDFBox up an running (CreatePDFA.java).
For this I copied the example class as is into a project module that includes a maven-dependency on PDFBox in version 2.0.0-RC3. I only changed the method signature and used a fixed font, filename and message instead of args[].
When trying to run the code I get an NPE in Line 107 because it cant't load the color profile (InputStream is null) When I check the included library in the project details I can see the resources folder, but it does not contain the expected file, namely "pdfa/sRGB Color Space Profile.icm".
Unfortunately, google-ing the problem only turned up more references to always the same example implementation, but after a while I acutally found what seems to be the needed file on apache.googlesource.com
I copied the file to our own resource directory and then used this line of code instead:
InputStream colorProfile = CreatePdfA.class.getResourceAsStream("/pdfa/sRGB Color Space Profile.icm");
This finally stopped the NPE - the file is apparently found - but now I get another exception which says:
java.lang.IllegalArgumentException: Invalid ICC Profile Data
Here, I'm stuck. I had hoped that this would work just out of the box, but it seems like I am missing something. Any ideas?
You already answered one part of the problem yourself: put the file into your resource directory.
The second problem may be a bad repository mirror or a transfer problem (binary to ascii). Here's the official repository URL with the ICC profile from the example:
https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/resources/org/apache/pdfbox/resources/pdfa/

Phantomjs is duplicating content when exporting to PDF

I've encountered a weird issue with Phantomjs when converting an html file to pdf. My html, resulting pdf, and rasterize.js files are below:
http://401web.com/_pub/2TRTI8E.html
http://401web.com/_pub/2TRTI8E.pdf
http://401web.com/_pub/rasterize.js
You will notice that in the PDF file, at page 6, the content gets cut off and then on page 7, the content is repeated and is then correct all the way to the end of the document.
The html file contains a series of tags with their src attributes set as data:image/png;base64...
The application call to the phantom library is as follows:
phantomJS.Run("C:\path\to\directory\rasterize.js"),
new[] { webpath, outFilePdf, "A4", "1", "portrait"}, null, null);
Note that sometimes the rendered pdf file will exhibit the break/repeat behavior in different locations within the document eg: page 7 instead of 6) but the same issue always occurs.
Also, I am using phantomjs throughout my application (with the same rasterize.js script) with no other issues. This only happens on this export and only if there are a number of images.
My theory is that there is something going on with the image.onload event, specifically with base64 data but I have no idea how to troubleshoot this.
This is all within a .Net MVC application. I am using the PhantomJS nuget package found here: https://www.nuget.org/packages/PhantomJS/
Help is greatly appreciated.
Update: when running phantomjs locally via command line I was receiving the error below:
[CRITICAL] QNetworkReplyImpl: backend error: caching was enabled after some bytes had been written
libpng error: Read Error
I solved this (though I have no idea how/why) by replacing the cdn references in the html file to font-awesome.css and weather-icons.css files with locally hosted versions. After that, no more error and no more duplicate content.

video.js.map throwing a 404 (Not Found)

Playing around with the newest video.js today, I'm noticing that video.js.map is showing up as a 404 when putting the video.js script into a site that I'm working on.
I don't see a source map file in the initial distribution, but it doesn't throw this error locally, only when I put it on a server.
Ideas as to solving?
You have a few options when you don't have access to a source map:
Ignore the message. It generaly only gets thrown when your dev tools are open.
Remove the reference in the original file. These are the last characters (comments) at the end of the file.
Generate a source map yourself when you have access to the source code. For video.js, it can be generated from video.dev.js.
Use a public CDN version which might not link to the source map.
There also is a discussion on GitHub about this topic.
I get the same error, everything should still work though. I think it's an html5 or browser bug
I was seeing this as well, but only in my log files. I was getting three multi-line entries (failures) every time a video was played in my production.log in a RoR site. It was really bulking up my .log file. More info on #smhg's 2nd bullet (remove references). I'm using video.js 5.4.6 along with some vpad-vast plugin stuff - I could see all three files referenced in my .log file. Your mileage may vary.
Edit video.js and remove the following entry on line 19694:
//# sourceMappingURL=video.js.map
(for vpaid-vast plugin only...)
Edit videojs_5.vast.vpaid.min.js and remove this line from the very end:
//# sourceMappingURL=videojs_5.vast.vpaid.min.js.map
Edit videojs.vast.vpaid.min.css and remove this line from the very end:
/*# sourceMappingURL=videojs.vast.vpaid.min.css.map */
The entries are no longer appearing in my log file and the player works fine.
Hope it helps!