Efficient thumbnail generation of huge pdf file? - pdf

In a system I'm working on we're generating thumbnails as part of the workflow.
Sometimes the pdf files are quite large (print size 3m2) and can contain huge bitmap images.
Are there thumbnail generation capable programs that are optimized for memory footprint handling such large pdf files?
The resulting thumbnail can be png or jpg.

ImageMagick is what I use for all my CLI graphics, so maybe it can work for you:
convert foo.pdf foo-%png
This produces three separate PNG files:
foo-0.png
foo-1.png
foo-2.png
To create only one thumbnail, treat the PDF as if it were an array ([0] is the first page, [1] is the second, etc.):
convert foo.pdf[0] foo-thumb.png
Since you're worrying about memory, with the -cache option, you can restrict memory usage:
-cache threshold megabytes of memory available to the pixel cache.
Image pixels are stored in memory
until threshold megabytes of memory have been
consumed. Subsequent pixel operations
are cached on disk. Operations to
memory are significantly faster but
if your computer does not have a
sufficient amount of free memory you
may want to adjust this threshold
value.
So to thumbnail a PDF file and resize it,, you could run this command which should have a max memory usage of around 20mb:
convert -cache 20 foo.pdf[0] -resize 10%x10% foo-thumb.png
Or you could use -density to specify the output density (900 scales it down quite a lot):
convert -cache 20 foo.pdf[0] -density 900 foo-thumb.png

Should you care? Current affordable servers have 512 GB ram. That supports storing a full colour uncompressed bitmap of over 9000 inches (250 m) square at 1200 dpi. The performance hit you take from using disk is large.

Related

Optimization 3d models and reusing single texture

If I use one texture for many objects, where each object selects the color (RGB) it needs from the 256 available (1x256 file), is this considered a resource saving or does it create a texture load queue?
And secondly, I have an OBJ file that has a slightly unoptimized number of triangles (example 3420), but weighs 5 KB, after rebuilding the model in blender, the number of triangles decreases (2440), but the file size increases to conditional 8-9 KB. In this case, what is more important for optimization (in the work of the GPU, RAM or CPU)? Number of triangles or file size?
And sorry for the dumb questions :)
Non-optimize with Blender:
3420 triangles, 5 KB file size
Blender remeshing:
2440 triangles, 8-9 KB file size
Environment:
Godot on Vulkan API

what is the maximum extent of compressing a pdf file?

Whenever I try to compress a PDF file to a lower possible size, by either using ghostscript or pdftk or pdfopt, I end up having a file near to half the size of original. But lately, I am getting files of size in 1000 MB range, which are compressing to say a few hundreds. Can we further reduce them?
The pdf is made from jpg images which are of higher resolutions, cant we reduce the size of those images and further bring in some more reduction in size?
As far I know, without degrading jpeg streams and loosing quality, you can try the special feature offered by
Multivalent
https://rg.to/file/c6bd7f31bf8885bcaa69b50ffab7e355/Multivalent20060102.jar.html
java -cp path/to.../multivalent.jar tool.pdf.Compress -compact file.pdf
resulting output will be compressed in a special way. the resulting file needs Multivalent browser to be read again
it is unpredictable how much space you can save (many times you cannot save any further space)

How to increase memory size of jpg file?

My jpg file size is 5 KB. How can I make it 15 KB file?
please Go through these Link.
http://www.mkyong.com/java/how-to-resize-an-image-in-java/
http://www.java2s.com/Code/Java/2D-Graphics-GUI/Imagesize.htm
JPEG is a compression algorithm, and hence its 5 KB. Save it as a 32-bit or higher Bitmap to get maximum size possible for that image. And your question has a lot of scope for improvement. Give context, reasons, and methods you already tried.

Streaming Jpeg Resizer

Does anyone know of any code that does streaming Jpeg resizing. What I mean by this is reading a chunk of an image (depending on the original source and destination size this would obviously vary), and resizing it, allowing for lower memory consumption when resizing very large jpegs. Obviously this wouldn't work for progressive jpegs (or at least it would become much more complicated), but it should be possible for standard jpegs.
The design of JPEG data allows simple resizing to 1/2, 1/4 or 1/8 size. Other variations are possible. These same size reductions are easy to do on progressive jpegs as well and the quantity of data to parse in a progressive file will be much less if you want a reduced size image. Beyond that, your question is not specific enough to know what you really want to do.
Another simple trick to reduce the data size by 33% is to render the image into a RGB565 bitmap instead of RGB24 (if you don't need the full color space).
I don't know of a library that can do this off the shelf, but it's certainly possible.
Lets say your JPEG is using 8x8 pixel MCUs (the units in which pixels are grouped). Lets also say you are reducing by a factor to 12 to 1. The first output pixel needs to be the average of the 12x12 block of pixels at the top left of the input image. To get to the input pixels with a y coordinate greater than 8, you need to have decoded the start of the second row of MCUs. You can't really get to decode those pixels before decoding the whole of the first row of MCUs. In practice, that probably means you'll need to store two rows of decoded MCUs. Still, for a 12000x12000 pixel image (roughly 150 mega pixels) you'd reduce the memory requirements by a factor of 12000/16 = 750. That should be enough for a PC. If you're looking at embedded use, you could horizontally resize the rows of MCUs as you read them, reducing the memory requirements by another factor of 12, at the cost of a little more code complexity.
I'd find a simple jpeg decoder library like Tiny Jpeg Decoder and look at the main loop in the jpeg decode function. In the case of Tiny Jpeg Decoder, the main loop calls decode_MCU, Modify from there. :-)
You've got a bunch of fiddly work to do to make the code work for non 8x8 MCUs and a load more if you want to reduce by a none integer factor. Sounds like fun though. Good luck.

Create discrepancy between size on disk and actual size in NTFS

I keep finding files which show a size of 10kb but a size on disk on 10gb. Trying to figure out how this is done, anyone have any ideas?
You can make sparse files on NTFS, as well as on any real filesystem. :-)
Seek to (10 GB - 10 kB), write 10 kB of data. There, you have a so-called 10 GB file, which in reality is only 10 kB big. :-)
You can create streams in NTFS files. It's like a separate file, but with the same filename. See here: Alternate Data Streams
I'm not sure about your case (or it might be a mistake in your question) but when you create a NTFS sparse file it will show different sizes for these fields.
When I create a 10MB sparse file and fill it with 1MB of data windows explorer will show:
Size: 10MB
Size on disk: 1MB
But in your case its the opposite. (or a mistake.)