Compressing JPG page to PDF with various compressions/settings [closed] - pdf

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I would like to take a single page jpg, and experiment with various pdf compression settings and other settings (to analyse resultant pdf size and quality) - can anyone point me towards decent tools to do this, and any useful docs/guides?

Adobe Acrobat Distiller, Ghostscript, possibly others. Acrobat has its own manual, Ghostscript documentation can be found at:
http://svn.ghostscript.com/ghostscript/tags/ghostscript-9.02/doc/Ps2pdf.htm
for the current version PostScript to PDF conversion (print your JPEG to a PostScript file before starting).
If your original is a JPEG, then the best quality output will be to do nothing at all to it, simply wrap it up in PDF syntax.
If you insist on downsampling the image data (which is the only way you are going to reduce the size of a JPEG image) then you would be advised not to use DCT (JPEG) compression in the PDF file, as this will introduce objectionable artefacts.
You should use a lossless compression scheme instead, eg *"Flate".
Your best bet would be to go back to an original which has not been stored as a JPEG, downsample in a decent image application and then convert to JPEG and wrap that up in a PDF.

Docotic.Pdf library can be used to add JPEGs (with or without recompression) to PDF.
Please take a look at sample that shows how to recompress images before adding them to PDF. With help of the library you can recompress existing images too.
Disclaimer: I work for Bit Miracle, vendor of the library.

If you're OK working with .NET on Windows, my company, Atalasoft, has tools for creating image PDFs. You can tweak the compression very easily using code like this:
public void WriteJpegPdf(AtalaImage image, Stream outStream)
{
PdfEncoder encoder = new PdfEncoder();
encoder.JpegQuality = 60; // 0 - 100
encoder.Save(outStream, image, PdfCompressionType.Jpeg);
}
This is the simplest way of hitting the jpeg quality. It will override your setting if the image isn't 24 bit rgb or 8 bit gray.
If you are concerned with encoding a bunch of files but want fine-grained control over compression, the encoder has an event, SetEncoderCompression, that is invoked before the image is encoded to let you see what the encoder chose and you can override it if you like.
FWIW, I wrote most of the PDF Encoder and the underlying layer that exports the actual PDF.

Related

How would you process an image puzzle webapp on either the server side or the client side? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am building a web app game which simply takes an image file as an input from the filesystem, divides the image into blocks, then puts the image blocks randomly so that the client-side user can relocate the blocks to restore the original image.
The platform I will be using is Elixir/Phoenix language/framework on the backend and VueJS on the front end. When I tried to manipulate image with Erlang, I had some difficulty to find libraries for pixel manipulation. But then I thought, if every user's puzzle image will have to be created on the server, it will cost a lot of resources anyway.
Do you think it's better to send the image from the server's filesystem to a client-side and create the image blocks on the client-side with Javascript or you would do it with Erlang/Elixir on the server side?
What would be your preferred method on a scientific basis, pros and cons?
This is how i am doing it.
# Args Explantion :
# -resize 72x72! - Resizes the passed Image, `!` means don't care about aspect ratio
# -crop 24x24 - Crops the image
# -scene 1 - Start naming the cropped images starting with index 1
# -%02d - produces output such as 00, 01, 02, 03
# image_size - Actual image size
# seg_size - Size of small segments/cropped images
list = [
"#{img_path}",
"-resize",
"#{image_size} x #{image_size}!",
"-crop",
"#{seg_size} x #{seg_size}",
"-scene",
"1",
"#{new_tmp}/%02d.png"
]
System.cmd("convert", list)
I would go with ImageMagick command line wrapper.
As by cropping documentation, somewhat like below would do:
convert my_image: -repage 60x40-5-3 \
-crop 30x20 +repage my_image_tiles_%d.png`
Although there is an Elixir wrapper for IM, called Mogrify, it implements a limited set of operations, so I usually go with System.cmd/3.
For each input file you might create tiles and next time the file is being accessed, you might start with a check. If the file was already cropped, just serve the tiles, otherwise crop it before serving tiles. This would be basically a clean server-side one-liner, prone to reverse engineering, unlike client-side solutions.

How to convert scanned document images to a PDF document with high compression?

I need to convert scanned document images to a PDF document with high compression. Compression ratio is very important. Can someone recommend any solution on C# for this task?
Best regards, Alexander
There is a free program called PDFBeads that can do it. It requires Ruby, ImageMagick and optionally jbig2enc.
The PDF format itself will probably add next to no overhead in your case. I mean your images will account for most of the output file size.
So, you should compress your images with highest possible compression. For black-and-white images you might get smallest output using FAX4 or JBIG2 compression schemes (both supported in PDF files).
For other images (grayscale, color) either use smallest possible size, lowest resolution and quality, or convert images to black-and-white and use FAX4/JBIG2 compression scheme.
Please note, that most probably you will lose some detail of any image while converting to black-and-white.
If you are looking for a library that can help you with recompression then have a look at Docotic.Pdf library (Disclaimer: I am one of developers of the library).
The Optimize images sample code shows how to recompress images before adding them to PDF. The sample shows how to recompress with JPEG, but for FAX4 the code will be almost the same.

Open jpeg with photoshop and just save it, do i loose quality? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I apologize for what seems a stupid question, but I need to know...
If I open an image with Photoshop, let’s say a jpeg, and save the image without actually doing anything (open the image, click save and that’s all), and i do this over and over again, do I loose quality?
I believe it looses quality when i do changes and save it, but what about opening the image and just save it multiple times without applying any changes?
Usually yes, you do. That is because the quantization in JPEG isn't lossless and will reduce quality up to a fixed point after which no further degradation will occur (depending on the quality you saved at the result will be more or less visible).
However, there is a special case with Photoshop where quality degrades much faster because they tweaked the algorithm. Neal Kravetz of Hacker Factor has an article about exactly that:
I resaved the image repeatedly at 99% quality. (Load, save at 99%, reload, resave at 99%, repeat.) At 99% quality, the changes stop after 11 resaves. (Since Q99 takes very tiny steps, it hits a local minima quickly.) Resaved files #11 through #500 all have the exact same sha1 checksum. At 75% quality, it stops after 54 resaves (saves #54 through #500 are identical).
[...]
I repeated the experiment manually, using Photoshop. I lost count around 12 (doing it manually and the phone rang) [...] With fewer than two dozen resaves, you can already see parts of the walls getting brighter and darker -- much more than the JPEG algorithm can account for.
[...]
Yes, repeatedly saving a JPEG makes the image worse. But repeatedly saving it with Photoshop makes it much worse.
You will lose quality as Photoshop has a default JPEG save quality that is <100%.
Whenever you save an image as JPEG, it will process and compress the image, losing quality as JPEG is a lossy format.
it would execute the same saving code for the same pixels as before, so, no.

Are all PDF files compressed?

So there are some threads here on PDF compression saying that there is some, but not a lot of, gain in compressing PDFs as PDFs are already compressed.
My question is: Is this true for all PDFs including older version of the format?
Also I'm sure its possible for someone (an idiot maybe) to place bitmaps into the PDF rather than JPEG etc. Our company has a lot of PDFs in its DBs (some older formats maybe). We are considering using gzip to compress during transmission but don't know if its worth the hassle
PDFs in general use internal compression for the objects they contain. But this compression is by no means compulsory according to the file format specifications. All (or some) objects may appear completely uncompressed, and they would still make a valid PDF.
There are commandline tools out there which are able to decompress most (if not all) of the internal object streams (even of the most modern versions of PDFs) -- and the new, uncompressed version of the file will render exactly the same on screen or on paper (if printed).
So to answer your question: No, you cannot assume that a gzip compression is adding only hassle and no benefit. You have to test it with a representative sample set of your files. Just gzip them and take note of the time used and of the space saved.
It also depends on the type of PDF producing software which was used...
Instead of applying gzip compression, you would get much better gain by using PDF utilities to apply compression to the contents within the format as well as remove things like unneeded embedded fonts. Such utilities can downsample images and apply the proper image compression, which would be far more effective than gzip. JBIG2 can be applied to bilevel images and is remarkably effective, and JPEG can be applied to natural images with the quality level selected to suit your needs. In Acrobat Pro, you can use Advanced -> PDF Optimizer to see where space is used and selectively attack those consumers. There is also a generic Document -> Reduce File Size to automatically apply these reductions.
Update:
Ika's answer has a link to a PDF optimization utility that can be used from Java. You can look at their sample Java code there. That code lists exactly the things I mentioned:
Remove duplicated fonts, images, ICC profiles, and any other data stream.
Optionally convert high-quality or print-ready PDF files to small, efficient and web-ready PDF.
Optionally down-sample large images to a given resolution.
Optionally compress or recompress PDF images using JBIG2 and JPEG2000 compression formats.
Compress uncompressed streams and remove unused PDF objects.

Latex using eps images builds slowly [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Working with a latex document with eps images as in the example below...
\documentclass[11pt]{paper}
\usepackage[dvips]{graphicx}
\usepackage{fullpage}
\usepackage{hyperref}
\usepackage{amsmath}
\DeclareMathSizes{8}{8}{8}{8}
\author{Matt Miller}
\title{my paper}
\begin{document}
\begin{figure}[!ht]
\begin{center}
\includegraphics[width=2in] {Figuer.eps}
\end{center}
\caption{Figure\label{fig:myFig}}
\end{figure}
\end{document}
When I got to build my latex document the time it takes to build the document increases with time. Are there any tips or tricks to help speed up this process?
latex paper.tex; dvipdf paper.dvi
Some additional ideas:
try making a simpler figure (e.g. if it's a bitmapped figure, make a lower-resolution one or one with a low-resolution preview)
use pdflatex and have the figure be a .jpg, .png, or .pdf source.
I generally take the latter approach (pdflatex).
How big are the eps files? Latex only needs to know the size of the bounding box, which is at the beginning of the file.
dvips (not dvipdf) shouldn't take too much time since it just needs to embed the eps into the postscript file.
dvipdf, on the other hand has to convert the eps into pdf, which is expensive.
Indeed, you can use directly
pdflatex paper.tex
Few changes are required.
Convert your graphics from EPS to PDF before running pdflatex. You need to do it only once:
epstopdf Figuer.eps
If will produce Figuer.pdf which is suitable for pdflatex. In your example dvipdf does it on every build.
In the document use
\usepackage[pdftex]{graphicx} % not [dvips]
And to include graphics, omit the extension:
\includegraphics[width=2in] {Figuer} % but {Figuer.pdf} works too
It will choose Figuer.pdf when compiled by pdflatex and Figuer.eps when compiled by latex. So the document remains compatible with legacy latex (only remember to chage \usepackage{graphics}).
Reducing the file size of your EPS files might help. Here are some ideas how to do that.
If you have the original image as JPEG, PNG, TIFF, GIF (or any other sampled image), reduce the original file size with whatever tools you have, then convert to EPS using sam2p. sam2p gives you much smaller EPS file sizes than what you get from most popular converters.
If your EPS is vector graphics, convert it to PDF using ps2pdf14 (part of Ghostscript), then convert back to eps using pdftops -eps (part of xpdf). This may reduce the EPS file size a lot.
As a quick fix, try passing the [draft] option to the graphix package.
Are you using a DVI previewer or going straight to pdf?
If you go all the way to pdf, you'll pay the cost of unencoding and reencoding (I used to have that problem with visio diagrams). However, if you can generate PSs most of the time or work straight with the DVI, the experience would be manageable.
Also, some packages will create .pdf files for you from figures, which you can then embed (I do that on my mac)