How to compress a Pdf file in Windows 8 and Windows 7 - pdf

I have a pdf file of 9mb but I want to convert it into less than 1mb.How can I do that? I have used all the online tools available.

It depends on what you produced it with, what level of PDF compliance you set, what target purpose / amount of images.
What you made it with: Some design software (Ilustrator) gives the option to leave editing information in the PDF. You would want to switch anything like that off.
Level of PDF compliance - compression got better with higher PDF versions. Go for the highest level you can.
Target purpose / content: Target purpose means DPI. Re-sample the images to the appropriate size for the output purpose. If you are for example intending the PDF to be displayed on a phone or tablet, then you do not need those massive images that you took with your 24Gb SLR.
Acrobat Pro tends to be good at compressing and removing junk.
9Mb is quite large for a PDF so I assume it is full of rich images. Reducing the size and colour depth of those images is likely to be the biggest benefit.

Related

How to increase the page turn speed

Are there some general rules how to create pdf documents in order to achieve an optimal page turn speed?
I recently created few pdf documents (without graphics) using Microsoft Word and realized that my ebook reader (SONY) leafs through them at a slower pace compared with some pdf documents containing more pages and graphics.
What features of pdf documents or ebook readers affect the page turn speed?
How these feature should be configured to increase the page turn speed?
Thanks
There are a few major items that could affect PDF document rendering speeds in the general case. These items are, in order of severity:
Vector graphics - this is by far the worst offender. Too many vector graphics could bring document rendering speeds to a crawl
High-resolution images - even if the images in the actual document appear small, the originals, stored in the document, could be high resolution. This leads to slow rendering speeds as well
Font problems - this category is expanded on below
I imagine issues with fonts is what comes into play here. The rendering speed can be affected by the number of fonts in a PDF document. Also the use of multi-byte and cid fonts sometimes could trip up a mobile viewer.
Another problem could be the version of word used. Older versions don't do as good of a job generating PDF documents, as newer versions.

How to convert scanned document images to a PDF document with high compression?

I need to convert scanned document images to a PDF document with high compression. Compression ratio is very important. Can someone recommend any solution on C# for this task?
Best regards, Alexander
There is a free program called PDFBeads that can do it. It requires Ruby, ImageMagick and optionally jbig2enc.
The PDF format itself will probably add next to no overhead in your case. I mean your images will account for most of the output file size.
So, you should compress your images with highest possible compression. For black-and-white images you might get smallest output using FAX4 or JBIG2 compression schemes (both supported in PDF files).
For other images (grayscale, color) either use smallest possible size, lowest resolution and quality, or convert images to black-and-white and use FAX4/JBIG2 compression scheme.
Please note, that most probably you will lose some detail of any image while converting to black-and-white.
If you are looking for a library that can help you with recompression then have a look at Docotic.Pdf library (Disclaimer: I am one of developers of the library).
The Optimize images sample code shows how to recompress images before adding them to PDF. The sample shows how to recompress with JPEG, but for FAX4 the code will be almost the same.

Pdf tools to analyze pdf attributes

Is there any pdf tools that generate information regarding the loading time and memory usage to display pdf in browser, and also total element inside the pdf?
Unfortunately not really. I've done some of this research, not for PDF in a browser but (and perhaps this is what you are looking at as well) PDF on mobile devices.
There are a number of factors that contribute and that to some extent can be tested for:
Whether or not big images exist in the PDF and what resolution they are. This is linked directly to memory usage.
What compression method is used for image compression. Decompressing JPEG-2000 images specifically can increase load time significantly. Even worse, as JPEG-2000 can be progressively decompressed, it can give the appearance of a really bad PDF until the images has been fully decompressed and loaded (this is ugly specifically on somewhat older tablets for example).
How complex the transparency effects are that are used in the document.
How many fonts are used in the document.
How many line-art objects (vector elements) with a large number of nodes (points) are used on a page.
You can test what is in the document using Acrobat Pro to some extent (there is a well-hidden tool when you save an optimised PDF file that can audit what objects use how much of the space in a PDF document). You can also use a preflight solution such as pdfToolbox from callas (I'm affiliated with this company) or pitstop from enfocus; these tools would allow you to get a report with the results of custom checks such as image resolution, compression, vector objects, color spaces etc.

Reducing the size of pdf generated from software using proprietary fonts

I am trying to bring an Indian Magazine online. This magazine is typed in CorelDraw using the proprietary Devenagari font (http://www.modular-infotech.com/html/shreelipi.html). So these guys have provided a USB dongle that you have to have attached to the machine when you want to access the fonts, and this software has been in use for past 10 years.
To put the magazine online, we've tried to convert it to pdf (by printing). The resultant pdf size is of the order of 30-50MB, even when the pdf does not have even a single image. I am guessing it converts the whole text into an image
It would be really difficult for users to read this magazine given its size. Though when I convert it to .swf format (for add flipbook kind of functionality) - the size reduces to 5-6MB. But there are people who like to download the magazine and then read. I have had no luck reducing the size of pdf.
I have done lot of research on web. The postscript, primo pdf do not help much. The best I could get was 30% reduction using DocuCom pdf printer. But it is still 20MB. I have tried to play with resolution, compression and quality but the best I could get was 18MB.
Ideally I would like to reduce it to less than 2MB.
I would be really grateful if you could help me reduce the size of the pdf! Considering that it has no images, I am hopeful that I can get some really good compression.
The (35MB) magazine can be downloaded from: http://merajhola.in/jin-march.pdf
I can't see any easy way to reduce the size of this PDF. There are no embedded fonts and all the text is drawn using vector graphics primitives. No amount of tweaking the resolution, compression and quality will have a significant improvement.
One possible option would be to embed the font as a subset rather than use vector graphics. That will almost certainly make a big difference, however I doubt the proprietary font license will allow it.
I'm sorry, but this Shree-Lipi thing just sounds wrong in 2012. It would be much better to use proper OpenType fonts with modern (say InDesign) or free (say LuaTeX) software.

Are all PDF files compressed?

So there are some threads here on PDF compression saying that there is some, but not a lot of, gain in compressing PDFs as PDFs are already compressed.
My question is: Is this true for all PDFs including older version of the format?
Also I'm sure its possible for someone (an idiot maybe) to place bitmaps into the PDF rather than JPEG etc. Our company has a lot of PDFs in its DBs (some older formats maybe). We are considering using gzip to compress during transmission but don't know if its worth the hassle
PDFs in general use internal compression for the objects they contain. But this compression is by no means compulsory according to the file format specifications. All (or some) objects may appear completely uncompressed, and they would still make a valid PDF.
There are commandline tools out there which are able to decompress most (if not all) of the internal object streams (even of the most modern versions of PDFs) -- and the new, uncompressed version of the file will render exactly the same on screen or on paper (if printed).
So to answer your question: No, you cannot assume that a gzip compression is adding only hassle and no benefit. You have to test it with a representative sample set of your files. Just gzip them and take note of the time used and of the space saved.
It also depends on the type of PDF producing software which was used...
Instead of applying gzip compression, you would get much better gain by using PDF utilities to apply compression to the contents within the format as well as remove things like unneeded embedded fonts. Such utilities can downsample images and apply the proper image compression, which would be far more effective than gzip. JBIG2 can be applied to bilevel images and is remarkably effective, and JPEG can be applied to natural images with the quality level selected to suit your needs. In Acrobat Pro, you can use Advanced -> PDF Optimizer to see where space is used and selectively attack those consumers. There is also a generic Document -> Reduce File Size to automatically apply these reductions.
Update:
Ika's answer has a link to a PDF optimization utility that can be used from Java. You can look at their sample Java code there. That code lists exactly the things I mentioned:
Remove duplicated fonts, images, ICC profiles, and any other data stream.
Optionally convert high-quality or print-ready PDF files to small, efficient and web-ready PDF.
Optionally down-sample large images to a given resolution.
Optionally compress or recompress PDF images using JBIG2 and JPEG2000 compression formats.
Compress uncompressed streams and remove unused PDF objects.