I am wondering if there is any way to compress a specific sections of an image and preserve other sections. For example I want the background of a large image compressed but the title and description text laid over the background to be crisp.
This would be pretty cool. Short answer (no).
Long Answer.
JPEG and PNG.
Do the background with JPEG and save this off as a separate file.
Then do the title and description as a PNG with transparency.
In what every you are making (website, app) you will then be able to overlay these images and since the PNG has transparency it will appear as part of the original image.
At the end of the day we only have a few technologies we can work with ant that is jpg, gif, png, tiff, bmp, (svg some things dont support this) for image decoding for the end user.
Neither of these technologies do what you want well. PNG is awesome, but it the file size will be pretty huge compared to JPG. JPG wont give you crisp text when you have an image in the background.
I wouldnt be surprised if someone has written an encoder for what you want to do but being able to send this file to someone or something. They wont be able to decode it easily without your encoder and hence this is why we stick to the standard formats.
The direct answers are:
If you are asking "can I . . . . in Photoshop", the answer to your question is NO.
If you are asking "can I . . . . programmatically," the answer to your question is YES, with some compression methods.
However, I sense there is a question behind your question.
You mention blurring. That suggests you are trying to save as JPEG because JPEG is the only major image compression technique that causes blurring.
The solution to this problem depends upon the nature of the image. Is it a photograph? Drawing? How many colors does it have?
Could you use another compression method (e.g., PNG as suggested previously)?
You might be able to get away with JPEG using a so called "high" quality setting at the cost of increased file size.
You can select to save it with a format that has lossless compression, but the compression rate is significantly lower. Having said that, save your file as JPG with the highest level (12), then open the new file and compare it to the original. In this level the details loss is relatively lower and if the text isn't on a single color background you might find this acceptable to your needs.
Related
I want to create a visualization of a matrix for some academic work. I decided to go about this by having the pixels in the image correspond to the values in the matrix. I created the nice small png that follows:
When properly scaled up, you get a very reasonable image:
This is a screenshot from within inkscape. However, when export this as a pdf, both evince and chrome do a terrible job at upscaling what should be very trivial, and instead I get something that looks like:
The pdf itself seems to scale appropriately well for printing, but unfortunately I do a lot of my editing without printing, and this looks unacceptable. I did find this incredibly old thread about people seeming to have a similar issue with chrome's pdf viewer, and the "solution" was to just upscale the raster graphics. This is a solution, but is terribly inefficient.
Is anyone aware of a way to change the pdf so that it gets upscaled appropriately? Maybe a config change in evince or chrome that will render these properly? Even a nice way to go from a raster image to a vector image might be suitable?
The comments aggregated into an answer...
An image dictionary in a PDF has an (optional) boolean entry Interpolate. It is specified as a flag indicating whether image interpolation shall be performed by a conforming reader.
The program used by the OP to create the PDF, Inkscape, seems to have explicitly set this flag to true. Editing the PDF to unset this flag creates a file which looks as desired by the OP.
(This also is a solution proposed in this Inkscape forum thread eventually found by the OP, which is to save the PDF with high-resolution bitmaps embedded. File -> Inkscape Preferences -> Bitmaps -> Resolution for Create Bitmap Copy, and set it to 6000 dpi)
The fact that interpolation looks different in different viewers and different output media, is by design. The PDF specification states on interpolation:
A conforming Reader may choose to not implement this feature of PDF, or may use any specific implementation of interpolation that it wishes.
A different way to get around this problem (especially as some PDF viewers have the tendency to not really live up to the specification and e.g. interpolate ignoring that flag) would be to use vector graphics here, drawing the bitmap pixels as rectangles. The result should be optimal.
I have a very specific requirement where i must automatically stamp every page of a PDF file (for a faxing application), so here's the process i've made:
step 1: Convert PDF to PNG, one png file per page
cmd1: gs -dSAFER -dBATCH -dNOPAUSE -sDEVICE=png16m -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -r400 -sOutputFile=image_raw.png input.pdf
cmd2: mogrify -resize 31.245% image_raw.png
input.pdf (input): https://www.dropbox.com/s/p2ajqxe99nc0h8m/input.pdf
image_raw.png (output): https://www.dropbox.com/s/4cni4w7mqnmr0t7/image_raw.png
step 2: Stamp every PNG file (using a third party tool ..)
image_stamped.png (output): https://www.dropbox.com/s/3ryiu1m9ndmqik6/image_stamped.png
step 3: Reconvert PNG files into one PDF file
cmd: convert -resize 1240x1753 -units PixelsPerInch -density 150x150 image_stamped.png output.pdf
output.pdf (output): https://www.dropbox.com/s/o9y0jp9b4pm08ci/output.pdf
The output file of the third step shal be "theoretically" the same as the input file in step 1 (plus the stamp on it) but it's not, the file is somehow blurry and it turns to be unreadeable for humans after faxing it since blurred pixels wouldnt pass through fax wires even if you may see no difference between input.pdf and output.pdf, try zooming in and you'll find that text characters are blurred on its edges.
What is the best parameters to play with at input (step 1) or output (step 3) ?
Thanks !
You are using anti-aliasing (TextAlphaBits=4). This 'smooths' the edges of text by introducing grey pixels between the black pixels of the text edges. At low resolutions (such as displays) this prevents the 'jaggies' in text and gives a more readable result. At higher resolutions its value is highly debatable.
Fax is a 1-bit monochrome medium, so the grayscale values have to be recreated by dithering. As you have discovered, this is not a good idea in a limited resolution device as it leads to a loss of sharpness.
I believe that if you remove the -dTextAlphaBits=4 you will see an immediate improvement. I would also suggest that you remove the GraphicsAlphaBits as well, since this will have the same effect on linework.
If you believe that you still want anti-aliasing you could try reducing the aggressiveness, you currnetly have it set to 4, try reducing it to 2.
Regarding the other comments;
Kurt is quite correct, as is fourat, and I'm afraid MarcB is mistaken, the -r400 sets the resolution for rendering, in dots per inch. If only one number is given it is used for both x and y resolution. It is possible to produce a fixed size raster using Ghostscript, but you use the -dFIXEDMEDIA with -sPAPERSIZE switches or the -g switch which also sets FIXEDMEDIA automatically.
While I do agree with yms and Kurt that converting the PDF to a bitmap format (PNG) and then back to PDF will result in a loss of quality, if the final PDF is only used for transmission via fax, it doesn't matter. The PDF must be rendered to a fax-resolution bitmap at some point in the process, its not a big problem if its done before the stamp is applied.
I don't agree with BitBank here, converting a vector representation to bitmap means rasterising it at a particular resolution. Once this is done, the resulting image cannot be rescaled without loss of quality, whereas the original vector representation can be as it is simply rendered again at a different resolution. Image in PDF refers to a bitmap, you can't have a vector bitmap. The image posted by yms clearly shows the effect of rendering a vector representation into an image.
One last caveat. I'm not familiar with the other tools being used here, but two of the command lines at least imply 'resize'. If you 'resize' a bitmap then the chances are that the tool will introduce the same kinds of artefacts (anti-aliasing) that you are having a problem with. Onceyou have created the bitmap you should not alter it at all. Its important that you create the PNG at the correct size in the first place.
And finally.....
I just checked your original PDF file and I see that the content of the page is already an image. Not only that its a DCT (JPEG) image. JPEG is a really poor choice of format for a monochrome image. Its a lossy compression format and always introduces artefacts into the image. If you open your original PDF file in Acrobat (or similar viewer) and zoom in, you can see that there are faint 'halos' around the text, you will also see that the text is already blurry.
You then render the image, quite probably at a different resolution to the original image resolution, and at the same time introduce more blurring by setting -dGraphicsAlphaBits. You then make further changes to the image data which I can't comment on. In the end you render the image again, to a monochrome bitmap. The dithering required to represent the grey pixels leads to your text being unreadable.
Here are some ways to improve this:
1) Don't convert text into images like this, it instantly leads to a quality loss.
2) Don't compress monochrome images using JPEG
3) If you are going to work with images, don't keep converting them back and forth, work with the original until you are done, then make a PDF file from that, if you really must.
4) If you really insist on doing all this, don't compound the problem by using more anti-aliasing. Remove the -dGraphicsAlphaBits from the command line. You might as well remove -dTextAlphaBits as well since your files contain no text. Please read the documentation before using switches and understand what it is you are doing.
You should really think about your workflow here. Obviously we don't know what you are doing or why, so there may well be good reasons why some things are not possible, but you should try and avoid manipulating images like this. Because these are not vector, every time you make a change to the image data you are potentially losing information which cannot be recovered at a later stage. By making many such transformations (and your workflow as depicted seems to perform as many as 5 transformations from the 'original' image data) you will unavoidably lose quality.
If possible retain everything as vector data. When it is unavoidable to move to image data, create the image data as you need it to be finally used, do not transform it further.
I've had a closer look at the files you provided, see here:
So, already the first image (image_raw), the result of the mogrify resize command, is fairly blurry at 1062x1375. While the blurriness does not get worse in the second image (image_stamped) which is the result of the third-party tool, the third image (extracted from your output.pdf), i.e. the result of that convert command, is even more blurred which is due to the graphic being resized (which is something you explicitly tell it to do).
I don't know at which resolution your fax program works, but there is more quality loss still, at least due to 24 bit colors to black-and-white transformation.
If you insist on the work flow (i.e. pdf->png->stamped png->pdf->fax) you should
in the initial rasterization already use the per-inch resolution your rastered image will have in all following steps (including fax transmission),
refrain from anti-aliasing and use of alpha bits (cf. KenS' answer), and
restrict the rasterized image to the colorspace available to the fax transmission, i.e. most likely black-and-white.
PS As KenS pointed out, already the original PDF is merely a container for an image (with some blur to start with). Therefore, an alternative way to improve your workflow is to extract that image instead of rendering it, to stamp that original image and only resize it (again without anti-aliasing) when faxing.
I have a 130kb jpeg image that wont open in anything and I need to fix it. From the various image recovery softwares that I used all I got was "Image headers corrupted/missing". I dont even get anything when I look up in the properties of the file, no dimensions etc., just the file size. Is it possible to recover the image once its headers are lost? I dont want to use any recovery softwares anymore. I got one idea from a colleague to parse the jpeg and look for anomalies compared to a working jpeg. Any other ideas?
The only method I can think of is to look at the JPEG using a hex editor, and check if its contents conform to the JPEG spec. Good luck.
This is not strictly a programming question, but it's related to programming task I need to perform this in order to make an iPhone app.
I have a PDF file with a large image (say, a campus map) which I want to store as a PNG image to include as resource in the app. The image I want itself is much larger than the screen area (a lot larger, about 4000x4000 px). So I cannot just take a single screenshot of the PDF and save it as PNG. The only way I know to accomplish this is to take a number of screenshots of different parts of the image and manually stitch them together in an image editor. There will be 8-10 images to stitch together, if not more.
I wonder if anyone knows a more efficient way of doing this? Acrobat PDF reader does not allow this. Are there any tools or tricks in either Windows or MacOS I can use? Googling this did not bring anything that works.
It would also be an option to use the PDF directly, iOS has pretty good support for reading PDFs, see the ZoomingPDFViewer sample code from Apple for an example.
As for your actual question, I'm not sure if there are existing tools that do exactly what you want here (though I'd guess there are), but it would also be pretty easy to make a small Cocoa command-line tool that converts a PDF to a number of bitmap tiles using Core Graphics.
You could use Ghostscript to convert your pdf to a png.
A command like
gs -sDEVICE=png16m -r600 -o my_Map.png my_Map.pdf
would provide you a png from a pdf image.
I am experimenting with a system to scan letters and convert the scanned bitmaps to PDF with the goal to have a high resolution and a small PDF file size.
I am prototyping with scanner, GIMP for bitmap manipulation and ImageMagick for bitmap-to-PDF conversion.
My process looks as follows:
Scan in 3x8bit color, 600 DPI,
LZW-compressed true-color TIFF file
size is around 8 Mb.
Use GIMP to convert bitmap to indexed
image with a typical color table of 4
to 8 colors. That makes the image better compressible.
Use ImageMagick to convert the
LZW-compressed indexed TIFF file PDF,
with around 500K per page.
Now in order to make the image even better compressible, I could make the bitmap more compression-friendly. Before experimenting here, I would like to know how PS/PDF stores bitmaps.
Are bitmaps in PS/PDF run-lenght-encoded? Then I woud gain compression by removing single pixles form bitmap rows.
Do you have ideas for further optimizing here?
Do you know references to bitmap storage format in PS/PDF?
PDF supports many types of image compression, see: http://en.wikipedia.org/wiki/Pdf#Raster_images
I think you can specify which one to use with the imagemagick -compress option: http://www.imagemagick.org/script/command-line-options.php#compress
A few companies (Luratech and CamiNova are the only ones I know) make a "Mixed Raster Content" model in PDF. The files are viewable in the standard Adobe Reader but are very, very small -- comparable to DjVu.
"Mixed Raster Content" means they segment the image into a high resolution B&W mask (hard edges, lines, letters) and lower resolution smooth tone image (background pictures). The mask gets stored using a bitonal compression algorithm (probably JBIG2) and the smooth tone image gets compressed using JP2K (probably).
For bitmaps, IIRC, PDF uses deflate. But PDF can also store images with more specific image compression algorithms, such JPEG (lossy), CCITT (lossless), JBIG2 (lossy and lossless) and JPX (of JPEG2000, lossy and lossless).
Adobe's PDF reference might be a good place to start. From a very cursory look, it looks like images are stored uncompressed, but that doesn't feel right at all. It can also link to external images, in JPEG for instance.
The compression method is generally selected by the tool creating the PDF and you may have limited control over that.
If you have Acrobat 9.0 there is a really nice 'hidden' feature which allows you to see the object tree inside a PDF (you are interested in the XObjects under Resources). There is a short blog on using it at http://pdf.jpedal.org/java-pdf-blog/bid/10479/Viewing-PDF-objects