PDFBox converting colorspace of existing pdf - pdf

Problem:
I have a existing pdf which have some images and text. This pdf has mixed color space RGB and CMYK. I want to convert it into single color-space pdf preferably into CMYK.
Using : PDFBox 2.0.4
PDFBox has PDPageContentStream which can used to change the add color-space when adding data to the existing pdf like below:
PDDocument document = PDDocument.load(myPdfWithMixColorSpace);
PDPage pdPage = document.getPage(0);
PDPageContentStream imageContentStream = new PDPageContentStream(document, page);
imageContentStream.setNonStrokingColor(0, 0, 0, 255); // for cmyk color space
imageContentStream.beginText();
imageContentStream.setFont(PDType1Font.TIMES_ROMAN, 20.15f);
imageContentStream.newLineAtOffset(13.55f, 50f);
imageContentStream.showText("this is test");
imageContentStream.endText();
imageContentStream.close();
document.save("D:\\newPDF.pdf");
document.close();
newPDF.pdf has the text added to it in CMYK color, But I don't want to add text or image in CMYK color on the existing pdf, but to convert the color-space of all the content of the input pdf itself.
In summary what I want is:
Given a pdf having mix color space for image and text in it.
Then using pdfbox
Create a new pdf or update the existing pdf with cmyk color space for all its content.

Related

How to convert existing pdf files to A4 size using pdfbox?

I want to set a size(A4) to an existing document.
I am using pdfbox for watermarking. I used the following link to add watermark. Here I am using another file in which watermark text is there. Latter we are only adding this layer as overlay to original file.
Here the problem arises when file with watermark text is with different size than original document to which the watermark is to be added. In those case the watermark is not getting added properly in terms of position.
Version: I am using pdfbox 1.8. I tried with 2.0 but I am more comfortable with this version.
Here is the code
PDDocument originalPdfFile = PDDocument.load(filename);
PDRectangle pdRect=new PDRectangle(595, 842);//Here I am setting height and width in terms of points
List PageList = originalPdfFile.getDocumentCatalog().getAllPages();
int noOfPages=PageList.size();
System.out.println("No of pages in original document="+noOfPages);
PDPage page=new PDPage();
//PDPage page=new PDPage(PDPage.PAGE_SIZE_A4);
//Here also I tried to add page size
for (int i = 0; i < PageList.size(); i++) {
page=(PDPage)PageList.get(i);
System.out.println("Original Document size in page before cropping: "+(i+1)+", Page Resolution: "+page.getMediaBox());
page.setMediaBox(pdRect);
System.out.println("Original Document size in page after cropping: "+(i+1)+", Page Resolution: "+page.getMediaBox());
//System.out.println("Original Document size in page: "+i+", Height: "+page.getMediaBox().getHeight()+",Width: "+page.getMediaBox().getWidth());
PDRectangle rec=page.getMediaBox();
generateWatermarkText(organisationName,rec);
}
HashMap<Integer, String> overlayGuide = new HashMap<Integer, String>();
for(int i=0; i<originalPdfFile.getNumberOfPages(); i++)
{
overlayGuide.put(i+1, "C:/drm/final/final.pdf");
//watermarktext.pdf is the document which is a one page PDF with your watermark image in it.
}
Overlay overlay = new Overlay();
overlay.setInputPDF(originalPdfFile);
overlay.setOutputFile(filename);
overlay.setOverlayPosition(Overlay.Position.FOREGROUND);
overlay.overlay(overlayGuide,false);
//pdf will have the original PDF with watermarks.
The above code add watermark successfully but I am not able to shrink the page.
This line
PDRectangle pdRect=new PDRectangle(595, 842);
crops the page but it cuts the contains of the page, which I don't want. I want the contains but to should be fit in that page and the page should be of specified size(like A4 in my case).

PDF to Black and White TIFF conversion using PDFBox looses quality

I have written a program that reads pages of a pdf file using PDFBox api and sends the BufferedImage to the following method that converts it to black and white. Then my programs writes it to TIFF files using FilesUtils.
private BufferedImage toBlacknWhite(BufferedImage imageBuffer) {
BufferedImage bw = new BufferedImage(imageBuffer.getWidth(),
imageBuffer.getHeight(), BufferedImage.TYPE_BYTE_BINARY);
if (imageBuffer != null) {
Graphics2D g2d = bw.createGraphics();
g2d.drawImage(imageBuffer, 0, 0, null);
g2d.dispose();
}
return bw;
}
The problem I am having is that the output TIFF files are loosing major portions of image and are of very poor quality. Please suggest me a way to improve the quality of the output image.
Original Image:
Output Image:
Thank you.

Replacing image in PDF using PDFBox

I have a PDF document which has a blank image (with grey background). I need to replace this blank image with another image which I have as a byte array.
I replace the image using PDFBox as below:
// Convert byte array to BufferedImage
BufferedImage imgFromBytes = ImageIO.read(new ByteArrayInputStream(imgBytes));
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ImageIO.write(imgFromBytes, "jpg", baos);
// Replace empty image in pdf with the image generated from byte array
PDXObjectImage newImage = new PDJpeg(doc, new ByteArrayInputStream(baos.toByteArray()));
blankImg.getCOSStream().replaceWithStream(newImage.getCOSStream());
This replaces the blank image in the pdf. But the problem is, the image generated from the byte array is not the same size as compared to the blank image. I guess it does some auto-scaling, because of which there is some blank space at the bottom which is grey in color (due to the background of the blank image).
So I tried re-scaling the image as below:
// Convert byte array to BufferedImage
BufferedImage imgFromBytes = ImageIO.read(new ByteArrayInputStream(imgBytes));
//Re-size the image
BufferedImage resizedImage = new BufferedImage(blankImg.getWidth(), blankImg.getHeight(), BufferedImage.TYPE_INT_RGB);
Graphics2D g2d = resizedImage.createGraphics();
g2d.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BILINEAR);
g2d.drawImage(imgFromBytes, 0, 0, blankImg.getWidth(), blankImg.getHeight(), Color.WHITE, null);
g2d.dispose();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ImageIO.write(resizedImage, "jpg", baos);
// Replace empty image in pdf with the image generated from byte array
PDXObjectImage newImage = new PDJpeg(doc, new ByteArrayInputStream(baos.toByteArray()));
blankImg.getCOSStream().replaceWithStream(newImage.getCOSStream());
But this doesn't help me either. I still see the grey background at the bottom. Moreover, this reduces the quality of the image and also chops a bit from the top and bottom of the image generated from the bytes.
What would be the best way to replace the blank image with the image I get from the byte array, so that it fits the size of the blank image perfectly?
Thanks.

Detect if PDF is colored [DATALOGICS][APDFL]

I am using APDFL 10.1.0 to convert PDF to images. This is how I am loading the PDF file and saving a specific page as image:
Document pdfdocument = null;
pdfdocument = new Document(docpath);
Page docpage = pdfdocument.GetPage(pagelist[0]);
Image pageimage = docpage.GetImage(PageRect);
Is there a way to detect from either the docpage variable or the pageimage variable if the specific page is colored or is grayscale?
You can use pageImage.NumberComponents to determine this. Color Images will have 3 or 4 components (depending on whether it is an RGB Image or a CMYK Image) while grayscale Images will have 1 component.

Paragraph in PDFbox

I have a requirement to change the PDF file using iText to PDFbox. I have following doubts:
How to generate a paragraph in PDF box? (new paragraph in iText)
How to give color for the font in PDFbox? (Font.BOLD, new Color(79, 129, 189)) in iText)
Can someone give me an advice how to solve those problems?
Not sure if you found the answer for this yet or not....
As far as I've heard, PDFBox doesn't know line breaks and you'll have to format the text and position it yourself with the moveTextPositionByAmount method.
This is how I write something and change the font and color:
PDFont font = PDType1Font.HELVETICA_BOLD;
PDPageContentStream contentStream =
new PDPageContentStream(document, page, true,true);
contentStream.beginText();
contentStream.setFont(font, size);
contentStream.setNonStrokingColor(Color.BLUE);
contentStream.moveTextPositionByAmount(x,y);
contentStream.drawString(message);
contentStream.endText();
contentStream.close();