pdfbox error - Colorspace can't be determined at this time, about to return NULL from unhandled branch - pdfbox

I am using the following code to get a list of jpegs attached to one of my objects "PrintDocument" wich references a "Document" which references a "Page" Object. The path is being combined from Document and Page. But this is not the Problem. When I use this code and run it in simple java on the jvm, the pdf is being built without any error...
public File createPDFtoPrint(PrintDocument pdocument)
throws IncompleteDocumentException, IOException, FSException {
File tmp = null;
if(validatePrintDocument(pdocument)){
PDDocument document = new PDDocument();
Iterator<Page> iterator = pdocument.getDocument().getPageList().iterator();
while(iterator.hasNext()){
Page page = iterator.next();
PDPage pdpage = new PDPage(PDPage.PAGE_SIZE_A4);
document.addPage( pdpage );
PDPageContentStream contentStream = null;
try {
contentStream = new PDPageContentStream(document, pdpage);
} catch (IOException e) {
String message = "Error while creating contentStream for Page" + page.getPageNumber();
logger.error(message);
logger.debug(e.toString());
}
PDJpeg ximage;
try {
File imgFile = new File(docDir+"/"+pdocument.getDocument().getName()+"/"+page.getPageNumber()+".jpg");
System.out.println("*************** FILEPATH IS "+imgFile.getPath());
ximage = new PDJpeg(document,new FileInputStream(imgFile));
} catch (FileNotFoundException e) {
String message = "File not found for Document " + pdocument.getDocument().getName() + "! Try uploading the PDF for this product!";
logger.error(message);
throw new FileNotFoundException(message);
} catch (IOException e) {
String message = "Error while reading input file for document " + pdocument.getDocument().getName();
logger.error(message);
throw new IOException(message); }
if(ximage.getWidth()>ximage.getHeight()){
AffineTransform at = new AffineTransform(pdpage.getMediaBox().getWidth(), 0, 0, pdpage.getMediaBox().getHeight(), pdpage.getMediaBox().getWidth(), 0);
at.rotate(Math.toRadians(90));
contentStream.drawXObject(ximage,at);
}
else{
AffineTransform at = new AffineTransform(pdpage.getMediaBox().getWidth(), 0, 0, pdpage.getMediaBox().getHeight(), 0, 0);
contentStream.drawXObject(ximage,at);
}
PDFont font = PDType1Font.HELVETICA_BOLD;
//Pixel per Point
float ppp = page.getPpp();
Iterator<Field> cbIterator = pdocument.getCheckBoxMap().keySet().iterator();
while(cbIterator.hasNext()){
Field field = cbIterator.next();
if(pdocument.getCheckBoxMap().get(field)){
contentStream.beginText();
contentStream.setFont( font, 14 );
contentStream.moveTextPositionByAmount(field.getPosx()/ppp, field.getPosy()/ppp );
contentStream.drawString("x");
contentStream.endText();
}
}
BUT, when I use this in a EJB within the Glassfish Container I get the following errors, and the pages are blank:
[2015-05-05T01:57:59.739+0200] [glassfish 4.0] [INFO] [] [] [tid: _ThreadID=22 _ThreadName=Thread-3] [timeMillis: 1430783879739] [levelValue: 800] [[2015-05-05 01:57:59 DEBUG PDXObjectImage:398 - Colorspace can't be determined at this time, about to return NULL from unhandled branch. filter = COSName{DCTDecode}]]
[2015-05-05T01:57:59.740+0200] [glassfish 4.0] [INFO] [] [] [tid: _ThreadID=22 _ThreadName=Thread-3] [timeMillis: 1430783879740] [levelValue: 800] [[2015-05-05 01:57:59 DEBUG PDXObjectImage:400 - Can happen e.g. when constructing PDJpeg from ImageStream]]
Does anybody have any clue, why this is happening and how this can be solved?
Regards

My Problem was, that the PDFs I am generating seemed to be empty. All white. But the error mentioned in the topic has nothing to do with that. It seems that this is a kind of error nobody, that not really cares about the colorspace should care about. The error, why my Documents seemed empty was that I was not closing the ContentStream after finishing a page. So ContentStream#close solved the problem in my case.

Related

Why NoPointerExcepeion when decompression by apache compress?

click and see The NoPointerExcepeion
I generate tar.gz files and send 2 others 4 decompress, but their progrem has error above(their progrem was not created by me), only one file has that error.
But when using command 'tar -xzvf ***' on my computer and their computer, no problem occured...
So I want 2 know what was wrong in my progrem below:
public static void archive(ArrayList<File> files, File destFile) throws Exception {
TarArchiveOutputStream taos = new TarArchiveOutputStream(new FileOutputStream(destFile));
taos.setLongFileMode(TarArchiveOutputStream.LONGFILE_POSIX);
for (File file : files) {
//LOG.info("file Name: "+file.getName());
archiveFile(file, taos, "");
}
}
private static void archiveFile(File file, TarArchiveOutputStream taos, String dir) throws Exception {
TarArchiveEntry entry = new TarArchiveEntry(dir + file.getName());
entry.setSize(file.length());
taos.putArchiveEntry(entry);
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
int count;
byte data[] = new byte[BUFFER];
while ((count = bis.read(data, 0, BUFFER)) != -1) {
taos.write(data, 0, count);
}
bis.close();
taos.closeArchiveEntry();
}
The stack trace looks like a bug in Apache Commons Compress https://issues.apache.org/jira/browse/COMPRESS-223 that has been fixed with version 1.7 (released almost three years ago).

Convert pdf to pdf/a using iText library

I want to export document to PdfAConformanceLevel.PDF_A_1B conformance, but when I do document.close, I get error below, resulting pdf is not usable.
I use following itext versions:
<artifactId>itextpdf</artifactId>
<version>5.5.9</version>
<artifactId>itext-pdfa</artifactId>
<version>5.5.9</version>
stack trace:
com.itextpdf.text.pdf.PdfAConformanceException: Real number is out of range.
at com.itextpdf.text.pdf.internal.PdfA1Checker.checkPdfObject(PdfA1Checker.java:259)
at com.itextpdf.text.pdf.internal.PdfAChecker.checkPdfAConformance(PdfAChecker.java:208)
at com.itextpdf.text.pdf.internal.PdfAConformanceImp.checkPdfIsoConformance(PdfAConformanceImp.java:71)
at com.itextpdf.text.pdf.PdfWriter.checkPdfIsoConformance(PdfWriter.java:3480)
at com.itextpdf.text.pdf.PdfWriter.checkPdfIsoConformance(PdfWriter.java:3476)
at com.itextpdf.text.pdf.PdfObject.toPdf(PdfObject.java:174)
at com.itextpdf.text.pdf.PdfArray.toPdf(PdfArray.java:175)
at com.itextpdf.text.pdf.PdfDictionary.toPdf(PdfDictionary.java:149)
at com.itextpdf.text.pdf.PdfStream.superToPdf(PdfStream.java:278)
at com.itextpdf.text.pdf.PRStream.toPdf(PRStream.java:239)
at com.itextpdf.text.pdf.PdfIndirectObject.writeTo(PdfIndirectObject.java:158)
at com.itextpdf.text.pdf.PdfWriter$PdfBody.write(PdfWriter.java:420)
at com.itextpdf.text.pdf.PdfWriter$PdfBody.add(PdfWriter.java:398)
at com.itextpdf.text.pdf.PdfWriter$PdfBody.add(PdfWriter.java:377)
at com.itextpdf.text.pdf.PdfWriter.addToBody(PdfWriter.java:872)
at com.itextpdf.text.pdf.PdfReaderInstance.writeAllVisited(PdfReaderInstance.java:161)
at com.itextpdf.text.pdf.PdfReaderInstance.writeAllPages(PdfReaderInstance.java:177)
at com.itextpdf.text.pdf.PdfWriter.addSharedObjectsToBody(PdfWriter.java:1380)
at com.itextpdf.text.pdf.PdfWriter.close(PdfWriter.java:1264)
at com.itextpdf.text.pdf.PdfAWriter.close(PdfAWriter.java:337)
at com.itextpdf.text.pdf.PdfDocument.close(PdfDocument.java:889)
at com.itextpdf.text.Document.close(Document.java:416)
at si.telekom.erender.ERenderImpl.mergeContentOfItems(ERenderImpl.java:2911)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.sun.xml.ws.api.server.MethodUtil.invoke(MethodUtil.java:83)
at com.sun.xml.ws.api.server.InstanceResolver$1.invoke(InstanceResolver.java:250)
at com.sun.xml.ws.server.InvokerTube$2.invoke(InvokerTube.java:149)
at com.sun.xml.ws.server.sei.SEIInvokerTube.processRequest(SEIInvokerTube.java:88)
at com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:1136)
at com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:1050)
at com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:1019)
at com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:877)
at com.sun.xml.ws.server.WSEndpointImpl$2.process(WSEndpointImpl.java:419)
at com.sun.xml.ws.transport.http.HttpAdapter$HttpToolkit.handle(HttpAdapter.java:868)
at com.sun.xml.ws.transport.http.HttpAdapter.handle(HttpAdapter.java:422)
at com.sun.xml.ws.transport.http.servlet.ServletAdapter.invokeAsync(ServletAdapter.java:225)
at com.sun.xml.ws.transport.http.servlet.WSServletDelegate.doGet(WSServletDelegate.java:161)
at com.sun.xml.ws.transport.http.servlet.WSServletDelegate.doPost(WSServletDelegate.java:197)
at com.sun.xml.ws.transport.http.servlet.WSServlet.doPost(WSServlet.java:81)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:603)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
I am producing PDF with following code:
public byte[] mergeContentOfItems(List<MergeItem> items) throws ErenderException {
MessageContext mc = wsCtx.getMessageContext();
HttpServletRequest req = (HttpServletRequest) mc.get(MessageContext.SERVLET_REQUEST);
getLogger().info("Webservice method 'mergeContentOfItems' called from IP:" + req.getRemoteAddr());
if (items.size() < 1) {
String errDescription = "No barcodes specified!";
throw new ErenderException(errDescription, new ErenderExceptionBean("201", errDescription),
new Throwable(errDescription));
}
com.itextpdf.text.Document document = new com.itextpdf.text.Document();
ByteArrayOutputStream baOs = new ByteArrayOutputStream();
PdfWriter writer = null;
List<PdfReader> readers = new ArrayList<PdfReader>();
int totalPages = 0;
try {
// Create a writer for the outputstream
writer = PdfAWriter.getInstance(document, baOs, PdfAConformanceLevel.PDF_A_1B);
writer.setPdfVersion(PdfWriter.PDF_VERSION_1_4);
writer.createXmpMetadata();
//writer = PdfWriter.getInstance(document, baOs);
document.open();
ICC_Profile icc = ICC_Profile
.getInstance(Thread.currentThread().getContextClassLoader().getResourceAsStream("srgb.profile"));
writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF
for (int i = 0; i < items.size(); i++) {
String pdfFileName = null;
File urlTempFile = null;
if (items.get(i).getBarcode() != null) {
Template tmpl = TemplatesSynchronizer.getTemplateByBarcode(items.get(i).getBarcode());
String fileName = tmpl.getName();
pdfFileName = fileName.substring(0, fileName.indexOf(".")) + ".pdf";
getLogger().info("\tworking on:" + items.get(i) + " fileName:" + pdfFileName);
if (!new File(pdfFileName).exists()) {
String msg = String.format("Datoteka %s ne obstaja", pdfFileName);
throw new ErenderException("Error", new ErenderExceptionBean("109", msg, new Exception(msg)));
}
} else if (items.get(i).getUrl() != null) {
urlTempFile = File.createTempFile("myTemp", "pdf");
FileUtils.copyURLToFile(new URL(items.get(i).getUrl()), urlTempFile);
}
if (pdfFileName != null || urlTempFile != null) {
PdfReader pdfReader = null;
if (pdfFileName != null)
pdfReader = new PdfReader(pdfFileName);
else if (urlTempFile != null)
pdfReader = new PdfReader(urlTempFile.getAbsolutePath());
if (pdfReader != null) {
// Create Readers for the pdfs.
readers.add(pdfReader);
totalPages += pdfReader.getNumberOfPages();
int pageOfCurrentReaderPDF = 0;
while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) {
document.newPage();
pageOfCurrentReaderPDF++;
PdfImportedPage page = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF);
document.setPageSize(pdfReader.getPageSizeWithRotation(pageOfCurrentReaderPDF));
document.newPage();
cb.addTemplate(page, 0, 0);
}
}
if (urlTempFile != null)
urlTempFile.delete();
}
}
} catch (Throwable ex) {
StringWriter errorStringWriter = new StringWriter();
PrintWriter pw = new PrintWriter(errorStringWriter);
ex.printStackTrace(pw);
Logger.getLogger(this.getClass()).error(errorStringWriter.getBuffer().toString());
throw new ErenderException("Error", new ErenderExceptionBean("109", "Napaka v merge metodi.",ex), ex);
} finally {
if (document != null && document.isOpen())
try {
document.close();
} catch (Exception ex) {
StringWriter errorStringWriter = new StringWriter();
PrintWriter pw = new PrintWriter(errorStringWriter);
ex.printStackTrace(pw);
Logger.getLogger(this.getClass()).error(errorStringWriter.getBuffer().toString());
getLogger().error("Unable to close document.\n" + errorStringWriter);
}
if (writer != null && writer.isCloseStream()) {
try {
writer.flush();
writer.close();
} catch (Exception ex) {
getLogger().error("Unable to flush or close writer");
}
}
try {
baOs.flush();
baOs.close();
} catch (Exception ex) {
getLogger().error("Unable to close baOs in mergeContent method.");
}
}
getLogger().info("Webservice method 'mergeContent' called from IP:" + req.getRemoteAddr() + " ended. " + totalPages
+ " merged.");
return baOs.toByteArray();
}
Since I don't get error on other files this seems to be input files specific - here is one file to reproduce error:
I am trying to convert this input pdf file:
http://filebin.ca/2hR2xO1SNlzh/09062009073008005.pdf
First this: iText doesn't convert ordinary PDF documents to PDF/A documents. We have customers who use iText to do this, but their code is much more elaborate than yours.
The reason why iText doesn't convert ordinary PDF documents to PDF/A should be evident: an ordinary PDF might not have all the necessary features that are needed in a PDF/A. You might have a PDF of which the fonts aren't embedded. In that case, someone needs to provide the appropriate font program. iText doesn't ship with any font program, hence the software using iText has to provide this.
In your code, you just copy content streams without checking any possible issues that make the end result non-compliant with PDF/A. You should be very careful with the resulting PDFs. They will show the blue bar that the file claims to be PDF/A, but that doesn't mean that the file will validate as a PDF when you pass it through a validator.
Now for your problem. You want to convert an ordinary PDF to PDF/A-1. PDF/A-1 is based on PDF 1.4 dating from 2001. This means that you can't use any of the new features that were introduced after 2001. In PDF 1.4, there was a limitation with respect to object number. Object numbers in PDF couldn't exceed 32,767. This limitation was removed from PDF in PDF 1.5.
My guess is that the problem you describe is caused by your attempt to create a PDF 1.4 with more objects than is allowed in PDF 1.4. There could be two reasons:
Your original PDF is PDF 1.5 or later,
Your manipulations of the PDF require more than the maximum available number of objects.
This could be fixed by generating PDF/A-2 instead of PDF/A-1, but I'm pretty sure that you'll soon hit other limitations (e.g. missing fonts and other issues that are caused by creating a file that claims to be a PDF but that isn't). PdfAWriter will throw exceptions when you try doing things that are blatantly wrong, but there's no guarantee that some more subtle PDF/A requirements are being missed.

Stop Controller from executing again once a request has been made

I'm new to grails so hoping someone will be patient and give me a hand. I have a controller that creates a PDF. If the user clicks more then one time before the PDF is created I get the following error. Below is the code for the creation of the PDF.
2016-03-09 09:32:11,549 ERROR errors.GrailsExceptionResolver - SocketException occurred when processing request: [GET] /wetlands-form/assessment/f3458c91-3435-4714-a0e0-3b24de238671/assessment/pdf
Connection reset by peer: socket write error. Stacktrace follows:
java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at mdt.wetlands.AssessmentController$_closure11$$EPeyAg3t.doCall(AssessmentController.groovy:300)
at grails.plugin.cache.web.filter.PageFragmentCachingFilter.doFilter(PageFragmentCachingFilter.java:195)
at grails.plugin.cache.web.filter.AbstractFilter.doFilter(AbstractFilter.java:63)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2016-03-09 09:32:11,549 ERROR errors.GrailsExceptionResolver - IllegalStateException occurred when processing request: [GET] /wetlands-form/assessment/f3458c91-3435-4714-a0e0-3b24de238671/assessment/pdf
getOutputStream() has already been called for this response. Stacktrace follows:
org.codehaus.groovy.grails.web.pages.exceptions.GroovyPagesException: Error processing GroovyPageView: getOutputStream() has already been called for this response
at grails.plugin.cache.web.filter.PageFragmentCachingFilter.doFilter(PageFragmentCachingFilter.java:195)
at grails.plugin.cache.web.filter.AbstractFilter.doFilter(AbstractFilter.java:63)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: getOutputStream() has already been called for this response
at C__MDTDATA_gg_workspace_new_wetlands_grails_app_views_error_gsp.run(error.gsp:1)
... 5 more
2016-03-09 09:32:11,549 ERROR [/wetlands-form].[grails] - Servlet.service() for servlet grails threw exception
java.lang.IllegalStateException: getOutputStream() has already been called for this response
PDF CODE VIA rendering plugin
def pdf = {
def assessment = lookupAssessment()
if (!assessment){
return
}
// Trac 219 Jasper report for PDF output
Map reportParams = [:]
def report = params.report
def printType = params.printType
def mitigationType = params.mitigationType
def fileName
def fileType
fileType = 'PDF'
def reportDir =
grailsApplication.mainContext.servletContext.getRealPath(""+File.separatorChar+"reports"+File.separatorChar)
def resolver = new SimpleFileResolver(new File(reportDir))
reportParams.put("ASSESS_ID", assessment.id)
reportParams.put("RUN_DIR", reportDir+File.separatorChar)
reportParams.put("JRParameter.REPORT_FILE_RESOLVER", resolver)
reportParams.put("_format", fileType)
reportParams.put("_file", "assessment")
println params
def reportDef = jasperService.buildReportDefinition(reportParams, request.getLocale(), [])
def file = jasperService.generateReport(reportDef).toByteArray()
// Non-inline reports (e.g. PDF)
if (!reportDef.fileFormat.inline && !reportDef.parameters._inline)
{
response.setContentType("APPLICATION/OCTET-STREAM")
response.setHeader("Content-disposition", "attachment; filename=" + assessment.name + "." + reportDef.fileFormat.extension);
response.contentType = reportDef.fileFormat.mimeTyp
response.characterEncoding = "UTF-8"
response.outputStream << reportDef.contentStream.toByteArray()
}
else
{
// Inline report (e.g. HTML)
render(text: reportDef.contentStream, contentType: reportDef.fileFormat.mimeTyp, encoding: reportDef.parameters.encoding ? reportDef.parameters.encoding : 'UTF-8');
}
}
This is the WORD code.
def word = {
def assessment = lookupAssessment()
if (!assessment){
return
}
// get the assessment's data as xml
def assessmentXml = g.render(template: 'word', model: [assessment:assessment]).toString()
// open the Word template
def loader = new LoadFromZipNG()
def template = servletContext.getResourceAsStream('/word/template.docx')
WordprocessingMLPackage wordMLPackage = (WordprocessingMLPackage)loader.get(template)
// get custom xml piece from Word template
String itemId = '{44f68b34-ffd4-4d43-b59d-c40f7b0a2880}' // have to pull up part by ID. Watch out - this may change if you muck with the template!
CustomXmlDataStoragePart customXmlDataStoragePart = wordMLPackage.getCustomXmlDataStorageParts().get(itemId)
CustomXmlDataStorage data = customXmlDataStoragePart.getData()
// and replace it with our assessment's xml
ByteArrayInputStream bs = new ByteArrayInputStream(assessmentXml.getBytes())
data.setDocument(bs) // needs java.io.InputStream
// that's it! the data is in the Word file
// but in order to do the highlighting, we have to manipulate the Word doc directly
// gather the list of cells to highlight
def highlights = assessment.highlights()
// get the main document from the Word file as xml
MainDocumentPart mainDocPart = wordMLPackage.getMainDocumentPart()
def xml = XmlUtils.marshaltoString(mainDocPart.getJaxbElement(), true)
// use the standard Groovy tools to handle the xml
def document = new XmlSlurper(keepWhitespace:true).parseText(xml)
// for each value in highlight list - find node, shade cell and add bold element
highlights.findAll{it != null}.each{highlight ->
def tableCell = document.body.tbl.tr.tc.find{it.sdt.sdtPr.alias.'#w:val' == highlight}
tableCell.tcPr.shd[0].replaceNode{
'w:shd'('w:fill': 'D9D9D9') // shade the cell
}
def textNodes = tableCell.sdt.sdtContent.p.r.rPr
textNodes.each{
it.appendNode{
'w:b'() // bold element
}
}
}
// here's a good way to print out xml for debugging
// System.out.println(new StreamingMarkupBuilder().bindNode(document.body.tbl.tr.tc.find{it.sdt.sdtPr.alias.#'w:val' == '12.1.1'}).toString())
// or save xml to file for study
// File testOut = new File("C:/MDTDATA/wetlands-trunk/xmlout.xml")
// testOut.setText(new StreamingMarkupBuilder().bindNode(document).toString())
// get the updated xml back in the Word doc
Object obj = XmlUtils.unmarshallFromTemplate(new StreamingMarkupBuilder().bindNode(document).toString(), null);
mainDocPart.setJaxbElement((Object)obj)
File file = File.createTempFile('wordexport-', '.docx')
wordMLPackage.save(file)
response.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document;')
response.setHeader('Content-Disposition', "attachment; filename=${assessment.name.encodeAsURL()}.docx")
response.setHeader('Content-Length', "${file.size()}")
response.outputStream << file.readBytes()
response.outputStream.flush()
file.delete()
}
// for checking XML during development
def word2 = {
def assessment = lookupAssessment()
if (!assessment){
return
}
render template: 'word', model: [assessment:assessment]
}
You need to catch the exception, if you wish to not do anything with it then as below in the catch nothing going on.. after it has gone through try and catch if still no file we know something has gone wrong so we render another or same view with error this time. After this it returns so it won't continue to your other bit which checks report type i.e. pdf or html
..
//declare file (def means it could be any type of object)
def file
//Now when you expect unexpected behaviour capture it with a try/catch
try {
file = jasperService.generateReport(reportDef).toByteArray()
}catch (Exception e) {
//log.warn (e)
//println "${e} ${e.errors}"
}
//in your scenario or 2nd click the user will hit the catch segment
//and have no file produced that would be in the above try block
//this now says if file == null or if file == ''
// in groovy !file means capture if there nothing defined for file
if (!file) {
//render something else
render 'a message or return to page with error that its in use or something gone wrong'
//return tells your controller to stop what ever else from this point
return
}
//so what ever else would occur will not occur since no file was produced
...
Now a final note try/catches are expensive and should not be used everywhere. If you are expecting something then deal with the data. In scenarios typically like this third party api where you have no control i.e. to make the unexpected expected then you fall back to these methods
1- Client Side : Better is to disable button on first click and wait for response from Server.
2- Catch Exception and do nothing or just print error log.
// get/set parameters
def file
def reportDef
try{
reportDef = jasperService.buildReportDefinition(reportParams, request.getLocale(), [])
file = jasperService.generateReport(reportDef).toByteArray()
}catch(Exception e){
// print log or do nothing
}
if (file){
// render file according to your conditions
}
else {
// render , return appropriate message.
}
Instead of catching Exception, Its better to catch IOException. Otherwise you will be eating all other exceptions as well. Here is how i handled it.
private def streamFile(File file) {
def outputStream
try {
response.contentType = "application/pdf"
response.setHeader "Content-disposition", "inline; filename=${file.name}"
outputStream = response.outputStream
file.withInputStream {
response.contentLength = it.available()
outputStream << it
}
outputStream.flush()
}
catch (IOException e){
log.info 'Probably User Cancelled the download!'
}
finally {
if (outputStream != null){
try {
outputStream.close()
} catch (IOException e) {
log.info 'Exception on close'
}
}
}
}

Split a "tagged" PDF document into multiple documents, keeping the tagging

In a project I have to split a PDF document into two documents, one containing all blank pages, and one containing all pages with content.
For this job, I use a PdfReader to read the source file, and two pdfCopy objects (one for the blank pages document, one for the pages with content document) to write the files to.
I use GetImportedPage to read a PdfImportedPage, which is then added to one of the PdfCopy writers.
Now, the problem is the following: the source file is using the "tagged PDF format". To preserve this (which is absolutely required), I use the SetTagged() method on both PdfCopy writers, and use the extra third parameter in GetImportedPage(...) to keep the tagged format. However, when calling the AddPage(...) on the PdfCopy writer, I get an invalid cast exception:
"Unable to cast object of type 'iTextSharp.text.pdf.PdfDictionary' to type 'iTextSharp.text.pdf.PRIndirectReference'."
Anyone has any ideas on how to solve this ? Any hints ?
Also: the project currently refers version 5.1.0.0 of the itext libraries. In 5.4.4.0 the third parameter to GetImportedPage does not seem to be there anymore.
Below, you can find a code extract:
iTextSharp.text.Document targetPdf = new iTextSharp.text.Document();
iTextSharp.text.Document blankPdf = new iTextSharp.text.Document();
iTextSharp.text.pdf.PdfReader sourcePdfReader = new iTextSharp.text.pdf.PdfReader(inputFile);
iTextSharp.text.pdf.PdfCopy targetPdfWriter = new iTextSharp.text.pdf.PdfSmartCopy(targetPdf, new FileStream(outputFile, FileMode.Create));
iTextSharp.text.pdf.PdfCopy blankPdfWriter = new iTextSharp.text.pdf.PdfSmartCopy(blankPdf, new FileStream(blanksFile, FileMode.Append));
targetPdfWriter.SetTagged();
blankPdfWriter.SetTagged();
try
{
iTextSharp.text.pdf.PdfImportedPage page = null;
int n = sourcePdfReader.NumberOfPages;
targetPdf.Open();
blankPdf.Open();
blankPdf.Add(new iTextSharp.text.Phrase("This document contains the blank pages removed from " + inputFile));
blankPdf.NewPage();
for (int i = 1; i <= n; i++)
{
byte[] pageBytes = sourcePdfReader.GetPageContent(i);
string pageText = "";
iTextSharp.text.pdf.PRTokeniser token = new iTextSharp.text.pdf.PRTokeniser(new iTextSharp.text.pdf.RandomAccessFileOrArray(pageBytes));
while (token.NextToken())
{
if (token.TokenType == iTextSharp.text.pdf.PRTokeniser.TokType.STRING)
{
pageText += token.StringValue;
}
}
if (pageText.Length >= 15)
{
page = targetPdfWriter.GetImportedPage(sourcePdfReader, i, true);
targetPdfWriter.AddPage(page);
}
else
{
page = blankPdfWriter.GetImportedPage(sourcePdfReader, i, true);
blankPdfWriter.AddPage(page);
blankPageCount++;
}
}
}
catch (Exception ex)
{
Console.WriteLine("Exception at LOC1: " + ex.Message);
}
The error occurs in the call to targetPdfWriter.AddPage(page); near the end of the code sample.
Thank you very much for your help.
Koen.

Attachments not showing up in pdf document - created using pdfbox

I m trying to attach an swf file to a pdf document. Below is my code (excerpted from the pdfbox-examples). while i can see that the file is attached based on the size of the file - with & without the attachment, I can't see / locate it in the pdf document. I do see textual content correctly displayed. Can someone tell me what I m doing wrong & help me fix the issue?
doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage( page );
PDFont font = PDType1Font.HELVETICA_BOLD;
String inputFileName = "sample.swf";
InputStream fileInputStream = new FileInputStream(new File(inputFileName));
PDEmbeddedFile ef = new PDEmbeddedFile(doc, fileInputStream );
PDPageContentStream contentStream = new PDPageContentStream(doc, page,true,true);
//embedded files are stored in a named tree
PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
//first create the file specification, which holds the embedded file
PDComplexFileSpecification fs = new PDComplexFileSpecification();
fs.setEmbeddedFile(ef);
//now lets some of the optional parameters
ef.setSubtype( "swf" );
ef.setCreationDate( new GregorianCalendar() );
//now add the entry to the embedded file tree and set in the document.
Map<String, COSObjectable> efMap = new HashMap<String, COSObjectable>();
efMap.put("My first attachment", fs );
efTree.setNames( efMap );
//attachments are stored as part of the "names" dictionary in the document catalog
PDDocumentNameDictionary names = new PDDocumentNameDictionary( doc.getDocumentCatalog() );
names.setEmbeddedFiles( efTree );
doc.getDocumentCatalog().setNames( names );
After struggling with the same thing, I've discovered this is a known issue. Attachments haven't worked for a while I guess.
Here's a link to the issue on the apache forum.
There is a hack suggested here that you can use.
I tried it and it worked!
the other work around i found is after you call setNames on your PDEmbeddedFilesNameTreeNode remove the limits: ((COSDictionary
)efTree.getCOSObject()).removeItem(COSName.LIMITS); ugly hack, but it
works, without having to recompile pdfbox
Attachment works fine with new version of PDFBox 2.0,
public static boolean addAtachement(final String fileName, final String... attachements) {
if (Objects.isNull(fileName)) {
throw new NullPointerException("fileName shouldn't be null");
}
if (Objects.isNull(attachements)) {
throw new NullPointerException("attachements shouldn't be null");
}
Map<String, PDComplexFileSpecification> efMap = new HashMap<String, PDComplexFileSpecification>();
/*
* Load PDF Document.
*/
try (PDDocument doc = PDDocument.load(new File(fileName))) {
/*
* Attachments are stored as part of the "names" dictionary in the
* document catalog
*/
PDDocumentNameDictionary names = new PDDocumentNameDictionary(doc.getDocumentCatalog());
/*
* First we need to get all the existed attachments, after that we
* can add new attachments
*/
PDEmbeddedFilesNameTreeNode efTree = names.getEmbeddedFiles();
if (Objects.isNull(efTree)) {
efTree = new PDEmbeddedFilesNameTreeNode();
}
Map<String, PDComplexFileSpecification> existedNames = efTree.getNames();
if (existedNames == null || existedNames.isEmpty()) {
existedNames = new HashMap<String, PDComplexFileSpecification>();
}
for (String attachement : attachements) {
/*
* Create the file specification, which holds the embedded file
*/
PDComplexFileSpecification fs = new PDComplexFileSpecification();
fs.setFile(attachement);
try (InputStream is = new FileInputStream(attachement)) {
/*
* This represents an embedded file in a file specification
*/
PDEmbeddedFile ef = new PDEmbeddedFile(doc, is);
/* Set some relevant properties of embedded file */
ef.setCreationDate(new GregorianCalendar());
fs.setEmbeddedFile(ef);
/*
* now add the entry to the embedded file tree and set in
* the document.
*/
efMap.put(attachement, fs);
}
}
efTree.setNames(efMap);
names.setEmbeddedFiles(efTree);
doc.getDocumentCatalog().setNames(names);
doc.save(fileName);
return true;
} catch (IOException e) {
System.out.println(e.getMessage());
return false;
}
}
To 'locate' or see an attached file in the PDF, you can't flip through its pages to find any trace of it there (like, an annotation).
In Acrobat Reader 9.x for example, you have to click on the "View Attachments" icon (looking like a paper-clip) on the left sidebar.