iTextSharp XML ZUGFeRD-invoice.xml attachment - vb.net

I use iTextSharp 5.5.3 and Ii need to generate a PDF/A with ConformanceLevel = ZUGFeRD. I have trouble in setting the correct XMP schema flags.
The code is working but I always get the exception
ZUGFeRD XMP schema shall contain attachment name
when I close the writer. The PDF was generated before but does not seem to be compliant with ZUGFeRD.
I dont know how to fix this warning. I really hope someone could help me. I've been working on this problem for 2 days and can't find a way.
Dim document As New Document(PageSize.A4, 0, 0, 0, 0)
Dim writer As PdfAWriter = PdfAWriter.GetInstance(document, New FileStream(tmpPDFDatei, FileMode.Create), PdfAConformanceLevel.ZUGFeRD)
writer.SetPdfVersion(PdfWriter.PDF_VERSION_1_7)
writer.CreateXmpMetadata()
Dim PDFbaseFont As BaseFont = BaseFont.CreateFont(Application.StartupPath & "\Courier Prime.ttf", BaseFont.CP1252, BaseFont.EMBEDDED)
document.Open()
Dim icc As ICC_Profile = ICC_Profile.GetInstance(Application.StartupPath & "\sRGB_IEC61966-2-1_black_scaled.icc")
writer.SetOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc)
Dim cb As PdfContentByte = writer.DirectContent
cb.BeginText()
cb.SetFontAndSize(PDFbaseFont, 10)
cb.ShowTextAligned(PdfContentByte.ALIGN_LEFT, "TEST TEXT", 0, 0, 0)
cb.SetHorizontalScaling(100)
cb.EndText()
Dim Params As PdfDictionary = New PdfDictionary
Params.Put(PdfName.MODDATE, New PdfDate)
Dim fileSpec As PdfFileSpecification = PdfFileSpecification.FileEmbedded(writer, tmpXMLDatei, "ZUGFeRD-invoice.xml", Nothing, False, "text/xml", Params)
fileSpec.Put(New PdfName("AFRelationship"), New PdfName("Alternative"))
writer.AddFileAttachment("ZUGFeRD Invoice", fileSpec)
Dim aRR As PdfArray = New PdfArray
aRR.Add(fileSpec.Reference)
writer.ExtraCatalog.Put(New PdfName("AF"), aRR)
writer.XmpWriter.SetProperty(PdfAXmpWriter.zugferdSchemaNS, PdfAXmpWriter.zugferdDocumentFileName, "ZUGFeRD-invoice.xml")
writer.XmpWriter.SetProperty(PdfAXmpWriter.zugferdSchemaNS, PdfAXmpWriter.zugferdDocumentType, "INVOICE")
document.Close()
writer.Close()

You can solve your problem by removing the following line:
writer.Close()
The writer is automatically closed when you close the Document. The problem you are facing is that the XMP is written to the document when the writer is closed the first time. The data in the XmpWriter is checked, approved and consumed.
When you close the writer a second time, the XMP data you added is gone. Hence the exception: some ZUGFeRD related information is missing.
Our problem with ZUGFeRD is that we didn't find a finalized version of the standard in German yet. I don't understand what you mean with the extra question in the comments.
I've made a screen shot of the internal structure of a ZUGFeRD PDF:
As far as I can see, the name of the file is stored in the Name tree of the EmbeddedFiles entry. Are you saying this isn't the case for you?

Related

PDF fill in not merging correctly

We are using an asp.net website with iTextSharp.dll version 5.5.13
We can merge multiple PDF files into one using a stream and it works perfectly. However, when we use a PDF that was created in a "fill-in" function the new PDF file does not correctly merge into the other documents. It merges without the filled in values. However, if I open the filled in PDF that it creates the filled in data displays and prints fine.
I have tried merging the new "filled in" PDF at a later time but it still displays the template file as though the filled in data was missing.
Below code fills in the data
Dim strFileName As String = Path.GetFileNameWithoutExtension(strSourceFile)
Dim strOutPath As String = HttpContext.Current.Server.MapPath("~/Apps/Lifetime/OfficeDocs/Export/")
newFile = strOutPath & strFileName & " " & strRONumber & ".pdf"
If Not File.Exists(newFile) Then
Dim pdfReader As PdfReader = New PdfReader(strSourceFile)
Using pdfStamper As PdfStamper = New PdfStamper(pdfReader, New FileStream(newFile, FileMode.Create))
Dim pdfFormFields As AcroFields = pdfStamper.AcroFields
pdfFormFields.SetField("CUSTOMER NAME", strCustomer)
pdfFormFields.SetField("YR MK MODEL", strVehicle)
pdfFormFields.SetField("RO#", strRONumber)
pdfStamper.FormFlattening = False
pdfStamper.Dispose()
End Using
End If
Then code below here merges multiple PDF files/paths sent to it
Public Shared Sub MergePDFs(ByVal files As List(Of String), ByVal filename As String)
'Gets a list of full path files and merges into one memory stream
'and outputs it to a browser response.
Dim MemStream As New System.IO.MemoryStream
Dim doc As New iTextSharp.text.Document
Dim reader As iTextSharp.text.pdf.PdfReader
Dim numberOfPages As Integer
Dim currentPageNumber As Integer
Dim writer As iTextSharp.text.pdf.PdfWriter = iTextSharp.text.pdf.PdfWriter.GetInstance(doc, MemStream)
doc.Open()
Dim cb As iTextSharp.text.pdf.PdfContentByte = writer.DirectContent
Dim page As iTextSharp.text.pdf.PdfImportedPage
Dim strError As String = ""
For Each strfile As String In files
reader = New iTextSharp.text.pdf.PdfReader(strfile)
numberOfPages = reader.NumberOfPages
currentPageNumber = 0
Do While (currentPageNumber < numberOfPages)
currentPageNumber += 1
doc.SetPageSize(reader.GetPageSizeWithRotation(currentPageNumber))
doc.NewPage()
page = writer.GetImportedPage(reader, currentPageNumber)
cb.AddTemplate(page, 0, 0)
Loop
Next
doc.Close()
doc.Dispose()
If MemStream Is Nothing Then
HttpContext.Current.Response.Write("No Data is available for output")
Else
HttpContext.Current.Response.Clear()
HttpContext.Current.Response.ContentType = "application/pdf"
HttpContext.Current.Response.AppendHeader("Content-Disposition", "inline; filename=" + filename)
HttpContext.Current.Response.BinaryWrite(MemStream.ToArray)
HttpContext.Current.Response.OutputStream.Flush()
HttpContext.Current.Response.OutputStream.Close()
HttpContext.Current.Response.OutputStream.Dispose()
MemStream.Close()
MemStream.Dispose()
End If
End Sub
I expect the "filled in" PDF in the list of files to retain the filled in data but it does not. Even if I try to merge the filled in file later it still comes up missing the filled in data. If I print the filled in file it looks perfect.
PdfWriter.GetImportedPage only returns you a copy of the page contents. This does not include any annotations, in particular not the widget annotations of form fields on the page at hand.
To also copy the annotations of the source pages, use the iText PdfCopy class instead. This class is designed to copy pages including all annotations. Furthermore, it includes methods to copy all pages of a source document in one step.
You have to tell the PdfCopy object to merge fields, otherwise the document-wide form structure won't be built.
As an aside, your code creates many PdfReader objects but does not close them. That may increase your memory requirements substantially.
Thus:
Public Shared Sub MergePDFsImproved(ByVal files As List(Of String), ByVal filename As String)
Using mem As New MemoryStream()
Dim readers As New List(Of PdfReader)
Using doc As New Document
Dim copy As New PdfCopy(doc, mem)
copy.SetMergeFields()
doc.Open()
For Each strfile As String In files
Dim reader As New PdfReader(strfile)
copy.AddDocument(reader)
readers.Add(reader)
Next
End Using
For Each reader As PdfReader In readers
reader.Close()
Next
HttpContext.Current.Response.Clear()
HttpContext.Current.Response.ContentType = "application/pdf"
HttpContext.Current.Response.AppendHeader("Content-Disposition", "inline; filename=" + filename)
HttpContext.Current.Response.BinaryWrite(mem.ToArray)
HttpContext.Current.Response.OutputStream.Flush()
HttpContext.Current.Response.OutputStream.Close()
HttpContext.Current.Response.OutputStream.Dispose()
End Using
End Sub
Actually I'm not sure whether it is a good idea to Close and Dispose the response output stream here, that shouldn't be the responsibility of a PDF merging method.
This is a related answer for the Java version of iText; you may want to read it for additional information. Unfortunately many links in that answer meanwhile are dead.

Increase page size in PDF documents to fit barcode (itextsharp)

I'm using vb.net to build a workflow where I'm processing a number of PDF files. One of the things I need to do is to place a barcode in the bottom left corner of the first page on each PDF document.
I already worked out the code to place the barcode but the problem is that it may cover existing content on the first page.
I want to increase the page size and add about 40 pixels of white space at the bottom of the first page where I can place the barcode. But I dont know how to do this!
Here is the existing code:
Public Sub addBarcodeToPdf(byval openPDFpath as string, byval savePDFpath as string, ByVal barcode As String)
Dim myPdf As PdfReader
Try
myPdf = New PdfReader(openPDFpath)
Catch ex As Exception
logEvent("LOAD PDF EXCEPTION " & ex.Message)
End Try
Dim stamper As PdfStamper = New PdfStamper(myPDF, New FileStream(savePDFpath, FileMode.Create))
Dim over As PdfContentByte = stamper.GetOverContent(1)
Dim textFont As BaseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED)
Dim BarcodeFont As BaseFont = BaseFont.CreateFont("c:\windows\fonts\FRE3OF9X.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED)
over.SetColorFill(BaseColor.BLACK)
over.BeginText()
over.SetFontAndSize(textFont, 15)
over.SetTextMatrix(30, 3)
over.ShowText(barcode)
over.EndText()
over.BeginText()
over.SetFontAndSize(BarcodeFont, 28)
over.SetTextMatrix(5, 16)
over.ShowText("*" & barcode & "*")
over.EndText()
stamper.Close()
myPdf.Close()
End Sub
Thank you in advance!
/M
Thank you Bruno for pointing me in the right direction. I haven't done volume testing yet but I managed to get it work on one PDF sample. Just changing the mediabox was not enough (I could only make the page size smaller) but when changing the cropbox at the same thime I got the results I was looking for.
Code in VB below for reference
Dim myPdf As PdfReader
Try
myPdf = New PdfReader(openPDFpath)
Catch ex As Exception
logEvent("LOAD PDF EXCEPTION " & ex.Message)
End Try
Dim stamper As PdfStamper = New PdfStamper(myPdf, New FileStream(savePDFpath, FileMode.Create))
Dim pageDict As PdfDictionary = myPdf.GetPageN(1)
Dim mediabox As PdfArray = pageDict.GetAsArray(PdfName.MEDIABOX)
Dim cropbox As PdfArray = pageDict.GetAsArray(PdfName.CROPBOX)
Dim pixelsToAdd As Integer = -40
mediabox.Set(1, New PdfNumber(pixelsToAdd))
cropbox.Set(1, New PdfNumber(pixelsToAdd))
stamper.Close()
myPdf.Close()
Thanks
/M

iTextSharp PdfStamper add Header/Footer

I am trying to add a header or footer to pages within a pdf document. This is explained in the iTextInAction book as the correct way to add direct content to a page. However when I try to open this document in Adobe I get the following error, and have some issues printing as well. Any ideas?
Dim reader As PdfReader = Nothing
Dim stamper As PdfStamper = Nothing
Try
reader = New PdfReader(inputFile)
stamper = New PdfStamper(reader, New IO.FileStream(outputFile, IO.FileMode.Append))
Dim fontSz As Single = 10.0F
Dim font As New Font(font.FontFamily.HELVETICA, fontSz, 1, BaseColor.GRAY)
Dim chunk As New Chunk(headerText, font)
Dim rect As Rectangle = reader.GetPageSizeWithRotation(1)
Here I am just adjusting the size of the text to make sure it fits within the page boundaries
While chunk.GetWidthPoint() > rect.Width
fontSz -= 1.0F
font = New Font(font.FontFamily.HELVETICA, fontSz, 1, BaseColor.GRAY)
chunk = New Chunk(wm.ToString(), font)
End While
This is where I get the overcontent and add my text to it
For pageNo As Int32 = 1 To reader.NumberOfPages
Dim phrase As New Phrase(chunk)
Dim x As Single = (rect.Width / 2) - (phrase.Chunks(0).GetWidthPoint() / 2)
Dim y As Single = If(wm.WatermarkPosition = "Header", rect.Height - font.Size, 1.0F)
Dim canvas As PdfContentByte = stamper.GetOverContent(pageNo)
canvas.BeginText()
ColumnText.ShowTextAligned(canvas, Element.ALIGN_LEFT, phrase, x, y, 0)
canvas.EndText()
Next
Catch ex As iTextSharp.text.pdf.BadPasswordException
Throw New InvalidOperationException("Page extraction is not supported for this pdf document. It must be allowed in order to add a watermark.")
Finally
reader.Close()
stamper.Close()
End Try
You're problem is probably this line:
stamper = New PdfStamper(reader, New IO.FileStream(outputFile, IO.FileMode.Append))
You are telling .Net to write the contents to a file in append mode. If the file doesn't exist then it creates the file but subsequent writes go to the end producing a corrupt PDF. You should change this to IO.FileMode.Create
Also, while you're at it, I usually recommend being even more explicit with your FileStream creation and tell .Net (and thus Windows) what you further intend for the stream. In this case, you are only ever going to write to it to you can say FileAccess.Write and while you are writing to it you want to make sure no one else attempts to read from it (since it would be in an invalid state) so you can say FileShare.None
stamper = New PdfStamper(reader, New FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None))
(An aside, although using IO.FileMode.Create is absolutely valid it is really weird to see. Most people either spell it out as System.IO.FileMode.Create or they import System.IO and then just us FileMode.Create.)

PDF Found but failed to open for iTextSharp

I am using iTextSharp and the below code worked up to last week so I am stumped, I suspect an iTextSharp update.
PDF file is found but then will not open for editing..
Error line (full error at the bottom):
If System.IO.File.Exists(sourceFile) Then ' found here
reader = New iTextSharp.text.pdf.PdfReader(sourceFile) 'fails here, see error at bottom of query
Sourcefile is from the same website: www.website.com/folder/pdftest.pdf and I have tried local as well i.e. c:'... pdftest.pdf
All code:
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim stamper As iTextSharp.text.pdf.PdfStamper = Nothing
Dim img As iTextSharp.text.Image = Nothing
Dim img1 As iTextSharp.text.Image = Nothing
Dim underContent As iTextSharp.text.pdf.PdfContentByte = Nothing
Dim overContent As iTextSharp.text.pdf.PdfContentByte = Nothing
Dim rect As iTextSharp.text.Rectangle = Nothing
'Dim X, Y As Single
Dim pageCount As Integer = 0
If System.IO.File.Exists(sourceFile) Then
reader = New iTextSharp.text.pdf.PdfReader(sourceFile)
rect = reader.GetPageSizeWithRotation(1)
stamper = New iTextSharp.text.pdf.PdfStamper(reader, New System.IO.FileStream(outputFile, System.IO.FileMode.Create))
pageCount = reader.NumberOfPages()
For i As Integer = 1 To pageCount
'#############
overContent = stamper.GetOverContent(i) ' can be over or under the existing layers
watermarkFont = iTextSharp.text.pdf.BaseFont.CreateFont(iTextSharp.text.pdf.BaseFont.HELVETICA, iTextSharp.text.pdf.BaseFont.CP1252, iTextSharp.text.pdf.BaseFont.NOT_EMBEDDED)
watermarkFontColor = iTextSharp.text.Basecolor.BLACK
overContent.BeginText() ' black set text first
overContent.SetFontAndSize(watermarkFont, 22)
overContent.SetColorFill(watermarkFontColor)
overContent.ShowTextAligned(Element.ALIGN_CENTER, "This is test", 300, 625, 0)
overContent.ShowTextAligned(Element.ALIGN_CENTER, "Successfully completed", 300, 475, 0)
overContent.ShowTextAligned(Element.ALIGN_CENTER, "A PDF Text", 300, 325, 0)
overContent.ShowTextAligned(Element.ALIGN_CENTER, "on", 300, 275, 0)
overContent.EndText()
Next
stamper.Close()
reader.Close()
Error:
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.IO.IOException: C:\sites\www\gateway\admin\maintenance\admin\blank.pdf not found as file or resource.
Source Error:
Line 229:
Line 230: If System.IO.File.Exists(sourceFile) Then
Line 231: reader = New iTextSharp.text.pdf.PdfReader(sourceFile)
Line 232:
Line 233:
Dim myBytes = System.IO.File.ReadAllBytes(sourceFile) reader = New iTextSharp.text.pdf.PdfReader(myBytes) from #Chris-Haas was the answer without changing any settings.
Check to see if the itextsharp.dll file is being blocked in Windows. Right click the itextsharp.dll file and choose properties. At the bottom of the general tab there is probably an Unblock button. Click that button.
This would explain why System.IO can read the file but iTextSharp cannot.

Controlling Image Resolution when coverting a PNG image into a PDF using iTextSharp

I have created a PNG image that is 200 DPI, and perfectly sized for a landscape A4 page size. I needed to convert this to a PDF document, so I've used the iTextSharp library with the code below.
This all works, however the image quality has degraded. Any suggestions as to how I might improve this?
Public Sub ConvertPNGtoPDF(ByVal inputFile As String, ByVal outputFile As String)
Using fs As New FileStream(outputFile, FileMode.Create, FileAccess.ReadWrite, FileShare.None)
Dim document As New Document(PageSize.A4.Rotate, 0, 0, 0, 0)
Dim writer As PdfWriter = PdfWriter.GetInstance(document, fs)
document.Open()
Dim cb As PdfContentByte = writer.DirectContent
Using bm As New Bitmap(inputFile)
Dim total As Integer = bm.GetFrameCount(FrameDimension.Page)
For k As Integer = 0 To total - 1
bm.SelectActiveFrame(FrameDimension.Page, k)
Dim img As iTextSharp.text.Image = iTextSharp.text.Image.GetInstance(bm, Nothing, False)
img.SetDpi(200, 200)
img.ScalePercent(72.0F / 200.0F * 100)
img.SetAbsolutePosition(0, 0)
cb.AddImage(img)
document.NewPage()
Next
End Using
document.Close()
writer.Close()
End Using
End Sub
This all works, however the image quality has degraded. Any suggestions as to how I might improve this?
Looking at the code in PngImage, it looks like iText doesn't support PNG compression as a PDF-native filter, so it has to be decompressed and recompressed as Something Else. Is this because the PDF spec doesn't support it:
Just checked, sure looks that way.
Best fix? JPEG and JPEG2000 are supported as "native" compression types within PDF (and this is echoed in iText[sharp]). So use JPEG[2k] instead, and run your images through your image conversion library of choice if need be.