how to highlight a text or word in a pdf file using iTextsharp?

how to highlight a text or word in a pdf file using iTextsharp? - pdf

I need to search a word in a existing pdf file and i want to highlight the text or word
and save the pdf file
I have an idea using PdfAnnotation.CreateMarkup we could find the position of the text and we can add bgcolor to it...but i dont know how to implement it :(
Please help me out

This is one of those "sounds easy but is actually really complicated" things. See Mark's posts here and here. Ultimately you'll probably be pointed to LocationTextExtractionStrategy. Good luck! If you actually find out how to do it post it here, there several people wondering exactly what you are wondering!

I've found how to do this, just in case someone needs to get words or sentences with locations (coordinates) from a PDF document you'll find this example Project
HERE
, I used VB.NET 2010 for this. Remember to add a reference to your iTextSharp DLL in this Project.
I added my own TextExtraction Strategy Class, based on Class LocationTextExtractionStrategy. I focused on TextChunks, because they already have these coordinates.
There are some known limitations like:
No multiple line searches (phrases), just char/s or word's or a one line sentence are allowed.
It Won't work with rotated text.
I didn't test on PDFs with landscape page orientation but i assume some modifications may be required for this.
In case you need to draw this HighLight/rectangles over a watermark you'll need to add/modify some code, but just code in the Form, this is not related to the text/locations extraction proccess.

#Jcis, I actually managed a workaround for handling multiple searches using your example as a starting point. I use your project as a reference in a c# project, and altered what it does. Instead of just highlighting I actually have it drawing a white rectangle around the search term, and then using the rectangle coordinates, place a form field. I also had to swap the contentbyte writing mode to getovercontent so that I block out the searched text entirely. What I actually did was to create a string array of search terms, and then using a for loop, I create as many different text fields as I need.
Test.Form1 formBuilder = new Test.Form1();
string[] fields = new string[] { "%AccountNumber%", "%MeterNumber%", "%EmailFieldHolder%", "%AddressFieldHolder%", "%EmptyFieldHolder%", "%CityStateZipFieldHolder%", "%emptyFieldHolder1%", "%emptyFieldHolder2%", "%emptyFieldHolder3%", "%emptyFieldHolder4%", "%emptyFieldHolder5%", "%emptyFieldHolder6%", "%emptyFieldHolder7%", "%emptyFieldHolder8%", "%SiteNameFieldHolder%", "%SiteNameFieldHolderWithExtraSpace%" };
//int a = 0;
for (int a = 0; a < fields.Length; )
{
string[] fieldNames = fields[a].Split('%');
string[] fieldName = Regex.Split(fieldNames[1], "Field");
formBuilder.PDFTextGetter(fields[a], StringComparison.CurrentCultureIgnoreCase, htmlToPdf, finalhtmlToPdf, fieldName[0]);
File.Delete(htmlToPdf);
System.Array.Clear(fieldNames, 0, 2);
System.Array.Clear(fieldName, 0, 1);
a++;
if (a == fields.Length)
{
break;
}
string[] fieldNames1 = fields[a].Split('%');
string[] fieldName1 = Regex.Split(fieldNames1[1], "Field");
formBuilder.PDFTextGetter(fields[a], StringComparison.CurrentCultureIgnoreCase, finalhtmlToPdf, htmlToPdf, fieldName1[0]);
File.Delete(finalhtmlToPdf);
System.Array.Clear(fieldNames1, 0, 2);
System.Array.Clear(fieldName1, 0, 1);
a++;
}
It bounces the PDFTextGetter function in your example back and forth between two files until I achieve the finished product. It works really well, and it would not have been possible without your initial project, so thank you for that. I also altered your VB to do the text field mapping like so;
For Each rect As iTextSharp.text.Rectangle In MatchesFound
cb.Rectangle(rect.Left, rect.Bottom + 1, rect.Width, rect.Height + 4)
Dim field As New TextField(stamper.Writer, rect, FieldName & Fields)
Dim form = stamper.AcroFields
Dim fieldKeys = form.Fields.Keys
stamper.AddAnnotation(field.GetTextField(), page)
Fields += 1
Next
Just figured I would share what I managed to do with your project as a backbone. It even increments the field names as I need them to. I also had to add a new parameter to your function, but that's not worth listing here. Thank you again for this great head start.

Thanks Jcis!
After a couple of hours of research and thinking, i found your solution, which helped me to solve my Problems.
there were 2 little bugs.
first: the stamper needs to be closed before the reader, otherwise it throws an exception.
Public Sub PDFTextGetter(ByVal pSearch As String, ByVal SC As StringComparison, ByVal SourceFile As String, ByVal DestinationFile As String)
Dim stamper As iTextSharp.text.pdf.PdfStamper = Nothing
Dim cb As iTextSharp.text.pdf.PdfContentByte = Nothing
Me.Cursor = Cursors.WaitCursor
If File.Exists(SourceFile) Then
Dim pReader As New PdfReader(SourceFile)
stamper = New iTextSharp.text.pdf.PdfStamper(pReader, New System.IO.FileStream(DestinationFile, FileMode.Create))
PB.Value = 0 : PB.Maximum = pReader.NumberOfPages
For page As Integer = 1 To pReader.NumberOfPages
Dim strategy As myLocationTextExtractionStrategy = New myLocationTextExtractionStrategy
'cb = stamper.GetUnderContent(page)
cb = stamper.GetOverContent(page)
Dim state As New PdfGState()
state.FillOpacity = 0.3F
cb.SetGState(state)
'Send some data contained in PdfContentByte, looks like the first is always cero for me and the second 100, but i'm not sure if this could change in some cases
strategy.UndercontentCharacterSpacing = cb.CharacterSpacing
strategy.UndercontentHorizontalScaling = cb.HorizontalScaling
'It's not really needed to get the text back, but we have to call this line ALWAYS,
'because it triggers the process that will get all chunks from PDF into our strategy Object
Dim currentText As String = PdfTextExtractor.GetTextFromPage(pReader, page, strategy)
'The real getter process starts in the following line
Dim MatchesFound As List(Of iTextSharp.text.Rectangle) = strategy.GetTextLocations(pSearch, SC)
'Set the fill color of the shapes, I don't use a border because it would make the rect bigger
'but maybe using a thin border could be a solution if you see the currect rect is not big enough to cover all the text it should cover
cb.SetColorFill(BaseColor.PINK)
'MatchesFound contains all text with locations, so do whatever you want with it, this highlights them using PINK color:
For Each rect As iTextSharp.text.Rectangle In MatchesFound
' cb.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height)
cb.SaveState()
cb.SetColorFill(BaseColor.YELLOW)
cb.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height)
cb.Fill()
cb.RestoreState()
Next
'cb.Fill()
PB.Value = PB.Value + 1
Next
stamper.Close()
pReader.Close()
End If
Me.Cursor = Cursors.Default
End Sub
second: your solution dont work, when the searched text is in the last line of the extraced text.
Public Function GetTextLocations(ByVal pSearchString As String, ByVal pStrComp As System.StringComparison) As List(Of iTextSharp.text.Rectangle)
Dim FoundMatches As New List(Of iTextSharp.text.Rectangle)
Dim sb As New StringBuilder()
Dim ThisLineChunks As List(Of TextChunk) = New List(Of TextChunk)
Dim bStart As Boolean, bEnd As Boolean
Dim FirstChunk As TextChunk = Nothing, LastChunk As TextChunk = Nothing
Dim sTextInUsedChunks As String = vbNullString
' For Each chunk As TextChunk In locationalResult
For j As Integer = 0 To locationalResult.Count - 1
Dim chunk As TextChunk = locationalResult(j)
If chunk.text.Contains(pSearchString) Then
Thread.Sleep(1)
End If
If ThisLineChunks.Count > 0 AndAlso (Not chunk.SameLine(ThisLineChunks.Last) Or j = locationalResult.Count - 1) Then
If sb.ToString.IndexOf(pSearchString, pStrComp) > -1 Then
Dim sLine As String = sb.ToString
'Check how many times the Search String is present in this line:
Dim iCount As Integer = 0
Dim lPos As Integer
lPos = sLine.IndexOf(pSearchString, 0, pStrComp)
Do While lPos > -1
iCount += 1
If lPos + pSearchString.Length > sLine.Length Then Exit Do Else lPos = lPos + pSearchString.Length
lPos = sLine.IndexOf(pSearchString, lPos, pStrComp)
Loop
'Process each match found in this Text line:
Dim curPos As Integer = 0
For i As Integer = 1 To iCount
Dim sCurrentText As String, iFromChar As Integer, iToChar As Integer
iFromChar = sLine.IndexOf(pSearchString, curPos, pStrComp)
curPos = iFromChar
iToChar = iFromChar + pSearchString.Length - 1
sCurrentText = vbNullString
sTextInUsedChunks = vbNullString
FirstChunk = Nothing
LastChunk = Nothing
'Get first and last Chunks corresponding to this match found, from all Chunks in this line
For Each chk As TextChunk In ThisLineChunks
sCurrentText = sCurrentText & chk.text
'Check if we entered the part where we had found a matching String then get this Chunk (First Chunk)
If Not bStart AndAlso sCurrentText.Length - 1 >= iFromChar Then
FirstChunk = chk
bStart = True
End If
'Keep getting Text from Chunks while we are in the part where the matching String had been found
If bStart And Not bEnd Then
sTextInUsedChunks = sTextInUsedChunks & chk.text
End If
'If we get out the matching String part then get this Chunk (last Chunk)
If Not bEnd AndAlso sCurrentText.Length - 1 >= iToChar Then
LastChunk = chk
bEnd = True
End If
'If we already have first and last Chunks enclosing the Text where our String pSearchString has been found
'then it's time to get the rectangle, GetRectangleFromText Function below this Function, there we extract the pSearchString locations
If bStart And bEnd Then
FoundMatches.Add(GetRectangleFromText(FirstChunk, LastChunk, pSearchString, sTextInUsedChunks, iFromChar, iToChar, pStrComp))
curPos = curPos + pSearchString.Length
bStart = False : bEnd = False
Exit For
End If
Next
Next
End If
sb.Clear()
ThisLineChunks.Clear()
End If
ThisLineChunks.Add(chunk)
sb.Append(chunk.text)
Next
Return FoundMatches
End Function

Related

vb- How to assign a different tag property to different array words?

This is my code for inserting from textfile to array to labels, but I want to be able to assign a tag property onto some of the words or perhaps use the 'Answer' which is located a line below on my text file??
IndexNo = 0
Dim FileTerm As String = "D:\soccer.txt"
Dim FileNum As Integer = FreeFile()
FileOpen(FileNum, FileTerm, OpenMode.Input)
Do
Term(IndexNo) = LineInput(FileNum)
Answer(IndexNo) = LineInput(FileNum)
IndexNo = IndexNo + 1
Loop Until EOF(FileNum)
FileClose(FileNum)
Dim Obj As Object, Count As Integer = 0
For Each Obj In Me.Controls
If TypeOf Obj Is Label Then
MyLabels(Count) = Obj
Count = Count + 1
End If
Next
Dim Random1, Random2 As Integer
Dim TempTerm, TempAnswer As Object
For Count = 0 To 15
Randomize()
Random1 = Val(Int(16 * Rnd()))
Random2 = Val(Int(16 * Rnd()))
TempTerm = Term(Random1)
Term(Random1) = Term(Random2)
Term(Random2) = TempTerm
TempAnswer = Answer(Random1)
Answer(Random1) = Answer(Random2)
Answer(Random2) = TempAnswer
Count = Count + 1
Next
For Count = 0 To 15
MyLabels(Count).Text = Term(Count)
Next
If anyone has any ideas, the help is appreciated. Thanks

Even though your question is not very clear. From what I get, you want to have some way to keep your Labels and the corresponding Term and Answers in sync.
There are many ways to do this, but I would prefer to do it as below... the perfect VB.NET way as opposed to using any legacy VB6 techniques.
First declare a class that would keep our Term and Answers in one object. This is the equivalent of Type in VB6.
Public Class TermAnswer
Public Term As String
Public Answer As String
' you may add more fields/properties here if you wish to...
End Class
Now it is easy to code our solution.
' declare a Dictionary object with Label as key and the corresponding Term and Answers as values.
Dim TermAnswers As New Dictionary(Of Label, TermAnswer)
' this is a temporary List to hold our Term and Answers read from file until we randomize them.
Dim tempTermAnswers As New List(Of TermAnswer)
' Our Labels array... yes it is this easy :)
Dim myLabels() As Label = Me.Controls.OfType(Of Label)().ToArray
' read our file into the tempTermAnswers List
Dim FileTerm As String = "D:\soccer.txt"
Using reader As New IO.StreamReader(FileTerm)
While Not reader.EndOfStream
Dim ta As New TermAnswer
ta.Term = reader.ReadLine
ta.Answer = reader.ReadLine
tempTermAnswers.Add(ta)
End While
reader.Close()
End Using
' pick Term Answers from our tempTermAnswers List randomly and add them to our TermAnswers Dictionary
' we also set our Label text here, though you can loop separately too
Dim randomNumbers As New Random
Dim tempTerm As TermAnswer, randomNumber As Integer
For Each label In myLabels
randomNumber = randomNumbers.Next(0, tempTermAnswers.Count)
tempTerm = tempTermAnswers(randomNumber)
TermAnswers.Add(label, tempTerm)
tempTermAnswers.Remove(tempTerm)
label.Text = tempTerm.Term
Next
' now you have your term answers in a Dictionary, indexed by Label.
' you can get any of them by providing a Label on your form as key and get the corresponding Term and Answer as value.
' e.g. let us list the Label Name, Term and Answer in our debug window...
For Each label In myLabels
Debug.WriteLine(label.Name & " ... " & TermAnswers(label).Term & " ... " & TermAnswers(label).Answer)
Next

Search line in text file and return value from a set starting point vb.net

I'm currently using the following to read the contents of all text files in a directory into an array
Dim allLines() As String = File.ReadAllLines(txtfi.FullName)
Within the text files are only 6 lines that all follow the same format and will read something like
forecolour=black
I'm trying to then search for the word "forecolour" and retrieve the information after the "=" sign (black) so i can then populate the below code
AllDetail(numfiles).uPath = ' this needs to be the above result
I've only posted parts of the code but if it helps i can post the rest. I just need a little guidance if possible
Thanks
This is the full code
Dim numfiles As Integer
ReDim AllDetail(0 To 0)
numfiles = 0
lb1.Items.Clear()
Dim lynxin As New IO.DirectoryInfo(zMailbox)
lb1.Items.Clear()
For Each txtfi In lynxin.GetFiles("*.txt")
Dim allLines() As String = File.ReadAllLines(txtfi.FullName)
ReDim Preserve AllDetail(0 To numfiles)
AllDetail(numfiles).uPath = 'Needs to be populated
AllDetail(numfiles).uName = 'Needs to be populated
AllDetail(numfiles).uCode = 'Needs to be populated
AllDetail(numfiles).uOps = 'Needs to be populated
lb1.Items.Add(IO.Path.GetFileNameWithoutExtension(txtfi.Name))
numfiles = numfiles + 1
Next
End Sub
AllDetail(numfiles).uPath = Would be the actual file path
AllDetail(numfiles).uName = Would be the detail after “unitname=”
AllDetail(numfiles).uCode = Would be the detail after “unitcode=”
AllDetail(numfiles).uOps = Would be the detail after “operation=”
Within the text files that are being read there will be the following lines
Unitname=
Unitcode=
Operation=
Requirements=
Dateplanned=
For the purpose of this array I just need the unitname, unitcode & operation. Going forward I will need the dateplanned as when this is working I want to try and work out how to only display the information if the dateplanned matches the date from a datepicker. Hope that helps and any guidance or tips are gratefully received

If your file is not very big you could simply
Dim allLines() As String = File.ReadAllLines(txtfi.FullName)
For each line in allLines
Dim parts = line.Split("="c)
if parts.Length = 2 andalso parts(0) = "unitname" Then
AllDetails(numFiles).uName = parts(1)
Exit For
End If
Next
If you are absolutely sure of the format of your input file, you could also use Linq to remove the explict for each
Dim line = allLines.Where(Function(x) (x.StartsWith("unitname"))).SingleOrDefault()
if line IsNot Nothing then
AllDetails(numFiles).uName = line.Split("="c)(1)
End If
EDIT
Looking at the last details added to your question I think you could rewrite your code in this way, but still a critical piece of info is missing.
What kind of object is supposed to be stored in the array AllDetails?
I suppose you have a class named FileDetail as this
Public class FileDetail
Public Dim uName As String
Public Dim uCode As String
Public Dim uCode As String
End Class
....
numfiles = 0
lb1.Items.Clear()
Dim lynxin As New IO.DirectoryInfo(zMailbox)
' Get the FileInfo array here and dimension the array for the size required
Dim allfiles = lynxin.GetFiles("*.txt")
' The array should contains elements of a class that have the appropriate properties
Dim AllDetails(allfiles.Count) as FileDetail
lb1.Items.Clear()
For Each txtfi In allfiles)
Dim allLines() As String = File.ReadAllLines(txtfi.FullName)
AllDetails(numFiles) = new FileDetail()
AllDetails(numFiles).uPath = txtfi.FullName
Dim line = allLines.Where(Function(x) (x.StartsWith("unitname="))).SingleOrDefault()
if line IsNot Nothing then
AllDetails(numFiles).uName = line.Split("="c)(1)
End If
line = allLines.Where(Function(x) (x.StartsWith("unitcode="))).SingleOrDefault()
if line IsNot Nothing then
AllDetails(numFiles).uName = line.Split("="c)(1)
End If
line = allLines.Where(Function(x) (x.StartsWith("operation="))).SingleOrDefault()
if line IsNot Nothing then
AllDetails(numFiles).uOps = line.Split("="c)(1)
End If
lb1.Items.Add(IO.Path.GetFileNameWithoutExtension(txtfi.Name))
numfiles = numfiles + 1
Next
Keep in mind that this code could be really simplified if you start using a List(Of FileDetails)

How to check if lines in string are separated by space?

I'm building a program that gets the publisher of the book by scanning its title page and using OCR … since publishers are always at the bottom of the title page I'm thinking that a detecting lines that are separated by space is a solution but I don't know how to test for that. Here is my code:
Dim builder As New StringBuilder()
Dim reader As New StringReader(txtOCR.Text)
Dim iCounter As Integer = 0
While True
Dim line As String = reader.ReadLine()
If line Is Nothing Then Exit While
'i want to put the condition here
End While
txtPublisher.Text = builder.ToString()

Do you mean empty lines? Then you can do this:
Dim bEmpty As Boolean
And then inside the loop:
If line.Trim().Length = 0 Then
bEmpty = True
Else
If bEmpty Then
'...
End If
bEmpty = False
End If

Why not do the following: from the bottom, go up until you find the first non-empty line (no idea how the OCR works … maybe the bottom-most line is always non-empty, in which case this is redundant). In the next step, go up until the first empty line. The text in the middle is the publisher.
You don’t need the StringReader for that:
Dim lines As String() = txtOCR.Text.Split(Environment.NewLine)
Dim bottom As Integer = lines.Length - 1
' Find bottom-most non-empty line.
Do While String.IsNullOrWhitespace(lines(bottom))
bottom -= 1
Loop
' Find empty line above that
Dim top As Integer = bottom - 1
Do Until String.IsNullOrWhitespace(lines(top))
top -= 1
Loop
Dim publisherSubset As New String(bottom - top)()
Array.Copy(lines, top + 1, publisherSubset, 0, bottom - top)
Dim publisher As String = String.Join(Environment.NewLine, publisherSubset)
But to be honest I don’t think this is a particularly good approach. It’s inflexible and doesn’t cope well with unexpected formatting. I’d instead use a regular expression to describe what the publisher string (and its context) looks like. And maybe even that isn’t enough and you have to put some thought into describing the whole page to extrapolate which of the bits is the publisher.

Assuming the publisher is always on the last line and always comes after an empty line. Then perhaps the following?
Dim Lines as New List(Of String)
Dim currentLine as String = ""
Dim previousLine as String = ""
Using reader As StreamReader = New StreamReader(txtOCR.Txt)
currentLine = reader.ReadLine
If String.IsNullOrWhiteSpace(previousLine) then lines.Add(currentLine)
previousLine = currentLine
End Using
txtPublisher.Text = lines.LastOrDefault()
To ignore if the previous line is blank/empty:
Dim Lines as New List(Of String)
Using reader As StreamReader = New StreamReader(txtOCR.Txt)
lines.Add(reader.ReadLine)
End Using
txtPublisher.Text = lines.LastOrDefault()

Create Table of Contents using iTextSharp

I'm working on some code that I can't make it work.
I have a program that takes multiple pdf's and merges them into one file. Now I need to create a table of contents on the first page. You can see examples of the documents below.
I would like to outsource this to someone who is an expert with iTextSharp. I don't think this will take more than an hour or two the most.
The requirements are:
The toc will be based of the bookmarks.
The toc text will be linked to the proper page so the user can click on the text to go to the page.
The existing bookmarks in sampe1.pdf must remain.
The page numbers are already calculated, so do don't have to worry about that.
The working code must be part of the VB.Net project files I give you. I've tried several snippets without luck, I would like it to just work without me having to adapt the code.
The file I generate looks like this: http://gamepacks.org/sample1.pdf
The file with toc should look like this (layout, not the font style): http://gamepacks.org/sample2.pdf
I would appreciate anyone who can help me out.
The code I used to generate sample1.pdf looks like this to give you an idea what you need to work with.
Public Sub MergePdfFiles(ByVal docList As List(Of Portal.DocumentRow), ByVal outputPath As String)
'
' http://www.vbforums.com/showthread.php?475920-Merge-Pdf-Files-and-Add-Bookmarks-to-It-(Using-iTextSharp)
'
If docList.Count = 0 Then Exit Sub
Dim tmpFile As String = "c:\STEP_1_Working.pdf"
Dim OutlineList As List(Of PdfOutline) = New List(Of PdfOutline)
Dim FirstPageIndex As Integer = 1 ' Tracks which page to link the bookmark
Dim result As Boolean = False
Dim pdfCount As Integer = 0 'total input pdf file count
Dim fileName As String = String.Empty 'current input pdf filename
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim pageCount As Integer = 0 'current input pdf page count
Dim doc As iTextSharp.text.Document = Nothing 'the output pdf document
Dim writer As PdfWriter = Nothing
Dim cb As PdfContentByte = Nothing
'Declare a variable to hold the imported pages
Dim page As PdfImportedPage = Nothing
Dim rotation As Integer = 0
'Now loop thru the input pdfs
For Each row As Portal.DocumentRow In docList
reader = New iTextSharp.text.pdf.PdfReader(row.FilePath)
' Is this the first pdf file
If (row.Name = docList(0).Name) Then
doc = New iTextSharp.text.Document(reader.GetPageSizeWithRotation(1), 18, 18, 18, 18)
writer = PdfWriter.GetInstance(doc, New IO.FileStream(tmpFile, IO.FileMode.Create))
' Always show the bookmarks
writer.ViewerPreferences = PdfWriter.PageModeUseOutlines
'Set metadata and open the document
With doc
.AddAuthor("Sample Title")
.AddCreationDate()
.Open()
End With
'Instantiate a PdfContentByte object
cb = writer.DirectContentUnder
End If
For i As Integer = 1 To reader.NumberOfPages
'Get the input page size
doc.SetPageSize(reader.GetPageSizeWithRotation(i))
'Create a new page on the output document
doc.NewPage()
'If it is the 1st page, we add bookmarks to the page
If i = 1 Then
If row.Parent = "" Then
Dim oline As PdfOutline = New PdfOutline(cb.RootOutline, PdfAction.GotoLocalPage(FirstPageIndex, New PdfDestination(FirstPageIndex), writer), row.Name)
Else
Dim parent As PdfOutline = Nothing
For Each tmp As PdfOutline In cb.RootOutline.Kids
If tmp.Title = row.Parent Then
parent = tmp
End If
Next
' Create new group outline
If parent Is Nothing Then
parent = New PdfOutline(cb.RootOutline, PdfAction.GotoLocalPage(FirstPageIndex, New PdfDestination(FirstPageIndex), writer), row.Parent)
End If
' Add to new parent
Dim oline As PdfOutline = New PdfOutline(parent, PdfAction.GotoLocalPage(FirstPageIndex, New PdfDestination(FirstPageIndex), writer), row.Name)
OutlineList.Add(oline)
End If
FirstPageIndex += reader.NumberOfPages
End If
'Now we get the imported page
page = writer.GetImportedPage(reader, i)
'Read the imported page's rotation
rotation = reader.GetPageRotation(i)
'Then add the imported page to the PdfContentByte object as a template based on the page's rotation
If rotation = 90 Then
cb.AddTemplate(page, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(i).Height)
ElseIf rotation = 270 Then
cb.AddTemplate(page, 0, 1.0F, -1.0F, 0, reader.GetPageSizeWithRotation(i).Width + 60, -30)
Else
cb.AddTemplate(page, 1.0F, 0, 0, 1.0F, 0, 0)
End If
Next
Next
doc.Close()
End Sub

iTextSharp Adding Background Color to Watermark Text

I am adding watermark text to PDFs in a class library I have created. The code I posted below works fine, however the watermark is sometimes difficult to read because it overlays with content on the PDF. How would I go about adding a white background color around the watermark text? I basically would like the watermark text to be surrounded inside a white rectangle the size of the text. Thanks
Public Function AddWatermarkText(ByVal tempDirectory As String) As String
' Just return the full path of the PDF if we don't need to add a watermark.
If Me.Document.RevRank <> 0 OrElse Me.Document.ReleaseDate Is Nothing Then Return Me.FullPath
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
Dim stamper As iTextSharp.text.pdf.PdfStamper = Nothing
Dim gstate As New iTextSharp.text.pdf.PdfGState()
Dim overContent As iTextSharp.text.pdf.PdfContentByte = Nothing
Dim rect As iTextSharp.text.Rectangle = Nothing
Dim watermarkFont As iTextSharp.text.pdf.BaseFont = Nothing
Dim folderGuid As Guid = Guid.NewGuid()
Dim outputFile As String = tempDirectory & System.IO.Path.DirectorySeparatorChar & folderGuid.ToString() & System.IO.Path.DirectorySeparatorChar _
& Me.Document.Prefix & Me.Document.BaseNumber & Me.Document.Revision & ".pdf"
' Create the temp directory to place the new PDF in.
If Not My.Computer.FileSystem.DirectoryExists(tempDirectory) Then My.Computer.FileSystem.CreateDirectory(tempDirectory)
My.Computer.FileSystem.CreateDirectory(tempDirectory & System.IO.Path.DirectorySeparatorChar & folderGuid.ToString())
reader = New iTextSharp.text.pdf.PdfReader(Me.FullPath)
rect = reader.GetPageSizeWithRotation(1)
stamper = New iTextSharp.text.pdf.PdfStamper(reader, New System.IO.FileStream(outputFile, IO.FileMode.Create))
watermarkFont = iTextSharp.text.pdf.BaseFont.CreateFont(iTextSharp.text.pdf.BaseFont.HELVETICA_BOLD, _
iTextSharp.text.pdf.BaseFont.CP1252, _
iTextSharp.text.pdf.BaseFont.NOT_EMBEDDED)
gstate.FillOpacity = 0.9F
gstate.StrokeOpacity = 1.0F
' Add the watermark to each page in the document.
For i As Integer = 1 To reader.NumberOfPages()
overContent = stamper.GetOverContent(i)
With overContent
.SaveState()
.SetGState(gstate)
.SetColorFill(iTextSharp.text.BaseColor.BLUE)
.Fill()
.BeginText()
.SetFontAndSize(watermarkFont, 8)
.SetTextMatrix(30, 30)
If Me.Document.RevRank = 0 AndAlso Me.Document.ReleaseDate IsNot Nothing Then
.ShowTextAligned(iTextSharp.text.Element.ALIGN_LEFT, UCase(String.Format("CONTROLLED DOCUMENT – THIS COPY IS THE LATEST REVISION AS OF {0}" _
, Date.Now.ToString("ddMMMyyyy"))), 10, rect.Height - 15, 0)
End If
.Fill()
.EndText()
.RestoreState()
End With
Next
stamper.Close()
reader.Close()
Return outputFile
End Function

I usually like to have code that you can just plop in but unfortunately you're code is a little too domain-specific to provide a direct answer (lots of Me.* that we have to guess at) but I can still get you there with a little code refactoring.
To do what you want to do you have to measure the string that you are drawing and then draw a rectangle to those dimensions. The PDF spec doesn't have a concept of "background color" for text and any implementation that makes it look like it does is really just drawing rectangles for you. (Yes, you can highlight text but that's an Annotation which is different.)
So first I'm going to pull things out into variables so that we can reuse and adjust them easier:
''//Text to measure and draw
Dim myText As String = UCase(String.Format("CONTROLLED DOCUMENT – THIS COPY IS THE LATEST REVISION AS OF {0}", Date.Now.ToString("ddMMMyyyy")))
''//Font size to measure and draw with
Dim TextFontSize As Integer = 8
''//Original X,Y positions that we were drawing the text at
Dim TextX As Single = 10
Dim TextY As Single = rect.Height - 15
Next we need to calculate the width and height. The former is easy but the latter requires us to first get the Ascent and Descent of the text and then calculate the difference.
''//Calculate the width
Dim TextWidth As Single = watermarkFont.GetWidthPoint(myText, TextFontSize)
''//Calculate the ascent and decent
Dim TextAscent As Single = watermarkFont.GetAscentPoint(myText, TextFontSize)
Dim TextDescent As Single = watermarkFont.GetDescentPoint(myText, TextFontSize)
''//The height is the difference between the two
Dim TextHeight As Single = TextAscent - TextDescent
(NOTE: I'm not sure if GetWidthPoint(), GetAscentPoint() and GetDescentPoint() work as desired with multi-line text.)
Then you probably want to have some padding between the box and text:
''//Amount of padding around the text when drawing the box
Dim TextPadding As Single = 2
Lastly, somewhere before you setup and draw the text you want to first draw the rectangle:
''//Set a background color
.SetColorFill(BaseColor.YELLOW)
''//Create a rectangle
.Rectangle(TextX - TextPadding, TextY - TextPadding, TextWidth + (TextPadding * 2), TextHeight + (TextPadding * 2))
''//Fill it
.Fill()

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

how to highlight a text or word in a pdf file using iTextsharp? - pdf

I need to search a word in a existing pdf file and i want to highlight the text or word and save the pdf file I have an idea using PdfAnnotation.CreateMarkup we could find the position of the text and we can add bgcolor to it...but i dont know how to implement it :( Please help me out

Related

vb- How to assign a different tag property to different array words?

Search line in text file and return value from a set starting point vb.net

How to check if lines in string are separated by space?

Create Table of Contents using iTextSharp

iTextSharp Adding Background Color to Watermark Text

Categories

Resources