Here is the code I have so far:
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports System.IO
Module Module1
Sub Main()
AddjImage("C:\test.png", "c:\pdfTemplate.pdf", "C:\output.pdf")
End Sub
Private Function AddjImage(ByVal strImageFileName As String, ByVal pdfTemplateFile As String, ByVal outputPdf As String) As Boolean
Try
Dim iPdfReader As PdfReader = New PdfReader(pdfTemplateFile)
Dim iPdfStamper As PdfStamper = New PdfStamper(iPdfReader, New FileStream(outputPdf, FileMode.Create))
Dim imgjImage As iTextSharp.text.Image
Dim bytContent As PdfContentByte
'Insert Image
imgjImage = iTextSharp.text.Image.GetInstance(strImageFileName)
imgjImage.Alignment = iTextSharp.text.Image.ALIGN_TOP
imgjImage.ScalePercent(78)
imgjImage.SetAbsolutePosition(445, 0)
bytContent = iPdfStamper.GetOverContent(1)
bytContent.AddImage(imgjImage)
iPdfStamper.FormFlattening = True
iPdfStamper.Close()
Return True
Catch ex As Exception
Return False
End Try
End Function
End Module
The pdf is in landscape layout. The page size is A4. I am trying to insert the image on right side of the pdf page. I want to align the image on x=445 and y=0 position.
I have couple of images with two sizes. They are:
image 1 with width=500px; height=910px;
image 2 with width=500px; height=400px;
The problem is, both the images are aligned to bottom instead of top. because of that the top portion of image 1 is cut off.
I tried your code(with modifications) to suit my button click event in a wpf app. The line below has to be altered to make the image go up. I feel the 0 you are using starts from bottom.
imgjImage.SetAbsolutePosition(445, 0)
to be altered to
imgjImage.SetAbsolutePosition(445, 200)
the 200 is not absolute, it has to be readjusted for your image actual size.
Related
I want to have my code find the xy position of text in a pdf or image, so that I can crop the image out, this is so that I can include any diagrams that the question includes in the question (which consists of an image that text is put on top of), I am currently using the EJ2.PdfViewer from syncfusion but I am happy to use other packages that are more useful for my purposes.
My test code for reference if it will help:
Imports System
Imports Syncfusion.EJ2.PdfViewer
Module Program
Sub Main(args As String())
Dim extraction As PdfRenderer = New PdfRenderer()
extraction.Load("C:\math.pdf")
Dim textCollection As List(Of TextData) = New List(Of TextData)
Dim text As String = extraction.ExtractText(44, textCollection)
Console.WriteLine(text)
End Sub
End Module
To get position of text in a pdf , you can use some libs:
iText7: https://itextpdf.com/resources/api-documentation
Spire PDF: https://www.e-iceblue.com/Introduce/pdf-for-net-introduce.html
To get position of text in a Image:
Google Vision API: https://cloud.google.com/vision/docs/ocr
I have a PDF which is created by scanning software. One image per page and hidden OCR'ed text.
I want to remove the images and make the text visible.
I found info how to remove images (replace by another image) but found no way for making the invisible text visible.
Sample PDF with image and hidden text
I tried below method, but it does not work:
Public Shared Sub UnhideText(ByVal strFileName As String)
Dim pdf As iTextSharp.text.pdf.PdfReader = New iTextSharp.text.pdf.PdfReader(strFileName)
Dim stp As iTextSharp.text.pdf.PdfStamper = New iTextSharp.text.pdf.PdfStamper(pdf, New IO.FileStream("e:\out.pdf", IO.FileMode.Create))
'This does not work, text remains unvisible. I guess SetTextRenderingMode applies only for new added text.
For pageNumber As Integer = 1 To pdf.NumberOfPages
Dim cb As iTextSharp.text.pdf.PdfContentByte = stp.GetOverContent(pageNumber)
cb.SetTextRenderingMode(iTextSharp.text.pdf.PdfContentByte.TEXT_RENDER_MODE_FILL)
Next
stp.Close()
End Sub
I have an application that loads a spashscreen which loads a picture and centers this for a set length of time.
The issue I have is that the file used by splashscreen is locked and I can not make any modifications or replace it etc. until application is unloaded.
I am using the default VB net Spashscreen method and the code I use for the image is below
Dim Advert As System.Drawing.Image = Image.FromFile("Y:\Test\TestPic.jpg")
Dim width As Integer = Advert.Width
Dim height As Integer = Advert.Height
Me.BackgroundImage = Advert
Me.BackgroundImageLayout = ImageLayout.Stretch
Me.Size = New Size(width, height)
Me.CenterToScreen()
and then this code in the application events to overide the display time
Protected Overrides Function OnInitialize(ByVal commandLineArgs As
System.Collections.ObjectModel.ReadOnlyCollection(Of String)) As Boolean
Me.MinimumSplashScreenDisplayTime = 7000
Return MyBase.OnInitialize(commandLineArgs)
End Function
Is there a way to release the image file when the splash screen closes.
I’m getting unwanted results in Adobe Reader DC when generating or regenerating a multi-selection list box with iTextSharp in an Acroform PDF document.
Problem: The PDF form is missing deselected display items at the beginning of the list box when viewing the modified PDF in Adobe Reader DC. For example: “One“,“Two“,“Three“,“Four“,“Five“ are list items; and “Two“ and “Four“ are selected; then the previous items such as “One” are missing the top of the list box. And the first item displayed in the list box starts with the first selection, in this case “Two”. (See Adobe Reader DC Screenshot)
FYI: Using Adobe Reader DC, when I select different field selections in the list box, and then click outside the list box, the list box field reverts back to normal appearance with all the items shown. I can’t reproduce this behavior when opening the modified PDF in Adobe Acrobat Professional 8 and all the field items are visible and correctly selected. This missing list items behavior can also be reproduced in GhostScript when converting PDF to BMP or PNG.
Please answer my question: Can you please provide me with a resolution to this issue if this is an iTextSharp problem or if my syntax is incorrect. Would you also please let me know if this behavior can reproduced using your Adobe Reader DC?
Thank you for your support!
Modified Acroform PDF Document with issue:
http://www.nk-inc.com/listbox-error.pdf
Adobe Reader DC Screenshot:
(source: nk-inc.com)
ADDITIONAL INFORMATION:
iTextSharp.dll Version: 5.5.6.0
Adobe Reader DC Version: 2015.008.20082
Adobe Acrobat Pro Version: 8.x
Form Type: Acroform PDF
VB.NET CODE (v3.5 – Windows Application):
Imports iTextSharp.text.pdf
Imports iTextSharp.text
Imports System.IO
Public Class listboxTest
Private Sub RunTest()
Dim cList As New listboxTest()
Dim fn As String = Application.StartupPath.ToString.TrimEnd("\") & "\listbox-error.pdf"
Dim b() As Byte = cList.addListBox(System.IO.File.ReadAllBytes(fn), New iTextSharp.text.Rectangle(231.67, 108.0, 395.67, 197.0), "ListBox1", "ListBox1", 1)
File.WriteAllBytes(fn, b)
Process.Start(fn)
End Sub
Public Function addListBox(ByVal pdfBytes() As Byte, ByVal newRect As Rectangle, ByVal newFldName As String, ByVal oldfldname As String, ByVal pg As Integer) As Byte()
Dim pdfReaderDoc As New PdfReader(pdfBytes)
Dim m As New System.IO.MemoryStream
Dim b() As Byte = Nothing
Try
With New PdfStamper(pdfReaderDoc, m)
Dim txtField As iTextSharp.text.pdf.TextField
txtField = New iTextSharp.text.pdf.TextField(.Writer, newRect, newFldName)
txtField.TextColor = BaseColor.BLACK
txtField.BackgroundColor = BaseColor.WHITE
txtField.BorderColor = BaseColor.BLACK
txtField.FieldName = newFldName 'ListBox1
txtField.Alignment = 0 'LEFT
txtField.BorderStyle = 0 'SOLID
txtField.BorderWidth = 1.0F 'THIN
txtField.Visibility = TextField.VISIBLE
txtField.Rotation = 0 'None
txtField.Box = newRect '231.67, 108.0, 395.67, 197.0
Dim opt As New PdfArray
Dim ListBox_ItemDisplay As New List(Of String)
ListBox_ItemDisplay.Add("One")
ListBox_ItemDisplay.Add("Two")
ListBox_ItemDisplay.Add("Three")
ListBox_ItemDisplay.Add("Four")
ListBox_ItemDisplay.Add("Five")
Dim ListBox_ItemValue As New List(Of String)
ListBox_ItemValue.Add("1X")
ListBox_ItemValue.Add("2X")
ListBox_ItemValue.Add("3X")
ListBox_ItemValue.Add("4X")
ListBox_ItemValue.Add("5X")
txtField.Options += iTextSharp.text.pdf.TextField.MULTISELECT
Dim selIndex As New List(Of Integer)
Dim selValues As New List(Of String)
selIndex.Add(CInt(1)) ' SELECT #1 (index)
selIndex.Add(CInt(3)) ' SELECT #3 (index)
txtField.Choices = ListBox_ItemDisplay.ToArray
txtField.ChoiceExports = ListBox_ItemValue.ToArray
txtField.ChoiceSelections = selIndex
Dim listField As PdfFormField = txtField.GetListField
If Not String.IsNullOrEmpty(oldfldname & "") Then
.AcroFields.RemoveField(oldfldname, pg)
End If
.AddAnnotation(listField, pg)
.Writer.CloseStream = False
.Close()
If m.CanSeek Then
m.Position = 0
End If
b = m.ToArray
m.Close()
m.Dispose()
pdfReaderDoc.Close()
End With
Return b.ToArray
Catch ex As Exception
Err.Clear()
Finally
b = Nothing
End Try
Return Nothing
End Function
End Class
The reason why the visible list starts with the second entry is that iTextSharp starts drawing the list at the first selected entry.
This is an optimization for lists which have more (probably many more) entries than can be displayed in the fixed text box area, so that the displayed entries contain at least one interesting, i.e. selected, one.
Unfortunately this optimization does not consider whether this means leaving some lines empty at the bottom, and in case of lists which fit completely into the text box, there even aren't scroll bars or anything.
But iTextSharp also offers a way to disable this optimization: You can explicitly set the first visible item manually:
txtField.ChoiceSelections = selIndex
txtField.VisibleTopChoice = 0 ' Top visible choice is start of list!
Dim listField As PdfFormField = txtField.GetListField
Adding this middle line makes the generated appearance start at the first list value.
I am using an ajaxfileupload control to upload a pdf file to the server. On the server side, I'd like to convert the pdf to jpg. Using the PDFsharp Sample: Export Images as a guide, I've got the following:
Imports System
Imports System.Drawing
Imports System.Drawing.Imaging
Imports PdfSharp.Pdf
Imports System.IO
Imports PdfSharp.Pdf.IO
Imports PdfSharp.Pdf.Advanced
Namespace Tools
Public Module ConvertImage
Public Sub pdf2JPG(pdfFile As String, jpgFile As String)
pdfFile = System.Web.HttpContext.Current.Request.PhysicalApplicationPath & "upload\" & pdfFile
Dim document As PdfSharp.Pdf.PdfDocument = PdfReader.Open(pdfFile)
Dim imageCount As Integer = 0
' Iterate pages
For Each page As PdfPage In document.Pages
' Get resources dictionary
Dim resources As PdfDictionary = page.Elements.GetDictionary("/Resources")
If resources IsNot Nothing Then
' Get external objects dictionary
Dim xObjects As PdfDictionary = resources.Elements.GetDictionary("/XObject")
If xObjects IsNot Nothing Then
Dim items As ICollection(Of PdfItem) = xObjects.Elements.Values
' Iterate references to external objects
For Each item As PdfItem In items
Dim reference As PdfReference = TryCast(item, PdfReference)
If reference IsNot Nothing Then
Dim xObject As PdfDictionary = TryCast(reference.Value, PdfDictionary)
' Is external object an image?
If xObject IsNot Nothing AndAlso xObject.Elements.GetString("/Subtype") = "/Image" Then
ExportImage(xObject, imageCount)
End If
End If
Next
End If
End If
Next
End Sub
Private Sub ExportImage(image As PdfDictionary, ByRef count As Integer)
Dim filter As String = image.Elements.GetName("/Filter")
Select Case filter
Case "/DCTDecode"
ExportJpegImage(image, count)
Exit Select
Case "/FlateDecode"
ExportAsPngImage(image, count)
Exit Select
End Select
End Sub
Private Sub ExportJpegImage(image As PdfDictionary, ByRef count As Integer)
' Fortunately JPEG has native support in PDF and exporting an image is just writing the stream to a file.
Dim stream As Byte() = image.Stream.Value
Dim fs As New FileStream([String].Format("Image{0}.jpeg", System.Math.Max(System.Threading.Interlocked.Increment(count), count - 1)), FileMode.Create, FileAccess.Write)
Dim bw As New BinaryWriter(fs)
bw.Write(stream)
bw.Close()
End Sub
Private Sub ExportAsPngImage(image As PdfDictionary, ByRef count As Integer)
Dim width As Integer = image.Elements.GetInteger(PdfImage.Keys.Width)
Dim height As Integer = image.Elements.GetInteger(PdfImage.Keys.Height)
Dim bitsPerComponent As Integer = image.Elements.GetInteger(PdfImage.Keys.BitsPerComponent)
' TODO: You can put the code here that converts vom PDF internal image format to a Windows bitmap
' and use GDI+ to save it in PNG format.
' It is the work of a day or two for the most important formats. Take a look at the file
' PdfSharp.Pdf.Advanced/PdfImage.cs to see how we create the PDF image formats.
' We don't need that feature at the moment and therefore will not implement it.
' If you write the code for exporting images I would be pleased to publish it in a future release
' of PDFsharp.
End Sub
End Module
End Namespace
As I debug, it blows up on Dim filter As String = image.Elements.GetName("/Filter") in ExportImage. The message is:
Unhandled exception at line 336, column 21 in ~:46138/ScriptResource.axd?d=LGq0ri4wlMGBKd-1vxLjtxNH_pd26HaruaEG_1eWx-epwPmhNKVpO8IpfHoIHzVj2Arxn5804quRprX3HtHb0OmkZFRocFIG-7a-SJYT_EwYUd--x9AHktpraSBgoZk4VJ1RMtFNwl1mULDLid5o5U9iBcuDi4EQpbpswgBn_oI1&t=ffffffffda74082d
0x800a139e - JavaScript runtime error: error raising upload complete event and start new upload
Any thoughts on what the issue might be? It seems an issue with the ajaxfileupload control, but I don't understand why it would be barking here. It's neither here nor there, but I know I'm not using jpgFile yet.
PDFsharp cannot create JPEG Images from PDF pages:
http://pdfsharp.net/wiki/PDFsharpFAQ.ashx#Can_PDFsharp_show_PDF_files_Print_PDF_files_Create_images_from_PDF_files_3
The sample you refer to can extract JPEG Images that are included in PDF files. That's all. The sample does not cover all possible cases.
Long story short: the code you are showing seems unrelated to the error message. And it seems unrelated to your intended goal.