Openpop.NET retriving attachment to specific folder - vb.net

I am trying to retrieve an email attachment and save it to specific directory in filesystem using the code below.
Dim objMail As Message = New Message(Encoding.ASCII.GetBytes(strMessage))
For Each att In objMail.FindAllAttachments()
Dim Stream as FileStream = New FileStream("D:\XX\" & att.FileName.ToString(),
FileMode.Create)
Dim BinaryStream As New BinaryWriter(Stream)
BinaryStream.Write(BitConverter.ToString(att.Body))
BinaryStream.Close()
Next
I have also tried att.GetBodyasText()
Using this code I am able to save attachment file in desired folder. But while opening a file, I am getting errorL
File is not in proper Format or not Decoded Properly.
I am new to MIME encoding/decoding.

I am a developer of OpenPop.NET.
There are multiple issues with the code you are using to instantiate the Message class:
Where is the strMessage contents comming from?
How do you know it is encoded in only ASCII?
These are two major issues that will likely make a big difference.
You should NOT be using a string to contain the message, instead you should be using raw bytes!
For example (in C#):
byte[] byteMessage = someFileStream.ReadToEnd();
Message message = new Message(byteMessage);
In this way, you will not destroy the message by using a wrong encoding on the bytes. Typically the email will include a header which tells how to decode to bytes to a string, which is what the OpenPop Message class will do for you.
Now let me explain attachments. Attachments are typically raw bytes, for example a PNG picture is some bytes that a PNG picture reader will understand. For the PNG picture reader to understand the picture, the attachments raw bytes must be saved to a file. You can get the raw bytes using att.Body.
There are also attachments where the raw bytes does not make sense - for example a text attachment encoded in BASE64 is not very useful for a text reader program, and such an attachment must be converted to text before saved. You can get the text using att.GetBodyAsText().
What you are doing is taking the raw bytes for the attachment and then using a BitConverter to convert it into hexadecimal numbers - which I cannot make any meaning of. Instead, you should change your:
BinaryStream.Write(BitConverter.ToString(att.Body))
to
BinaryStream.Write(att.Body)
in case your attachment is a picture or some more complex file.
I hope this can help with your problem.

Related

iText7 Merge of 2 PDF MemorStreams Not Working

I am upgrading some older iTextSharp code to the new iText 7 libraries. I am having a lot of trouble determining the proper way to merge 2 PDF MemoryStreams into one PDF MemoryStream that contains all the pages from both source PDF MemoryStreams. It seems simple and I think the code below is set up properly but the resulting PDF memory stream only contains the first file. The second PDF file is never present and never concatenated to the first.
I have found multiple ways documented on the Internet as the "proper" way to do the merge. The actual sample code with iText 7 seems to be unusually complex (in that is mixes multiple concepts into one sample repeatedly - as in doesn't reduce the concept to the simplest possible code), and seems to fail to demonstrate simple concepts. For instance, their PDFMerge documentation has no sample code at all in the documentation (nor does anything else I looked at in the class documentation). The examples they have online actually always mix merging from files (not MemoryStreams) with other concepts like adding page numbers or adding Table of Contents. So they never just show one concept and they never start with anything other than files. My PDFs are coming out of a database and we just need to merge them into one PDF memory stream and save it back out. My concern is that maybe I am not creating the MemoryStream properly when I initialize the PDFWriter. As none of their samples ever do anything but initial with an actual file, I was unable to confirm this was done properly. I also fully qualified all objects in the code because I want to leave the old iTextSharp code in place while I am upgrading to the new iText 7. This was done to make sure an iTextSharp object of the same name wasn't inadvertently being unknowingly used.
Also, in the interest of making the source as easy as possible to read I removed some of the declarations and initialization of objects being used. Everything was traced through and all values are fully loaded with proper values as you trace through the code. The only problem is that the second PDFMerge doesn't seem to do anything. I am assuming the problem is that I didn't prepare the PDF objects properly or that I have to do something special with the PDFWriter on the Destination PDF Document (p_pdfDocument) before the second PDF is written out with the PDFMerge object.
Dim p_bResult As Boolean = False
Dim p_bArray As Byte() = Nothing
Dim p_memStream As New System.IO.MemoryStream
Dim p_pdfWriter As New iText.Kernel.Pdf.PdfWriter(p_memStream)
Dim p_pdfDocument As New iText.Kernel.Pdf.PdfDocument(p_pdfWriter)
Dim p_pdf1Stream As New System.IO.MemoryStream(CType(p_cImage1.ImageFile, Byte()))
Dim p_pdf2Stream As New System.IO.MemoryStream(CType(p_cImage2.ImageFile, Byte()))
Dim p_pdf1Reader As New iText.Kernel.Pdf.PdfReader(p_pdf1Stream)
Dim p_pdf2Reader As New iText.Kernel.Pdf.PdfReader(p_pdf2Stream)
Dim p_pdf1Document As New iText.Kernel.Pdf.PdfDocument(p_pdf1Reader)
Dim p_pdf2Document As New iText.Kernel.Pdf.PdfDocument(p_pdf2Reader)
Dim p_pdfMerger As New iText.Kernel.Utils.PdfMerger(p_pdfDocument)
p_pdfMerger.Merge(p_pdf1Document, 1, p_pdf1Document.GetNumberOfPages())
p_pdfMerger.Merge(p_pdf2Document, 1, p_pdf2Document.GetNumberOfPages())
'Problem is here... the array only has the first PDF in it
'The second p_pdfMerger.Merge didn't seem to do anything
p_bArray = p_memStream.ToArray
p_pdf1Document.Close()
p_pdf2Document.Close()
p_pdfDocument.Close()
I expected the 2 source PDF MemoryStreams to be present in the destination MemoryStream but it only contains the first PDF in it.
Edit:
I changed the ending to...
p_pdfMerger.Merge(p_pdf1Document, 1, p_pdf1Document.GetNumberOfPages())
p_pdfMerger.Merge(p_pdf2Document, 1, p_pdf2Document.GetNumberOfPages())
p_cImage1.PageCount = p_pdfDocument.GetNumberOfPages()
p_pdfDocument.Close()
p_bArray = p_memStream.ToArray
p_pdf1Document.Close()
p_pdf2Document.Close()
Thing is that the p_pdfDocument.GetNumberOfPages() is correct but bytes are still just first PDF document when saved to database and viewed.
I tested your use case, condensing your code a bit, reading the input memory streams from files, and writing the output memory stream to a file as I don't have your database environment:
Using MemoryStream As New MemoryStream,
Pdf1MemoryStream As New MemoryStream(File.ReadAllBytes(MY_FIRST_PDF_FILE)),
Pdf2MemoryStream As New MemoryStream(File.ReadAllBytes(MY_SECOND_PDF_FILE))
Using PdfDocument As New PdfDocument(New PdfWriter(MemoryStream)),
Pdf1 As New PdfDocument(New PdfReader(Pdf1MemoryStream)),
Pdf2 As New PdfDocument(New PdfReader(Pdf2MemoryStream))
Dim Merger As New PdfMerger(PdfDocument)
Merger.Merge(Pdf1, 1, Pdf1.GetNumberOfPages)
Merger.Merge(Pdf2, 1, Pdf2.GetNumberOfPages)
End Using
Dim PdfBytes As Byte() = MemoryStream.ToArray()
Using FileStream As Stream = File.Create("TwoPdfsMergedInMemoryStream.pdf")
FileStream.Write(PdfBytes, 0, PdfBytes.Length)
End Using
End Using
As result I got the contents of both source files in TwoPdfsMergedInMemoryStream.pdf as it should be. Concerning your observation
Thing is that the p_pdfDocument.GetNumberOfPages() is correct but bytes are still just first PDF document when saved to database and viewed.
therefore, I would assume that p_bArray does contain a PDF with the contents of both your source PDFs but that there is an issue in saving to database or viewing.
To test this you could save the contents of the byte array to a file somewhere like I do above; then you can check what really is in the array.

Visual Basic Reading Saving Images Into Access

Can someone please give me an explanation to this code, and for your information I am trying to save a picture which is displayed in a picture box and save it to the Microsoft access database, I don't understand what anything means, and especially the 0.
If Not PictureBox1.Image Is Nothing Then
Dim fsreader As New FileStream(OpenFileDialog1.FileName, FileMode.Open, FileAccess.Read)
Dim breader As New BinaryReader(fsreader)
Dim imgbuffer(fsreader.Length) As Byte
breader.Read(imgbuffer, 0, fsreader.Length)
fsreader.Close()
Your snippet code is code to save a binary of picture to imgbuffer variable in VB.Net. This snippet code didn't save this images into Ms Access. But...
I will try to explain it, how this code works :
If Not PictureBox1.Image Is Nothing Then 'This code for checking if there's any images in the picture box, if it's so run next code.
Dim fsreader As New FileStream(OpenFileDialog1.FileName, FileMode.Open, FileAccess.Read) 'This variable declared for getting file name, access file and the read mode. And open it using `Filestream` variable, so after it's opened, the length of images will be saved into `fsreader` variable
Dim breader As New BinaryReader(fsreader) 'This code for reading the binary of images from `fsreader` variable and put the image buffer from range of byte to a variable
Dim imgbuffer(fsreader.Length) As Byte 'This variable is use to collect image buffer with specified length of binary images from `fsreader` variable.
breader.Read(imgbuffer, 0, fsreader.Length) 'This code is use for reading binary of images with specified length of byte and put the image buffer into a imgbuffer variable. The number `0` in the brackets means the binary images will be put fully in imgbuffer. Why `0` ? can I use larger than `0` ? yes you can but the image will be corrupt or differ from the original, so you must read the binary images from 0 to the image file size.
fsreader.Close() 'This is for closing `fsreader` from accessing images file.
So, what's next ?
From that snippet code, you can continued it to process the imgbuffer variable to save it into Ms-Access using some library collection that provided in VB.Net, like OleDB or anything else. At last, I would prefer using MemoryStream to put image buffer and save it into Bitmap variable.
Hope this explain your snippet code.

How to extract text from PDF file with IDENTITY-H fonts using VB.NET

I have a PDF file.
I am reading Text from PDF file pro-grammatically using iTextSharp class.
It does read Ansi Encoding Texts but It does not read IDENTITY-H Encoding Texts.
My problem is how to read IDENTITY-H texts from pdf file using VB.Net
Below is my code:
Public Function ReadPDFFile(ByVal strSource As String) As String
Dim sbPDFText As New StringBuilder() 'StringBuilder Object To Store read Text
If File.Exists(strSource) Then 'Does File Exist?
Dim pdfFileReader As New PdfReader(strSource) 'read File
For intCurrPage As Integer = 1 To pdfFileReader.NumberOfPages 'Loop Through All Pages
Dim lteStrategy As LocTextExtractionStrategy = New LocTextExtractionStrategy 'Read PDF File Content Blocks
'Get Text
Dim strCurrText As String = PdfTextExtractor.GetTextFromPage(pdfFileReader, intCurrPage, lteStrategy)
sbPDFText.Append(strCurrText) 'Add Text To String Builder
Next
pdfFileReader.Close() 'Close File
End If
Return sbPDFText.ToString() 'Return
End Function
Public Overridable Sub RenderText(ByVal renderInfo As TextRenderInfo) Implements ITextExtractionStrategy.RenderText
Dim segment As LineSegment = renderInfo.GetBaseline()
Dim location As New TextChunk(renderInfo.GetText(), segment.GetStartPoint(), segment.GetEndPoint(), renderInfo.GetSingleSpaceWidth())
If renderInfo.GetText = "" Then
Console.WriteLine(GetResultantText())
End If
With location
'Chunk Location:
Debug.Print(renderInfo.GetText)
.PosLeft = renderInfo.GetDescentLine.GetStartPoint(Vector.I1)
.PosRight = renderInfo.GetAscentLine.GetEndPoint(Vector.I1)
.PosBottom = renderInfo.GetDescentLine.GetStartPoint(Vector.I2)
.PosTop = renderInfo.GetAscentLine.GetEndPoint(Vector.I2)
'Chunk Font Size: (Height)
.curFontSize = .PosTop - segment.GetStartPoint()(Vector.I2)
'Use Font name and Size as Key in the SortedList
Dim StrKey As String = renderInfo.GetFont.PostscriptFontName & .curFontSize.ToString
'Add this font to ThisPdfDocFonts SortedList if it's not already present
If 1 = 1 Then
If Not ThisPdfDocFonts.ContainsKey(StrKey) Then ThisPdfDocFonts.Add(StrKey, renderInfo.GetFont)
'Store the SortedList index in this Chunk, so we can get it later
.FontIndex = ThisPdfDocFonts.IndexOfKey(StrKey)
Console.WriteLine(renderInfo.GetFont.ToString & "-->" & StrKey)
Else
'pcbContent.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED), 9)
.FontIndex = 3
.curFontSize = 8
End If
End With
locationalResult.Add(location)
End Sub
Thank you for sharing the PDF document. It helped us to determine that the problem you describe is not an iTextSharp problem. Instead it is a problem with the PDF document itself.
This problem doesn't have a solution, but I'm providing this answer to explain how you can discover for yourself that the problem also exists when iTextSharp isn't involved.
Open the document in Adobe Reader. Select the text "Muy señores nuestros" and copy/paste it into a text editor. You get "Muy señores nuestros". This is text that can be extracted using iTextSharp (it works correctly).
Now do the same with the text "GUARDIAN GLASS EXPRESS, S.L.". You get the following result: "􀀪􀀸􀀤􀀵􀀧􀀬􀀤􀀱􀀃􀀪􀀯􀀤􀀶􀀶􀀃􀀨􀀻􀀳􀀵􀀨􀀶􀀶􀀏􀀃􀀶􀀑􀀯􀀑". As you can see, you can not copy/paste the text correctly from Adobe Reader. This is due to the way the text is stored in the PDF. If you can not copy/paste the text from Adobe Reader, you should not expect to be able to extract the text using iTextSharp. The PDF is created in a way that doesn't allow extraction.
Please take a look at this video to find out some possible causes: https://www.youtube.com/watch?v=wxGEEv7ibHE
I'm sorry that it took so long to figure this out and that it turns out that you're asking something that isn't possible. Your question narrowed the problem down too much, as if the problem was caused by the "IDENTITY-H" encoding and iTextSharp. In reality, you're trying to extract text that can't be extracted.
If you look at the page dictionary inside the PDF, you'll find three font resources for the first (and only) page:
In the content stream (below) small red arrow, you see two strings (hexadecimal notation) that are shown using fonts referenced using the names C2_0 and C2_1. Incidentally, these fonts are stored as composite fonts with /SubType 0 and /Encoding Identity-H. This means that the characters used in the hexadecimal string should correspond with the UNICODE values of the glyphs. If that's not the case, you're out of luck.
There seems to be no problem with the font for which the name /TT0 is used.
The fact that /TT0 uses WinAnsiEncoding and the other fonts use Identity-H is irrelevant. There are plenty of PDF files with fonts that use Identity-H of which the text can be copy/pasted or extracted using iTextSharp. Unfortunately, there is probably something wrong with the way your PDF was constructed. It would take too much time to analyze what went wrong, so your best shot is to contact the person who gave you the PDF and to ask him/her to fix the PDF.

Issues with VB and Aspose

I have the following lines of code:
Dim ms As New MemoryStream(my_memory_stream)
workbook As New Workbook(ms)
workbook.Save("C:\book1.xlsx")
My purpose is to save the stream contained in my_memory_stream into an xlsx file named "book1": the problem is that when I run this code an exception occurr (Invalid Excel2007Xlsx file format).
Does anyone know what I'm doing wrong?
Thanks so much!
Most probably, the problem is the base64 format. It is in the form of ASCII characters. You need to convert the Base64 data to binary. And then load the binary data in Workbook class.
' Decode base64
Dim binaryBytes As Byte() = Convert.FromBase64CharArray(base64Data, 0, base64Data.Length)
' Load in MemoryStream
Dim binaryStream As New MemoryStream(binaryBytes)
' Pass memory stream to Workbook
Dim workbook As New Aspose.Cells.Workbook(my_memory_stream)
workbook.Save(dataDir + "workbook-out.xlsx")
Also note that, if your decoding is correct, you can directly save the binary byte or stream to disk and verify that the Excel is saved correctly.
PS. I am a Developer Evangelist at Aspose.
It seems that you are trying to save a stream directly to a Excel file. Excel files have a fixed format, and the stream will not be adhering to the standard file protocol of excel.
I think you would like to use Workbook object of Microsoft.office.Interop.Excel namespace.

VB.net will not read text file correctly

I've been trying to use StreamReader to read a log file. I cannot verify what it is encoded in, as when I open it in notepad++ and select ANSI encoding, I get this result:
I'm getting the characters needed when using ANSI but they are followed by things like [NULL][EOT][SOH][NUL][SI]
When I try and read the file in VB (using StreamReader or ReadAll) with ANSI encoding selected the resulting string I get back is completely wrong.
How could I read a file like this in VB.net?
You could use the IO.File.ReadAllText("File Location", encoding as System.Text.Encoding) method,
Dim textFromFile as string = IO.File.ReadAllText("C:\Users\Jason\Desktop\login20130417.rdb", System.Text.Encoding.ASCII) 'Or Unicode, UFT32, UFT8, UFT7, BigEndianUnicode or default. Default is ANSI.
If you still don't get the text you need by using the default encoding (ANSI), then you can always try the other 6 different encoding methods.
Update...
It appears that your file is corrupt, using the code below I was able to get a binary representation of whatever is in the file, I got this,

The massive amount of null data would suggest that the file is corrupt, which would also explain why we are not getting a lot of data whenever we try to read the file.
The code,
Dim fileData As String = IO.File.ReadAllText("C:\Users\Jason\Desktop\login20130417.rdb")
Dim i As Integer = 0
Dim binaryData As String = ""
Dim ch As String = ""
Do Until i = fileData.Length
ch = fileData.Chars(i)
bin = bin & System.Convert.ToString(AscW(ch), 2).PadLeft(8, "0")
i = i + 1
Loop
As #Daniel A. White suggested in his comment, that file does not appear to be encoded like a "normal" text file. A StreamReader will not work in this situation. I would attempt to use a BinaryReader.
Rdb file? Never heard of it. Quick google makes it less clear - n64 database file, Darkbot, etc...
However considering the name you have, and the general look of the opened file, i would say its a binary file.
If you want to read the file in vb.net you'll need a library of sorts, and i can't help you with one until you are able to shed some light on what the file may be, or what it was created with.