Comparing files not working as intended - vb.net

hi guys could someone explain to me why this does not work.
I basically have to text files called Books and NewBooks...
The text files are populated from a web request and the info is then parsed into the text files...when I start the program Books and new books are identical and pretty much a copy of each other.
more web requests are done to update the NewBooks text file and when I compare them if there is a line in NewBooks that is not in Books it adds that line to a third text file called myNewBooks. Now my initial code that I will show here works as I expected
Dim InitialBooks = File.ReadAllLines("Books.json")
Dim TW As System.IO.TextWriter
'Create a Text file and load it into the TextWriter
TW = System.IO.File.CreateText("myNewBooks.JSON")
Dim NewBooks = String.Empty
Using reader = New StreamReader("NewBooks.json")
Do Until reader.EndOfStream
Dim current = reader.ReadLine
If Not InitialBooks.Contains(current) Then
NewBooks = current & Environment.NewLine
TW.WriteLine(NewBooks)
TW.Flush()
'Close the File
End If
Loop
End Using
TW.Close() : TW.Dispose()
but because part of the string in my text file lines contain a url which sometimes I find the same book with a different url... I was getting duplicate entries of books becuase the url was the only difference. So I thought that I would split the string before the url so that I just compare the title and description and region ...fyi a line in my text files look similar to this:
{ "Title": "My Title Here", "Description": "My Description Here", "Region": "My Region Here", "Url": "My Url Here", "Image": "My Image Here" };
So a fellow today helped me figure out how to split my line so it looks more like this:
{ "Title": "My Title Here", "Description": "My Description Here", "Region": "My Region Here", "Url"
which is great but now when I compare it does not see that the first line contains the split line and I don't understand why... here is the code after it was modified.
Dim InitialBooks = File.ReadAllLines("Books.json")
Dim TW As System.IO.TextWriter
'Create a Text file and load it into the TextWriter
TW = System.IO.File.CreateText("myNewBooks.JSON")
Dim NewBooks = String.Empty
Using reader = New StreamReader("NewBooks.json")
Do Until reader.EndOfStream
Dim current = reader.ReadLine
Dim splitAt As String = """Url"""
Dim index As Integer = current.IndexOf(splitAt)
Dim output As String = current.Substring(0, index + splitAt.Length)
If Not InitialBooks.Contains(output) Then
NewBooks = current & Environment.NewLine
TW.WriteLine(NewBooks)
TW.Flush()
'Close the File
End If
Loop
End Using
TW.Close() : TW.Dispose()
Your wisdom would be appreciated!!

Your OP is confusing.
If I understood correctly:
You have 3 files Books, NewBooks and MyBooks.
You download data from web, if that data is not located in Books, you add it to NewBooks, otherwise to MyBooks(duplicates).
Seeing that you are working with JSON i would do it the following way.
Load the Books, when downloading data check it and compare it with Books. Then write to proper file.
Imports System.Web.Script.Serialization ' for reading of JSON (+add the reference to System.Web.Extensions library)
Dim JSONBooks = New JavaScriptSerializer().DeserializeObject(Books_string)
Inspect JSONBooks with breakpoint. You will see how it looks.
When downlaoding data you can simply check if book exist in it, by title, url or whatever you want.
Since you shown only one book
Debug.Print(JSONBooks("Title")) 'returns >>>My Title Here
When you have more
JSONBooks(x)("Title") 'where x is book number.
So you can loop over all books and check what you need.
JSON array looks like this (if you need to construct it)
[{book1},{book2},...]

Related

How do I know if the PdfTextExtractor produced reliable results?

I am using the below code to extract text from PDFs for book keeping purposes.
How would I know if the PDF was "well readable" and produced accurate results or if produced "garbage" output which would require using an OCR solution?
Currently I have to inspect each results manually and see if it resulted in
"Iin voicE #Ajk 932 2"
or
"Invoice #8793201".
Using nReader As iTextSharp.text.pdf.PdfReader = New iTextSharp.text.pdf.PdfReader(fileName)
For page As Integer = 1 To nReader.NumberOfPages
Dim strategy As New iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy
Dim currentText As String = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(nReader, page, strategy)
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)))
sb.Append(currentText)
Next
nReader.Close()
End Using

vb.net How to save File as Word & Open Office Document?

I have a small programm and i want to save the File so i can read them later into when i open it.
How can i now save the File cause i must save 5 Variables and read them back into the Tool and if its possible i want to use the File in Word or OpenOffice too.
My Variables
Title - Pieces- SinglePrice- Totalprice
Please give me Examples for the Point in the right way.
Thanks everyone!
If all you want to do is store four variables from your program in a file that can also be read by Word and OpenOffice, you can do that easily enough with a text file. The following code assumes that Title is a String, Pieces is an Integer, SinglePrice and TotalPrice are Decimal.
Dim folder As String = My.Computer.FileSystem.SpecialDirectories.MyDocuments
Dim saveFile As String = System.IO.Path.Combine(folder, "Save.txt")
Dim vars() As String = {Title, Pieces.ToString, SinglePrice.ToString, TotalPrice.ToString}
System.IO.File.WriteAllLines(saveFile, vars)
If you need to read the file to restore the values of the variables, you can do it like this. Note that this code assumes that the file was written by the first snippet of code, otherwise it would be necessary to validate the contaents of the file before using it.
Dim folder As String = My.Computer.FileSystem.SpecialDirectories.MyDocuments
Dim saveFile As String = System.IO.Path.Combine(folder, "Save.txt")
Dim vars() As String = System.IO.File.ReadAllLines(saveFile)
Title = vars(0)
Pieces = CInt(vars(1))
SinglePrice = CDec(vars(2))
TotalPrice = CDec(vars(3))

How to read a text file from a url line by line

I am using vb.net ( i'm amateur ) and i am trying to make my program to download a file from my ftp server .
It should read a text file from a link , line by line and on each line is a word .
Text file is something like :
first
second
third
It must add each line content to a link and then download it . After it downloads the first line it must go to the second and so on .
I don't know if its possible and i really hope someone can help me . Thank you .
This code does what you need:
Dim FileLink As String = "yourserver/file.txt" 'Link to your text file
Dim Destination As String = "C:\" 'Link to the download directory
Dim Client As WebClient = New WebClient() 'New web client
Dim Reader As StreamReader = New StreamReader(Client.OpenRead(FileLink)) 'A stream reader to get your file contents
Dim A As String = Reader.ReadToEnd 'Outputing the file contents to A
Dim Phase As Integer = 0 'A phase which determens which line you're at
Dim Links(500) As String 'A string array to hold 500 links, you can edit the number
For I As Integer = 0 To A.Length 'Looping through A
If A(I) = Environment.NewLine Then 'If the current character is a newline, it'll start downloading the from the first link and increase the Phase
My.Computer.Network.DownloadFile(Links(Phase), Destination, "", "", False, 9800, True) 'Link, destination, id, password, show UI or not, timeout, overwrite or not
Phase += 1
Else 'If it's not a newline then the link is still incomplete, it adds the character to the link currently being written
Links(Phase) += A(I)
End If
Next

How to make textfile to save in program directory in Visual Studio

As i have abandoned the array approach to the problem, i need to know how to make listbox to save in textfile always in program's directory so it can be used/accessed to populate a different listbox, any ideas? Below is my code.
SaveFileDialog1.Filter = "Text files (.txt)|.txt"
SaveFileDialog1.ShowDialog()
If SaveFileDialog1.FileName <> "" Then
Using SW As New IO.StreamWriter(SaveFileDialog1.FileName, False)
For Each itm As String In Me.ListBox1.Items
SW.WriteLine(itm)
Next
End Using
End If
A little bit of research on your part would've helped you understand what you are trying to accomplish better.
How do I get Program Data directory? My.Computer.FileSystem.SpecialDirectories.AllUsersApplicationData
How do I Write multiple lines to file? File.WriteAllLines()
How do I Read multiple lines from a file? File.ReadAllLines()
Once you understand the basics you can easily put them together
Create two List boxes, and one button on your WinForm:
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.
'Get the Program Data Directory (This is hidden by default by the OS.)
Dim strPath As String = My.Computer.FileSystem.SpecialDirectories.AllUsersApplicationData
Dim fileName As String = "myFile.txt"
Dim fullPath = Path.Combine(strPath, fileName)
Dim data As String() = {"Item 1", "Item 2", "Item 3", "Item 4", "Item 5"}
'Save the items to ListBox1 First
For Each item As String In data
ListBox1.Items.Add(item)
Next
'Now write the items to the textfile, line by line.
File.WriteAllLines(fullPath, data)
'Read all lines we just saved and load them onto an array of strings.
Dim tempAllLines() As String = File.ReadAllLines(fullPath)
'Display each on ListBox2 by iterating the array.
For Each line As String In tempAllLines
ListBox2.Items.Add(line)
Next
End Sub
Here, I created this form so you can get an idea of what i'm referring to.
You can get the path to the current executable's folder like this:
folderPath = Path.GetDirectoryName(Assembly.GetEntryAssembly().Location)
However, that will only work if the executable is a .NET assembly. Otherwise, you could use the first argument in the command line (which is the full executable file path), like this:
folderPath = Path.GetDirectoryName(Environment.GetCommandLineArgs()(0))
If, on the other hand, you want to get the path of the current assembly (which may be different than the executable that loaded it) you could do this:
folderPath = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)
Or, if you want to just get the current directory, you could use this:
folderPath = Directory.GetCurrentDirectory()
Once you have the folder path, you can add the file name to it with Path.Combine, like this:
filePath = Path.Combine(folderPath, fileName)
However, it's not recommended that you write data directly to the program's running path, since the user may not have permission to write to that folder. Using the program data folder would certainly be better, but even that can be risky:
folderPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData), "MyAppName")
The recommended place to store data from .NET apps is Isolated Storage.

How do I search through a string for a particular hyperlink in Visualbasic.net?

I have a written a program which downloads a webpage's source but now I want to search the source for a particular link I know the link is written like this:
<b>Geographical Survey Work</b>
Is there anyway of using "Geographical Survey Work" as criteria to retrieve the link? The code I am using to download the source to a string is this:
Dim sourcecode As String = ((New Net.WebClient).DownloadString("http://examplesite.com"))
So just to clarify I want to type into an input box "Geographical Survey Work" for instance and "/internet/A2" to popup in a messagebox? I think it can be done using a regex, but that's a bit beyond me. Any help would be great.
With HTMLAgilityPack:
Dim vsPageHTML As String = "<html>... your webpage HTML code ...</html>"
Dim voHTMLDoc.LoadHtml(vsPageHTML) : vsPageHTML = ""
Dim vsURI As String = ""
Dim voNodes As HtmlAgilityPack.HtmlNodeCollection = voHTMLDoc.SelectNodes("//a[#href]")
If Not IsNothing(voNodes) Then
For Each voNode As HtmlAgilityPack.HtmlNode In voNodes
If voNode.innerHTML.toLower() = "<b>geographical survey work</b>" Then
vsURI = voNode.GetAttributeValue("href", "")
Exit For
End If
Next
End If
voNodes = Nothing : voHTMLDoc = Nothing
Do whatever you want with vsURI.
You might need to tweak the code a bit as I'm writing free-hand.