Finding And Removing Duplicates from text file using vb.net - vb.net

I have a huge text file in which a large number of duplication are occurred. The duplications are as follows.
Total Posts 16
Pin Code = GFDHG
TITLE = Shop Signs/Projection Signs/Industrial Signage/Restaurant signs/Menu Boards&Box in London
DATE = 12-09-2012
Tracking Key # 85265E712050-15207427406854753
Total Posts 16
Pin Code = GFDHG
TITLE = Shop Signs/Projection Signs/Industrial Signage/Restaurant signs/Menu Boards&Box in London
DATE = 12-09-2012
Tracking Key # 85265E712050-15207427406854753
Total Posts 2894
Pin Code = GFDHG
TITLE = Shop Signs/Projection Signs/Industrial Signage/Restaurant signs/Menu Boards&Box in London
DATE = 15-09-2012
Tracking Key # 85265E712050-152797637654753
Total Posts 2894
Pin Code = GFDHG
TITLE = Shop Signs/Projection Signs/Industrial Signage/Restaurant signs/Menu Boards&Box in London
DATE = 15-09-2012
Tracking Key # 85265E712050-152797637654753
and so on upto 4000 total posts are there in this text file. I want that my program match the total post 6 to all total post that occurred in file and where find the duplicate , then programatically remove that duplicate and also delete the next 7 lines of that duplicate. Thank you

Assuming the formatting is consistent (i.e. each logged event in the file uses 6 total lines of text), then if you're looking to remove duplicates from the file you just need to do something like this:
Sub DupClean(ByVal fpath As String) 'fpath is the FULL file path, i.e. C:\Users\username\Documents\filename.txt
Dim OrigText As String = ""
Dim CleanText As String = ""
Dim CText As String = ""
Dim SReader As New System.IO.StreamReader(fpath, System.Text.Encoding.UTF8)
Dim TxtLines As New List(Of String)
Dim i As Long = 0
Dim writer As New System.IO.StreamWriter(Left(fpath, fpath.Length - 4) & "_clean.txt", False) 'to overwrite the text inside the same file simply use StreamWriter(fpath)
Try
'Read in the text
OrigText = SReader.ReadToEnd
'Parse the text at new lines to allow selecting groups of 6 lines
TxtLines.AddRange(Split(OrigText, Chr(10))) 'may need to change the Chr # to look for depending on if 10 or 13 is used when the file is generated
Catch ex As Exception
MsgBox("Encountered an error while reading in the text file contents and parsing them. Details: " & ex.Message, vbOKOnly, "Read Error")
End
End Try
Try
'Now we iterate through blocks of 6 lines
Do While i < TxtLines.Count
'Set CText to the next 6 lines of text
CText = TxtLines.Item(i) & Chr(10) & TxtLines.Item(i + 1) & Chr(10) & TxtLines.Item(i + 2) & Chr(10) & TxtLines.Item(i + 3) & Chr(10) & TxtLines.Item(i + 4) & Chr(10) & TxtLines.Item(i + 5)
'Check if CText is already present in CleanText
If Not (CleanText.Contains(CText)) Then
'Add CText to CleanText
If CleanText.Length = 0 Then
CleanText = CText
Else
CleanText = CleanText & Chr(10) & CText
End If
End If 'else the text is already present and we don't need to do anything
i = i + 6
Loop
Catch ex As Exception
MsgBox("Encountered an error while running cleaning duplicates from the read in text. The application was on the " & i & "-th line of text when the following error was thrown: " & ex.Message, _
vbOKOnly, "Comparison Error")
End
End Try
Try
'Write out the clean text
writer.Write(CleanText)
Catch ex As Exception
MsgBox("Encountered an error writing the cleaned text. Details: " & ex.Message & Chr(10) & Chr(10) & "The cleaned text was " & CleanText, vbOKOnly, "Write Error")
End Try
End Sub
If the format isn't consistent you'll need to get fancier and define rules to tell which lines to add to CText on any given pass through the loop, but without context I wouldn't be able to give you any ideas as to what those may be.

Related

Read message from text file and display in message box in vb.net

JavaError.128 = "project creation failed. & vbLf & Please try again and if the problem persists then contact the administrator"
I am able to read this message from text file. the issue is vbLf is not considered as newline in msgbox. it prints vbLf in msgbox.
Using sr As System.IO.StreamReader = My.Computer.FileSystem.OpenTextFileReader(errorfilePath)
While ((sr.Peek() <> -1))
line = sr.ReadLine
If line.Trim().StartsWith("JavaError." & output) Then
isValueFound = True
Exit While
End If
End While
sr.Close()
End Using
If isValueFound Then
Dim strArray As String() = line.Split("="c)
MsgBox(strArray(1).Replace("""", "").Trim({" "c}))
End If
You can make all your code a simpler one line version using File.ReadAllLines and LINQ. This code will put all the lines starting with javaerror into the textbox, not just the first:
textBox.Lines = File.ReadAllLines(errorFilePath) _
.Where(Function(s) s.Trim().StartsWith("JavaError")) _
.Select(Function(t) t.Substring(t.IndexOf("= ") + 2).Replace(" & vbLf & ", Environment.NewLine)) _
.ToArray()
You need to Imports System.IO and System.Linq
This code reads all the lines of the file into an array, then uses LINQ to pull out only those starting with java error, then projects a new string of everything after the = with vbLf replaced with a newline, converts the enumerable projection to an array of strings and assigns it to the textBox lines
If you don't want all the lines but instead only the first:
textBox.Text = File.ReadLines(errorFilePath) _
.FirstOrDefault(Function(s) s.Trim().StartsWith("JavaError")) _
?.Substring(t.IndexOf("= ") + 2).Replace(" & vbLf & ", Environment.NewLine))
This one uses ReadLine instead of ReadALlLines - ReadLines works progressively, and it makes sense to be able to stop reading after we foundt he first rather than have the overhead of reading ALL (million) lines only to then end up pulling the first out and throwing 999,999 lines of effort away. So it's reading line by line, pulls out the first that starts with "JavaError" (or Nothing if there is no such line), then checks if Nothing came out (the ?) and skips the Substring if it was Nothing, or it does a Substring on everything after the = and replaces vbLf with newline
For a straight up mod of your original code:
Using sr As System.IO.StreamReader = My.Computer.FileSystem.OpenTextFileReader(errorfilePath)
While ((sr.Peek() <> -1))
line = sr.ReadLine
If line.Trim().StartsWith("JavaError." & output) Then
isValueFound = True
line = line.Replace(" & vbLf & ", Environment.NewLine))
'^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ added code
Exit While
End If
End While
sr.Close()
End Using
If isValueFound Then
Dim strArray As String() = line.Split("="c)
MsgBox(strArray(1).Replace("""", "").Trim({" "c}))
End If
Note that I've always made my replacement on & vbLf & with a space at each end to avoid stray spaces being left behind - if your file sometimes doesn't have them, consider using Regex to do the replace, e.g. Regex.Replace(line, " ?& vbLf & ?", Environment.NewLine
This could work:
Dim txtFile As String = "project creation failed. & vbLf & Please try again and if the problem persists then contact the administrator"
Dim arraytext() As String = txtFile.Split("&")
Dim txtMsgBox As String = Nothing
For Each row As String In arraytext
If Trim(row) = "vbLf" Then
txtMsgBox = txtMsgBox & vbLf
Else
txtMsgBox = txtMsgBox & Trim(row)
End If
Next
MsgBox(txtMsgBox)

vb.net WPF application combobox not waiting for user to choose

I have a wpf vb.net application (written in VS 2019 community) containing two website searches. The first search is a string search and that either produces one result, which I then load into the second search, or it produces more than one result, which I load into a combobox to let the user choose. My problem is getting the application to stop and allow the user to choose from the combobox. I have implemented a workaround that uses a modal form containing a combobox and this allows the user to choose from the combobox and supply the value to the second search. I have been advised to use the 'change' event for the combobox but there isn't one available, I have also been advised to use the selectedindexchanged but the control doesn't let the dropdown list occur to select anything. I have also tried using various forms of addhandler (commented out in the code below).
' Build the 'Search API' URL.
Dim uri = New Uri("https://api.themoviedb.org/3/search/tv?" _
& "api_key=" & TMDBAPIKey _
& "&language=en-US" _
& "&query=" & sLvl1NodeName _
& "&page=1" _
& "&first_air_date_year=" & sFirstXmitYear)
' Retrieve the IMDB ID with an API Search function using the series title
Try
Dim Site = New WebClient()
Answer = Site.DownloadString(uri)
Catch ex As NullReferenceException
Dim messagetext As String = "The 'Search API' from GetDetails popup failed with : " _
& ex.Message & " for: Title=" & sLvl1NodeName
Me.txtErrorMessageBox.Text = messagetext
Exit Sub
End Try
' Deserialise the answer
Dim JsonElem As TMDBtitle = JsonConvert.DeserializeObject(Of TMDBtitle)(Answer)
' If the websearch finds only one result this is the TV series we want, if more than
' one result is found load the results into a combobox and get the user to choose.
If JsonElem.results.Length = 1 Then
TVSeriesID = JsonElem.results(0).id
Else
Me.cmbChooseSeries.BeginUpdate()
Me.lblChooseSeries.Text = Me.lblChooseSeries.Text & "( " & JsonElem.results.Length & " )"
Me.cmbChooseSeries.Items.Clear()
For Each titleresult In JsonElem.results
ComboSeriesChoice = titleresult.name & " | " &
titleresult.id & " | " &
titleresult.first_air_date & " | " &
titleresult.overview
Me.cmbChooseSeries.Items.Add(ComboSeriesChoice)
Next
cmbChooseSeries.DroppedDown = True
Me.cmbChooseSeries.EndUpdate()
If cmbChooseSeries.SelectedIndex <> -1 Then
Dim var1 = cmbChooseSeries.SelectedText
Else
Threading.Thread.Sleep(3000)
End If
'AddHandler cmbChooseSeries.MouseDoubleClick,
'Sub()
'Threading.Thread.Sleep(3000)
'End Sub
TVSeriesID = cmbChooseSeries.SelectedItem
End If
' Build the 'TV Search API' call URL.
Dim urix = New Uri("https://api.themoviedb.org/3/tv/" _
& TVSeriesID & "?" _
& "api_key=" & TMDBAPIKey _
& "&language=en-US")
Try
Dim site = New WebClient()
Answer = site.DownloadString(urix) ' download the JSON from the server.
Catch ex As NullReferenceException
Dim MessageText As String = "The 'TV Search API' from GetDetails popup failed with : " _
& ex.Message & " for: Title=" & sLvl1NodeName & " ID=" & popupid
Me.txtErrorMessageBox.Text = MessageText
Exit Sub
End Try
Dim jsonelemx = JsonConvert.DeserializeObject(Of TVResult)(Answer)
lstDetailItems(0) = "Name"
lstDetailItems(1) = jsonelemx.name
lstDetailItems(2) = (String.Empty)
Dim DelItems = New ListViewItem(lstDetailItems)
Me.lstSeriesDetails.Items.Add(DelItems)
lstDetailItems(0) = "Status"
lstDetailItems(1) = jsonelemx.status
lstDetailItems(2) = (String.Empty)
DelItems = New ListViewItem(lstDetailItems)
Me.lstSeriesDetails.Items.Add(DelItems)
lstDetailItems(0) = "Episode run time"
lstDetailItems(1) = Convert.ToString(jsonelemx.episode_run_time(0))
lstDetailItems(2) = (String.Empty)
DelItems = New ListViewItem(lstDetailItems)
Me.lstSeriesDetails.Items.Add(DelItems)

Prevent the `vbLf` being included in a `String.Length` calculation

I have a Collection of servers which I take from a multiline textbox.
I have some simple validation which should be trimming whitespace (to prevent an entry being created for blank line), but it's not working.
For example, if Me.formServers.txtServers.Text is as follows, the lengths of the lines are returned as 5, 5, 5 and 4. How can I correctly calculate the length of each line and thus avoid erroneous items being added to my Collection? Thanks.
TTSA
TTSB
TTSC
TTSD
Here is my code
Me.Servers = New Collection ' Reset 'Servers' to ensure only the correct servers are included
For Each Server As String In Me.formServers.txtServers.Text.Split(vbLf)
If Not Server.Trim.Length = 0 Then Me.Servers.Add(Server)
MsgBox(Server.Length)
Next
This test was interesting:
Dim s As String = "TTSA" & vbCrLf
s &= "TTSB" & vbLf
s &= "TTSC" & vbCr
s &= "TTSD" & Environment.NewLine
s &= "TTSE" & vbNewLine
Dim Excluded() As String
Excluded = s.Split({vbCrLf, vbLf}, StringSplitOptions.None)
For Each s In Excluded
Debug.Print(s & " " & s.Length)
Next
result:
TTSA 4
TTSB 4
TTSC ' vbCr was not in list so is still in the string
TTSD 9
TTSE 4
0 ' last separator honored
Corrected to:
Excluded = s.Split({vbCrLf, vbLf, vbCr}, _
StringSplitOptions.RemoveEmptyEntries)

VB Saving Listview Items Error

Well the code itself works. The problem occurs when there is a sub items without text, the program will crash. I'm looking for a method that will bypass this annoying error.
My Code:
If ComboBox1.Text = "Everything" Then
Dim SetSave As SaveFileDialog = New SaveFileDialog
SetSave.Title = ".txt"
SetSave.Filter = ".txt File (*.txt)|*.txt"
If SetSave.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim s As New IO.StreamWriter(SetSave.FileName, False)
For Each myItem As ListViewItem In Form1.ListView1.Items
s.WriteLine(myItem.Text & TextBox1.Text & myItem.SubItems(1).Text & TextBox1.Text & myItem.SubItems(2).Text & TextBox1.Text & myItem.SubItems(3).Text & TextBox1.Text & myItem.SubItems(4).Text & TextBox1.Text & myItem.SubItems(5).Text & TextBox1.Text & myItem.SubItems(6).Text & TextBox1.Text & myItem.SubItems(7).Text) '// write Item and SubItem.
Next
s.Close()
End If
Error:(this indicates the the listview item without text it can range from number 1 up to 7, the one below is 5)
InvalidArgument=Value of '5' is not valid for 'index'.
Parameter name: index
Your indexing is starting at 1. VB indexing starts at 0 so for 5 items you whould have index values of 0 to 4

Vb.net Journal Program Issue

Okay so for an internship project i'm making a Journal with streamwriters and streamreaders.
I have to to where you can create an account with a name, Username, and Password. I also have it to where it creates a txt file in that persons name when you create the account. Now, they login and it brings them to the journal page. The Journal Page for the most part has a Date for your journal Entry, the title of the journal and the journal entry text itself.
The problem that I am having is that when you click the button to create/edit a journal entry, it goes through a sub routine that checks if that journal exists (Meaning that there is already one for that date) or not. If it doesn't exist, then it should create a new one at the bottom of the text file. If it does exist then it should edit the lines in which that journal are stationed in the text file.
Code:
Private Sub CreateBtn_Click(sender As System.Object, e As System.EventArgs) Handles CreateBtn.Click
Errors = ""
Dim TempCounter As Integer = 0
If TitleTxt.Text = "" Then
Errors = "You must enter a title." & vbCrLf
End If
If JournalTextRtxt.Text = "" Then
Errors &= "You must enter an entry for the journal."
End If
If Errors <> "" Then
MessageBox.Show("There's an error in creating/editing your journal." & vbCrLf & "Error(s):" & vbCrLf & Errors, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error)
Else
JournalDate = DateTimePicker1.Value
JournalTitle = TitleTxt.Text
JournalText = JournalTextRtxt.Text
arrJournalEntries(TempCounter).TheDate = JournalDate
arrJournalEntries(TempCounter).Title = JournalTitle
arrJournalEntries(TempCounter).JournalEntry = JournalText
CheckAndWrite()
End If
End Sub
Private Sub CheckAndWrite()
Dim Reader As New StreamReader(MyName & ".txt", False)
Dim Sline As String = Reader.ReadLine
Counter = 0
Do Until (Sline Is Nothing) 'Perform the code until the line in the text file is blank
If Not Sline Is Nothing Then 'If the line in the text file is NOT blank then
For i As Integer = 1 To 3
Select Case i
Case 1
arrJournalEntries(Counter).TheDate = Sline
Sline = Reader.ReadLine
Case 2
arrJournalEntries(Counter).Title = Sline
Sline = Reader.ReadLine
Case 3
arrJournalEntries(Counter).JournalEntry = Sline
Sline = Reader.ReadLine
End Select
Next
End If
JournalDate = arrJournalEntries(Counter).TheDate
Time = DateTimePicker1.Value
MsgBox("Journal Date = " & JournalDate & vbCrLf & "Today's Date = " & Time)
If Time = JournalDate Then
JournalFound = True
Else
Counter += 1
JournalFound = False
End If
Loop
Reader.Close()
Try
If Sline Is Nothing Or JournalFound = False Then
MsgBox("Your journal is now going to be created.")
JournalDate = DateTimePicker1.Value
JournalTitle = TitleTxt.Text
JournalText = JournalTextRtxt.Text
arrJournalEntries(Counter).TheDate = JournalDate
arrJournalEntries(Counter).Title = JournalTitle
arrJournalEntries(Counter).JournalEntry = JournalText
Dim Writer As New StreamWriter(MyName & ".txt", True)
Do Until (arrJournalEntries(Counter).TheDate = Nothing)
Writer.WriteLine(arrJournalEntries(Counter).TheDate)
Writer.WriteLine(arrJournalEntries(Counter).Title)
Writer.WriteLine(arrJournalEntries(Counter).JournalEntry)
Counter += 1
Loop
Writer.Close()
End If
If JournalFound = True Then
MsgBox("Your journal is now going to be edited.")
JournalDate = DateTimePicker1.Value
JournalTitle = TitleTxt.Text
JournalText = JournalTextRtxt.Text
arrJournalEntries(Counter).TheDate = JournalDate
arrJournalEntries(Counter).Title = JournalTitle
arrJournalEntries(Counter).JournalEntry = JournalText
Dim Writer As New StreamWriter(MyName & ".txt", True)
Do Until (arrJournalEntries(Counter).TheDate = Nothing)
Writer.WriteLine(arrJournalEntries(Counter).TheDate)
Writer.WriteLine(arrJournalEntries(Counter).Title)
Writer.WriteLine(arrJournalEntries(Counter).JournalEntry)
Counter += 1
Loop
Writer.Close()
End If
Catch ex As Exception
MessageBox.Show("An error has occured" & vbCrLf & vbCrLf & "Original Error:" & vbCrLf & ex.ToString)
End Try
End Sub`
The problem that's occuring is that it's not only writing in the first time wrong. When it's supposed to say it's going to edit, it doesn't, it just says creating. But it just adds on to the file. After pressing the button 3 times with the current date. and the Title being "Test title", and the journal entry text being "Test text". This is what occured.
It should just be
7/10/2012 3:52:08 PM
Test title
Test text
7/10/2012 3:52:08 PM
Test title
Test text
the whole way through. but of course if it's the same date then it just overwrites it. So can anybody please help me?
You are only filtering your array by the date, so it looks like you have an object with a date but no title or text:
Do Until (arrJournalEntries(Counter).TheDate = Nothing)
The "quick" fix:
Do Until (arrJournalEntries(Counter).TheDate = Nothing)
If arrJournalEntries(Counter).Title <> String.Empty Then
Writer.WriteLine(arrJournalEntries(Counter).TheDate)
Writer.WriteLine(arrJournalEntries(Counter).Title)
Writer.WriteLine(arrJournalEntries(Counter).JournalEntry)
End If
Counter += 1
Loop
Do consider getting rid of the array and using a List(of JournalEntry) instead. Your code looks difficult to maintain in its current state.