Reading multiple textfiles.txt and copying all duplicate lines to a new textfile.txt in VB .net - vb.net

FIRST PROBLEM:
I am looking for an easiest way to read multiple text files at one time and extract all the duplicate lines from each text file and copy those duplicate lines to a new text file with headings representing name of parent text file. I am dealing with more than 20 text files at a time and it is a mess to go through each one by one. Secondly, I am dealing with large / heavy file (more or less 30,000 lines in each file) .. I am currently using a program with "Stream Reader" and "Stream Writer" and I prefer to have the same approach for my understanding. OR any new and easy way is Welcome !!
SECOND PROBLEM:
I want to compare 2 text files and get the duplicate lines on to a new text file. I don't want to remove/delete/overwrite: Just to copy them across to a new text file.
Please use openfiledialog and savefiledialog for the files.
Thanks a lot in advance.
Best Regards
VB_Learner
Dim Dim optxtfile As New OpenFileDialog
optxtfile.RestoreDirectory = True
optxtfile.Multiselect = False
optxtfile.Filter = "txt files (*.txt)|*.txt"
optxtfile.FilterIndex = 1
optxtfile.ShowDialog()
If (Not optxtfile.FileName = Nothing) Then
Dim lines As New List(Of String)
Using sr As New System.IO.StreamReader(optxtfile.FileName)
While sr.Peek <> -1
Dim line As String = sr.ReadLine()
Dim isNew As Boolean = True
For Each dupl As String In lines
If (dupl = line) Then isNew = False
Next
If (isNew) Then lines.Add(sr.ReadLine())
End While
End Using
Dim svDir As String
If (My.Computer.FileSystem.FileExists(optxtfile.FileName)) Then
My.Computer.FileSystem.DeleteFile(optxtfile.FileName)
svDir = optxtfile.filename
Dim svtxtfile As New SaveFileDialog
svtxtfile.RestoreDirectory = True
svtxtfile.Filter = "txt files (*.txt)|*.txt"
svtxtfile.FilterIndex = 1
svtxtfile.ShowDialog()
If (svtxtfile.FileName = Nothing) Then
svDir = optxtfile.FileName
Else
svDir = svtxtfile.FileName
End If
End If
Using write2text As New System.IO.StreamWriter(svDir)
For Each line As String In lines
write2text.WriteLine(line)
Next
End Using
End If
End Sub

Related

VB.net: overwrite everything in a text file

in my VB.net Application id like to overwrite and add new content of a text file
What Code do I need to use?
Thanks
Read (ie: load) everything in the TXT file into your program.
Dim sFullPathToFile As String = Application.StartupPath & "\Sample.txt"
Dim sAllText As String = ""
Using xStreamReader As StreamReader = New StreamReader(sFullPathToFile)
sAllText = xStreamReader.ReadToEnd
End Using
Dim arNames As String() = Split(sAllText, vbCrLf)
'Just for fun, display the found entries in a ListBox
For iNum As Integer = 0 To UBound(arNames)
If arNames(iNum) > "" Then lstPeople.Items.Add(arNames(iNum))
Next iNum
Because you wanted to overwrite everything in the file, we now use StreamWriter (not a StreamReader like before).
'Use the True to indicate it is to be appended to existing file
'Or use False to open the file in Overwrite mode
Dim xStreamWRITER As StreamWriter = New StreamWriter(sFullPathToFile, False)
'Use the carriage return character or else each entry is on the same line
xStreamWRITER.Write("I have overwritten everything!" & vbCrLf)
xStreamWRITER.Close()

Dynamically edit Contents of a CSV file using VB.net

I have this code
Dim fileReader As System.IO.StreamReader
fileReader =
My.Computer.FileSystem.OpenTextFileReader("Filepath")
Dim stringReader As String
'read csv file from first to last line
While fileReader.ReadLine <> ""
'get data of line
stringReader = fileReader.ReadLine()
'check the number of commas in the line
Dim meh As String() = stringReader.Split(",")
If meh.Length > 14 Then
My.Computer.FileSystem.WriteAllText("Filepath", "asd", True)
ElseIf meh.Length < 14 Then
My.Computer.FileSystem.WriteAllText("Filepath", "asd", True)
ElseIf meh.Length = 14 Then
MsgBox("This line of the file has " & stringReader & meh.Length & "commas")
End If
End While
End
End Sub
To explain, the above code would check EACH line of a CSV file to check weather the contents has 14 commas('). Then if that line has more commas, the code will reduce it to 14, and if not, it would write commas so that it would be equal to 14. The above conditions are not created yet, so the code is just for testing. I read something about WriteAllText and this code gives me the error :
The process cannot access the file 'filepath' because it is being used by another process.
Which, I think, means that I cant edit the CSV file because I'm currently using its data.
My question is, how could I edit the contents of the CSV file even when I am checking its contents?
Please do disregard this code
My.Computer.FileSystem.WriteAllText("Filepath", "asd", True)
as I use this code just for testing, if ever I could manage to write it to the CSV file.
I Thank you for all your help.
If your CSV is not too big, you can read it in memory and work with it. When you have finish you can write it again on disk.
'Declare 2 List (or you can work directly with one)
Dim ListLines As New List(Of String)
Dim ListLinesNEW As New List(Of String)
'Read the file
Using MyCSVread As New IO.FileStream("C:\MyCSV.csv", IO.FileMode.Open, IO.FileAccess.ReadWrite)
'Read all the lines and put it in a list of T (string)
Using sReader As IO.StreamReader = New IO.StreamReader(MyCSVread)
Do While sReader.Peek >= 0
ListLines.Add(sReader.ReadLine)
Loop
End Using
End Using
'Your code for work with the line. Here you can write the new lines in the NEW list of string or work directly in the first
For L As Integer = 0 To ListLines.Count - 1
Dim meh As String() = ListLines(L).Split(",")
If meh.Length > 14 Then
'your code ...
ElseIf meh.Length < 14 Then
'your code ...
ElseIf meh.Length = 14 Then
MessageBox.Show("The line " & (ListLines(L) + 1) & " of the file has " & meh.Length & "commas", "MyApp", MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
End If
Next
'Open again the file for write
Using MyCSVwrite As New IO.FileStream("C:\MyCSV.csv", IO.FileMode.Open, IO.FileAccess.ReadWrite)
'Write back the file with the new lines
Using sWriter As IO.StreamWriter = New IO.StreamWriter(MyCSVwrite)
For Each sLine In ListLinesNEW.ToArray
sWriter.WriteLine(sLine)
Next
End Using
End Using
The "Using" auto close the filestream or a streamreader/write or what you use. Then you will not have problem like "file already in use".
Hope this help.

How can i open more than 1000 files using openfiledialog? Or is there any method i can use?

I am writing a small program in which user will select set of file (mostly .CSV) files and my program will search through them and find required data.
It's working for up to 300 files, but after that its not working. It's giving me an error:
InvalidOperationExceprion was Unhandaled
Too Many files are selected. Select Fewer files and try again.
What should i do?
Like Eric Walker said, open file dialog works with 1,000's and 1,000's of files, there is no reason why it would stop at 300.
The best way to loop though files/folders to get info is to use the .GetFiles() method
Dim OFD As New FolderBrowserDialog
If OFD.ShowDialog = DialogResult.OK Then
For Each f In Directory.GetFiles(OFD.SelectedPath)
If Path.GetExtension(f) = ".txt" Path.GetExtension(f) = ".csv" Then
Dim reader As String() = File.ReadAllLines(f)
For each line as String in reader
DoSomethingAwesome(line)
Next
End If
Next
End If
This will cycle through every file in a certain Directory.
Now if you would like to cycle through every file in a file dialog, then you would use this.
Dim OFD As New OpenFileDialog()
OFD.Multiselect = True
If OFD.ShowDialog = DialogResult.OK Then
For Each f In OFD.FileNames
If Path.GetExtension(f) = ".txt" Path.GetExtension(f) = ".csv" Then
Dim reader As String() = File.ReadAllLines(f)
For Each line As String In reader
DoSomethingAwesome(line)
Next
End If
Next
End If
Give one of these a try depending on your preference.
As a side note, for future posts - please post your attempted code or more details on what you are trying to accomplish and where you are having trouble. Frankly, Im surprised you weren't downvoted (very surprised, people on SO can be ruthless). Just a friendly tip.

Count Specific lines in a text file (that start with a number)

As a part of an application that i am building in VB.net, i am trying to import multiple txt files and be able to count how many lines of that file start with a specific number (for example 1) and show it in a message box.
Here is my code so far:
OpenFileDialog1.DefaultExt = "txt"
OpenFileDialog1.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*"
If OpenFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
For Each File In OpenFileDialog1.FileNames
My.Computer.FileSystem.ReadAllText(OpenFileDialog1.FileName)
For Each fileName In OpenFileDialog1.FileNames
For Each line As String In System.IO.File.ReadLines(fileName)
Dim Linecount = line.count
If line.StartsWith("1") Then
MsgBox(LineCount)
End If
Next
Next
Next
The above code does not work as it gives me wrong number of lines. In my txt file i have only one line that starts with "1".
You are just showing the number of characters in each line here:
For Each line As String In System.IO.File.ReadLines(fileName)
Dim Linecount = line.Count ' Number of characters
You could use LINQ to get the number of lines that start with one:
Dim lineWithOne = File.ReadLines(fileName).Count(Function(l) l.StartsWith("1"))
If you don't want or can't use LINQ, this is the classic way:
Dim lineWithOne = 0
For Each line As String In System.IO.File.ReadLines(fileName)
If line.StartsWith("1") Then lineWithOne += 1
Next

Merging 2 or more text files after edit in VB.net

Helo there!
it seems that i am facing a problem with my code in VB.net. Please be patient as i am a complete beginner in programming. I am trying to code a program that will load 2 or more txt files, find and exclude specific lines (starting with some characters or contain some characters) and then merge and save only one file that will contain all the information after the editing (from all the files).
I am using openfiledialog and i have set the multiselect to true. Below is the code for the OpenfileDialog:
If OpenFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
For Each File In OpenFileDialog1.FileNames
My.Computer.FileSystem.ReadAllText(OpenFileDialog1.FileName)
Next
If i am correct, it loads the filenames and reads all the text from the files. For the editing i am using the following code:
Dim outputLines As New List(Of String)()
For Each line As String In System.IO.File.ReadLines(OpenFileDialog1.FileName)
Uline1 = line.StartsWith("text1")
Uline2 = line.StartsWith("text2")
Uline3 = line.StartsWith("text3")
Uline4 = line.StartsWith("text4")
Uline5 = line.StartsWith("text5")
Uline6 = line.StartsWith("text7")
Uline7 = line.StartsWith("sometext")
Trash = line.Contains("^")
If Uline1 Or Uline2 Or Uline3 Or Uline4 Or Uline5 Or Uline6 Or Uline7 Or Trash Then
outputLines.Remove(line)
Else
outputLines.Add(line)
End If
Next
For the output i am using a savefiledialog with the following code:
SaveFileDialog1.DefaultExt = "txt"
SaveFileDialog1.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*"
SaveFileDialog1.RestoreDirectory = True
If (SaveFileDialog1.ShowDialog() = DialogResult.OK) Then
System.IO.File.WriteAllLines(SaveFileDialog1.FileName, outputLines)
Although the files are being loaded correctly, the edit seems to happen only in one file (the last selected) and again the program saves only one file.
Could you please point me to the right direction?
You need to nest your loops through the file names returned by the open file dialog, and the lines returned by the ReadLines calls. You also don't need to remove lines from the outputLines list, since they are never added. Something like:
If OpenFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim outputLines As New List(Of String)()
For Each fileName In OpenFileDialog1.FileNames
For Each line As String In System.IO.File.ReadLines(fileName)
Uline1 = line.StartsWith("text1")
Uline2 = line.StartsWith("text2")
Uline3 = line.StartsWith("text3")
Uline4 = line.StartsWith("text4")
Uline5 = line.StartsWith("text5")
Uline6 = line.StartsWith("text7")
Uline7 = line.StartsWith("sometext")
Trash = line.Contains("^")
If Not (Uline1 Or Uline2 Or Uline3 Or Uline4 Or Uline5 Or Uline6 Or Uline7 Or Trash) Then
outputLines.Add(line)
End If
Next
Next
End If
If the files are very large you will start to run into memory issues, and will need to write the data out as it is read instead of keeping it all in memory.
UPDATE
Based on the comments, if you want to check the following line to determine if the current line should be written, you could use something like this. Note the use of ReadAllLines and the For loop.
Dim outputLines As New List(Of String)()
For Each fileName In OpenFileDialog1.FileNames
Dim lines() As String = System.IO.File.ReadAllLines(fileName)
For i As Integer = 0 To lines.Count - 1
Dim line As String = lines(i)
If line.StartsWith("19") AndAlso i < lines.Count - 2 AndAlso lines(i + 1).StartsWith("15") Then
outputLines.Add(line)
outputLines.Add(lines(i + 1))
i += 1
End If
Next
Next