How many count lines duplicates in text files - vb.net

please how can I get count of duplicate lines?
Source data: line e.g. user_id;name;surname;3400;44711;30.05.2022 7:00:00;30.05.2022 15:30:00;0;480;0;1;682;10000120;9
Private Sub remove_duplicite(sender As Object, e As EventArgs)
Dim sFiles() As String
sFiles = Directory.GetFiles(filesPath1, remove_dupl)
Dim path As String = String.Join("", sFiles)
'MessageBox.Show(path)
Dim lines As New HashSet(Of String)()
'Read to file
Using sr As StreamReader = New StreamReader(path)
Do While sr.Peek() >= 0
lines.Add(sr.ReadLine())
Loop
End Using
'Write to file
Using sw As StreamWriter = New StreamWriter(path)
For Each line As String In lines
sw.WriteLine(line)
Next
End Using
Close()
End Sub
I try some answers but no success.But I think that will be easy.
Thank you

Dim sList As New List(of String)
sList.Add("1")
sList.Add("2")
sList.Add("2")
sList.Add("3")
Dim sListDistinct As List(Of String) = sList.Distinct().ToList()
Dim iCount as Integer = sList.Count - sListDistinct.Count
But depending on the size of your file, this isn't the best performance way.
Maybe check in your HashSet with .Contains and count if entry already exists

Related

How to combine all csv files from the same folder into one data

I want merge multiple csv files with same layout from the same folder
example :
csv1.csv
ID, PINA,PINB,PCS
1,100,200,450
2,99,285,300
csv2.csv
ID, PINA,PINB,PCS
1,100,200,999
2,99,285,998
out.csv (The file I want make by VB.net)
ID, PINA,PINB,PCS,PCS
1,100,200,450,999
2,99,285,300,998
my problem code :
Dim FileReader As StreamReader
Dim i As Integer = 0
Dim temp As String
For i = 0 To LstFiles.Items.Count - 1
FileReader = File.OpenText(LstFiles.Items.Item(i))
temp = FileReader.ReadToEnd
File.AppendAllText(SaveFileDialog1.FileName, temp)
Next
Please guide me.
Thanks a lot !
Looks to me like each line in the input files has an identifier based on the first value in that row. You want to combine all the numbers after that identifier, from all the files in your ListBox, into one list of numbers that is sorted and has no duplicates. Then you want to generate an output file that has all those identifiers followed by each set of sorted, unique numbers.
If that is correct, then try this out:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
If SaveFileDialog1.ShowDialog = DialogResult.OK Then
Dim header As String = ""
Dim combinedLines As New SortedList(Of Integer, List(Of Integer))
For Each filename As String In LstFiles.Items
Dim lines = File.ReadLines(filename)
If header = "" Then
header = lines.First
End If
lines = lines.Skip(1)
For Each line As String In lines
Dim strValues = line.Split(",").AsEnumerable
Try
Dim lineNumber As Integer = Integer.Parse(strValues.First)
strValues = strValues.Skip(1)
Dim numbers = strValues.ToList.ConvertAll(Of Integer)(Function(x) Integer.Parse(x))
If Not combinedLines.ContainsKey(lineNumber) Then
combinedLines.Add(lineNumber, New List(Of Integer)(numbers))
Else
combinedLines(lineNumber).AddRange(numbers)
End If
Catch ex As Exception
MessageBox.Show("Error Parsing Line: " & line)
End Try
Next
Next
Using sw As New StreamWriter(SaveFileDialog1.FileName, False)
sw.WriteLine(header)
For Each numberSet In combinedLines
Dim numbers = numberSet.Value.Distinct.ToList
numbers.Sort()
sw.WriteLine(numberSet.Key & "," & String.Join(",", numbers))
Next
End Using
End If
End Sub

Reading Unknown Number Of Lines In A File

I have about 20 files, each file has a short description that starts on line 7 and goes to the 3rd to last line of the file.
For example, one file has the description started at line 7, and ends at line 10, but the file has a total of 13 lines.
How can I import JUST the description, for example line 7 - 10?
This is the example code I have so far.
Public Class Form1
Dim MyDir As String = "..\GoodFils\"
Dim MyFiles() As String = IO.Directory.GetFiles(MyDir)
Dim Count As Integer = 0
Public Function ReadLine(lineNumber As Integer, lines As List(Of String)) As String
Return lines(lineNumber - 1)
End Function
Private Sub btnDo_Click(sender As Object, e As EventArgs) Handles btnDo.Click
Dim reader As New System.IO.StreamReader(MyDir & "gucci.hcs")
Dim allLines As List(Of String) = New List(Of String)
Dim i As Integer
Dim strTemp As String
Do Until reader.EndOfStream = True
allLines.Add(reader.ReadLine())
Loop
lblName.Text = ReadLine(2, allLines)
lblPrice.Text = ReadLine(5, allLines)
lblDesc.Text = EOF(1) - 3
reader.Close()
FileOpen(1, MyDir & "gucci.hcs", OpenMode.Input) 'May be able to use MyDir & lblName & ".hcs"
For i = 7 To reader.EndOfStream
Input(1, strTemp)
Next
lblDesc.Text += i
FileClose(1)
End Sub
End Class
You can load the contents of each individual file into an Array by using IO.File.ReadAllLines and then you can use LINQ to Skip to jump to line 7 and then Take up to the 3rd to last line.
Here is a quick example:
'Create a collection to store all of the file's descriptions
Dim descriptions As New List(Of String)
'Placeholder variable for the upcoming iteration
Dim lines() As String
'Iterate through each file
For Each file As IO.FileInfo In New IO.DirectoryInfo("GoodFils").GetFiles("*.txt")
'Read the file
lines = IO.File.ReadAllLines(file.FullName)
'Get only lines 7 to n-3
descriptions.Add(String.Join(Environment.NewLine, lines.Skip(6).Take(lines.Count - 10).ToArray()))
Next
Fiddle: Live Demo

Read random txt line, split strings by : and write in textboxes

I've got two textboxes and I want make account generator which will read random line from txt file on website and write it into textboxes. So, I want to read random line(just one) from a text file, where email and password are separated by : so .txt file would look like email#site.com:password , write data before : in textbox1(email) and write data from the same line after : in textbox2.
.txt file looks like this:
email1#example.com:password1
email2#example.com:password2
email3#example.com:password3 etc....
I cannot figure out how to split this strings, any help will be appreciated, thanks anyway :)
There you go.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
tbxEmail.Text = String.Empty
tbxPassword.Text = String.Empty
Dim lines As String() = getData("URL_OF_FILE")
Dim lineCount As Integer = lines.Length
Dim randomValue As Integer = CInt(Math.Floor((lineCount) * Rnd()))
Dim line As String = lines(randomValue)
Dim parts As String() = line.Split(New Char() {":"c})
Dim email As String = parts(0)
Dim password As String = parts(1)
tbxEmail.Text = email
tbxPassword.Text = password
End Sub
Function getData(url As String) As String()
Dim client As System.Net.WebClient = New System.Net.WebClient()
Dim data As String = client.DownloadString(url)
Dim returnValue As String() = data.Split(New String() {Environment.NewLine},
StringSplitOptions.RemoveEmptyEntries)
Return returnValue
End Function
Please not that this is a synchronous request, meaning it will "freeze" your application for the duration of the request.

Create a search bar for hex values

My current code requires me to edit the search value while the project is still in VB. I have not been able to figure out how to code the input value to use a textbox for search. I would really like to be able to build this project and use it without having VB open. Below is my code:
Dim filePath As String = Me.TextBox1.Text 'The path for the file you want to search
Dim fInfo As New FileInfo("C:\MyFile.File")
Dim numBytes As Long = fInfo.Length
Dim fStream As New FileStream("C:\MyFile.File", FileMode.Open, FileAccess.Read)
Dim br As New BinaryReader(fStream)
Dim data As Byte() = br.ReadBytes(CInt(numBytes))
Dim pos As Integer = -1
Dim searchItem As String = "b6" 'The hex values of what you want to search
Dim searchItemAsInteger As Integer
Dim locationsFound As New List(Of Integer)
MessageBox.Show("Wait while I Scan?")
br.Close()
fStream.Close()
Integer.TryParse(searchItem, Globalization.NumberStyles.AllowHexSpecifier, CultureInfo.CurrentCulture, searchItemAsInteger)
For Each byteItem As Byte In data
pos += 1
If CInt(byteItem) = searchItemAsInteger Then
locationsFound.Add(pos)
Me.ListBox1.Items.Add(Hex(pos))
End If
Next
For i As Integer = 0 To Me.ListBox1.Items.Count - 1
Me.ListBox1.SetSelected(i, True)
Next
End Sub
Place a textbox named "txtHexValueToSearch" inside Form1. And then replaces the code that is commented:
' Dim searchItem As String = "b6" 'The hex values of what you want to search
Dim searchItem As String = Me.txtHexValueToSearch.Text 'The hex values of what you want to search

Replace Memory Issue VB.net

i was kindly helped before with this code, however I have hit a stumbling block, and im not sure the correct way to go. I have the code below which does a find and replace on over 120k of find and replaces. The problem is the text file is HUGE easily over 5 gig of log files so i get a memory issue which is not surprise. So do i load the data in blocks if that is even possible?, if so how.
Private Sub CmdBtnTestReplace_Click(sender As System.Object, e As System.EventArgs) Handles CmdBtnTestReplace.Click
Dim fName As String = "c:\backup\logs\master.txt"
Dim wrtFile As String = "c:\backup\logs\masterUserFormatted.txt"
Dim strRead As New System.IO.StreamReader(fName)
Dim strWrite As New System.IO.StreamWriter(wrtFile)
Dim s As String
s = strRead.ReadToEnd()
'runs through over 120k of find and replaces
For Each row As DataGridViewRow In DataGridView1.Rows
If Not row.IsNewRow Then
Dim Find1 As String = row.Cells(0).Value.ToString
Dim Replace1 As String = row.Cells(1).Value.ToString
Cursor.Current = Cursors.WaitCursor
'replace using string from 1st column and replaces with string from 2nd column.
s = s.Replace(Find1, Replace1)
End If
Next
strWrite.Write(s)
strRead.Close()
strWrite.Close()
Cursor.Current = Cursors.Default
MessageBox.Show("Finished Replacing")
End Sub
If the input file is a simple multi-line text file, where no individual line is too big to load into memory at once, and the search string is never going to span multiple lines, then reading only one line at a time will be the simplest solution. For instance:
Dim fName As String = "c:\backup\logs\master.txt"
Dim wrtFile As String = "c:\backup\logs\masterUserFormatted.txt"
Dim strRead As New System.IO.StreamReader(fName)
Dim strWrite As New System.IO.StreamWriter(wrtFile)
Cursor.Current = Cursors.WaitCursor
While True
Dim line As String = strRead.ReadLine()
If line IsNot Nothing Then
For Each row As DataGridViewRow In DataGridView1.Rows
If Not row.IsNewRow Then
Dim Find1 As String = row.Cells(0).Value.ToString
Dim Replace1 As String = row.Cells(1).Value.ToString
line = line.Replace(Find1, Replace1)
End If
Next
strWrite.WriteLine(line)
Else
Exit While
End If
End While
strRead.Close()
strWrite.Close()
Cursor.Current = Cursors.Default
MessageBox.Show("Finished Replacing")
It's worth mentioning that the StreamReader and StreamReader implement IDisposable. As such, it would be preferable to enclose them in a Using block rather than explicitly calling Close yourself.