vb.net efficiently finding byte sequence in byte array - vb.net

so I am creating a piece of software that in short, has a list of original byte sequences and new sequences that those bytes need to be changed into, kinda like this in text form "original location(currently irrelevant as sequence can be in different places) $ 56,69,71,73,75,77 : 56,69,71,80,50,54"
I already have code that works fine, however there can be up to 600+ of these sequences to find and change and in some cases it is taking a really really long time 15 mins +, i think it is down to how long it is taking to find the sequences to them change so i am trying to find a better way to do this as currently it is unusable due to how long it takes.
I have copied the whole code for this function below in hopes one of you kind souls can have a look and help =)
Dim originalbytes() As Byte
Dim fd As OpenFileDialog = New OpenFileDialog()
fd.Title = "Select the file"
fd.Filter = "All files (*.*)|*.*|All files (*.*)|*.*"
fd.FilterIndex = 2
If fd.ShowDialog() = DialogResult.OK Then
TextBox2.Text = fd.FileName
originalbytes = File.ReadAllBytes(fd.FileName)
End If
Dim x As Integer = 0
Dim y As Integer = 0
Dim textbox1array() = TextBox1.Lines
Dim changedbytes() = originalbytes
Dim startvalue As Integer = 0
Dim databoxarray() As String
Dim databoxarray2() As String
While x < textbox1array.Length - 1
'for each change to make
databoxarray = textbox1array(x).Replace(" $ ", vbCr).Replace(" : ", vbCr).Split
databoxarray2 = databoxarray(1).Replace(",", vbCr).Split
Dim databox2bytes() As String = databoxarray2
'copy original bytes line to databox2 lines
y = 0
While y < (originalbytes.Length - databox2bytes.Length)
'repeat for all bytes in ori file - size of data to find
If originalbytes(y) = databox2bytes(0) Then
startvalue = y
Dim z As String = 1
Dim samebytecounter As Integer = 1
While z < databox2bytes.Length
'repeat for all ori bytes
If originalbytes(y + z) = databox2bytes(z) Then
samebytecounter = samebytecounter + 1
End If
z = z + 1
End While
If samebytecounter = databox2bytes.Length Then
'same original data found, make changes
Dim bytestoinsert() As String = databoxarray(2).Replace(",", vbCr).Split
Dim t As Integer = 0
While t < bytestoinsert.Length
changedbytes(startvalue + t) = bytestoinsert(t)
t = t + 1
End While
End If
End If
y = y + 1
End While
x = x + 1
End While
File.WriteAllBytes(TextBox2.Text & " modified", changedbytes)

Let 's take a look at that inner while loop in your code, there are some things that can be optimized:
There is no need to check the total length all the time
Dim length as Integer = originalbytes.Length - databox2bytes.Length
While y < length
'repeat for all bytes in ori file - size of data to find
If originalbytes(y) = databox2bytes(0) Then
startvalue = y
z is not necessary, samebytecounter does exactly the same
Dim samebytecounter As Integer = 1
This while loop is a real bottleneck, since you always check the full length of your databox2bytes, you should rather quit the while loop when they don't match
While samebytecounter < databox2bytes.Length AndAlso originalbytes(y + samebytecounter ) = databox2bytes(samebytecounter )
samebytecounter = samebytecounter + 1
End While
This seems fine, but you already splitted the data at the top of your while loop, so, no need to create another array that does the same operation again
If samebytecounter = databox2bytes.Length Then
'same original data found, make changes
Dim t As Integer = 0
While t < databoxarray2.Length
changedbytes(startvalue + t) = databoxarray2(t)
t = t + 1
End While
End If
End If
y = y + 1
End While
For the rest I would agree that the algorithm you created is hugely inefficient, theoretically your code could have been rewritten like eg: (didn't really test this code)
Dim text = System.Text.Encoding.UTF8.GetString(originalbytes, 0, originalbytes.Length)
dim findText = System.Text.Encoding.UTF8.GetString(stringToFind, 0, stringToFind.Length)
dim replaceWith = System.Text.Encoding.UTF8.GetString(stringToSet, 0, stringToSet.Length)
text = text.Replace( findText, replaceWith )
dim outbytes = System.Text.Encoding.UTF8.GetBytes(text)
which would probably be a huge time saver.
For the rest your code seems to be created in such a way that nobody will really understand it if it's laying around for a month or so, I would say, including yourself

Related

Does the textbox that is generated when creating a detail view not count as a textbox?

My code searches for certain numbers in textboxes and replaces them. The code however does not change the number if it is in a textbox that is created from a detail view(see figure 1). Do these not count as textboxes?
Figure 1
Dim Totalsheets As Integer
Dim target_text As String
Dim FirstPage As Integer
Dim replace_text As String
Dim result As String
Dim n As Integer 'count No. of text frames changed
Dim i As Integer 'count views for the sheet
Dim x As Integer 'takes the value of the first page of the old config
Dim y As Integer 'takes the value of the total number of sheets of the old config
Dim z As Integer 'takes the value of the number that needs to be added to update the zoning
Dim a As String 'takes the value of the letter found in the zoning box
Dim b As Integer
n = 0
Set osheets = odoc.Sheets
Set osheets = osheets.Item("DRAFT") 'makes sure only sheet "DRAFT" is edited
Set oViews = osheets.Views
Totalsheets = Totalsheets1.Value 'draws value from the textbox
FirstPage = FirstPage1.Value 'draws value from the textbox
For i = 3 To oViews.Count 'scans through all views in sheet
Set oView = oViews.Item(i)
Set oTexts = oView.Texts
For Each SrcText In oTexts 'scans through all text in view
x = FirstPage
y = Totalsheets
b = x + y
Do Until x = b + 1
z = x + Totalsheets
a = "A"
Do Until a = "[" 'goes from A to Z
result = SrcText.Text
target_text = " " & x & a 'gets space in front and letter at back to ensure only zone box are updated
replace_text = " " & z & a
If InStr(result, target_text) Then
result = Replace(result, target_text, replace_text)
SrcText.Text = result
n = n + 1
End If
a = Chr(Asc(a) + 1)
Loop
x = x + 1
Loop
Next
Next
Although the detail view identifier is a DrawingText, it does not belong to the DrawingTexts-collection.
You could access the DrawingText by searching in the view.
Better would be to rename the property of the view.
EDIT:
Example for using the (slower) selection:
Set oSel = oDoc.Selection
oSel.Clear
oSel.Add oView
oSel.Search "CATDrwSearch.DrwText,sel"
for i = 1 to oSel.Count2
Set oDrwText = oSel.Item2(i).Value
'do something with the text
next

Splitting string every 100 characters not working

I am having a problem where I just can't seem to get it to split or even display the message. The message variable is predefined in another part of my code and I have debugged to make sure that the value comes through. I am trying to get it so that every 100 characters it goes onto a new line and with every message it also goes onto a new line.
y = y - 13
messagearray.AddRange(Message.Split(ChrW(100)))
Dim k = messagearray.Count - 1
Dim messagefin As String
messagefin = ""
While k > -1
messagefin = messagefin + vbCrLf + messagearray(k)
k = k - 1
End While
k = 0
Label1.Text = Label1.Text & vbCrLf & messagefin
Label1.Location = New Point(5, 398 + y)
You can use regular expression. It will create the array of strings where every string contains 100 characters. If the amount of remained characters is less than 100, it will match all of them.
Dim input = New String("A", 310)
Dim mc = Regex.Matches(input, ".{1,100}")
For Each m As Match In mc
'// Do something
MsgBox(m.Value)
Next
You can use LINQ to do that.
When you do a Select you can get the index of the item by including a second parameter. Then group the characters by that index divided by the line length so, the first character has index 0, and 0 \ 100 = 0, all the way up to the hundredth char which has index 99: 99 \ 100 = 0. The next hundred chars have 100 \ 100 = 1 to 199 \ 100 = 1, and so on (\ is the integer division operator in VB.NET).
Dim message = New String("A"c, 100)
message &= New String("B"c, 100)
message &= New String("C"c, 99)
Dim lineLength = 100
Dim q = message.Select(Function(c, i) New With {.Char = c, .Idx = i}).
GroupBy(Function(a) a.Idx \ lineLength).
Select(Function(b) String.Join("", b.Select(Function(d) d.Char)))
TextBox1.AppendText(vbCrLf & String.Join(vbCrLf, q))
It is easy to see how to change the line length because it is in a variable with a meaningful name, for example I set it to 50 to get the output
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
You can use String.SubString to do that. Like this
Dim Message As String = "your message here"
Dim MessageList As New List (Of String)
For i As Integer = 0 To Message.Length Step 100
If (Message.Length < i + 100) Then
MessageList.Add(Message.SubString (i, Message.Length - i)
Exit For
Else
MessageList.Add(Message.SubString (i, 100))
End If
Next
Dim k = MessageList.Count - 1
...
Here is what your code produced with a bit of clean up. I ignored the new position of the label.
Private Sub OpCode()
Dim messagearray As New List(Of String) 'I guessed that messagearray was a List(Of T)
messagearray.AddRange(Message.Split(ChrW(100))) 'ChrW(100) is lowercase d
Dim k = messagearray.Count - 1
Dim messagefin As String
messagefin = ""
While k > -1
messagefin = messagefin + vbCrLf + messagearray(k)
k = k - 1
End While
k = 0 'Why reset k? It falls out of scope at End Sub
Label1.Text = Label1.Text & vbCrLf & messagefin
End Sub
I am not sure why you think that splitting a string by lowercase d would have anything to do with getting 100 characters. As you can see the code reversed the order of the list items. It also added a blank line between the existing text in the label (In this case Label1) and the new text.
To accomplish your goal, I first created a List(Of String) to store the chunks. The For loop starts at the beginning of the input string and keeps going to the end increasing by 10 on each iteration.
To avoid an index out of range which would happen at the end. Say, we only had 6 characters left from start index. If we tried to retrieve 10 characters we would have an index out of range.
At the end we join the elements of the string with the separated of new line.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
BreakInto10CharacterChunks("The quick brown fox jumped over the lazy dogs.")
End Sub
Private Sub BreakInto10CharacterChunks(input As String)
Dim output As New List(Of String)
Dim chunk As String
For StartIndex = 0 To input.Length Step 10
If StartIndex + 10 > input.Length Then
chunk = input.Substring(StartIndex, input.Length - StartIndex)
Else
chunk = input.Substring(StartIndex, 10)
End If
output.Add(chunk)
Next
Label1.Text &= vbCrLf & String.Join(vbCrLf, output)
End Sub
Be sure to look up String.SubString and String.Join to fully understand how these methods work.
https://learn.microsoft.com/en-us/dotnet/api/system.string.substring?view=netframework-4.8
and https://learn.microsoft.com/en-us/dotnet/api/system.string.join?view=netframework-4.8

Best way to optimise For Loops and Do Until Loops [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I have the following code that searches through folder directories in a DataGridView table, and puts all files of the wanted format into a list, it also gathers a list of their last modified date for later use in the application.
The code works, but it is sore on the eyes. I want to tidy up the following loops to improve efficiency - what I mean is that I have a For loop within a For loop that creates the list of filenames, then I have two separate Do Until loops that search through the list from start to finish to pick out file names that need adjustment.
I would be very interested to learn a better way of achieving the same result, as my knowledge of efficiency in coding is quite elementary. Basically, can this be done in one or two loops, as the idea of looping through the Lists twice seems inefficient?
Public Class
Private Sub btnDirectory_Click(sender As Object, e As EventArgs) Handles btnDirectory.Click
Dim FileNames As New List(Of String)
Dim FileDates As New List(Of Date)
Dim DocNo As String
Dim rowCheck As String
Dim ProjectNo As String = "1111"
Dim FileNameCheck As String
Dim str As String
Dim k As Integer = 0
Dim i As Integer
Dim j As Integer
Dim CorrectType As Boolean = False
'The first loop grabs all files of the wanted format from a datagridview table containing all directories to be checked
For Each rw In Background.Table1.Rows
rowCheck = Background.Table1(0, k).Value
If Not String.IsNullOrEmpty(rowCheck) Then
For Each file As String In My.Computer.FileSystem.GetFiles(Background.Table1(0, k).Value)
CorrectType = False
FileNameCheck = IO.Path.GetFileNameWithoutExtension(file)
If FileNameCheck.Contains(ProjectNo) AndAlso FileNameCheck.Contains("-") AndAlso Not String.IsNullOrEmpty(FileNameCheck) AndAlso FileNameCheck.Contains(" ") Then
DocNo = FileNameCheck.Substring(0, FileNameCheck.IndexOf(" "))
If FileNameCheck.Substring(0, FileNameCheck.IndexOf("-")) = ProjectNo AndAlso CountLetters(DocNo) = 3 Then
CorrectType = True
End If
End If
If CorrectType = True Then
FileNames.Add(FileNameCheck)
FileDates.Add(IO.File.GetLastWriteTime(file))
End If
Next
End If
k += 1
Next
'The next loop tidies up the file formats that contain a "-00-" in their names
j = FileNames.Count
i = 0
Do
str = FileNames(i)
If str.Contains("-00-") Then
FileNames(i) = RemoveChar(str, "-00-") ' RemoveChar is a function that replaces "-00-" with a "-"
End If
i += 1
Loop Until i = j
i = 0
j = FileNames.Count
'Finally, this loop checks that no two files have the exact same name, and gets rid of one of them if that is the case
Do
Dim st1 As String = FileNames(j - 1)
Dim st2 As String = FileNames(j - 2)
If st1 = st2 Then
FileNames.RemoveAt(j - 1)
FileDates.RemoveAt(j - 1)
End If
j -= 1
Loop Until j = 1
End Sub
End Class
The code is certainly hard on the eyes.
the For Each rw loop does not use rw. You could replace this with a loop such as:
For k = 1 to Background.Table1.Rows.Count
' Do things here
Next k
You assign rowCheck and use it once, but you missed the opportunity to reuse it in the For Each file line.
Where you have CorrectType = True you can easily place the corresponding code instead.
If FileNameCheck.Substring(0, FileNameCheck.IndexOf("-")) = ProjectNo AndAlso CountLetters(DocNo) = 3 Then
CorrectType = True
End If
End If
If CorrectType = True Then
FileNames.Add(FileNameCheck)
FileDates.Add(IO.File.GetLastWriteTime(file))
End If
becomes:
If FileNameCheck.Substring(0, FileNameCheck.IndexOf("-")) = ProjectNo AndAlso CountLetters(DocNo) = 3 Then
FileNames.Add(FileNameCheck)
FileDates.Add(IO.File.GetLastWriteTime(file))
End If
I must admit, the next two loops made my eyes bleed (figuratively, not literally).
j = FileNames.Count
i = 0
Do
str = FileNames(i)
If str.Contains("-00-") Then
FileNames(i) = RemoveChar(str, "-00-") ' RemoveChar is a function that replaces "-00-" with a "-"
End If
i += 1
Loop Until i = j
becomes
for i = 1 to FileNames.Count
str = FileNames(i)
If str.Contains("-00-") Then
FileNames(i) = RemoveChar(str, "-00-") ' RemoveChar is a function that replaces "-00-" with a "-"
End If
Next I
And
i = 0
j = FileNames.Count
'Finally, this loop checks that no two files have the exact same name, and gets rid of one of them if that is the case
Do
Dim st1 As String = FileNames(j - 1)
Dim st2 As String = FileNames(j - 2)
If st1 = st2 Then
FileNames.RemoveAt(j - 1)
FileDates.RemoveAt(j - 1)
End If
j -= 1
Loop Until j = 1
becomes
'Finally, this loop checks that no two files have the exact same name, and gets rid of one of them if that is the case
For j = FileNames.Count - 1 to 1 Step -1 ' Check my counting here - stop at 1, 2 or 0?
Dim st1 As String = FileNames(j)
Dim st2 As String = FileNames(j - 1)
If st1 = st2 Then
FileNames.RemoveAt(j)
FileDates.RemoveAt(j)
End If
Next j

Speed up large string data parser function

I currently have a file with 1 million characters.. the file is 1 MB in size. I am trying to parse data with this old function that still works but very slow.
start0end
start1end
start2end
start3end
start4end
start5end
start6end
the code, takes about 5 painful minutes to process the whole data.
any pointers and suggestions are appreciated.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim sFinal = ""
Dim strData = textbox.Text
Dim strFirst = "start"
Dim strSec = "end"
Dim strID As String, Pos1 As Long, Pos2 As Long, strCur As String = ""
Do While InStr(strData, strFirst) > 0
Pos1 = InStr(strData, strFirst)
strID = Mid(strData, Pos1 + Len(strFirst))
Pos2 = InStr(strID, strSec)
If Pos2 > 0 Then
strID = Microsoft.VisualBasic.Left(strID, Pos2 - 1)
End If
If strID <> strCur Then
strCur = strID
sFinal += strID & ","
End If
strData = Mid(strData, Pos1 + Len(strFirst) + 3 + Len(strID))
Loop
End Sub
The reason that is so slow is because you keep destroying and recreating a 1 MB string over and over. Strings are immutable, so strData = Mid(strData... creates a new string and copies the remaining of the 1 MB string data to a new strData variable over and over and over. Interestingly, even VB6 allowed for a progressive index.
I would have processed the disk file LINE BY LINE and plucked out the info as it was read (see streamreader.ReadLine) to avoid working with a 1MB string. Pretty much the same method could be used there.
' 1 MB textbox data (!?)
Dim sData As String = TextBox1.Text
' start/stop - probably fake
Dim sStart As String = "start"
Dim sStop As String = "end"
' result
Dim sbResult As New StringBuilder
' progressive index
Dim nNDX As Integer = 0
' shortcut at least as far as typing and readability
Dim MagicNumber As Integer = sStart.Length
' NEXT index of start/stop after nNDX
Dim i As Integer = 0
Dim j As Integer = 0
' loop as long as string remains
Do While (nNDX < sData.Length) AndAlso (i >= 0)
i = sData.IndexOf(sStart, nNDX) ' start index
j = sData.IndexOf(sStop, i) ' stop index
' Extract and append bracketed substring
sbResult.Append(sData.Substring(i + MagicNumber, j - (i + MagicNumber)))
' add a cute comma
sbResult.Append(",")
nNDX = j ' where we start next time
i = sData.IndexOf(sStart, nNDX)
Loop
' remove last comma
sbResult.Remove(sbResult.ToString.Length - 1, 1)
' show my work
Console.WriteLine(sbResult.ToString)
EDIT: Small mod for the ad hoc test data

Reading a file bug in VB.NET?

The way this file works is there is a null buffer, then a user check sum then a byte that gives you the user name letter count, then a byte for how many bytes to skip to the next user and a byte for which user file the user keeps their settings in.
the loop with the usersm variable in the IF statement sets up the whole file stream for extraction. However with almost the exact same code the else clause specifically the str.Read(xnl, 0, usn - 1) in the else code appears to be reading the very beginning of the file despite the position of the filestream being set earlier, anyone know whats happening here?
this is in vb2005
Private Sub readusersdata(ByVal userdatafile As String)
ListView1.BeginUpdate()
ListView1.Items.Clear()
Using snxl As IO.Stream = IO.File.Open(userdatafile, IO.FileMode.Open)
Using str As New IO.StreamReader(snxl)
str.BaseStream.Position = 4
Dim usersm As Integer = str.BaseStream.ReadByte()
Dim users As Integer = usersm
While users > 0
If usersm = users Then
Dim trailtouser As Integer = 0
str.BaseStream.Position = 6
Dim ust As Integer = str.BaseStream.ReadByte()
str.BaseStream.Position = 8
Dim snb(ust - 1) As Char
str.ReadBlock(snb, 0, ust)
Dim bst = New String(snb)
If usersm = 1 Then
str.BaseStream.Position = 16
Else
str.BaseStream.Position = 15
End If
cLVN(ListView1, bst, str.BaseStream.ReadByte)
str.BaseStream.Position = 8 + snb.Length
str.BaseStream.Position += str.BaseStream.ReadByte + 1
Else
Dim usn As Integer = str.BaseStream.ReadByte
str.BaseStream.Position += 2
Dim chrpos As Integer = str.BaseStream.Position
Dim xnl(usn - 1) As Char
str.Read(xnl, 0, usn - 1)
Dim skpbyte As Integer = str.BaseStream.ReadByte
str.BaseStream.Position += 3
Dim udata As Integer = str.BaseStream.ReadByte
End If
users -= 1
End While
End Using
End Using
ListView1.EndUpdate()
End Sub
When you change the position of the underlying stream, the StreamReader doesn't know you've done that. If it's previously read "too much" data (deliberately, for the sake of efficiency - it tries to avoid doing lots of little reads on the underlying stream) then it will have buffered data that it'll use instead of talking directly to the repositioned stream. You need to call StreamReader.DiscardBufferedData after repositioning the stream to avoid that.