Longest Common substring breaking issue - vb.net

Hi I have a function that finds the longest common substring between two strings. It works great except it seems to break when it reaches any single quote mark: '
This causes it to not truly find the longest substring sometimes.
Could anyone help me adjust this function so it includes single quotes in the substring? I know it needs to be escaped someplace I'm just not sure where.
Example:
String 1: Hi there this is jeff's dog.
String 2: Hi there this is jeff's dog.
After running the function the longest common substring would be:
Hi there this is jeff
Edit: seems to also happen with "-" as well.
It will not count anything after the single quote as part of the substring.
Here's is the function:
Public Shared Function LongestCommonSubstring(str1 As String, str2 As String, ByRef subStr As String)
Try
subStr = String.Empty
If String.IsNullOrEmpty(str1) OrElse String.IsNullOrEmpty(str2) Then
Return 0
End If
Dim num As Integer(,) = New Integer(str1.Length - 1, str2.Length - 1) {}
Dim maxlen As Integer = 0
Dim lastSubsBegin As Integer = 0
Dim subStrBuilder As New StringBuilder()
For i As Integer = 0 To str1.Length - 1
For j As Integer = 0 To str2.Length - 1
If str1(i) <> str2(j) Then
num(i, j) = 0
Else
If (i = 0) OrElse (j = 0) Then
num(i, j) = 1
Else
num(i, j) = 1 + num(i - 1, j - 1)
End If
If num(i, j) > maxlen Then
maxlen = num(i, j)
Dim thisSubsBegin As Integer = i - num(i, j) + 1
If lastSubsBegin = thisSubsBegin Then
subStrBuilder.Append(str1(i))
Else
lastSubsBegin = thisSubsBegin
subStrBuilder.Length = 0
subStrBuilder.Append(str1.Substring(lastSubsBegin, (i + 1) - lastSubsBegin))
End If
End If
End If
Next
Next
subStr = subStrBuilder.ToString()
Return subStr
Catch e As Exception
Return ""
End Try
End Function

I tried it with dotnetfiddle and there it is working with your Code you posted. Please activate your warnings in your project. You have function with no return value and you return an integer or a string. This is not correct. How are you calling your function?
Here is my example I tested for you:
https://dotnetfiddle.net/mVBDQp

Your code works perfectly like Regex! As far as I can see, there is really nothing wrong with your code.
Here I even tested it under more severe case:
Public Sub Main()
Dim a As String = ""
Dim str1 As String = "Hi there this is jeff''s dog.-do you recognize this?? This__)=+ is m((a-#-&&*-ry$##! <>Hi:;? the[]{}re this|\ is jeff''s dog." 'Try to trick the logic!
Dim str2 As String = "Hi there this is jeff''s dog. ^^^^This__)=+ is m((a-#-&&*-ry$##! <>Hi:;? the[]{}re this|\ is jeff''s dog."
LongestCommonSubstring(str1, str2, a)
Console.WriteLine(a)
Console.ReadKey()
End Sub
Note that I put '-$#^_)=+&|\{}[]?!;:.<> all there. Plus I tried to trick your code by giving early result.
But the result is excellent!
You could probably put more actual samples on the inputs which give you problems. Else, you could possibly describe the environment that you use/deploy your code into. Maybe the problem lies elsewhere and not in the code.

The quickest way to solve this would be to use an escape code and replace all the ' with whatever escape code you use

Related

For Loop: changing the loop condition while it is looping

What I want to do is replace all 'A' in a string with "Bb". but it will only loop with the original string not on the new string.
for example:
AAA
BbAA
BbBbA
and it stops there because the original string only has a length of 3. it reads only up to the 3rd index and not the rest.
Dim txt As String
txt = output_text.Text
Dim a As String = a_equi.Text
Dim index As Integer = txt.Length - 1
Dim output As String = ""
For i = 0 To index
If (txt(i) = TextBox1.Text) Then
output = txt.Remove(i, 1).Insert(i, a)
txt = output
TextBox2.Text += txt + Environment.NewLine
End If
Next
End Sub
I think this leaves us looking for a String.ReplaceFirst function. Since there isn't one, we can just write that function. Then the code that calls it becomes much more readable because it's quickly apparent what it's doing (from the name of the function.)
Public Function ReplaceFirst(searched As String, target As String, replacement As String) As String
'This input validation is just for completeness.
'It's not strictly necessary.
'If the searched string is "null", throw an exception.
If (searched Is Nothing) Then Throw New ArgumentNullException("searched")
'If the target string is "null", throw an exception.
If (target Is Nothing) Then Throw New ArgumentNullException("target")
'If the searched string doesn't contain the target string at all
'then just return it - were done.
Dim foundIndex As Integer = searched.IndexOf(target)
If (foundIndex = -1) Then Return searched
'Build a new string that replaces the target with the replacement.
Return String.Concat(searched.Substring(0, foundIndex), replacement, _
searched.Substring(foundIndex + target.Length, searched.Length - (foundIndex + target.Length)))
End Function
Notice how when you read the code below, you don't even have to spend a moment trying to figure out what it's doing. It's readable. While the input string contains "A", replace the first "A" with "Bb".
Dim input as string = "AAA"
While input.IndexOf("A") > -1
input = input.ReplaceFirst(input,"A","Bb")
'If you need to capture individual values of "input" as it changes
'add them to a list.
End While
You could optimize or completely replace the function. What matters is that your code is readable, someone can tell what it's doing, and the ReplaceFirst function is testable.
Then, let's say you wanted another function that gave you all of the "versions" of your input string as the target string is replaced:
Public Function GetIterativeReplacements(searched As String, target As String, replacement As String) As List(of string)
Dim output As New List(Of String)
While searched.IndexOf(target) > -1
searched = ReplaceFirst(searched, target, replacement)
output.Add(searched)
End While
Return output
End Function
If you call
dim output as List(of string) = GetIterativeReplacments("AAAA","A","Bb")
It's going to return a list of strings containing
BbAAA, BbBbAA, BbBbBbA, BbBbBbBb
It's almost always good to keep methods short. If they start to get too long, just break them into smaller methods with clear names. That way you're not trying to read and follow and test one big, long function. That's difficult whether or not you're a new programmer. The trick isn't being able to create long, complex functions that we understand because we wrote them - it's creating small, simpler functions that anyone can understand.
Check your comments for a better solution, but for future reference you should use a while loop instead of a for loop if your condition will be changing and you're wanting to take that change into account.
I've made a simple example below to help you understand. If you tried the same with a for loop, you'd only get "one" "two" and "three" printed because the for loop doesn't 'see' that vals was changed
Dim vals As New List(Of String)
vals.Add("one")
vals.Add("two")
vals.Add("three")
Dim i As Integer = 0
While i < vals.Count
Console.WriteLine(vals(i))
If vals(i) = "two" Then
vals.Add("four")
vals.Add("five")
End If
i += 1
End While
If you do want to replace one by one instead of using the Replace function, you could use a while loop to look for the index of your search character/string, and then replace/insert at that index.
Sub Main()
Dim a As String = String.Empty
Dim b As String = String.Empty
Dim c As String = String.Empty
Dim d As Int32 = -1
Console.Write("Whole string: ")
a = Console.ReadLine()
Console.Write("Replace: ")
b = Console.ReadLine()
Console.Write("Replace with: ")
c = Console.ReadLine()
d = a.IndexOf(b)
While d > -1
a = a.Remove(d, b.Length)
a = a.Insert(d, c)
d = a.LastIndexOf(b)
End While
Console.WriteLine("Finished string: " & a)
Console.ReadLine()
End Sub
Output would look like this:
Whole string: This is A string for replAcing chArActers.
Replace: A
Replace with: Bb
Finished string: This is Bb string for replBbcing chBbrBbcters.
I was going to write a while loop to answer your question, but realized (with assistance from others) that you could just .replace(x,y)
Output.Text = Input.Text.Replace("A", "Bb")
'Input = N A T O
'Output = N Bb T O
Edit: There is probably a better alternative, but i quickly jotted this loop down, hope it helps.
You've said your new and don't fully understand while loops. So if you don't understand functions either or how to pass arguments to them, I'd suggest looking that up too.
This is your Event, It can be a Button click or Textbox text change.
'Cut & Paste into an Event (Change textboxes to whatever you have input/output)
Dim Input As String = textbox1.Text
Do While Input.Contains("A")
Input = ChangeString(Input, "A", "Bb")
' Do whatever you like with each return of ChangeString() here
Loop
textbox2.Text = Input
This is your Function, with 3 Arguments and a Return Value that can be called in your code
' Cut & Paste into Code somewhere (not inside another sub/Function)
Private Function ChangeString(Input As String, LookFor As Char, ReplaceWith As String)
Dim Output As String = Nothing
Dim cFlag As Boolean = False
For i As Integer = 0 To Input.Length - 1
Dim c As Char = Input(i)
If (c = LookFor) AndAlso (cFlag = False) Then
Output += ReplaceWith
cFlag = True
Else
Output += c
End If
Next
Console.WriteLine("Output: " & Output)
Return Output
End Function

Increment character in a string

I have a 2 character string composed only of the 26 capital alphabet letters, 'A' through 'Z'.
We have a way of knowing the "highest" used value (e..g "IJ" in {"AB", "AC", "DD", "IH", "IJ"}). We'd like to get the "next" value ("IK" if "IJ" is the "highest").
Function GetNextValue(input As String) As String
Dim first = input(0)
Dim last = input(1)
If last = "Z"c Then
If first = "Z"c Then Return Nothing
last = "A"c
first++
Else
last++
EndIf
Return first & last
End Function
Obviously char++ is not valid syntax in VB.NET. C# apparently allows you to do this. Is there something shorter less ugly than this that'd increment a letter? (Note: Option Strict is on)
CChar(CInt(char)+1).ToString
Edit: As noted in comment/answers, the above line won't even compile. You can't convert from Char -> Integer at all in VB.NET.
The tidiest so far is simply:
Dim a As Char = "a"
a = Chr(Asc(a) + 1)
This still needs handling for the "z" boundary condition though, depending on what behaviour you require.
Interestingly, converting char++ through developerfusion suggests that char += 1 should work. It doesn't. (VB.Net doesn't appear to implicitly convert from char to int16 as C# does).
To make things really nice you can do the increment in an Extension by passing the char byref. This now includes some validation and also a reset back to a:
<Extension>
Public Sub Inc(ByRef c As Char)
'Remember if input is uppercase for later
Dim isUpper = Char.IsUpper(c)
'Work in lower case for ease
c = Char.ToLower(c)
'Check input range
If c < "a" Or c > "z" Then Throw New ArgumentOutOfRangeException
'Do the increment
c = Chr(Asc(c) + 1)
'Check not left alphabet
If c > "z" Then c = "a"
'Check if input was upper case
If isUpper Then c = Char.ToUpper(c)
End Sub
Then you just need to call:
Dim a As Char = "a"
a.Inc() 'a is now = "b"
My answer will support up to 10 characters, but can easily support more.
Private Sub Test
MsgBox(ConvertBase10ToBase26(ConvertBase26ToBase10("AA") + 1))
End Sub
Public Function ConvertBase10ToBase26(ToConvert As Integer) As String
Dim pos As Integer = 0
ConvertBase10ToBase26 = ""
For pos = 10 To 0 Step -1
If ToConvert >= (26 ^ pos) Then
ConvertBase10ToBase26 += Chr((ToConvert \ (26 ^ pos)) + 64)
ToConvert -= (26 ^ pos)
End If
Next
End Function
Public Function ConvertBase26ToBase10(ToConvert As String) As Integer
Dim pos As Integer = 0
ConvertBase26ToBase10 = 0
For pos = 0 To ToConvert.Length - 1
ConvertBase26ToBase10 += (Asc(ToConvert.Substring(pos, 1)) - 64) * (26 ^ pos)
Next
End Function
Unfortunately, there's no easy way -- even CChar(CInt(char)+1).ToString doesn't work. It's even uglier:
CChar(Char.ConvertFromUtf32(Char.ConvertToUtf32(myCharacter, 0) + 1))
but of course you could always put that in a function with a short name or, like Jon E. pointed out, an extension method.
Try this
Private Function IncBy1(input As String) As String
Static ltrs As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
Dim first As Integer = ltrs.IndexOf(input(0))
Dim last As Integer = ltrs.IndexOf(input(1))
last += 1
If last = ltrs.Length Then
last = 0
first += 1
End If
If first = ltrs.Length Then Return Nothing
Return ltrs(first) & ltrs(last)
End Function
This DOES assume that the code is only two chars, and are A-Z only.
Dim N as String = ""
Dim chArray As Char = Convert.ToChar(N)
Dim a As String = CChar(Char.ConvertFromUtf32(Char.ConvertToUtf32(chArray, 0) + 1))

how to find the number of occurrences of a substring within a string vb.net

I have a string (for example: "Hello there. My name is John. I work very hard. Hello there!") and I am trying to find the number of occurrences of the string "hello there". So far, this is the code I have:
Dim input as String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase as String = "hello there"
Dim Occurrences As Integer = 0
If input.toLower.Contains(phrase) = True Then
Occurrences = input.Split(phrase).Length
'REM: Do stuff
End If
Unfortunately, what this line of code seems to do is split the string every time it sees the first letter of phrase, in this case, h. So instead of the result Occurrences = 2 that I would hope for, I actually get a much larger number. I know that counting the number of splits in a string is a horrible way to go about doing this, even if I did get the correct answer, so could someone please help me out and provide some assistance?
Yet another idea:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "Hello there"
Dim Occurrences As Integer = (input.Length - input.Replace(phrase, String.Empty).Length) / phrase.Length
You just need to make sure that phrase.Length > 0.
the best way to do it is this:
Public Function countString(ByVal inputString As String, ByVal stringToBeSearchedInsideTheInputString as String) As Integer
Return System.Text.RegularExpressions.Regex.Split(inputString, stringToBeSearchedInsideTheInputString).Length -1
End Function
str="Thisissumlivinginsumgjhvgsum in the sum bcoz sum ot ih sum"
b= LCase(str)
array1=Split(b,"sum")
l=Ubound(array1)
msgbox l
the output gives u the no. of occurences of a string within another one.
You can create a Do Until loop that stops once an integer variable equals the length of the string you're checking. If the phrase exists, increment your occurences and add the length of the phrase plus the position in which it is found to the cursor variable. If the phrase can not be found, you are done searching (no more results), so set it to the length of the target string. To not count the same occurance more than once, check only from the cursor to the length of the target string in the Loop (strCheckThisString).
Dim input As String = "hello there. this is a test. hello there hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer = 0
Dim intCursor As Integer = 0
Do Until intCursor >= input.Length
Dim strCheckThisString As String = Mid(LCase(input), intCursor + 1, (Len(input) - intCursor))
Dim intPlaceOfPhrase As Integer = InStr(strCheckThisString, phrase)
If intPlaceOfPhrase > 0 Then
Occurrences += 1
intCursor += (intPlaceOfPhrase + Len(phrase) - 1)
Else
intCursor = input.Length
End If
Loop
You just have to change the input of the split function into a string array and then delare the StringSplitOptions.
Try out this line of code:
Occurrences = input.Split({phrase}, StringSplitOptions.None).Length
I haven't checked this, but I'm thinking you'll also have to account for the fact that occurrences would be too high due to the fact that you're splitting using your string and not actually counting how many times it is in the string, so I think Occurrences = Occurrences - 1
Hope this helps
You could create a recursive function using IndexOf. Passing the string to be searched and the string to locate, each recursion increments a Counter and sets the StartIndex to +1 the last found index, until the search string is no longer found. Function will require optional parameters Starting Position and Counter passed by reference:
Function InStrCount(ByVal SourceString As String, _
ByVal SearchString As String, _
Optional ByRef StartPos As Integer = 0, _
Optional ByRef Count As Integer = 0) As Integer
If SourceString.IndexOf(SearchString, StartPos) > -1 Then
Count += 1
InStrCount(SourceString, _
SearchString, _
SourceString.IndexOf(SearchString, StartPos) + 1, _
Count)
End If
Return Count
End Function
Call function by passing string to search and string to locate and, optionally, start position:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer
Occurrances = InStrCount(input.ToLower, phrase.ToLower)
Note the use of .ToLower, which is used to ignore case in your comparison. Do not include this directive if you do wish comparison to be case specific.
One more solution based on InStr(i, str, substr) function (searching substr in str starting from i position, more info about InStr()):
Function findOccurancesCount(baseString, subString)
occurancesCount = 0
i = 1
Do
foundPosition = InStr(i, baseString, subString) 'searching from i position
If foundPosition > 0 Then 'substring is found at foundPosition index
occurancesCount = occurancesCount + 1 'count this occurance
i = foundPosition + 1 'searching from i+1 on the next cycle
End If
Loop While foundPosition <> 0
findOccurancesCount = occurancesCount
End Function
As soon as there is no substring found (InStr returns 0, instead of found substring position in base string), searching is over and occurances count is returned.
Looking at your original attempt, I have found that this should do the trick as "Split" creates an array.
Occurrences = input.split(phrase).ubound
This is CaSe sensitive, so in your case the phrase should equal "Hello there", as there is no "hello there" in the input
Expanding on Sumit Kumar's simple solution, here it is as a one-line working function:
Public Function fnStrCnt(ByVal str As String, ByVal substr As String) As Integer
fnStrCnt = UBound(Split(LCase(str), substr))
End Function
Demo:
Sub testit()
Dim thePhrase
thePhrase = "Once upon a midnight dreary while a man was in a house in the usa."
If fnStrCnt(thePhrase, " a ") > 1 Then
MsgBox "Found " & fnStrCnt(thePhrase, " a ") & " occurrences."
End If
End Sub 'testit()
I don't know if this is more obvious?
Starting from the beginning of longString check the next characters up to the number characters in phrase, if phrase is not found start looking from the second character etc. If it is found start agin from the current position plus the number of characters in phrase and increment the value of occurences
Module Module1
Sub Main()
Dim longString As String = "Hello there. My name is John. I work very hard. Hello there! Hello therehello there"
Dim phrase As String = "hello There"
Dim occurences As Integer = 0
Dim n As Integer = 0
Do Until n >= longString.Length - (phrase.Length - 1)
If longString.ToLower.Substring(n, phrase.Length).Contains(phrase.ToLower) Then
occurences += 1
n = n + (phrase.Length - 1)
End If
n += 1
Loop
Console.WriteLine(occurences)
End Sub
End Module
I used this in Vbscript, You can convert the same to VB.net as well
Dim str, strToFind
str = "sdfsdf:sdsdgs::"
strToFind = ":"
MsgBox GetNoOfOccurranceOf( strToFind, str)
Function GetNoOfOccurranceOf(ByVal subStringToFind As String, ByVal strReference As String)
Dim iTotalLength, newString, iTotalOccCount
iTotalLength = Len(strReference)
newString = Replace(strReference, subStringToFind, "")
iTotalOccCount = iTotalLength - Len(newString)
GetNoOfOccurranceOf = iTotalOccCount
End Function
I know this thread is really old, but I got another solution too:
Function countOccurencesOf(needle As String, s As String)
Dim count As Integer = 0
For i As Integer = 0 to s.Length - 1
If s.Substring(i).Startswith(needle) Then
count = count + 1
End If
Next
Return count
End Function

get string between other string vb.net

I have code below. How do I get strings inside brackets? Thank you.
Dim tmpStr() As String
Dim strSplit() As String
Dim strReal As String
Dim i As Integer
strWord = "hello (string1) there how (string2) are you?"
strSplit = Split(strWord, "(")
strReal = strSplit(LBound(strSplit))
For i = 1 To UBound(strSplit)
tmpStr = Split(strSplit(i), ")")
strReal = strReal & tmpStr(UBound(tmpStr))
Next
Dim src As String = "hello (string1) there how (string2) are you?"
Dim strs As New List(Of String)
Dim start As Integer = 0
Dim [end] As Integer = 0
While start < src.Length
start = src.IndexOf("("c, start)
If start <> -1 Then
[end] = src.IndexOf(")"c, start)
If [end] <> -1 Then
Dim subStr As String = src.Substring(start + 1, [end] - start - 1)
If Not subStr.StartsWith("(") Then strs.Add(src.Substring(start + 1, [end] - start - 1))
End If
Else
Exit While
End If
start += 1 ' Increment start to skip to next (
End While
This should do it.
Dim result = Regex.Matches(src, "\(([^()]*)\)").Cast(Of Match)().Select(Function(x) x.Groups(1))
Would also work.
This is what regular expressions are for. Learn them, love them:
' Imports System.Text.RegularExpressions
Dim matches = Regex.Matches(input, "\(([^)]*)\)").Cast(of Match)()
Dim result = matches.Select(Function (x) x.Groups(1))
Two lines of code instead of more than 10.
In the words of Stephan Lavavej: “Even intricate regular expressions are easier to understand and modify than equivalent code.”
Use String.IndexOf to get the position of the first opening bracket (x).
Use IndexOf again the get the position of the first closing bracket (y).
Use String.Substring to get the text based on the positions from x and y.
Remove beginning of string up to y+1.
Loop as required
That should get you going.
This may also work:
Dim myString As String = "Hello (FooBar) World"
Dim finalString As String = myString.Substring(myString.IndexOf("("), (myString.LastIndexOf(")") - myString.IndexOf("(")) + 1)
Also 2 lines.

Help Visual Basic mixing characters

I'm making an application that will change position of two characters in Word.
Imports System.IO
Module Module1
Sub Main()
Dim str As String = File.ReadAllText("File.txt")
Dim str2 As String() = Split(str, " ")
For i As Integer = 0 To str2.Length - 1
Dim arr As Char() = CType(str2(i), Char())
For ia As Integer = 0 To arr.Length() - 1 Step 2
Dim pa As String
pa = arr(ia + 1)
arr(ia + 1) = arr(ia)
arr(ia) = pa
Next ia
For ib As Integer = 0 To arr.Length - 1
Console.Write(arr(ib))
File.WriteAllText("File2.txt", arr(ib))
Next ib
File.WriteAllText("File2.txt", " ")
Console.Write(" ")
Next i
Console.Read()
End Sub
End Module
For example:
Input: ab
Output: ba
Input: asdasd asdasd
Output: saadds saadds
Program works good, it is mixing characters good, but it doesn't write text to the file. It will write text in console, but not in file.
Note: Program is working only with words that are divisible by 2, but it's not a problem.
Also, it does not return any error message.
Your code is overwriting the file that you have already written with a single space (" ") each time round.
You should only open the file once, and append to it using a stream writer:
Using output = File.CreateText("file2.txt")
' Put the for loop here.
End Using
There are some other things wrong with your code. Firstly, use For Each instead of For, this makes your code much more simple and readable. Secondly, try to avoid For loops altogether where possible. For instance, instead of iterating over the characters to output them one at a time, just create a new string from the char array, and write that:
Dim shuffledWord As New String(arr)
output.Write(shuffledWord)
Some of your types are plain wrong, i.e. you are using String in places instead of Char. You should always use Option Strict On. Then the compiler will not tolerate such code.
You should also prefer to use framework methods over VB-specific methods. This makes it easier to understand for C# programmers, and also makes it easier to translate and change (that is, use the Split method of strings instead of a free function, use ToCharArray instead of a cast to Char() …).
Finally, use meaningful variable names. str, str2 and arr are particularly cryptic because they don’t tell the reader of the code anything of interest about the variables.
Sub Main()
Dim text As String = File.ReadAllText("File.txt")
Dim words As String() = str.Split(" "c)
Using output = File.CreateText("file2.txt")
For Each word In words
dim wordChars = word.ToCharArray()
For i As Integer = 0 To wordChars.Length - 1 Step 2
Dim tmp As Char = wordChars(i + 1)
wordChars(i + 1) = wordChars(i)
arr(i) = tmp
Next
Dim shuffledWord As New String(wordChars)
output.Write(shuffledWord + " ")
Console.Write(huffledWord + " ")
Next
End Using
Console.Read()
End Sub