Vb.net find and replace within paragraph - vb.net

What would be the efficient way to read a paragraph, and replace anything between square brackets []? In my case following paragraph,
I agree to [Terms of Use|https://www.google.com/terms-use], [Privacy Statement|https://www.google.com/privacy-statement] and [Misc Term|https://www.google.com/misc-terms]
should parsed as,
I agree to Terms of Use, Privacy Statement and Misc Term

You can use the following regular expression pattern to match the brackets:
\[[^]]+]
This is what the pattern means:
\[ match an open bracket
[^]]+ match anything but a closed bracket one or more times
] match a closed bracket
Here is an example fiddle: https://regex101.com/r/Iyk9kR/1
Once you have the matches, you would:
Use Regex.Replace (documentation)
Use the MatchEvaluator overload
In the MatchEvaluator, use String.Split (documentation) on the pipe character
Dynamically build your anchor tag by setting the href attribute to the second match of the split and the innerHTML to the first match of the split
It might be worth adding some conditional checking in between steps 2 and 3, but I'm not sure of your exact requirements.
Here is an example:
Imports System
Imports System.Text.RegularExpressions
Public Module Module1
Public Sub Main()
Dim literal = "I agree to [Terms of Use|https://www.google.com/terms-use], [Privacy Statement|https://www.google.com/privacy-statement] and [Misc Term|https://www.google.com/misc-terms]"
Dim regexPattern = "\[[^]]+]"
Dim massagedLiteral = Regex.Replace(literal, regexPattern, AddressOf ConvertToAnchor)
Console.WriteLine(massagedLiteral)
End Sub
Private Function ConvertToAnchor(m As Match) As String
Dim matches = m.Value.Split("|"c)
Return "" & matches(0).Substring(1) & ""
End Function
End Module
Fiddle: https://dotnetfiddle.net/xUE8St

Dim rtnStr = "String goes here"
Dim pattern As String = "\[.*?\]"
If Regex.IsMatch(rtnStr, pattern) Then
Dim matches = Regex.Matches(rtnStr, pattern, RegexOptions.IgnoreCase)
For index As Integer = 0 To matches.Count - 1
If matches(index).ToString().Contains("|") Then
Dim splitBracket = matches(index).ToString().Split("|")
Dim linkName = String.Empty
Dim linkUrl = String.Empty
If splitBracket.Length > 0 Then
linkName = splitBracket(0).Replace("[", "")
linkUrl = splitBracket(1).Replace("]", "")
End If
Dim linkHtml = "<a class=""terms"" href=""javascript: void(0);"" data-url=" + linkUrl + ">" + linkName + "</a>"
rtnStr = rtnStr.Replace(matches(index).ToString(), linkHtml)
End If
Next
End If
#Html.Raw(rtnStr)

Related

Get a specific value from the line in brackets (Visual Studio 2019)

I would like to ask for your help regarding my problem. I want to create a module for my program where it would read .txt file, find a specific value and insert it to the text box.
As an example I have a text file called system.txt which contains single line text. The text is something like this:
[Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]
What i want to do is to get only the last name value "xxx_xxx" which every time can be different and insert it to my form's text box
Im totally new in programming, was looking for the other examples but couldnt find anything what would fit exactly to my situation.
Here is what i could write so far but i dont have any idea if there is any logic in my code:
Dim field As New List(Of String)
Private Sub readcrnFile()
For Each line In File.ReadAllLines(C:\test\test_1\db\update\network\system.txt)
For i = 1 To 3
If line.Contains("Last Name=" & i) Then
field.Add(line.Substring(line.IndexOf("=") + 2))
End If
Next
Next
End Sub
Im
You can get this down to a function with a single line of code:
Private Function readcrnFile(fileName As String) As IEnumerable(Of String)
Return File.ReadLines(fileName).Where(Function(line) RegEx.IsMatch(line, "[[[]Last Name=(?<LastName>[^]]+)]").Select(Function(line) RegEx.Match(line, exp).Groups("LastName").Value)
End Function
But for readability/maintainability and to avoid repeating the expression evaluation on each line I'd spread it out a bit:
Private Function readcrnFile(fileName As String) As IEnumerable(Of String)
Dim exp As New RegEx("[[[]Last Name=(?<LastName>[^]]+)]")
Return File.ReadLines(fileName).
Select(Function(line) exp.Match(line)).
Where(Function(m) m.Success).
Select(Function(m) m.Groups("LastName").Value)
End Function
See a simple example of the expression here:
https://dotnetfiddle.net/gJf3su
Dim strval As String = " [Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]"
Dim strline() As String = strval.Split(New String() {"[", "]"}, StringSplitOptions.RemoveEmptyEntries) _
.Where(Function(s) Not String.IsNullOrWhiteSpace(s)) _
.ToArray()
Dim lastnameArray() = strline(1).Split("=")
Dim lastname = lastnameArray(1).ToString()
Using your sample data...
I read the file and trim off the first and last bracket symbol. The small c following the the 2 strings tell the compiler that this is a Char. The braces enclosed an array of Char which is what the Trim method expects.
Next we split the file text into an array of strings with the .Split method. We need to use the overload that accepts a String. Although the docs show Split(String, StringSplitOptions), I could only get it to work with a string array with a single element. Split(String(), StringSplitOptions)
Then I looped through the string array called splits, checking for and element that starts with "Last Name=". As soon as we find it we return a substring that starts at position 10 (starts at zero).
If no match is found, an empty string is returned.
Private Function readcrnFile() As String
Dim LineInput = File.ReadAllText("system.txt").Trim({"["c, "]"c})
Dim splits = LineInput.Split({"]["}, StringSplitOptions.None)
For Each s In splits
If s.StartsWith("Last Name=") Then
Return s.Substring(10)
End If
Next
Return ""
End Function
Usage...
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
TextBox1.Text = readcrnFile()
End Sub
You can easily split that line in an array of strings using as separators the [ and ] brackets and removing any empty string from the result.
Dim input As String = "[Name=John][Last Name=xxx_xxx][Address=xxxx][Age=22][Phone Number=8454845]"
Dim parts = input.Split(New Char() {"["c, "]"c}, StringSplitOptions.RemoveEmptyEntries)
At this point you have an array of strings and you can loop over it to find the entry that starts with the last name key, when you find it you can split at the = character and get the second element of the array
For Each p As String In parts
If p.StartsWith("Last Name") Then
Dim data = p.Split("="c)
field.Add(data(1))
Exit For
End If
Next
Of course, if you are sure that the second entry in each line is the Last Name entry then you can remove the loop and go directly for the entry
Dim data = parts(1).Split("="c)
A more sophisticated way to remove the for each loop with a single line is using some of the IEnumerable extensions available in the Linq namespace.
So, for example, the loop above could be replaced with
field.Add((parts.FirstOrDefault(Function(x) x.StartsWith("Last Name"))).Split("="c)(1))
As you can see, it is a lot more obscure and probably not a good way to do it anyway because there is no check on the eventuality that if the Last Name key is missing in the input string
You should first know the difference between ReadAllLines() and ReadLines().
Then, here's an example using only two simple string manipulation functions, String.IndexOf() and String.Substring():
Sub Main(args As String())
Dim entryMarker As String = "[Last Name="
Dim closingMarker As String = "]"
Dim FileName As String = "C:\test\test_1\db\update\network\system.txt"
Dim value As String = readcrnFile(entryMarker, closingMarker, FileName)
If Not IsNothing(value) Then
Console.WriteLine("value = " & value)
Else
Console.WriteLine("Entry not found")
End If
Console.Write("Press Enter to Quit...")
Console.ReadKey()
End Sub
Private Function readcrnFile(ByVal entry As String, ByVal closingMarker As String, ByVal fileName As String) As String
Dim entryIndex As Integer
Dim closingIndex As Integer
For Each line In File.ReadLines(fileName)
entryIndex = line.IndexOf(entry) ' see if the marker is in our line
If entryIndex <> -1 Then
closingIndex = line.IndexOf(closingMarker, entryIndex + entry.Length) ' find first "]" AFTER our entry marker
If closingIndex <> -1 Then
' calculate the starting position and length of the value after the entry marker
Dim startAt As Integer = entryIndex + entry.Length
Dim length As Integer = closingIndex - startAt
Return line.Substring(startAt, length)
End If
End If
Next
Return Nothing
End Function

How to find a phrase in a string based on a character

Let's say I have the following string:
sdfhahsdfu^asdhfhasdf^asd7f8asdfh^asdfhasdf^testemail#email.com^asdhfausdf^asodfuasdufh^alsdfhasdh
What's the best way of extracting the email from that string? I thought of maybe split(string, "#") but then I'm not sure where to go from there.
Note: the email will always be flanked by ^ on either side, but the position in the string will be different depending on the string.
You can use Regex to find your string. Try something like:
System.Text.RegularExpressions.Regex.Match("\^[^\^]+#[^\^]+\^", myString)
I would split over ^ and then loop through all items to find something containing a #
'1 form with:
' 1 command button: name=Command1
Option Explicit
Private Sub Command1_Click()
Dim lngItem As Long
Dim strString As String
Dim strItem() As String
strString = "sdfhahsdfu^asdhfhasdf^asd7f8asdfh^asdfhasdf^testemail#email.com^asdhfausdf^asodfuasdufh^alsdfhasdh"
strItem = Split(strString, "^")
For lngItem = 0 To UBound(strItem)
If InStr(strItem(lngItem), "#") > 0 Then
DoEmail strItem(lngItem)
End If
Next lngItem
End Sub
Private Sub DoEmail(strEmail As String)
Print strEmail
End Sub

How to filter anything but numbers from a string

I want to filter out other characters from a string as well as split the remaining numbers with periods.
This is my string: major.number=9minor.number=10revision.number=0build.number=804
and this is the expected output: 9.10.0.804
Any suggestions?
As to my comment, if your text is going to be constant you can use String.Split to remove the text and String.Join to add your deliminators. Quick example using your string.
Sub Main()
Dim value As String = "major.number=9minor.number=10revision.number=0build.number=804"
Dim seperator() As String = {"major.number=", "minor.number=", "revision.number=", "build.number="}
Console.WriteLine(String.Join(".", value.Split(seperator, StringSplitOptions.RemoveEmptyEntries)))
Console.ReadLine()
End Sub
If your string does not always follow a specific pattern, you could use Regex.Replace:
Sub Main()
Dim value as String = "major.number=9minor.number=10revision.number=0build.number=804"
Dim version as String = Regex.Replace(value, "\D*(\d+)\D*", "$1.") ' Run the regex
version = version.Substring(0, version.Length - 1) ' Trim the last dot
End
Note you should Imports System.Text.RegularExpressions.

vb2008 match text between html tags

hello i'm using Visual Basic 2008 Express Edition
how is it possible to match text between tags?
for example i have a string : <data>Text</data>more text..., how i can get the Text which is inside <data></data> ( .Replace won't help).
thanks
My solution :
Public Function parseText(ByVal str As String, ByVal tag As String) As String
Dim match As Match = Regex.Match(str, "<" & tag & "\b[^>]*>(.*?)</" & tag & ">")
If match.Groups.Count = 2 Then
Return match.Groups(1).Value
Else
Return "0"
End If
End Function
I use this because in my case the tags will be always without id, class, width, href, src, style .... just tag name (ex:<data><str><text>...)
You can use RegularExpressions.
Dim s As String = "<data>Hello world</data>"
Dim match As Match = Regex.Match(s, "<data\b[^>]*>(.*?)</data>")
Dim text As String
If match.Groups.Count = 2 Then
text = match.Groups(1).Value
End If
Use the HTML Agility Pack to parse the HTML string and then query the resulting object for the values you want.
The source download comes with many example projects.
This may help you
Dim findtext2 As String = "(?<=<data>)(.*?)(?=</data>)"
Dim myregex2 As String = TextBox1.Text 'Your HTML code
Dim doregex2 As MatchCollection = Regex.Matches(myregex2, findtext2)
Dim matches2 As String = ""
For Each match2 As Match In doregex2
matches2 = matches2 + match2.ToString + Environment.NewLine
Next
MsgBox(matches2) 'Results
Don't forget Imports System.Text.RegularExpressions.
Above code is getting all information between 2 strings, in this case - <data> and </data>. You can use whatever you want (it doesn't need to be tag, not even html).

Remove special characters from a string

These are valid characters:
a-z
A-Z
0-9
-
/
How do I remove all other characters from my string?
Dim cleanString As String = Regex.Replace(yourString, "[^A-Za-z0-9\-/]", "")
Use either regex or Char class functions like IsControl(), IsDigit() etc. Get a list of these functions here: http://msdn.microsoft.com/en-us/library/system.char_members.aspx
Here's a sample regex example:
(Import this before using RegEx)
Imports System.Text.RegularExpressions
In your function, write this
Regex.Replace(strIn, "[^\w\\-]", "")
This statement will replace any character that is not a word, \ or -. For e.g. aa-b#c will become aa-bc.
Dim txt As String
txt = Regex.Replace(txt, "[^a-zA-Z 0-9-/-]", "")
Function RemoveCharacter(ByVal stringToCleanUp)
Dim characterToRemove As String = ""
characterToRemove = Chr(34) + "#$%&'()*+,-./\~"
Dim firstThree As Char() = characterToRemove.Take(16).ToArray()
For index = 1 To firstThree.Length - 1
stringToCleanUp = stringToCleanUp.ToString.Replace(firstThree(index), "")
Next
Return stringToCleanUp
End Function
I've used the first solution from LukeH, but then realized that this code replaces the dot for extension, therefore I've just upgraded the code slightly:
Dim fileNameNoExtension As String = Path.GetFileNameWithoutExtension(fileNameWithExtension)
Dim cleanFileName As String = Regex.Replace(fileNameNoExtension, "[^A-Za-z0-9\-/]", "") & Path.GetExtension(fileNameWithExtension)
cleanFileName will the file name with no special characters with extension.