VB.net Html convertor error giving no change? - vb.net

net language.
and I would like to replace the html first tags and keep the structure of the text, I have tried this code below from the website https://beansoftware.com/ASP.NET-Tutorials/Convert-HTML-To-Plain-Text.aspx
Dim html As String = "<div class='WordSection1'><p class='MsoNormal'>"
Dim final_result As String
Dim sbhtml As StringBuilder = New StringBuilder(html)
Dim OldWords() As String = {" ", "&", """, "<", ">", "®", "©", "•", "™"}
Dim NewWords() As String = {" ", "&", """", "<", ">", "®", "©", "•", "™"}
For i As Integer = 0 To i < OldWords.Length
sbhtml.Replace(OldWords(i), NewWords(i))
Next i
Console.WriteLine($"result after loop : {sbhtml}")
sbhtml.Replace("<br>", "\n<br>")
sbhtml.Replace("<br ", "\n<br ")
sbhtml.Replace("<p ", "\n<p ")
final_result = Regex.Replace(sbhtml.ToString(), "<[^>]*>", "")
Console.WriteLine(final_result)
However the output come back as the same as the string

The for-statement is wrong. It should be
For i As Integer = 0 To OldWords.Length - 1
Probably some C# syntax leaked over.
Why didn't you append the sbhtml.Replace("<br>", "\n<br>") and following lines to the OldWords and NewWords? They are technically not any different.
By using tuples, you can put the old and new words into the same array and use a For-Each-loop
I suggest the following approach
Dim html As String =
"<div class='WordSection1'>aaa<br>bbb<p class='MsoNormal'>"
Dim final_result As String
Dim sbhtml As StringBuilder = New StringBuilder(html)
Dim Substitutions() As (old As String, repl As String) = {
(" ", " "), ("&", "&"), (""", """"), ("<", "<"),
(">", ">"), ("®", "®"), ("©", "©"), ("•", "•"),
("™", "â„¢"), ("<br>", "\n<br>"), ("<br ", "\n<br "), ("<p ", "\n<p ")}
For Each subst In Substitutions
sbhtml.Replace(subst.old, subst.repl)
Next
Console.WriteLine($"result after loop : {sbhtml}")
final_result = Regex.Replace(sbhtml.ToString(), "<[^>]*>", "")
Console.WriteLine(final_result)
The HtmlAgilityPack does a very good job in HTML manipulation and is very reliable. You would do something like
Dim plainText As String = HtmlUtilities.ConvertToPlainText(html)
See Install and manage packages in Visual Studio using the NuGet Package Manager for the easy installation of the HtmlAgilityPack.

Related

Use a string as a vb function

I have into a string a replace, but I want to use it as a real vb.net function, There are a possibility to do this? For example:
dim str as string = "my task"
dim func as string = "Replace(str, " ", "-")"
dim result as string = 'here I must to use func string to have into result "my-task"
help me please
This is how to do it:
Dim inputString As String = "my task"
Dim methodName As String = "Replace"
Dim arguments = New String() {" ", "-"}
Dim result = CallByName(inputString, methodName, CallType.Method, arguments)
This is equivalent to:
Dim inputString As String = "my task"
Dim result = inputString.Replace(" ", "-")
Although it is worth noting: it is very likely that there are better ways to organize your code. Executing functions from a string have multiple downsides that you might want to avoid.

Using Regex to match multiple html lines

Like the title says I have the html source that contains the code below. I am trying to grab the number 25 which can change so I think I would use the wildcard .* but I am not sure how to grab it because I the number is on its own line. I usually used one line of html then used the split method and split the double quotes then getting the value. I know how to do html webrequest and webresponse, so I have no problem getting the source. Any help with the regex expression would be appreciated.
<td class="number">25</td>
Edit: This is what I used to get the source.
Dim strURL As String = "mysite"
Dim strOutput As String = ""
Dim wrResponse As WebResponse
Dim wrRequest As WebRequest = HttpWebRequest.Create(strURL)
wrResponse = wrRequest.GetResponse()
Using sr As New StreamReader(wrResponse.GetResponseStream())
strOutput = sr.ReadToEnd()
'Close and clean up the StreamReader
sr.Close()
End Using
Instead of a Regex, it is possible to use a string search to pull out the text, e.g.:
Sub Main()
Dim html = "<td class=""number"">" + vbCrLf + "25" + vbCrLf + "</td>"
Dim startText = "<td class=""number"">"
Dim startIndex = html.IndexOf(startText) + startText.Length
Dim endText = "</td>"
Dim endIndex = html.IndexOf(endText, startIndex)
Dim text = html.Substring(startIndex, endIndex - startIndex)
text = text.Trim({CChar(vbLf), CChar(vbCr), " "c})
Console.WriteLine(text)
End Sub

Substring and return list of comma separated characters

Domain\X_User|X_User,Domain\Y_User|Y_User,
I'm using a SSRS report and I'm receiving the above value, I want to write visual basic function in the report ( Custom code) to split the above string and return the following value:
X_User,Y_User
I tried to write this code inside a custom code of the report body:
Public Function SubString_Owner(X As String) As String
Dim OwnerArray() As String = Split(X, ",")
Dim Names As String
Dim i As Integer = 0
While i <= OwnerArray.Length - 1
Dim NamesArr As String() = Split(OwnerArray(0), "|")
Names = NamesArr(1) + ","
i += 1
End While
Return Names
End Function
The problem is when trying to split OwnerArray(i), it gives an error but when using a fixed value, like zero, it builds fine. Can anyone figure out why this is?
Here is a more generic solution that will work with any number of items:
Dim sourceString As String = "Domain\X_User|X_User,Domain\Y_User|Y_User,"
Dim domainsAndUsers As IEnumerable(Of String) = sourceString.Split(","c).Where(Function(s) Not String.IsNullOrEmpty(s))
Dim usersWithoutDomains As IEnumerable(Of String) = domainsAndUsers.Select(Function(s) s.Remove(0, s.IndexOf("\") + 1))
Dim users As IEnumerable(Of String) = usersWithoutDomains.Select(Function(s) s.Remove(s.IndexOf("|")))
Dim result As String = users.Aggregate(Function(s, d) s & "," & d)
Or if you want it as a single-line function, here:
Function Foo(sourceString As String) As String
Return sourceString.Split(","c).Where(Function(s) Not String.IsNullOrEmpty(s)).Select(Function(s) s.Remove(0, s.IndexOf("\") + 1)).Select(Function(s) s.Remove(s.IndexOf("|"))).Aggregate(Function(s, d) s & "," & d)
End Function
EDIT:
You may have to add Imports System.Linq to the top. Not sure if SSRS can use LINQ or not. If not, then here is a similar solution without LINQ:
Dim sourceString As String = "Domain\X_User|X_User,Domain\Y_User|Y_User,"
Dim domainsAndUsers As IEnumerable(Of String) = sourceString.Split(","c)
Dim usersWithoutDomains As String = String.Empty
For Each domainUser As String In domainsAndUsers
usersWithoutDomains &= domainUser.Remove(0, domainUser.IndexOf("\") + 1) & ","
Next
Dim strTest As String = "Domain\X_User|X_User,Domain\Y_User|Y_User"
MsgBox(strTest.Split("|")(0).Split("\")(1) & " " & strTest.Split("|")(1).Split("\")(1))
Here's a simple way that will work with variable data as long as the pattern you've shown is strongly followed:
Imports System.Linq
Dim strtest As String = "Domain\X_User|X_User,Domain\Y_User|Y_User,"
'This splits the string according to "|" and ",". Now any string without _
a "\" is the user and Join adds them together with `,` as a delimiter
Dim result As String = Join((From s In strtest.Split("|,".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
Where Not s.Contains("\")
Select s).ToArray, ",")
Just in case LINQ is unavailable to you here's a different way to the same results without LINQ:
Dim result As String = ""
For Each s As String In strtest.Split("|,".ToCharArray, StringSplitOptions.RemoveEmptyEntries)
If Not s.Contains("\") Then
result += s & ","
End If
Next
result = result.TrimEnd(",".ToCharArray)

Format for a multiple Replace Commands

Lets say i have this in a shell
"chdir * && whoami.exe >> $$$"
I have this replacecommand
Dim ReplaceCommand as String = sCommand.Replace("*", UserDirect)
I also would like the $$$ to be replaced with a user chosen filepath.
I can get the file path chosen but it never puts it into the shell.
I have tried
Dim ReplaceCommand1, ReplaceCommand2 as String = sCommand.Replace("*" & "$$$", UserDirect & filepath)
Shell("cmd.exe" & ReplaceCommand1 & ReplaceCommand2)
Dim ReplaceCommand as String = sCommand.Replace("*", UserDirect) & ("$$$", filepath)
Shell("cmd.exe" & ReplaceCommand)
also
Dim ReplaceCommand1 as String = sCommand.Replace("*", UserDirect)
Dim ReplaceCommand2 as String = sCommand.Replace("$$$", filepath)
Shell("cmd.exe" & ReplaceCommand1 & ReplaceCommand2)
EDIT:
get a path to short error when I use commas in shell instead of &
Dim ReplaceCommand1 as String = sCommand.Replace("*", UserDirect)
Dim ReplaceCommand2 as String = sCommand.Replace("$$$", filepath)
Shell("cmd.exe", ReplaceCommand1 , ReplaceCommand2)
You can chain the Replace's together:
Dim ReplaceCommand1 as String = sCommand.Replace("*", UserDirect).Replace("$$$", filepath)
Shell("cmd.exe" & ReplaceCommand1)
Part of your examples don't compile cause of the syntax errors.
You're not using Shell() like you're supposed to.
Public Function Shell(
PathName As String,
Optional Style As Microsoft.VisualBasic.AppWinStyle = MinimizedFocus,
Optional Wait As Boolean = False,
Optional Timeout As Integer = -1
) As Integer
From the examples you gave, it looks like you're just throwing stuff together. Stop and think for a minute :)

Create strings on the fly using VB.NET

I'm trying to create a number of strings based on one long string that i'm passing.
Basically this is an example of my long string
StrMain = AL123456 - PR123456 - RD123456 - LO123456
So in this case I want to create 4 separate strings.
Str1 = AL123456
Str2 = PR123456
Str3 = RD123456
Str4 = LO123456
But there isn't always that many or there may be more so I need to count how many - there are and with that then create the amount strings needed.
Any ideas?
Thanks
Jamie
Let we have:
var input = "AL123456 - PR123456 - RD123456 - LO123456"
then
input.Split('-');
will return
{ "AL123456 ", " PR123456 ", " RD123456 ", " LO123456" }
i.e. with leading and trailing spaces.
So you need trim each:
using System.Collections.Generic;
using System.Linq;
IEnumerable<string> result = input.Split('-').Select(s => s.Trim());
(Select() requires .NET 3.5+)
Or just split by " - ":
var result = input.Split(new string[] { " - " }, StringSplitOptions.None);
or using VB.NET syntax:
Dim result As String() = input.Split({ " - " }, StringSplitOptions.None)
I guess VB.NET has next syntax for for-each:
For Each str As String In result
Response.Write(str) ' or use str in other context '
End For
You could use the Split function:
Dim tokens As String() = "AL123456 - PR123456 - RD123456 - LO123456".Split("-"C)
or if you want to use a string as separator:
Dim tokens As String() = "AL123456 - PR123456 - RD123456 - LO123456".Split({" - "}, StringSplitOptions.RemoveEmptyEntries)
Use the split function
This is the use i recommend is below, you can adapt it by either:
adding more separators (add strings to the separator array like this: New String() {" - ", " _ "} )
removing empty entries (not necessary, but usually useful)
Dim MyString As String = "value1 - value2 - value3 - - value4"
Dim results As String() = MyString.Split(New String() {" - "}, StringSplitOptions.RemoveEmptyEntries)
' you get an array with 4 entries containing each individual string