Extract href text from website - vb.net

How Can I extract href text from website?
<div class="ba by">**I want this text!**</div>
I try some solution, but doesn't work.
Dim myMatches As MatchCollection
Dim myRegex As New Regex("<div.*?class=""ba by"".*?>.*</div>", RegexOptions.Singleline)
Dim wc As New WebClient
Dim html As String = wc.DownloadString("http://somewebaddress.com")
TextBox1.Text = html
myMatches = myRegex.Matches(html)
MsgBox(html)
Dim successfulMatch As Match
For Each successfulMatch In myMatches
MsgBox(successfulMatch.Groups(1).ToString)
Next
or
Dim divs = WebBrowser1.Document.Body.GetElementsByTagName("div")
For Each d As HtmlElement In divs
If d.GetAttribute("class") = "ba by" Then
TextBox1.Text = d.InnerText
End If
Next
Thank you!

Instead of ...
Dim divs = WebBrowser1.Document.Body.GetElementsByTagName("div")
Try...
Dim anchors = WebBrowser1.Document.Body.GetElementsByTagName("a")
That should get you a list of all the "

Related

Extracting Portion of Url using VB.net

I have this URL
https://www.google.com/maps/place/Aleem+Iqbal+SEO/#31.888433,73.263572,17z/data=!3m1!4b1!4m5!3m4!1s0x39221cb7e4154211:0x9cf2bb941cace556!8m2!3d31.888433!4d73.2657607
I am trying to Extract 31.888433,73.263572 from the URL
and send 31.888433 to TextBox 1
and 73.263572 to TextBox 2
Can you give me an example how can i do this with regex or anything else
You can use string.split(). This method takes an array of chars which are the discriminants for the splitting. The better solution is to split by '/', take the string that starts with '#' and then split it by ','. You'll have an array with two string: first latitude, second longitude.
Should be immediate using LINQ
The explanation is in the code comments.
Dim strURL As String = "https://www.google.com/maps/place/Aleem+Iqbal+SEO/#31.888433,73.263572,17z/data=!3m1!4b1!4m5!3m4!1s0x39221cb7e4154211:0x9cf2bb941cace556!8m2!3d31.888433!4d73.2657607"
'Find the index of the first occurance of the # character
Dim index As Integer = strURL.IndexOf("#")
'Get the string from that the next character to the end of the string
Dim firstSubstring As String = strURL.Substring(index + 1)
'Get a Char array of the separators
Dim separators As Char() = {CChar(",")}
'Split the string into an array based on the separator; the separator is not part of the array
Dim strArray As String() = firstSubstring.Split(separators)
'The first and second elements of the array is what you want
Dim strTextBox1 As String = strArray(0)
Dim strTextBox2 As String = strArray(1)
Debug.Print($"{strTextBox1} For TextBox1 and {strTextBox2} for TextBox2")
Finally Made a working Code Myself
Private _reg As Regex = New
Regex("#(-?[\d].[\d]),(-?[\d].[\d])", RegexOptions.IgnoreCase)
Private Sub FlatButton1_Click(sender As Object, e As EventArgs) Handles FlatButton1.Click
Dim url As String = WebBrowser2.Url.AbsoluteUri.ToString()
' The input string.
Dim value As String = WebBrowser2.Url.ToString
Dim myString As String = WebBrowser2.Url.ToString
Dim regex1 = New Regex("#(-?\d+\.\d+)")
Dim regex2 = New Regex(",(-?\d+\.\d+)")
Dim match = regex1.Match(myString)
Dim match2 = regex2.Match(myString)
If match.Success Then
Dim match3 As String = match.Value.Replace("#", "")
Dim match4 As String = match2.Value.Replace(",", "")
Label1.Text = match3
Label2.Text = match4
End If
End Sub
Dim url As String = "www.google.com/maps/place/Aleem+Iqbal+SEO/#31.888433,73.263572,17z/data=!3m1!4b1!4m5!3m4!1s0x39221cb7e4154211:0x9cf2bb941cace556!8m2!3d31.888433!4d73.2657607"
Dim temp As String = Regex.Match(url, "#.*,").Value.Replace("#", "")
Dim arrTemp As String() = temp.Split(New String() {","}, StringSplitOptions.None)
Label1.Text = arrTemp(0)
Label2.Text = arrTemp(1)

Replacing strings in multiple text files using regular expressions on vb.net

I'm looking to build a simple text string replacement tool using visual studio 2015 community tool, which will do the below replacements on all *.txt files whose path is given in a textbox:
Find: \<figure (\d+)\>
Replace: <a href id="fig\1">figure \1</a>
Find: \<table (\d+)\>
Replace: <a href id="tab\1">table \1</a>
Find: \<section (\d+)\>
Replace: <a href id="sec\1">section \1</a>
I have coded a little portion of the programme but struggling to complete it. I'm completely new in programming and in visual basic is well. Can anyone help complete this programme
Imports System.IO
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
If FBD.ShowDialog = DialogResult.OK Then
TextBox1.Text = FBD.SelectedPath
End If
End Sub
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
Dim targetDirectory As String
targetDirectory = TextBox1.Text
Dim Files As String() = Directory.GetFiles(targetDirectory, "*.txt")
For Each file In Files
Dim FileInfo As New FileInfo(file)
Dim FileLocation As String = FileInfo.FullName
Dim input As String = file.ReadAllLines(FileLocation)
Dim pattern1 As String = "\<figure (\d+)\>"
Dim pattern2 As String = "\<table (\d+)\>"
Dim pattern3 As String = "\<section (\d+)\>"
Dim rep1 As String = "<a href id= \""fig\1\"" > figure \1</a>"
Dim rep2 As String = "<a href id= \""tab\1\"" > table \1</a>"
Dim rep3 As String = "<a href id= \""sec\1\"" > section \1</a>"
Dim rgx1 As New Regex(pattern1)
Dim rgx2 As New Regex(pattern2)
Dim rgx3 As New Regex(pattern3)
Dim result1 As String = rgx1.Replace(input, rep1)
Dim result2 As String = rgx2.Replace(result1, rep2)
Dim result3 As String = rgx3.Replace(result2, rep3)
Next
End Sub
End Class
The errors I'm getting are given below
Error BC30456 'ReadAllLines' is not a member of 'String'
For the replace button, once you have the link to the folder directory, you need to read in all files that say "*.txt". The below line does this
targetDirectory = TextBox1.text
Dim txtFilesArray As String() = Directory.GetFiles(targetDirectory,"*.txt")
You can then loop through this array and do your replace logic.
For each txtFile in txtFilesArray
'here we grab the files information as we need the files directory
Dim FileInfo As New FileInfo(txtFile)
Dim FileLocation As String = FileInfo.FullName
Dim input() as string = File.ReadAllLines(FileLocation)
'now you have read in your text file you can edit each file as it goes through the loop
'you can now use your regex here to edit each file,
'then once done editing the file, dont forget to write back to your file or it wont save
'you will need to loop through the input array now to change the line
For x as integer = 0 to (input.length - 1)
Dim pattern1 As String = "\<figure (\d+)\>"
Dim pattern2 As String = "\<table (\d+)\>"
Dim pattern3 As String = "\<section (\d+)\>"
Dim rep1 As String = "<a href id= \""fig\1\"" > figure \1</a>"
Dim rep2 As String = "<a href id= \""tab\1\"" > table \1</a>"
Dim rep3 As String = "<a href id= \""sec\1\"" > section \1</a>"
Dim rgx1 As New Regex(pattern1)
Dim rgx2 As New Regex(pattern2)
Dim rgx3 As New Regex(pattern3)
Dim result1 As String = rgx1.Replace(input(x), rep1)
Dim result2 As String = rgx2.Replace(result1, rep2)
Dim result3 As String = rgx3.Replace(result2, rep3)
input(x) = result3
Next
'now you can write your corrected file back to the file
File.WriteAllLines(FileLocation, input)
Next
MsgBox("process complete")
#Tamal Banerjee, try this for progress bar, put the below code after the last Next in the coding
Next
ProgressBar1.PerformStep()
ProgressBar1.Value = 100
MessageBox.Show("Process complete")
End Sub
I'm not sure whether this is the write way to do it, but it worked when tried this on my computer with your coding :)

VB.NET get a specific "href" by id

I need help to get start with my app.
I try to get HREF link from webpage without webbrowser.
I use HtmlAgilityPack but i cant get the spefic HREF:
<a id="ItemsList_file_0" title="HR9nqJCIqHex8niygKtwUHpdkjRGNaH22Oy54SPBmw.avi" href="http://dsa11.uverload.com/d/a012284e-1fd5-4317-82d3-9bf9e738f0a2/BIfmZ/0lOipBI/HR9nqJCIqHex8niygKtwUHpdkjRGNaH22Oy54SPBmw.avi" target="_blank">HR9nqJCIqHex8niygK..22Oy54SPBmw.avi</a><br />
i try this code .
Dim webreq As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.uverload.com/view/0lOipBI")
Dim webres As System.Net.HttpWebResponse = webreq.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(webres.GetResponseStream)
Dim docStr As String = sr.ReadToEnd
Dim HAPdoc As New HtmlAgilityPack.HtmlDocument
HAPdoc.LoadHtml(docStr)
Dim HAPnode As HtmlAgilityPack.HtmlNode
HAPnode = HAPdoc.GetElementbyId("ItemsList_file_0")
MsgBox(HAPnode.InnerText)
End Sub
give me nothing

How to get title inside table using HtmlAgility pack?

Hello I am trying to get the "title" element out of this table:
<td class="field_domain"><strong>Plantar</strong><strong>Fasciitis</strong>HeelPain.com</td>
Here is my code that almost works but not quite:
Dim web As New HtmlAgilityPack.HtmlWeb()
Dim htmlDoc As HtmlAgilityPack.HtmlDocument = web.Load("https://www.expireddomains.net/domain-name-search/?o=domainpop&r=d&q=plantar+fasciitis")
Dim html As String = htmlDoc.DocumentNode.OuterHtml
Dim tabletag = htmlDoc.DocumentNode.SelectNodes("//td[#class='field_domain']")
For Each t In tabletag
Dim var = t.SelectSingleNode("//td[#class='title']").InnerText
MessageBox.Show(var)
Next
This did the trick:
Dim web As New HtmlAgilityPack.HtmlWeb()
Dim htmlDoc As HtmlAgilityPack.HtmlDocument = web.Load("https://www.expireddomains.net/domain-name-search/?o=domainpop&r=d&q=plantar+fasciitis")
Dim html As String = htmlDoc.DocumentNode.OuterHtml
For Each linkItem As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//td[#class='field_domain']")
Dim name = linkItem.Element("a").InnerText
MessageBox.Show(name)
Next

retrieve unique values from string of numbers

i have this string
Dim test As String = "12,32,12,32,12,12,32,15,16,15,14,12,32"
and want to retrieve a string
newstr = 12,32,15,16,14
i tried this much
Dim test As String = "12,32,12,32,12,12,32,15,16,15,14,12,32"
Dim word As String
Dim uc As String() = test.Split(New Char() {","c})
For Each word In uc
' What can i do here?????????
Next
only unique numbers how can i do that in vb asp.net
right answer
Dim test As String = "12,32,12,32,12,12,32,15,16,15,14,12,32"
Dim word As String
Dim uc As String() = test.Split(New Char() {","c}).Distinct.ToArray
Dim sb2 As String = "-1"
For Each word In uc
sb2 = sb2 + "," + word
Next
MsgBox(sb2.ToString)
Dim test As String = "12,32,12,32,12,12,32,15,16,15,14,12,32"
Dim uniqueList As String() = test.Split(New Char() {","c}).Distinct().ToArray()
Dim test As String = "12,32,12,32,12,12,32,15,16,15,14,12,32"
'Split into an array
Dim testArray As String() = test.Split(",")
'remove duplicates
Dim uniqueTestArray As String() = testArray.Distinct().ToArray())
'Concatenate back to string
Dim uniqueString As String = String.Join(",", uniqueTestArray)
Or all in one line:
Dim uniqueString As String = String.Join(",", test.Split(",").Distinct().ToArray())
Updated Sorry I forgot to add the new string together
Solution:
Dim test As String = "12,32,12,32,12,12,32,15,16,15,14,12,32"
Dim distinctArray = test.Split(",").Distinct()
Dim newStr As String = String.Join(",", distinctArray.ToArray())
Training References: Check out this website for a guide on LINQ which will make these types of programming challenges easier for you. LINQ Tutorial
You forgot to put parentheses for Distinctand ToArray. Because these are methods
Dim uc As String() = test.Split(New Char() {","c}).Distinct().ToArray()