Obtaining a part from pagesource using Selenium in VBA - vba

I am a beginner of using VBA.
I am struggling to obtain the html from a webpage using Selenium and VBA. Nevertheless, I found that I have failed to get all the html from that webpage because the maximum character allowed in a cell in Excel is 32k. What I am trying to do is to obtain the following line from pagesource to get through all the Id element .Attribute("InnerHtml") to print a part of the page source but it seemed not to work :(
I have tried all I can find from the Internet, including the
article class="q q-scale q-l0" id="i_67398910"
data-question-number="9"
Nevertheless
The code is following:
Sub Getting_full_pagesource()
Dim FindBy As New Selenium.By
Dim mypos, i, y As Integer
Set CD = New Selenium.ChromeDriver
CD.start
CD.Get Sheet1.Range("B1").Value
y = 1
Do While y <> 0
Sheet1("A" & i).Value = CD.PageSource
If CD.IsElementPresent(FindBy.class("btn-finish")) = True Then
CD.Quit
Exit Do
End If
y = y + 1
CD.FindElementByTag("button").Click
i = i + 1
Loop
End Sub

Related

VBA Selenium : How to extract the text data in the unmerged form..as in like Spec1 : Value1 | Spec2 : Value2 | Spec3 : Value3...etc

enter image description hereI was trying to extract some data part from the list of webpage links that I have in my excel sheet and with the help of the below code, I could able to extract the data, but the problem is.. the result data that I am getting is in merged form... like this "ColorOrangeMaterialPolyethyleneDimensions6 x 37 inFor Use With(1 to 3) 36 in Blankets...." This is the link of the data.. FYI "https://www.grainger.com/product/SALISBURY-Blanket-Canister-Orange-3KUX9". Any suggestions on how to export the data in the form of like(Spec1:Value1|Spec2:Value2... Like this)would be greatly appreciated. Please advise.
This is the code..
Sub Test()
Dim ResultSections As Selenium.WebElements
Dim ResultSection As Selenium.WebElement
Dim i As Long
Dim lastrow As Long
lastrow = Sheet1.Cells(Rows.Count, "A").End(xlUp).Row
For i = 2 To lastrow
Dim MyUrl As String
MyUrl = Sheet1.Cells(i, 1).Value
Set MB = New Selenium.ChromeDriver
MB.Start
MB.Get MyUrl
MB.Wait 10000
Set ResultSections = MB.FindElementsByClass("P9I57X")
For Each ResultSection In ResultSections
Sheet1.Cells(i, "B").Value = ResultSection.Text
Exit For
Next ResultSection
If i = lastrow Then
MB.Quit
End If
Next i
End Sub
Kindly help me out with this..:-)
I tried to extract the data part of a webpage link, but it is coming the merged form and I couldn't able to differentiate which is the label name and which is the value. So I need that extracted data to be in the right format. Kindly advise.
Select for the dt and dd elements rather than a parent. That way you can set two lists to iterate over and access the desired content and the level where text is as appears on screen. Select a single element, as you are, higher up the DOM and you get this mangled looking string.
Code below should get you started. Note you also don't need to keep creating a new webDriver instance inside your loop.
Dim specs As Selenium.WebElements, values As Selenium.WebElements, i As Long
Set specs = MB.FindElementsByCss("[data-testid='product-techs'] dt")
Set values = MB.FindElementsByCss("[data-testid='product-techs'] dd")
For i = 1 To specs.Count
Debug.Print Join$(Array(specs.Item(i).Text, values.Item(i).Text), ":")
Next

Take list box selection, add value to other list box without allowing duplicates

I have two list boxes on a form I am making. The first list box is linked to a table with various company names. The goal I am after is after double clicking a companies name, the value is inserted in the second list box.
It worked fine until I tried to add code to prevent duplicates from appearing in the second list box, so you couldn't accidentally insert the same company twice. I have tried several different iterations, but with no luck. Anyone able to help with this one? My end goal would be for a msgbox to pop up alerting the user that duplicates are not allowed.
Private Sub ContractorLstbx_DblClick(Cancel As Integer)
Dim found As Boolean
found = False
Dim ID As Long
Dim Contractor As String
For Each newItem In Me.ContractorLstbx.ItemsSelected
For j = 0 To Me.SelectedContractorLst.ListCount - 1
If (Me!ContractorLstbx.ItemData(newItem).Column(1) = Me.SelectedContractorLst.ItemData(j).Column(1)) Then
found = True
Exit For
End If
Next j
If found = False Then
ID = Me.ContractorLstbx.ItemData(newItem)
Me.SelectedContractorLst.AddItem ContractorLstbx!.ItemData(newItem).Column(0) & ";" & Me!ContractorLstbx.ItemData(newItem).Column(1)
End If
found = False
Next newItem
End Sub
This is the full code for your solution. I tried it on test sample and working fine. just copy and paste the code. If you need your comparison to be case sensitive (I mean A <> a) then use Option Compare Binary as in my code below. If it is required to be case insensitive (A = a) just leave the default Option Compare Database or better force it using Option Compare Text
Option Compare Binary
Private Sub ContractorLstbx_DblClick(Cancel As Integer)
Dim found As Boolean
found = False
Dim ID As Long
Dim Contractor As String
For i = 0 To Me.ContractorLstbx.ItemsSelected.Count - 1
For j = 0 To Me.SelectedContractorLst.ListCount - 1
If (Me.ContractorLstbx.Column(1, Me.ContractorLstbx.ItemsSelected(i)) = Me.SelectedContractorLst.Column(1, j)) Then
found = True
Exit For
End If
Next j
If found = False Then
ID = Me.ContractorLstbx.ItemData(Me.ContractorLstbx.ItemsSelected(i))
Me.SelectedContractorLst.AddItem (ContractorLstbx.Column(0, Me.ContractorLstbx.ItemsSelected(i)) & ";" & Me.ContractorLstbx.Column(1, Me.ContractorLstbx.ItemsSelected(i)))
End If
found = False
Next i
End Sub

Scraper clicking on the next page button but fetches nothing

I've written some code in vba in combination with selenium to parse data from different tables spreading across multiple pages. When I run my script I can see that it parses data from the first page and then keep clicking on next page button until there is no more button is available. However, I'm getting the data from first page and seeing the browser clicking on the next page button for nothing cause it doesn't fetch any data from other pages. I don't understand what I'm doing wrong here. Perhaps, the loop I have created has got something to do with it or I don't know. Thanks for taking a look into it. Here is the full code:
Sub Table_data()
Dim driver As New ChromeDriver
Dim tabl As Object, rdata As Object, cdata As Object
driver.Get "https://toolkit.financialexpress.net/santanderam"
driver.Wait 1000
For Each tabl In driver.FindElementsByXPath("//table[#class='fe-datatable']")
For Each rdata In tabl.FindElementsByXPath(".//tr")
For Each cdata In rdata.FindElementsByXPath(".//td")
y = y + 1
Cells(x + 1, y) = cdata.Text
Next cdata
x = x + 1
y = 0
Next rdata
driver.FindElementByLinkText("Next").Click
driver.Wait 1000
Next tabl
End Sub
Consider pressing the Next button outside of your loops. You should use it within another loop, and the loop should terminate when there is no more Next button to press (Run-time Error 7: NoSuchElementError)
Xpath //table[#class='fe-datatable'] returns Page numbers as well. You should be using the inner table which is //table[#class='fe-fund-tableBody'] by class name or if you seek by id //*[#id='docRows']. They will point to the same element.
You might have noticed there are 7 occurrences of the above mentioned element. Your code loops through the empty ones for each page. You can avoid this by looping through the first occurence only, like this: (//table[#class='fe-fund-tableBody'])[1] or (//*[#id='docRows'])[1].
I also would recommend to find a way to implicit/explicit wait instead of wait. If we don't go further to improve anything else, in the end your code should look something like this:
Sub Table_data()
Dim driver As New ChromeDriver
Dim tabl As Object, rdata As Object, cdata As Object
driver.Get "https://toolkit.financialexpress.net/santanderam"
driver.Wait 1000
Do
For Each tabl In driver.FindElementsByXPath("(//*[#id='docRows'])[1]") 'or "(//table[#class='fe-fund-tableBody'])[1]"
For Each rdata In tabl.FindElementsByXPath(".//tr")
For Each cdata In rdata.FindElementsByXPath(".//td")
y = y + 1
Cells(x + 1, y) = cdata.Text
Next cdata
x = x + 1
y = 0
Next rdata
Next tabl
On Error Resume Next
driver.FindElementByLinkText("Next").Click
driver.Wait 1000
Loop Until Err.Number = 7
End Sub
Personally I would change the way you are iterating the pages. It should be like this in pseudo code:
function element getNextButton(){
all_buttons = driver.findElementsByXpath("""//*[#id="Price_1_1"]/tfoot/tr/td/div/div/a""");
next_button = all_buttons[all_buttons.Size()-1];
return next_button;
}
main(){
next_button = getNextButton();
while true{
do something with your current table;
next_button.click();
wait(2); // wait some time till the page loads
next_button = getNextButton();
if next_button.text does not contains 'Next'{
break;
}
}
}
I have just tested it on Python:
from selenium import webdriver
import time
def get_next_button():
buttons = driver.find_elements_by_xpath("""//*[#id="Price_1_1"]/tfoot/tr/td/div/div/a""")
next_element_button = buttons[len(buttons)-1]
return next_element_button
chrome_path = r"chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://toolkit.financialexpress.net/santanderam")
time.sleep(5)
next_button =get_next_button()
while(True):
# Do something with the table
next_button.click()
time.sleep(2)
next_button = get_next_button()
if 'Next' not in next_button.text:
break
print 'End'
I am not familiar with vba, but if you do not understand Python I can try to translate it to vba.
EDIT
An "approximation" to VBA solution should be this (please check syntax errors, I have never used VBA):
Function GetNextElement() as Object
Dim all_buttons As Object
Dim next_button As Object
all_buttons= driver.FindElementsByXpath("""//*[#id="Price_1_1"]/tfoot/tr/td/div/div/a""")
next_button = all_buttons[all_buttons.Length-1]
Return next_button
End Function
Sub Table_data()
Dim driver As New ChromeDriver
Dim position as Integer
Dim next_button As Object
driver.Get "https://toolkit.financialexpress.net/santanderam"
driver.Wait 1000
next_button = GetNextElement()
Do While True
// Do something with the table
next_button.Click
driver.Wait 2000
next_button = GetNextElement()
position = InStr(next_button.Text,"Next")
If position = 0 Then
Exit Do
End If
Loop
End Sub

Searching text file and showing part of the text in a text box

I am working on a VB.net application where I have a very large text file. It is basically a large database of error codes with descriptions of how to clear the code after it. What I would like to do, is on the click of a button, search the text file for the specific code and display all text for just that error code into a text box. I have tried many different ways, but am unable to get it to work properly. I went through the entire text file and added a "|" to the beginning of each fault code so that I could specify where the code starts at.
Here is an example of a couple fault codes:
|ACAL-000 Fail to run DETECT Motn Cause: The AccuCal2 Motion failed to
nm. The AccuCal2 motion cannot be started. Remedy: Clear all the
errors before executing AccuCal2. |ACAL-001 Robot is not ready.
Cause: The robot is not ready. The system cannot issue motion
because it is in an error state. Remedy: Clear all faults, then retry
the operation.
If I search for "ACAL-000", I want it to show everything from the | before ACAL-000 to the bar before ACAL-001.
I would post the code that I have written, but I have tried so many different versions that I don't really know which one to post.
Any help you can provide would be greatly appreciated.
EDIT
Here is my current code after some editing and implementation of what has been recommended. Please see the comments below for more information on how I got to this point. A quick note, I am currently just using "|ACAL-000" for a test search. When this is complete, I have some other (already working) code that will put together a code from a couple of drop down lists.
Function ReadEmbeddedTextFileResource(embeddedResourceName As String) As String
Using stream As Stream = Reflection.Assembly.GetExecutingAssembly().GetManifestResourceStream(embeddedResourceName)
If stream Is Nothing Then
Throw New FileNotFoundException("The resource '" & embeddedResourceName & "' was not found.")
End If
Using reader As StreamReader = New StreamReader(stream, True)
Return reader.ReadToEnd()
End Using
End Using
End Function
Function FindTextBetweenBars(bodyOfTextToSearch As String, textToLookFor As String) As String
Dim i As Integer = bodyOfTextToSearch.IndexOf(textToLookFor)
If i < 0 Then Return Nothing
Dim j As Integer = bodyOfTextToSearch.LastIndexOf("|", i)
If j < 0 Then j = 0
Dim k As Integer = bodyOfTextToSearch.IndexOf("|", i + Len(textToLookFor))
If k < 0 Then k = Len(bodyOfTextToSearch)
Return bodyOfTextToSearch.Substring(j + 1, k - j - 1)
End Function
Private Sub btnShowTroubleshooting_Click(sender As Object, e As EventArgs) Handles btnShowTroubleshooting.Click
Dim allErrorText As String = ReadEmbeddedTextFileResource(My.Resources.FanucCodes)
Dim errorMessage As String = FindTextBetweenBars(allErrorText, "|ACAL-000")
If errorMessage Is Nothing Then errorMessage = "Error code Not found!"
RichTextBoxFanucFaults.Text = errorMessage
End Sub
Here is a function that should do what you want:
Function FindTextBetweenBars(bodyOfTextToSearch As String, textToLookFor As String) As String
Dim i As Integer = bodyOfTextToSearch.IndexOf(textToLookFor)
If i < 0 Then Return Nothing
Dim j As Integer = bodyOfTextToSearch.LastIndexOf("|", i)
Dim k As Integer = bodyOfTextToSearch.IndexOf("|", i + Len(textToLookFor))
If k < 0 Then k = Len(bodyOfTextToSearch)
Return bodyOfTextToSearch.Substring(j + 1, k - j - 1)
End Function
In your button click event handler you can call the function like this:
Dim errorMessage as String = FindTextBetweenBars(My.Resources.FanucCodes, txtErrorCodeToLookFor.Text)
If errorMessage Is Nothing Then errorMessage = "Error code not found!"
txtErrorMessage.Text = errorMessage
where txtErrorMessage is the output textbox to display the error message result,
My.Resources.FanucCodes is your large string resource containing all the error descriptions (with | separators), and txtErrorCodeToLookFor is a textbox that accepts the error code input from the user.

Pull data from Website into VBA

This might fall under the dumb question from a newbie. But I honestly don't know where to start in VBA. I tried a few different approaches on the web trying to pull data from the site I'm trying to and all of them failed miserably. Can someone help me (more or less show me) how to pull the data from this website?
https://rotogrinders.com/projected-stats/nfl?site=fanduel
It wouldn't even let me do the data->import. here is what I have so far. I keep getting stuck on line For t = 0 To (Table.Length - 1).
Sub test1()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
Dim Table As Object
Dim t As Integer
Dim r As Integer
Dim c As Integer
With appIE
.Navigate "https://rotogrinders.com/projected-stats/nfl?site=fanduel"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
Set Table = appIE.document.getElementById("proj-stats")
For t = 0 To (Table.Length - 1)
For r = 0 To (Table(t).Rows.Length - 1)
For c = 0 To (Table(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = Table(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
appIE.Quit
Set appIE = Nothing
End Sub
You are close, and there are several ways to get the data. I chose to extract all row elements (HTML <TD>) and step through a simple loop. Since there are six columns I'm using two variables (r & c for row and column) to offset the data to format correctly.
Set Table = appIE.document.getElementsbytagname("td")
r = 0
c = 0
For Each itm In Table
Worksheets(1).Range("A1").Offset(r, c).Value = itm.innertext
c = c + 1
If c Mod 6 = 0 Then
r = r + 1
c = 0
End If
Next itm
Example Result:
One last note, sometimes the browser didn't finish loading before the script went on... I cheated by using a break point before the loop, waited until it loaded, then hit F5 to continue execution of code to ensure it would alway run.