PhantomJSDriverTimes Out On Button Click Event in Selenium - vba

I have VBA code in Excel that is supposed to login to a website and download some files using Selenium. I have my code working using the ChromeDriver and am trying to modify it to work with the PhantomJSDriver so I can do something else while the program runs (it runs for ~45 minutes). The issue is that when I try to have Selenium click on the login button I get a timeout error:
Run-time error '101':
WebRequestTimeout:
No response from the server within 30000 seconds
The interesting thing is that after it times out, I can use the immediate window to take a screenshot and it's clear that the button was clicked and the browser has advanced to the next page.
Dim D As New PhantomJSDriver
With D
.ExecuteScript ("window.resizeTo(1920,1080)")
.SendKeys MyKeys.Control, "0" 'Set zoom to 100% (causes errors if not 100%)
.Get "LoginPage.com"
.FindElementByName("username").SendKeys "UserName"
.FindElementByXPath("/html/body/div[#class='centreContent']/form[#id='loginForm']/input[#id='passwordDummy']").Click
.FindElementByXPath("/html/body/div[#class='centreContent']/form[#id='loginForm']/input[#id='password']").SendKeys "Password"
.TakeScreenShot.SaveAs "C:\Users\110SidedHexagon\Downloads\Capture.png" '<---Takes screenshot of login screen with uesername and password filled in
.FindElementByName("loginSubmitButton", 0.1).Click '<---Error occurs here
<--Using the immediate window taking a picture after the error breaks code execution shows login was successful-->
End With

It means that after the button is clicked, the new loaded page doesn't return a completed state within 30 seconds.
It could be due to a dead resource within the page.
You could try to increase the server timeout:
Dim driver As New PhantomJSDriver
driver.Timeouts.Server = 60000 ' 60 seconds
driver.Get "https://..."
driver.FindElementByName("loginSubmitButton").Click
Or you could define a timeout to load the page and skip the error:
Dim driver As New PhantomJSDriver
driver.Timeouts.PageLoad = 20000 ' 20 seconds
driver.Get "https://..."
On Error Resume Next
driver.FindElementByName("loginSubmitButton").Click
On Error Goto 0
To get the latest version in date working with the above example:
https://github.com/florentbr/SeleniumBasic/releases/latest

Related

Using VBA Selenium Chromedriver to get download progress via executescript ...Selector('#progress').value getting error sometimes

I'm using MSAccess VBA with Selenium VBA, Chromedriver
I am able to get to a website, login, find the button to click to start download, and get that download to save to the location I want.
I want to track the progress.
I have opened a new Chrome window (tab) and navigated to chrome:\\downloads\ and switched my driver window to that window.
I've used the following code I found on stack overflow, in a loop to monitor the progress.
downloadPercentage = wd.ExecuteScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value")
It returns this error ...cannot return properties of null...
My download is present on the page. I can get the file name.
When I open the developer tab on the downloads page and enter the same query selector path into the console I get the same thing. It returns null.
If I manually (or maybe even using VBA haven't tried) click the button for a second download, then all of a sudden that same code returns a value of 100. (It may catch it at a lower percent. The download is too fast for me to catch that in debug mode.)
What would cause the selector to not be present for one download, but then present for the next?
Here's the code that's in question.
Function getDownloadedFileName(wd as ChromeDriver) As String
Dim startTime As Date
'I'm using this method because opening a second ChromeDriver instance and going to the chrome://downloads/ page returns a clean slate (no downloads) and this method works for me.
wd.ExecuteScript ("window.open()")
wd.SwitchToNextWindow
wd.Get "chrome://downloads/"
startTime = Now()
Dim downloadPercentage As Integer
Do While DateDiff("s", startTime, Now()) < 120 And downloadPercentage < 100
'This is the line that returns the Javascript error ... Cannot read properties of null ...
downloadPercentage = wd.ExecuteScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value")
If (downloadPercentage = 100) Then
getDownLoadedFileName = wd.ExecuteScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').text")
Exit Do
End If
Loop
wd.SwitchToPreviousWindow
End Function
I'll appreciate any help on this. Thanks!

Click website button with VBA selenium basic

I'm trying to click on the web using VBA, Selenium and Chrome.
I need to authorize on website. It has worked but now fails.
If I run the website manually, input login-password and put submit, it works.
If I run VBA code the button is not clickable.
Sub Run_Test()
Dim dr As New ChromeDriver
Dim el As WebElement
Dim Login, password As String
Login = "vasilenko12": password = "1204"
dr.Get ("https://www.perevirkaznan.com/")
Sleep 1000
Set el = dr.FindElementByXPath("//a[#class='navigation__enter js-modal']")
el.Click
Sleep 30
Set el = dr.FindElementByXPath("//input[#name='login']")
el.SendKeys Login
Sleep 30
Set el = dr.FindElementByXPath("//input[#name='password']")
el.SendKeys password
Sleep 30
Set el = dr.FindElementByXPath("//label[#class='checkbox']")
el.Click
Sleep 30
Set el = dr.FindElementByXPath("//button[#class='btn btn-blue-transparent modal-submit']")
el.Click
Sleep 1000
dr.Get ("https://www.perevirkaznan.com/account/course")
Sleep 30
End Sub
What is happening
Every time I tested page throwed some errors which means that button is actually being clicked:
What we have here is a Communicability issue where page is not showing an error message.
According to authors Raquel Prates, Clarisse de Souza and Simone Barbosa in their article, they explain that there is some "breakdown situations and user attitudes likely to occur during human-computer interaction" where under the question "Why doesn't it? (What happened?)" they describe:
"The alternative scenario (What happened?) is when they [Users] do not get
feedback from the system and are apparently unable to assign meaning
to the function's outcome (halt for a moment)."
Based on that, the page you are attempting to login is not showing you any error message in user interface (Form) despite your username and password are not incorrect, and that leads you to think that your code has some problem (which is not true, your code is fine) But is showing some error in console (DevTools)
What you cand do
According to James Scott on his article he explains that:
“Effective error messages inform users that a problem occurred,
explain why it happened, and provide a solution so users can fix the
problem. Users should either perform an action or change their
behavior as the result of an error message.”
Based on this context you can try to explore more after the error that it's actually being shown in console which is a 400 error.
If we go this way we can assume that the problem is some bug at loading the page. For that, we can dodge the problem. I have tested this code several times already and found out that both errors shown in console do matter, in fact when reCaptcha API or Ads API from google shows an error or does not load to the page, Submit button does not work. This is because using ReCaptcha was a must for developers who made the page when using forms to create the login interface (yes, you are login through that object); integrating this resources to the page made it more buggy, so we need a workaround.
Workaround
So this is my approach in the meantime the page does not solve the bug. You can wait for an element that would've been shown when you sucessfully login. For that I've used logout button as a reference. If your button is not shown, then try again. This is what the code does. Usually it fails one time and works in the second one.
Sub stackoverflow_test2()
Dim dr As New ChromeDriver
Dim el As webElement
Dim Login, password As String
Login = "vasilenko12": password = "1204"
try:
dr.Get ("https://www.perevirkaznan.com/")
Sleep 1000
Set el = dr.FindElementByCss("a.navigation__enter.js-modal")
el.Click
Sleep 30
Set el = dr.FindElementByCss("input[name=login]")
el.SendKeys Login
Sleep 30
Set el = dr.FindElementByCss("input[type=password]")
el.SendKeys password
Sleep 30
Set el = dr.FindElementByCss("label[class=checkbox]")
el.Click
Sleep 30
Set el = dr.FindElementByCss("button[type=submit]")
el.Click
Sleep 1000
Success = CBool(dr.FindElementsByCss("a.navigation__logout").Count > 0)
If Not Success Then GoTo try:
dr.Get ("https://www.perevirkaznan.com/account/course")
Sleep 30
End Sub

Don't wait for a page to load using Selenium ChromeDriver in vba excel

i just want to get the final url from a redirect.
I can read this with url = GC2.Url but the end takes a long time to load completely.
how do i get it now that as soon as i call the link with my 2nd selenium instance the webite doesn't wait for it and uses "none". but as soon as i close my 2nd instance again and continue with my first instant the normal time is used again.
i have found some approaches here: Don't wait for a page to load using Selenium in Python
approximate example:
Set GC = New Selenium.ChromeDriver
Set GC2 = New Selenium.ChromeDriver
for c = 1 to 100
GC.Get link(c)
Set Elements = GC.FindElementsByCss("p.TITLE a[href]")
For Each Element In Elements
ReDim Preserve links(a) As String
links(a) = Element.Attribute("href") 'geht auch innerHTML/href usw.
a = a + 1
GC2.Get links(a) ' this here should not wait so long
redict(a) = GC2.Url
GC2.Close
Next Element
Try to work in Headless mode, it might be faster. In this mode the browser doesn't open and everything happens with the "eyes closed", everything is virtual.

Unable to let my script keep clicking on Load more button using IE

I've created a script in vba using IE to keep clicking on the Load more hits button located at the bottom of a webpage until there is no such button is left.
Here is how my script can populate that button: In the site's landing page there is a dropdown named Type. The script can click on that Type to unfold the dropdown then it clicks on some corporate bond checkbox among the options. Finally, it clicks on the apply button to populate the data. However, that load more hits button can be visible at the bottom now.
My script can follow almost all the steps exactly what I described above. The only thing I am struggling to solve is that the script seems to get stuck after clicking on that button 3/4 times.
How can I rectify my script to keep clicking on that Load more hits button until there is no such button is left?
Website link
I've tried so far:
Sub ExhaustLoadMore()
Dim IE As New InternetExplorer, I As Long
Dim Html As HTMLDocument, post As Object, elem As Object
Dim CheckBox As Object, btnSelect As Object
With IE
.Visible = True
.navigate "https://www.boerse-stuttgart.de/en/tools/product-search/bonds"
While .Busy Or .readyState < 4: DoEvents: Wend
Set Html = .document
Do: Loop Until Html.querySelectorAll(".bsg-loader-ring__item").Length = 0
Html.querySelector("#bsg-filters-btn-bgs-filter-3").Click
Do: Set CheckBox = Html.querySelector("#bsg-checkbox-3053"): DoEvents: Loop While CheckBox Is Nothing
CheckBox.Click
Set btnSelect = Html.querySelector("#bsg-filters-menu-bgs-filter-3 .bsg-btn__label")
Do: Loop While btnSelect.innerText = "Close"
btnSelect.Click
Do: Loop Until Html.querySelectorAll(".bsg-loader-ring__item").Length = 0
Do: Set elem = Html.querySelector(".bsg-table__tr td"): DoEvents: Loop While elem Is Nothing
Do
Set post = Html.querySelector(".bsg-searchlist__load-more button.bsg-btn--juna")
If Not post Is Nothing Then
post.ScrollIntoView
post.Click
Application.Wait Now + TimeValue("00:00:05")
Else: Exit Do
End If
Loop
End With
End Sub
I've tried with selenium but that seems to be way slower. However, it keeps clicking on the load more button after a long wait in between even when no hardcoded wait within it. In case of selenium: I wish to have any solution which might help reduce it's execution time.
Sub ExhaustLoadMore()
Const Url$ = "https://www.boerse-stuttgart.de/en/tools/product-search/bonds"
Dim driver As New ChromeDriver, elem As Object, post As Object
With driver
.get Url
Do: Loop Until .FindElementsByCss(".bsg-loader-ring__item").count = 0
.FindElementByCss("#bsg-filters-btn-bgs-filter-3", timeOut:=10000).Click
.FindElementByXPath("//label[contains(.,'Corporate Bond')]", timeOut:=10000).Click
.FindElementByXPath("//*[#id='bsg-filters-menu-bgs-filter-3']//button", timeOut:=10000).Click
Do: Loop Until .FindElementsByCss(".bsg-loader-ring__item").count = 0
Set elem = .FindElementByCss(".bsg-table__tr td", timeOut:=10000)
Do
Set post = .FindElementByCss(".bsg-searchlist__load-more button.bsg-btn--juna", timeOut:=10000)
If Not post Is Nothing Then
post.ScrollIntoView
.ExecuteScript "arguments[0].click();", post
Do: Loop Until .FindElementsByCss("p.bsg-searchlist__info--load-more").count = 0
Else: Exit Do
End If
Loop
Stop
End With
End Sub
I have studied a bit your website, and since I could not say all of this into a single comment I have decided to post an answer (even though it doesn't provide with a concrete solution, but just with an "answer" and maybe some tips).
The answer to your question
How can I rectify my script to keep clicking on that Load more hits button until there is no such button is left?
Unfortunately, it's just not your fault. The website you are targeting is working through WebSocket communication between the web client (your browser) and the web server providing with the prices you are trying to scrape. You can see it as follows:
Imagine it like this:
When you first load your webpage, the web socket is initialized and the first request is sent (Web client: "Hey server, give me the first X results", Web server: "Sure, here you go").
Every time you click on the "Load more results" button, the Web client (important: re-using the same WS connection) keeps on asking for X new results to the web server.
So, the communication keeps on going on for some time. At some point, out of your control, it happens that the web socket just dies. It's enough to look at the JavaScript console while clicking on the "Load more results" button: you will see the request going through until at some point you don't just see a NullPointerException raised:
If you click on the last line of the stack before the exception, you will see that it's because of the web socket:
The error speaks clearly: cannot read .send() on null, meaning that _ws (the web socket) is gone.
Starting from now, you can forget about your website. When you click on the button "Load more results", the web client will ask the web socket to deliver the new request to the web server, but the web socket is gone so goodbye communications between the two, and so (unfortunately) goodbye the rest of your data.
You can verify this by just going a bit upper in the stack:
As you can see above, we have:
A message logged in the console saying "performSearch params ...") just before posting the new data request
The post of the new data request
A message logged in the console saying "performed search with result ...") just after posting the new data request
While the web socket is still alive, everytime you click on "Load more results" you will see these two messages in the console (with other messages in between printed over the rest of their code):
However, after the first crash of the web socket, no matter how many times you try to click on the button you will only get the first message (web client sends the request) but never will get the second message (request gets lost in the void):
Please note this corresponds to your behavior observed in VBA:
the script seems to get stuck after clicking on that button 3/4 times.
It doesn't get stuck, actually your script keeps on executing correctly. It's the website that times out.
I have tried to figure out why the web socket crashes, but no luck. It just seems a timeout (I've had this a lot more while debugging their JavaScript, so my breakpoints were causing the timeout) but I can't make you sure it's the only cause. Since you're not controlling the process between the web client and the web server, all you can do is to hope that it doesn't timeout.
Also, I believe using Selenium automatically sets some longer timeouts (because of the long execution time) and this somehow allows you to keep the web socket more tolerant with respect to the timeouts.
The only way I found to restore the connection after a crash of the web socket is completely reload the web page and restart the process from scratch.
My suggestions
I think you might go with building an XHR request and sending through JavaScript, because their API (through which the web client/web socket deliver the request to the web server) is pretty exposed in their front-end code.
If you open their file FinderAPI.js, you will see they've left the endpoints and API configurations harcoded:
var FinderAPI = {
store: null,
state: null,
finderEndpoint: '/api/v1/bsg/etp/finder/list',
bidAskEndpoint: '/api/v1/prices/bidAsk/get',
instrumentNameEndpoint: '/api/products/ProductTypeMapping/InstrumentNames',
nameMappingEndpoint: '/api/v1/bsg/general/namemapping/list',
apiConfig: false,
initialize: function initialize(store, finderEndpoint) {
var apiConfig = arguments.length > 2 && arguments[2] !== undefined ? arguments[2] : false;
this.store = store;
this.state = store.getState();
this.apiConfig = apiConfig;
this.finderEndpoint = finderEndpoint;
},
This means you know the URL to which you should send your POST request.
A request also requires a Bearer Token to be validated by the server. Lucky you, they have also forgot to protect their tokens providing (GORSH) a GET end point to get the token:
End-point: https://www.boerse-stuttgart.de/api/products
Response:
{"AuthenticationToken":"JgACxn2DfHceHL33uJhNj34qSnlTZu4+hAUACGc49UcjUhmLutN6sqcktr/T634vaPVcNzJ8sHBvKvWz","Host":"frontgate.mdgms.com"}
You'll just have to play around with the website a little bit to figure out what is the body of your POST request, then create a new XmlHttpRequest and send those values inside it to retrieve the prices directly in your VBA without opening the webpage and robotic-scraping.
I suggest you start with a breakpoint on the file FinderAPI.js, line 66 (the line of code is this.post(this.finderEndpoint, params), params should lead you to the body of the request - I remember you can print the object as string with JSON.stringify(params)).
Also, please note that they use a pagination of 50 results each time, even though their API supports up to 500 of them. In other words, if you get to sweep the value 500 (instead of 50) into their pagination property sent to the API for the request:
... then you will get 500 results per time instead of 50, so reducing by 10 the time your code will spend scraping the webpage in case you decide not to go deeper into the XHR solution.
Could you try to change
Do
Set post = Html.querySelector(".bsg-searchlist__load-more button.bsg-btn--juna")
If Not post Is Nothing Then
post.ScrollIntoView
post.Click
Application.Wait Now + TimeValue("00:00:05")
Else: Exit Do
End If
Loop
to:
Set post = Html.querySelector(".bsg-searchlist__load-more button.bsg-btn--juna")
If Not post Is Nothing Then
post.ScrollIntoView
While Not post Is Nothing
Debug.Print "Clicking"
post.Click
Application.Wait Now + TimeValue("00:00:05")
Wend
Debug.Print "Exited Click"
End If
(untested)

Use Selenium Driver to load page, dismiss/dispose it and keep browser active

I am using Selenium Webdriver to load a specific feature from an application, a rich text editor (actually a custom release of CKEditor) and the code below works perfectly for that... except that I would like to release Selenium objects (and geckodriver.exe/marionnette black cmd window) since the desired page was loaded. But either .Close(), .Quit() or .Dispose() methods will wipe out the Firefox window as well...
Is there a way to dismiss Selenium Webdriver and keep Firefox running by its own?
Thank you very much
Private Sub LoadResource()
Dim FFD As New OpenQA.Selenium.Firefox.FirefoxDriver()
'Set timeout of 60 seconds for steps to complete successfully
Dim WDW As New OpenQA.Selenium.Support.UI.WebDriverWait(FFD, TimeSpan.FromSeconds(60))
'navigate to login page
FFD.Navigate.GoToUrl("https://www.myapplication.com/login")
'Wait until application loads main page (this means login was successful)
WDW.Until(Function() FFD.Url = "https://www.myapplication.com/")
'Load built-in rich text editor Rich text
FFD.Navigate.GoToUrl("https://www.myapplication.com/editor?document=1080199")
'Wait for successful loading of the editor page
WDW.Until(Function() FFD.Url = "https://www.myapplication.com/editor?document=1080199")
'That's all.
'here I'd like to release Firefox to keep running and get rid of WebDriver's objects and resources, if possible.
End Sub
This is based on Kirhgoph's comment and seems to work well:
Private Sub LoadResource()
Dim FFD As New OpenQA.Selenium.Firefox.FirefoxDriver()
Dim GDP As Process = Process.GetProcessesByName("geckodriver").Last
Dim WDW As New OpenQA.Selenium.Support.UI.WebDriverWait(FFD, TimeSpan.FromSeconds(60))
FFD.Navigate.GoToUrl("https://www.myapplication.com/login")
WDW.Until(Function() FFD.Url = "https://www.myapplication.com/")
FFD.Navigate.GoToUrl("https://my.application.com/editor?document=1080199")
WDW.Until(Function() FFD.Url = "https://www.myapplication.com/editor?document=1080199")
GDP.CloseMainWindow()
GDP.WaitForExit()
FFD.Quit()
End Sub