How does this website detect remote control with selenium and chromedriver? - selenium

I’m trying to screen scrape my own credit card information from the Discover website using selenium and chromedriver. In response it returns the error:
Your account cannot currently be accessed.
Outdated browsers can expose your computer to security risks. To get
the best experience on Discover.com, you may need to update your
browser to the latest version and try again.
Interestingly, if I write a script to open a headed browser and type in some random account and password, it works normally. But if the script first touches the web page and then I type, I get the above error message. The script that works is:
import time
from selenium import webdriver
driver = webdriver.Chrome()
driver.execute_script('window.location.href = "https://portal.discover.com/customersvcs/universalLogin/ac_main";')
It fails if I append these lines to the script and type after the sleep finishes:
time.sleep(5)
driver.find_element_by_id('userid-content').click()
I’ve tried other ways to enter data into the page, such as send_keys and executing Javascript to modify the page and they all fail the same way.
How can the website detect the remote control? Is there a way to circumvent it?

I have tried with your concept and your code block and have realized Yes portal.discover.com is able to detect Automated Logins.
One aspect is that filling up the User ID and Password field and even clicking Submit button is still achievable. Here is the relevant code block :
import time
from selenium import webdriver
driver = webdriver.Chrome()
driver.execute_script('window.location.href = "https://portal.discover.com/customersvcs/universalLogin/ac_main";')
time.sleep(5)
driver.find_element_by_css_selector("input#userid-content").send_keys("Harold")
driver.find_element_by_css_selector("input#password-content").send_keys("Harold")
# driver.find_element_by_css_selector("form#login-form-content input#log-in-button").click()
Snapshot with filledup User ID and Password field :
But one you click on the Submit button, the loginForm is validated through a JavaScript function validateForm(this); invoked at onsubmit event.
Amazingly, even before the User Credentials are validated the website seems to be detecting the Automated Login Process and sends back :
Your account cannot currently be accessed.
Outdated browsers can expose your computer to security risks. To get the best experience on Discover.com, you may need to update your browser to the latest version and try again.
For questions, please contact us at 1-800-347-7769. We're always available 24 hours a day, 7 days a week.
Snapshot of the error :
Reference
You can find a couple of detailed discussions in:
Can a website detect when you are using selenium with chromedriver?

Related

Using Selenium IDE for a test-case which includes backend call?

I add a new category in the admin panel and want to ensure that the category is available in the dropdown on the user's part of the website. Recorded test in the Selenium IDE works fine. But the thing is, the task that I execute is of course not a pure frontend thing - the category is saved in the database and is loaded from it to show it to the user. So if something goes wrong on the database-side, the test will fail.
My question is: is it bad practice to do such tests that depend on backend-behavior ? Should I go for Selenium Webdriver ?
If you use Selenium Webdriver, your test will not change in a main thing. It still will check database side. Selenium Webdriver is just anouther tool for testing that is more flexible and allows to make more complex test then in Selenium IDE.
I don't think that it is bad practice, because it is just one of the tests that chould be executed to enshure you that this part of your project works correctly. In this case I would check back-end part(get all categories from DB or admin's panel and check that there is no extra or missing ones) and than check user's panel(all categories are the same as set in DB and admin's panel).

Retrieve current chrome open page in html without saving it

I'm implementing a python script mainly based on pyautogui. One of the things the script does is to open a chrome webpage. After that I would need to access the DOM of this currently open webpage.
Since I've not opened the browser with selenium, I can't use it to analyze the DOM.
However, my question is: is this currently open chrome page available/saved somewhere in the hard drive so that I can access it with selenium? Like an .html file?
I checked many other questions here and users talk about chrome cache, but there are no html files there.
I just need to be able to access the current open page and not all the historical data in the cache.
Opening web browser directly with selenium is not an option either, since most of the websites analyzed have captchas and distil technology.
Thanks.
If you start the original chrome with --remote-debugging-port=PORT_NR argument, and visit localhost:PORT_NR from another browser, you will have access to the full content of the browser, including dev console.
Once you have this, you have multiple ways to go:
You can visit http://localhost:PORT_NR with with any other browser (or even with the same browser), and you should have full access to the content of the original Chrome. With Selenium you should have a relatively easy time to get by.
You can also use the devtools api (the documentation.. is.. well... there is room for improvement. Search for chrome devtools protocol to be amazed by the lack of docs). As an example you can get to http://localhost:PORT_NR/json to get the available debugging URIs. Grab the relevant websocket endpoint (webSocketDebuggerUrl). Open a websocket connection, and issue a command, like {"method": "DOM.getDocument", "id":12}. You can find available DOM related commands here: https://chromedevtools.github.io/devtools-protocol/1-3/DOM
Sice I had to reinvet the wheel I may give some extra info that I coudn't find anywhere:
Start the Browser with remote debugging enabled (see previous posts)
Connect to the given port on localhost and use these HTTP-GET-Requests to geta very limited control on your browser:
https://chromedevtools.github.io/devtools-protocol/#endpoints
Most important:
GET /json/new?{url}
GET /json/activate/{targetId}
GET /json/close/{targetId}
GET /json or /json/list
To gain full control over the browser, you need to use a "websocket" connection. Each Object in the GET /json or /json/list has it's own ID. Use this ID to interact with the tab. Btw: Type "page" are normal tabs, the other stuff are extentions and so on. Once you know which Tab you want to influence, get it's "webSocketDebuggerUrl".
Use this URL and connect with something that can speak the Websocket-protocol.
Once connected, you must craft a valid Json by the following structure:
{
"id":0,
"method":"Page.navigate",
"params":{url:http://google.com}}
}
Notes:
ID is a simple counter (int) that get bigger - not the ID of the tab(!)
Method is the method described in the docs params is also in the docs.
The return values are always JSONs.
From now on you can use the official docs:
https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-navigate
Dunno how other ppl found out about it but it took a few hours to get it working. Probably cause everyone is just using python's selenium to do it.

Can I use Selenium with password-protected URLs?

I'm trying to use Selenium for fetching information automatically from an ADSL modem's status page.
To log in to the modem, it requires certain username + password combination. Unlike all the samples that I have found, this comes before fetching the page, and therefore is not the case of finding the right id and then 'typing' the text into them.
Does Selenium have support for reaching such access controlled pages?
If I understand you correctly, if you use the following url, it will work:
http://username:password#modemstatusurl/bar/foo

Accessing pass/failure state of previous row in Selenium

I've made my own user extension for Selenium IDE and one addition to it that I would like to make requires me to access weather or not the previous command was successful, or failed.
Is there a way to access this information?

Do not record particular page hit

I am trying to hit particular web page and record post its load for example:
http://serv1.project.com/page7
but on hitting above page only http://serv1.project.com gets recorded. When i play same script then http://serv1.project.com is opened without subsequent page hit.
Note: I am trying to run my scripts using RC with java as base.
Why does it matter what gets recorded? Can't you just do:
selenium.open("page7");
in your Java code?
Of course I assume you have created the selenium session with the http://serv1.project.com/ as a base URL.