I am using htmlunit for web scraping - logging to a website on behalf of the users, settings something in their profile and then come back.
Just using pure Htmlunit and no selenium framework.
Now my question:
WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_11);
Does this statement - creates a browser instance on the machine where i am executing the code or what it does?
I am using BrowserVersion.INTERNET_EXPLORER_11 as this is an accepted browser at that website.
How Selenium is different than htmlunit - i know we can use htmlunit as a webdriver in Selenium. Does Selenium needs a native browser instance on the machine where the code is getting executed? Does Selenium creates browser instances?
My use case is - I will be having multiple users accessing this application. I know WebClient in htmlunit is not thread safe(so have to code it as Spring proto type bean).
Is there any suggestions regarding this?
Any help is greatly appreciated.
HTMLUnit is a headless browser. So no window will be created if used with Selenium either. Setting the BrowserVersion will just tell HTMLUnit to present itself to the server as if it were a given browser (AFAIK, it will just change the User-Agent but might perform additional internal processing depending on the version). I guess this answers most of the questions but the last one.
Regarding asking for suggestions on how to implement this I would try to avoid logging in to a website that way. If the website does not provide an API for this then it is likely that it is agains the Terms Of Service. Assuming it is not, you will have to create new WebClient instances for each user each time the data needs to be extracted from the other site.
Related
Please note that I am not talking about Selenium RC or RemoteWebDriver or the old Selenium server or the standalone server. I am also not talking about Selenium 4.0 so let us not get into that because it simply dodges answering the question.
When we say Selenium-WebDriver, as per my understanding, it is an integration of Selenium code + WebDriver project (which is also the name of the W3C WebDriver specification).
Here is my understanding of the difference so far:
Selenium = the library which supports multiple programming languages. In each language the user creates instances of ChromeDriver / FirefoxDriver etc. which are client libraries that allow the user to write high-level code. This code is converted by these client libraries into an API request with JSON body containing javascript commands. This is sent to the Selenium server through API requests.
The API requests which will be sent to the Selenium server is either converted to the W3C WebDriver format, or maybe the client libraries already do that. I'm not sure which is it.
The selenium server converts the request or simply forwards it to the driver application that is running (Selenium server is supposed to act like a reverse proxy so I think that it simply forwards it) into a format that is understood by the drivers (the W3C specification).
The drivers now interact with the browser. How do they do that? Well, the browsers come built-in with JS libraries to help drive automation. And these drivers are external applications that take REST API requests, and are able to call these JS libraries based on these requests.
When we say WebDriver, it consists of 2 parts. One, the W3C specification. Two, it is also the same name that Selenium has kept for it's client. You are actually creating an instance of "WebDriver" (or "RemoteWebDriver" to be more specific) while writing code. So "WebDriver" is more than just a W3C specification over here. It is also the name of the Selenium client implementation. (I have no idea why they decided to keep the same name which made it so confusing - it's like creating a microservice called as "REST API microservice").
Qs A: Is my understanding of the concepts correct? I cannot get a clear answer anywhere because all answers/discussions on the web are simply dodging answering the question by quoting from the official docs (which are pretty vaguely worded).
Qs B: Does the Selenium server modify the response in any way before sending it to the driver? If yes, then can we say that in this case the "Selenium server" has become the client and driver has become the server?
Qs C: I had read somewhere that the browser drivers also act like a RESTful service. Is this true?
I've seen this question has been asked a few times, and lots of solutions get suggested - but none of them seem to work for the RemoteWebDriver (ie: using Selenium Grid). They're usually centered around using the local ChromeDriver/FirefoxDriver/IEDriver classes.
I am using the .NET bindings, by the way :).
What I want to do is fairly simple (in terms of requirement). I have a Selenium Server setup, and am currently using the RemoteWebDriver to perform automated UI tests on various sites. This setup is working fine.
However, some sites use NTLM/Windows Authentication, and we need to start writing automated tests for these. However, as far as I can tell, there is no solution for this.
I have seen the following "solutions", but - unless someone can correct me - they either don't work consistently, or will not work using RemoteWebDriver:
Using the IAlert functionality (like here). However, this isn't implemented in the .NET bindings, and doesn't work for all browsers as far as I can tell.
Using the Robot API to interact with the popup (like here). But this is for running on your local machine, and not supported by RemoteWebDriver.
Using AutoIt to do a similar thing to the Robot API. However, this won't work using RemoteWebDriver.
Passing the credentials in the URL (eg: http://username:password#example.com). However, this doesn't work for Windows Authentication - just normal HTTP Basic Authentication.
I can't actually see any other solutions, unless anyone else can help?
A workaround currently is to log onto the Selenium server, go to the sites in each browser, and save the credentials. But this isn't ideal, and adds a level of manual interaction to each test.
Any help would be appreciated :).
It appears I have found my own solution - use a proxy which adds the NTLM negotiation/authorisation automatically. Pretty simple to setup :).
http://cntlm.sourceforge.net/
There are many selenium webdriver binding package of Golang.
However, I don't want to control browser throught server.
How can I control browser with Golang and selenium without selenium server?
You can try github.com/fedesog/webdriver which says in its documentation:
This is a pure go library and doesn't require a running Selenium driver.
I would characterize the Selenium webdriver as a client rather than a server. Caveat: I have used the Selenium webdriver (Chrome version) from .Net and I am assuming it is similar for Go.
The way Selenium works is that you will launch an instance of it from within code, and it creates a live version of the selected browser (i.e. Chrome) and your program retains control over it. Then you write code to tell the browser to navigate to a page, inspect the response, and interact with the browser by filling out form data, clicking on buttons, etc. You can see what is happening on the browser as the code runs, so it is easy to troubleshoot when the interaction doesn't go as planned.
I have used Selenium to upload tens of thousands of records to a website that has no API and only a graphical user interface. Give it a chance.
Using Selenium's WebDriver, with PhantomJSDriver, I am trying to do headless browser testing. It works fine when connected to internet WITHOUT a proxy. But when the connection to internet is via an authenticated proxy, it fails. I want to deploy this program to multiple user sites, which might be connected to internet with or without proxy, and in case of proxy, it might be authenticated or unauthenticated.
Is there a way to tell Selenium Webdriver to use the "current" browser's internet connection settings? Please note I am using phantomjs.
Thanks,
abbas
There is some more simple but very effective solution, that I've used when battling with similar issues an year ago.
Do you have these issues when using other *Drivers? If not - my proposal is to use your implementation of any other *Driver that works fine and after it passes authentication just cast it to PhantomJSDriver. Please note that is just possible workaround if your TestFramework hierarchy is built to support such an action.
In addition you can consider the following - when I used such Polymorphism the difference is speed. For FirefoxDriver and PhantomJSDriver it wasn't such a pain and if you can use it only for authentication it will not slow you down noticeable.
I'm not sure that I can help you with my solution, but it will not hurt to try it.
I'm new in using Selenium.
Selenium IDE is a user-friendly firefox plugin. I have no problem in using it. However, I found that the documentation for other Selenium tools such as Selenium RC and Selenium Core is quite confusing for beginners. It seems that the author assume that the readers already have deep knowledge in using these tools.
For example, when I try to figure out how to setup Selenium RC to test a webserver, the only diagram i can find from the Selenium website is this:
http://www.sparksupport.com/blog/wp-content/uploads/2010/11/selenium-rc.png
From this diagram, i can't even see which one is the webserver under test and where should i install the Selenium components.
At first I thought this diagram is a bit weird and i should be able to get a better diagram from other websites. I was surprised to find that almost all Selenium RC setup diagram on the internet are similar to this diagram (clones). No one has ever attempted to create a different diagram or give more description for Selenium RC setup.
Appreciate if anyone can give me guidance on how to setup Selenium RC. The things that i want to know are:
Can i use Selenium RC to test any website on the Internet?
How to setup Selenium RC?
Is my current setup correct? My current setup is like this: In a LAN network which has access to the Internet, I have 3 servers. Server-1 comes with IE8, Server-2 comes with Firefox 3.6. Server-3 will be used as the Selenium RC server. So, Selenium RC in server-3 will remotely control server-1 and server-2 to start up IE and FF. Server-1 and 2 will use server-3 as the HTTP proxy to connect to any webserver on the Internet. If I want to test a website such as yahoo.com, I can write Selenium script and let it run in Server-3 to control the IE and FF in server-1 and 2.
This info is related to Selenium 1.
Selenium system consists of 3 parts:
selenium core - that is javascript library that will be used to simulate user actions
selenium RC - this is selenium-server.jar - mediation JETTY server that will receive requests from selenium client. Selenium Server RC (Remote Control) should be on the same machine where the Browser placed
Selenium client - java/ruby/... library that you will be use with your tests to communicate with Selenium RC.
It will be helpful if you provide language that you use for your tests and other technical details.
About your questions:
can
type in command line -> java selenium-server.jar
or you can use class SeleniumServer in your program
please use text formatting when ask questions.
server-1 will has IE8 and SeleiumServer
server-2 will has FF and yet one SeleniumServer
server-3 will has you client tests
FYI - you can run all together on one PC
The below diagram is of a web application test system that I've implemented on numerous occassions. This does not show you specifically details on installing Selenium RC, but it does show you, at a high level, all of the necessary system components and how they interface.
We hope you can use it to get ideas on how to implement your own systems using open source solutions like Selenium, MySQL and Perl.
Our team understands that not all web sites are created equal, and, in order for any automation initiative to be successful a thorough analysis must be performed of not only the web application, but the business as well. Since our client's QA team, while technically savvy, were not programmers we decided to implement a page object design pattern where all of the "magical selenium commands" were abstracted in a class and exposed to the test developers as methods they would call from their test scripts.
The resulting implementation, as seen in the diagram below, is currently deployed and keeping management and interested parties up to date on the status of key functional areas of the web site.
System Diagram - Click to View
In the coming weeks, we are going to be covering each implementation step in more detail. We look forward to any feedback!
Web and Mobile Automation Blog