Running scraper bot on remote server

Running scraper bot on remote server - selenium

Is there a way to run python script which uses selenium on remote linux server?
I would like to connect via ssh, upload the script and have it working 24/7.
But I am not sure if such script which normally is based on interacting with browser would work.
This script logins to facebook and then do some stuff which help me with my work and I would like to put it on remote server so I don't have to run my pc all thetime.
Thanks in advance.

This is definitely doable, what u need can be simplified to:
your scraper script (python is installed on some Unix machines, so no
need to worry about the runtime environment)
cron job (CI server if you need some more complex behavior)
selenium driver binaries
Xvfb setup
(optional) some deploy.sh that will automatically put your new code versions on the VM

Related

Robot Framework executing test on RDP client

I have my robotframework setup on my PC.
I would like to connect to a remote windows client, have it open a browser and access a URL.
Verify that the pages has loaded.
Pretty basic but since I am new to RF, I wanted to know how that would work.
For Linux machines, I would use the SSHLibrary and just execute commands (wget) but for the windows machine, I need to use the browsers.
Do I need RF installed on the destination client RDP?
Do I need the webdrivers for each browser on the client RDP?
How would I go about logging in the Windows machine through RDP?
After Logging in with RDP, I run the same "open broswer" with broswer and URL?
Thanks!!!

The use case you describe - a browser to be opened & controlled on a remote machine, is precisely what Selenium solves.
Though in day-to-day work or debugging we are usually starting a local browser, SE is preliminary designed for remote execution. So head to www.selenium.dev, and focus on the Grid - that's the component you are after.
I'm that approach, answers to your specific questions:
no, you need Robot Framework and selenium library on the local machine, and only selenium & webdriver on the remote.
you don't need the drivers on the client - the selenium library is all you communicate with in your code; you need them installed in the remote only.
on the local you will get the logs of the webdriver commands execution; actual browser manipulation logs are only on the remote and the hub (but these are really debugging ones, everything high-level for the functional execution is local).
you don't really log into RDP with this approach (RDP is totally out of the picture here), and yes - your code is the same as running on local browser - Open Browser, Get Text, etc - but, executed on a remote machine.
If you want to see why 1) and 2), head to the answer over here (shameless plug 🙂)

How can anyone run access Selenium test scripts without having to install/run it locally?

I am looking for ways to set up like a central 'hub' for Selenium in my work, allowing anyone to access in within the company. For example, Tester A writes test scripts, the Person B can run without having to manually copy over the test scripts to their local workstation)
So far, I've only thought of installing Selenium in a VM which will then execute as per normal. But if I run Selenium Grid, it will run VMs within VM (?). My only concern with VMs is that it'd run slowly.
If anyone can think of a better solution or recommendation please do give me some advice. Thank you in advance.

One idea. You can create an infrastructure combining Jenkins/Selenium/Amazon.
The following is my solution from another post.
You can do it with a grid.
First of all you need to create a Selenium hub with an EC2 ubuntu 14.04 AMI without UI and link it as a jenkins slave to your Jenkins master. Or as directly a master. What you want. Only command line. Download Selenium Server standalone. (be careful on downloading the version. If you Download the Selenium3Beta, things could change). Here you can configure the HUB. You can also add the Selenium Hub as a service and configure to run automatically at server start. its important that you open the Selenium default port (or the one that you configured) so the nodes can connect to it. You can do that on the Amazon EC2 console when you have created your instance. You just need to add a security group with an inbound rule for TCP in the port you want for the IPs you want.
Then, you can create a Windows server 2012 instance server (for example, that's what I did), and do the same process. Download the same version for Selenium and the chromedriver (there is no need to download any firefoxdriver for Selenium versions before Selenium3). Generate a txt file and prepare the Selenium command to link to the HUB as a NODE. And convert it to *.bat in order to execute it. If you want to run the bat at start you can create a service with the task scheduler or use NSSM (https://nssm.cc/). Don't forget to add the rules to the security groups for this machine too!
Next, create the Jenkins server. You can use the Selenium Hub as the Jenkins master or as a slave.
Last step is configuring a job to be run in the Jenkins-Selenium machine. This job needs to be linked to your code repository (git, mercurial...) Using the parametrized build plugion for jenkins you can tell that job to pull the revision you want (where every developer can pull the revision with the new changes and new tests) and run the Selenium tests in that build with the current breanch/revision and against one unique selenium. You can use ANT or Maven to run the Selenium tests in Jenkins.
May be it's complicated to understand because there are so many concepts here but it's robust and it works fine!
If you have doubts, tell me!

If Internet Explorer is not one of the browsers on which you must run your automation tests, I would recommend that you consider docker selenium.
Selenium is providing pre-configured docker images for both Selenium Hub and Node ( refer here for more information ). For making use of docker selenium all you need to do is find a machine (preferably unix machine), install docker on it by following instructions detailed here and then start the hub and node by starting off those containers. In the case of docker you can literally transform a VM (or) a physical machine into a VM farm and yet not have to worry about slowness etc., because I believe docker is optimised for these and it runs your VM as a process.
Resorting to using Amazon cloud for running your selenium nodes is all fine, but if you have corporate policies that prevent in-coming traffic from the internet into your intranet region, then I am not sure how far Amazon cloud would be useful.
Also remember that Jenkins is not something that is absolutely required but is more of a good to have part in the setup because it would let anyone run their tests from a web UI. This will however require that all your tests are checked-in and made available in a central version control system in your organization.
PS : The reason why called out Internet Explorer as an exception is because IE runs only on windows and there are no docker images (yet) for windows. All the docker images are UNIX based images.

Tests running silently without window appearing

Apologies, I'm still fairly new to selenium so please bear with me as I explain this.
Currently my selenium tests are running on a remote machine but no window opens when I'm remotely logged into that machine!
My setup is:
Remote machine has 2 admin users with Selenium Grid2 and a node as a windows service.
The machine is running windows server 2012 with the services having Interactive services enabled.
I am using Selenium 2.42.1 with IEDriver version 2.42.0
Tests are being built and run remotely on our build server.
I think that's everything, if there's any more information then please let me know as I'd really like to know why I cannot view my tests running.
Just for clarification, the tests are running and they successfully pass or fail where necessary, but I just can't see it if I'm remote logged into the machine.
UPDATE
There has been some interesting progress on this issue but still no resolution.
I decided to try and run a node from command line and funny enough the tests run with no problems and the browser window is displayed.
So if anyone has any idea why a browser window would appear when running a node from command line but not when running a node as a service would be great.
I'm using java service wrapper to create my service and im using the same nodeconfig.json when running the node.

This can happen when you are running the node as a windows service. There is an option in service to say whether the job should run in background or interact with desktop. Check that and you should see the windows.

How to run GUI tests on a jenkins windows slave without remote desktop connection?

I have a jenkins agent set up on window 7 and a jenkins server on Linux. I am running GUI testing on the windows agent. It runs fine if I have a remote desktop connection connected to it, but fails otherwise. I found this link, Jenkins on Windows and GUI Tests without RDC
But the solution provided there is pretty vague.. It seems like the only solution is to somehow make Jenkins server to have a remote desktop connection open at all times. But I can't find such an option to do so.
Could anyone please clearly teach me how to solve this issue?
Much appreciated!

Your slave machines have to be at a desktop before the test can run
properly. We had the same problem.
Solution was to have the test machine start up and auto-logon to the
desktop. To ensure that the test would ONLY start after the desktop
was available, we added a scheduled task, set to run at user login,
that would launch the Jenkins slave via Java Web Start. That way,
Jenkins would only see the slave once the desktop was running. After
that, everything worked fine.
This is the winning answer to the question you linked to and it is very clear on what to do. The whole setup is outside of Jenkins. Jason Swager discribed on how he automated a user logging into a windows desktop machine and then starting the Jenkins slave in the user session.
And now Step by step:
1. make sure you have a GUI evailable
Solution was to have the test machine start up and auto-logon to the desktop
Configure a standard Windows desktop to login a specific user automatically when windows start. This way nobody needs to physically log into the desktop. (see How to turn on automatic logon in Windows 7)
2. start Jenkins slave
You need to start the Jenkins slave within this user setting. Otherwise, the Jenkins slave won't have access to the Windows UI components (or in other words can not interact with the desktop).
To ensure that the test would ONLY start after the desktop
was available, we added a scheduled task, set to run at user login,
that would launch the Jenkins slave via Java Web Start.
So you have to create a scheduled task and configure it to start your Jenkins client using Java Web Start.
3. use it
That way, Jenkins would only see the slave once the desktop was running. After
that, everything worked fine.
When the slave is online, you can use it to run your UI tests.

To solve it, set Windows Auto-Logon as I explain here:
https://serverfault.com/questions/269832/windows-server-2008-automatic-user-logon-on-power-on/606130#606130
Then create a startup batch for Jenkins slave (place it in Jenkins directory), which will launch it's console on desktop, and will allow GUI jobs to run:
java -jar slave.jar -jnlpUrl http://{Your Jenkins Server}:8080/computer/{Your Jenkins Node}/slave-agent.jnlp
(slave.jar can be downloaded from http://{Your Jenkins Server}:8080/jnlpJars/slave.jar)
EDIT :
If you're getting black screenshots (when using Selenium for example), create a batch file that disconnects Remote Desktop, instead of closing the RDP session with the regular X button:
%windir%\system32\tscon.exe %SESSIONNAME% /dest:console

The following thing did the trick for me:
In Jenkins execute windows shell command:
cmdkey /generic:TERMSRV/<servername> /user:<username> /pass:<password>
mstsc /v:<servername> /w:<width> /h:<height>
cd <path to your pom.xml>
<maven command>(e.g. mvn test -Dfiles_to_run=groupLaunch.xml
cmdkey /delete:TERMSRV/<servername>
It creates real mstsc session (Win-to-Win) with specified width and height in your virtual mstsc session (Jenkins-to-Win) powered by Jenkins.

I tried the solutions provided here but nothing worked for me. At the end, I came up with a workaround.
I opened an RDP connection to the VM in a different VM (VM2).
I left the first connection open inside VM2 and disconnected from it.
It worked but that implies having two virtual machines available.

You still need to use RDP but in my case, we can use loopback of the RDP in same VM.
The procedure:
In a VM, create two different accounts, and create Jenkins slave for both of account.
Now you'll have two Jenkins slave for two accounts in one VM
slave 1 - account 1
slave 2 - account 2
Enable multiple RDP follow guide
https://www.serverwatch.com/server-tutorials/how-to-enable-concurrent-remote-desktop-sessions-in-windows.html
In slave 2(with account 2), run rdp command to connect to slave 1(with account 1), like following
Start /b "" "C:\RDP\rdp.exe" /v:127.0.0.2 /domain:\ /u:admin /p:xxxx /fullscreen /w:1920 /h:1200
127.0.0.2 is very important, it's a loopback connection for RDP
Put above command into Jenkins job with the name such as "OpenRDP_ToVMXXX, and then you can run any test on slave 1 with GUI enabled, enjoy.

As the solutions above seemed a bit overkill, I used this approach:
Disable Jenkins service
Start Jenkins from command line using java -jar C:\Program Files (x86)\Jenkins\jenkins.war
For some reason I had to install all plugins and everything when starting it this way, so I recommend creating a backup, there is a plugin for that. Good luck. :)

How can I run the InternetExplorerDriver through a linux VM?

What are the options here as our plan is to be able to execute selenium tests on a linux (CentOS) vm using jenkins to schedule the execution using selenium and we only need to test Internet Explorer 9 at this time.
Has anyone had any luck using Wine with them?
What are my other options?
Thanks.

Hmm, consider this more like a guess than an answer:
I would go by the path of investigating the Selenium Grid in setup that your main machine with CentOS would play "hub" role and virtual machine would be "node" - Everything you need to investigate is how to "see" the virtual machine by entering its IP. I think this should be somehow possible, but do not know how to setup it

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas