Set locale/region while crawling - selenium

I am trying to crawl information from this website on an AWS machine. The machine, being hosted in US, gives me the price of the product in USD. How can I get the price in INR - the way I see it when I crawl on my local machine.
I normally use Scrapy to crawl the information but am open to using Selenium or any other tool for the same.
I tried using selenium and setting the browser locale to "en-IN" but that did not help.

I'd use TOR (you can setup proxy in Selenium and select desired output node in TOR).
Changing locale can help in some cases only because it's more likely that site is tracking your IP

Related

Selenium test not loading some specific URLs

Using selenium through python on AWS Linux server, when the test start it doesn't load the page, the strange thing is if I try to run a test using a url from google or facebook the test works, I used curl and links commands to see if I have access from the server and they work so not sure what could be the issue.
Any help is appreciated.

Using Scrapy Spider results for a website

I've experimented with some crawlers to pull web data from within a Python environment on my local machine. Ideally, I'd like to host a website that can initiate crawlers to aggregate content and display that on the site.
My question is, is it possible to do this from a web environment and not my local machine?
Sure there are many services that are doing the same task you wanted.
scrapingHub is the best example you can get. https://scrapinghub.com/
You can deploy your spiders in there and run it periodically(paid service). Deploy and call spider via scrapingHub API form your website and use the spider output in your host website.
Also, you can achieve the same idea in your server and website via API call.

Is it possible to use Selenium from within a web app?

I am building a web site in Django that would scrape data from some site, so people could enter the site, set custom data filters and view scraped data in friendly format.
The problem is that requests and beautiful soup modules will not be enough for the scraping purposes, since I will also need some automation to be done (loading javascript or clicking buttons).
Since Selenium requiers a webdriver to be downloaded and put into a path, is it possible to use it from within web app? Like hosting the webdriver somewhere?
I am also open to solutions other than Selenium, if there are any.
I think what you would want is a selenium grid server.
https://www.seleniumhq.org/docs/07_selenium_grid.jsp
Basically you host it on some remote server and then you can connect to it and spin up web drivers remotely and use them in code as needed. It also comes with a handy interface for checking on current browser instances and even taking screenshots or executing scripts from the web ui.

Selenium testing over multiple domains

I'm pretty new at Selenium, recently I've created a bunch of tests with Selenium-IDE and I wanted to run them through a .bat script against a selenium stand alone server so I could test in IE, Firefox, etc.
When running the tests in firefox everything goes well and they pass... now Internet Explorer (8) is another story, the testsuite uses the localhost as domain to test against.
But here's the tricky part - I have a static content provider which runs on another domain than localhost where my images, css and javascript is hosted. How can I tell Selenium Server that it's ok to use multiple domains?
I know it is disabled because of same origin policy, however firefox runs it without problems, and showing the correct css rules and images.
Well you would need to Selenium Webdriver (or Selenium RC (older version)) to do all that you just listed.
here is the page which will hep you get started.

Selenium: Is there a way to change the hosts file on the machine the server is running on

I want to make the browser open a local url with a name of the live url. Meaning that when I do:
sel.open('http://live-url/)
selenium will actually open the local url.
One would test this by changing it's hosts file but this is impossible when running on many machines.
ideas?
No
Selenium cannot change the hosts file as it can only interact with pages rendered inside a broswer.
You could probably set your CI server up to do something like this, but again I have to ask why. Hacking around with a site and then testing it will surely invalidate your tests?