Trying to automate Goibibo Hotel Search, but after clicking on Search Button, it goes to the Results page, but not loading complete page. It is not displaying the hotel options and other options on page. It is showing like processing to load. What could be the possible reason?
Related
In this page :
https://www.bedbathandbeyond.com/store/product/o-o-by-olivia-oliver-turkish-modal-bath-towel-collection/5469128?categoryId=13434
I can see a button with "Add to Cart" text , I can also see it in dev tools.
But when the same page source is retrieved by ChromeHeadless using selenium, and my script searches for it, this text is not present.
I tried with selecting show page source in the browser, the source too did not have the "Add To Cart text"
Further I used a curl to GET page, "Add To Cart" wasn't in the returned page source either.
What am I doing wrong?
is the page hiding the button?
How can I check for its presence, for product availability check?
The elements you are looking for are inside the shadow DOM. You need to access the shadow root first. Hard to see exactly what is going on in the DOM without some trial and error, but something like this:
WebElement shadowHost = driver.findElement(By.cssSelector("#wmHostPdp"));
SearchContext shadowRoot = shadowHost.getShadowRoot();
WebElement addToCart = shadowRoot.findElement(By.cssSelector(".shipItBtnCont button"));
More info on Shadow DOM & Selenium — https://titusfortner.com/2021/11/22/shadow-dom-selenium.html
I have been trying to scrape an Angular Website using Selenium. To my surprise it doesn't let you scrape the html rendered contents as it renders it dynamically using Javascript. I want to locate those tags for the purpose of scraping but I am unable to do so. What is the right way to scrape them? Here is some more context:
They say you can't do it using python.
Some also tried downloading all the html content and then read them. But again this isn't my use case.
But my use case is a lot different:
I want to login to my google account then it redirects me to an angular page where I click a button called reporting and from there I am redirected to a page from where I have to finally click download button to download the report.
I am trying to load through the information about each stock page in investing.com starting from the drop-down list of "Dow Jones Industrial Average" on page investing.com/equities
I have been thinking about using scrapy with
options = response.css("select[class=stocksFilter] option[id=166]")
but this does not simulate a selection action.
After the selection action, I will be going through the table items one by one in #cross_rate_markets_stocks_1, and crawl those equity pages recursively
Can you point out how to simulate a click action?
The selection action is user interaction with the browser UI, but scrapy doesn't render a webpage, we cannot simulate user interaction or run Javascript with it. However, if you're interested in crawling by simulate user interaction, selenium might be a good tool for you.
Back to the question, if we are to crawl with scrapy, we should focus on requests and responses sent to/by the target website, you can log them in the Developer Tools of your browser. After you opened the Developer Tool, click the dropdown menu, you can see the corresponding request is sent to this url:
https://cn.investing.com/equities/StocksFilter?noconstruct=1&smlID=0&sid=&tabletype=price&index_id=166
It's a GET request, with index_id assigned with selected stock ID, you can get the stock ID and name from HTML element of https://investing.com/equities
'xpath of stock ID: //*[#id="stocksFilter"]/option/#id'
'xpath of stock Name: //*[#id="stocksFilter"]/option/text()'
Summary: When I try to go to the third page in a web site I'm trying to
screen scrape using Selenium and Chrome I can't find any elements on the third page.
I see Html for 2nd page in Driver.PageSource.
Steps:
Navigate.GoToUrl("LoginPage.aspx")
Find username and password elements.
SendKeys to Username and Password.
Find and click on Submit\Login button.
Web Site displays main menu page with two Link style menu items.
Find desired menu item using FindElement(By.LinkTest("New Person System")).
Click on link menu item. This should get me to the "Person Search" page (the 3rd page).
Try to wait using WebDriverWait for element on "Person Search" page. This fails to find element on new page after 5-10 seconds.
Instead of using WebDriverWait I then simply wait 5 or 10 seconds for page to load using Thread.sleep(5000). (I realize WebDriverWait is the better design option.
Try to find link with text "Person".
Selenium fails to find link tag.
I see desired "Person Search" page displayed in Chrome.
I see last page Html in ChromeDriver.PageSource.
Chrome geckodriver.exe. Build 7/21/2017.
NUnit 3.7.1
Selenium 3.4
VB
I used IE for another project with similar environment. Didn't have a problem getting to any page (once I coded Selenium correctly).
The web site I'm trying to screen scrape only supports recent IE version. I'm testing a legacy app for another project that requires IE 8. So using IE is out of the question.
Maybe I should try Firefox...
Is it possible to use Selenium so that my code and browser will be integrated - I want to get updated HTML page every time I made any change on the web page in the browser?
In other words I would like to run my app which would automatically start a browser and every time I do any change on the web page selenium automatically get changed HTML in java/python code. For example select a dropdown item might be a good example.
Thanks!