Page source in selenium - selenium

How can I get page source of current page?
I make driver.get(link) and I am on main page. Then I use selenium to get other page (by tag and xpath) and when I get good page I'd like to obtain its page source.
I tried driver.page_source() but I obtain page source of main page not this current.
driver = webdriver.Chrome(ccc)
driver.get('https://aaa.com')
check1 = driver.find_element_by_xpath('/html/body/div[1]/div/div[2]/button')
check1.click()
time.sleep(1)
check2=driver.find_element_by_xpath('/html/body/div[1]/div[2]/div[2]/div[1]/div/a')
check2.click()
And after check2.click() I am on page with new link (this link only works by click not directly). How can I get page source for this new link?
I need it to change selenium for Beautiful Soup

I have used Webdriver and displaying sources of page

Related

Page elements seen in the browser dev tools are not retrieved by ChromeHeadless

In this page :
https://www.bedbathandbeyond.com/store/product/o-o-by-olivia-oliver-turkish-modal-bath-towel-collection/5469128?categoryId=13434
I can see a button with "Add to Cart" text , I can also see it in dev tools.
But when the same page source is retrieved by ChromeHeadless using selenium, and my script searches for it, this text is not present.
I tried with selecting show page source in the browser, the source too did not have the "Add To Cart text"
Further I used a curl to GET page, "Add To Cart" wasn't in the returned page source either.
What am I doing wrong?
is the page hiding the button?
How can I check for its presence, for product availability check?
The elements you are looking for are inside the shadow DOM. You need to access the shadow root first. Hard to see exactly what is going on in the DOM without some trial and error, but something like this:
WebElement shadowHost = driver.findElement(By.cssSelector("#wmHostPdp"));
SearchContext shadowRoot = shadowHost.getShadowRoot();
WebElement addToCart = shadowRoot.findElement(By.cssSelector(".shipItBtnCont button"));
More info on Shadow DOM & Selenium — https://titusfortner.com/2021/11/22/shadow-dom-selenium.html

How to Get New Page Using Selenium and Chrome

Summary: When I try to go to the third page in a web site I'm trying to
screen scrape using Selenium and Chrome I can't find any elements on the third page.
I see Html for 2nd page in Driver.PageSource.
Steps:
Navigate.GoToUrl("LoginPage.aspx")
Find username and password elements.
SendKeys to Username and Password.
Find and click on Submit\Login button.
Web Site displays main menu page with two Link style menu items.
Find desired menu item using FindElement(By.LinkTest("New Person System")).
Click on link menu item. This should get me to the "Person Search" page (the 3rd page).
Try to wait using WebDriverWait for element on "Person Search" page. This fails to find element on new page after 5-10 seconds.
Instead of using WebDriverWait I then simply wait 5 or 10 seconds for page to load using Thread.sleep(5000). (I realize WebDriverWait is the better design option.
Try to find link with text "Person".
Selenium fails to find link tag.
I see desired "Person Search" page displayed in Chrome.
I see last page Html in ChromeDriver.PageSource.
Chrome geckodriver.exe. Build 7/21/2017.
NUnit 3.7.1
Selenium 3.4
VB
I used IE for another project with similar environment. Didn't have a problem getting to any page (once I coded Selenium correctly).
The web site I'm trying to screen scrape only supports recent IE version. I'm testing a legacy app for another project that requires IE 8. So using IE is out of the question.
Maybe I should try Firefox...

retrieving ad urls using scrapy and selenium

I am trying to retrieve the ad URLs for this website:
http://www.appledaily.com
The ad URLs are loaded using javascript so a standard crawlspider does not work. The ads also changes as you refresh the page.
I found this question here and what I gathered is that we need to first use selenium to load a page in the browser then use Scrapy to retrieve the url. I have some experiences with scrapy but none at all in using Selenium. Can anyone show/point me to resource on how I can write a script to do that?
Thank you very much!
EDIT:
I tried the following but neither works in opening the ad banner. Can anyone help?
from selenium import webdriver driver=webdriver.Firefox()
driver=webdriver.Firefox()
driver.get('http://appledaily.com')
adBannerElement = driver.find_element_by_id('adHeaderTop')
adBannerElement.click()
2nd try:
adBannerElement =driver.find_element_by_css_selector("div[#id='adHeaderTop']")
adBannerElement.click()
CSS Selector should not contain # symbol it should be 'div[id='adHeaderTop']' or a shorter way of representing the same as div#adHeaderTop
Actually on observing and analyzing the site and the event that you are trying to carry out, I find that the noscript tag is what should interest you. Just get the HTML source of this node, parse the href attribute and fire this URL.
It will be equivalent to clicking the banner.
<noscript>
"<a href="http://adclick.g.doubleclick.net/aclk%253Fsa%...</a>"
</noscript>
(This is not the complete node information, just inspect the banner in Chrome and you will find this tag).
EDIT: Here is a working snippet that gives you the URL without clicking on the Ad banner, as mentioned from the tag.
driver = new FirefoxDriver();
driver.navigate().to("http://www.appledaily.com");
WebElement objHidden = driver.findElement(By.cssSelector("div#adHeaderTop_ad_container noscript"));
if(objHidden != null){
String innerHTML = objHidden.getAttribute("innerHTML");
String adURL = innerHTML.split("\"")[1];
System.out.println("** " + adURL); ///URL when you click on the Ad
}
else{
System.out.println("<noscript> element not found...");
}
Though this is written in Java, the page source wont change.

Getting Stale Element Reference Exception while trying to menu links in a web page

I am trying to click the main menu items in seleniumhq.org but after clicking on the first link i am getting a StaleElementReferenceException:Element not found in the cache=perhaps the page was changed since it was looked up
Please provide solution to solve the above problem
Below was my code
WebDriver d=new FirefoxDriver();
d.get("http://docs.seleniumhq.org/");
d.manage().timeouts().implicitlyWait(100,TimeUnit.SECONDS);
List<WebElement> l=d.findElements(By.cssSelector("ul>li"));
for(WebElement e:l) {
e.click();
}
Thanks in advance
If you click on a link and you are taken to the different page or even if you stay in same page the DOM is refreshed. Those elements are not attached to DOM anymore. You need to write some code to come back to previous page if your click takes you to a different page or even if you stays in same page you have find the link on the fly to click instead of "e.click()"

How to take screen shots of all page links in a web site automatically in Firefox using Selenium WebDriver?

How to take screen shots of all page links in a web site automatically in Firefox using Selenium Web Driver?
Tools, I am using:
selenium-server-standalone-2.31.0.jar
Eclipse [JUNO] for Java Codding
Done:
My code taking screen shots of Home Page after that it clicks on first Menu Item using its Element ID.
I have implemented the java code for all Links to be Load and then to take the Screen shots.
Problem:
After Loading that First Linked Page, it is not taking screen shot of that page though java program is still in running state.
If any buddy can Solve this Problem then it will be very helpful for me.....
Hope this code will works..
File screenshot = new File("D:\\screenshot1.png");
File tmpScreenshot = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(tmpScreenshot, screenshot);
System.out.println("the screenshot printed at:- " + screenshot.getAbsolutePath());