Scraping iframe using Selenium - selenium

I want to scrape ads in websites but many of them are dynamic and they are DOM objects. For example in
this snippet
I can get the iframe tag by Selenium but I cannot go any further. I think it is because of the XPATH. In this case the XPATH of the <html> inside the iframe is /html which is the same as the main page <html>.
This is the line of code that use:
element = WebDriverWait(self.driver,20).until(EC.presence_of_all_elements_located((By.XPATH, '/html')))
Any suggestions?

By default the selenium.webdriver object is set to the default page which it has parsed. To get the iframe data you will have to switch to the given iframe.
driver = webdriver.Chrome(executable_path=path_chrome)
# find the frame using id, title etc.
frame = driver.find_elements_by_xpath("//iframe[#title='iframe_to_get']")
# switch the webdriver object to the iframe.
driver.switch_to.frame(frame[i])
Always remember, if iterating over the iframes then to SWITCH BACK to the default webpage. Otherwise you won't be able to switch to other iframes in same code.
driver.switch_to.default_content()
Update
Below mentioned functions are deprecated now. So i have updated the answer.
driver.switch_to_frame('Any frame') #deprecated
driver.switch_to_default_content() #deprecated

To switch into an iframe on a page, you should use
driver.switch_to.frame:
iframeElement = driver.find_element_by_tag_name('iframe')
driver.switch_to.frame(iframeElement)
You can now use the driver to find elements within the iframe.
To switch back out of the iframe, use driver.switch_to_default_content()

Related

Unable to switch to iframe in Selenium chromedriver when iframe is directly under body tag [duplicate]

For the portal I am testing now, I came with the problem that I could not create any xpath locators, after some time I figured out that it was because of an '#document', this cuts the path and makes the simple "copy xpath" to direct the path to a completely different element.
<iframe id="FRAMENAME" src="/webclient/workspace/launch-task/REMbl?ds=BP" width="100%" height="100%" frameborder="0" data-navitemname="navitemname" style="" xpath="1">
#document
<html>
CODE....
</html>
I found the solution for this is it is simply add a switchTo like this:
driver.switchTo().frame("FRAMENAME");
This works and makes the rest of the code to work properly but, takes some extra time processing this command till the code moves to the next line.
So I would like to ask, is there is a better solution for this? something smarter/faster?
I am concerned that when the point where I have lots of scripts comes, the execution time will take too long.
I don't use id locators for example because they are all dynamic so sometimes a xpath is required.
Thank you!
To work with elements inside iframe you must switch to this specific iframe.
Your solution .switchTo().frame("FRAMENAME"); is correct. Selenium does not have any other ways to work with iframe wrappers.
inline frames
As per the documentation in Using inline frames, an inline frame is a construct which embeds a document into an HTML document so that embedded data is displayed inside a subwindow of the browser's window. This does not mean full inclusion and the two documents are independent, and both them are treated as complete documents, instead of treating one as part of the other.
iframe structure and details
Generally, an iframe element is in the form of:
<iframe src="URL" more attributes>
alternative content for browsers which do not
support iframe
</iframe>
Browsers which support iframe display the document referred to by the URL in a subwindow, typically with vertical and/or horizontal scroll bars. Such browsers ignore the content of the iframe element (i.e. everything between the start tag <iframe...> and the end tag </iframe>). Browsers which do not support iframe (or have such support disabled) does the opposite, i.e. process the content as if the <iframe...> and </iframe> tags were not there. Thus, the content matters, despite being ignored by some browsers.
So to summarize, inline frames do not mean an include feature, although it might sometimes serve similar purposes.
Note that, when inline frames are used, the browser (if it supports them) sends a request to the server referred to by the URL in the iframe element, and after getting the requested document displays it inside an inline frame. In this sense inline frames are a joint browser-server issue, but only the browser needs to be specifically iframe-aware; from the server's point of view, there's just a normal HTTP request for a document, and it sends the document without having (or needing) any idea on what the browser is going to do with it.
Something Smarter
As per the best practices while switching to an iframe you need to induce WebDriverWait as follows:
Switch through Frame Name (Java Sample Code):
new WebDriverWait(driver, 20).until(ExpectedConditions.frameToBeAvailableAndSwitchToIt(By.name("frame_name")));
Switch through iframe XPath (Python Sample Code):
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#id='ptifrmtgtframe' and #name='TargetContent']")))
Switch through iframe CssSelector (C# Sample Code):
new WebDriverWait(driver, TimeSpan.FromSeconds(20)).Until(ExpectedConditions.FrameToBeAvailableAndSwitchToIt(By.CssSelector("iframe#twitter-widget-0")));
Reference
You can find a couple of relevant discussions in:
Python: How can I select a html element no matter what frame it is in in selenium?
Java: Is it possible to switch to an element in a frame without using driver.switchTo().frame(“frameName”) in Selenium Webdriver Java?
C#: How to wait for a frame to load before locating an element?
tl; dr
Inline frames vs. normal frames

Karate Driver interaction with iframe

Having difficulty with Karate Driver and inputing data into fields that are in an iframe
Have tried using xpath and css selectors to the iframe so I could theoretically switch into and interact with the iframe without any luck. I can find the iframe but I am unsure how to switch context to the iframe so the scenario can continue inside the iframe. For example inputting values into fields in the iframe.
Help Please :)
Update:
Can successfully switch into an iframe but now running into an issue with nested iframe.
* switchFrame(0)
* click('.some-checkbox')
* switchFrame(0)
neither iframe has great css selectors. The second switchFrame with an index of 0 is not looking to the nested iframe.
I guess you have seen the docs here: https://github.com/intuit/karate/tree/develop/karate-core#switchFrame
I admit this is very tricky. Ideally you have a proper CSS or ID selector to the frame and this is an actual working example from a test I have. Note that the waitFor() may be what you are missing, especially when the <iframe> is some slow loading bloatware.
* waitFor('.some-css-name iframe').switchFrame()
* click('.some-checkbox')
* switchFrame(null)
And unfortunately I have found that this tends to work best on driver type: chrome and chromedriver

Get the xpath of an element inside an iFrame for RobotFramework RIDE

I am trying to get the xpath of a 'Form' element by its id that is inside an iframe.
In chrome xpath plugin when i query
//iframe[contains(#id,'fraModalPopup')]
it gets me the iframe but when i try to get anything down the hierarchy it just returns null. e.g. if i try doing
//iframe[contains(#id,'fraModalPopup')]/html // returns null
or
//iframe[contains(#id,'fraModalPopup')]/form[contains(#id='aspnetForm')]
// not sure if it is a right xpath statement - also returns null
would please anyone guide me how I can get hold on to the form element? I have to use this xpath inside RIDE (Robot Framework).
iframe is an element inside main HTML DOM that contains its own embedded HTML DOM. You don't need to use iframe as context node to find form inside frame, but you need to switch to that iframe
select frame id=fraModalPopup
to be able to handle elements inside embedded HTML DOM (no need to add "//iframe" to XPath)
xpath=//form[#id='aspnetForm']

How to work with iframe which is part of a webpage but not getting identified through webdriver?

I am trying to automate a webpage using webdriver,here i am struck with a iframe,which i dont know how to handle.
While i choose css for the iframe by selecting with ,it gives me #xEditingArea
again if I search the same iframe using the css or id,it is not identifying anything.
I tried everything
I want to write some message with the message body which is iframe.
Can anyone guide me how to handle this?
Thanks in advance.
If it is only one iFrame on your website you could try to access it with XPath and the tagname.
Directly accessing iframes is not possible, it has its own DOM elements and we have to switch to that particular frame and then perform the actions you want.
To select the iframe you want to work with, do:
driver.switchTo().frame("frame1");
Now, your driver set to work with the DOM the iframe one.
It's important to remember that maybe switching "back" will be needed. It's done like this:
driver.switchTo().defaultContent();
Using Python Selenium WebDriver, you can access a frame using:
from selenium import webdriver
browser = webdriver.Firefox()
browser.switch_to_frame("fame_name_or_id")
If you want to confirm that the frame was access correctly, just print out the page contents using:
print browser.page_source

Selenium has trouble recognizing elements in nested iFrames

I am using Selenium Server to test a widget-based webpage. All of the widgets are contained within an iFrame, and each widget contains its own iFrame. I'm trying to manipulate elements within specific widgets specified by title, but Selenium seems to be unable to recognize anything past the second iFrame.
The title of the widget is within the first iFrame, and I need to use it to determine which iFrame to select next. The code that I'm using to select the frame looks something like this:
selenium.selectFrame("css=div.widgetheader:contains(TITLE)+widgetbody iFrame);
However, when I attempt to access the elements within the iFrame, Selenium is unable to locate them.
Any thoughts on how to get Selenium to recognize the elements within this second frame?
If you try to enter iframe that is nested inside another iframe, 2 iframe "switching" must be perform.
For example:
driver.switchTo().frame(driver.findElement(By.id("frameId1"))); //switching driver to first iframe
driver.switchTo().frame(driver.findElement(By.id("frameId2"))); //switching driver to second iframe
//Now, do your stuff.
To return to initial location (pare, use:
driver.switchTo().defaultContent();
Also, there can be situations when we not be able to get the iframe values. In this case, you can use tagName method.
driver.switchTo().frame(driver.findElements(By.tagName("iframe").get(0));