Finding a string in an URL's source code with selenium webdriver - selenium

I'm trying to extract a keyword/string from a website's source code using this python 2.7 script:
from selenium import webdriver
keyword = ['googleadservices']
driver = webdriver.Chrome(executable_path=r'C:\Users\Jacob\PycharmProjects\Testing\chromedriver_win32\chromedriver.exe')
driver.get('https://www.vacatures.nl/')
elem = driver.find_element_by_xpath("//*")
source_code = elem.get_attribute("outerHTML")
for searchstring in keyword:
if searchstring.lower() in str(source_code).lower():
print (searchstring, 'found')
else:
print (searchstring, 'not found')
The browser fortunately opens when the script is running, but I'm not able to extract the desired keywords from it's source code. Any help?

As others have said, the issue isn't your code but simply that googleadservice isn't present in the source code.
What I want to add, is that your code is a bit over engineered, since all you seem to do is either return true or false if a certain string is present in the source code.
You can achieve that much easier with a better xpath like //script[contains(text(),'googletagmanager')] and than use find_element_by_xpath and catch the possible NoSuchElementException. That might save you time and you don't need the for loop.
There are other possiblities as well, using ExpectedConditions or find_elements_by_xpath and then check if the returned list is greater than 0.

I observed that googleadservices is NOT present in the web page source code.
There is NO issue with the code.
I tried with GoogleAnalyticsObject, and it is found.
from selenium import webdriver
keyword = ['googleadservices', 'GoogleAnalyticsObject']
driver = webdriver.Chrome()
driver.get('https://www.vacatures.nl/')
elem = driver.find_element_by_xpath("//*")
source_code = elem.get_attribute("outerHTML")
for searchstring in keyword:
if searchstring.lower() in str(source_code).lower():
print (searchstring, 'found')
else:
print (searchstring, 'not found')
Instead of using //* to find the source code
elem = driver.find_element_by_xpath("//*")
source_code = elem.get_attribute("outerHTML")
Use the following code:
source_code = driver.page_source

Related

Locating elements in section with selenium

I'm trying to enter text into a field (the subject field in the image) in a section using Selenium .
I've tried locating by Xpath , ID and a few others but it looks like maybe I need to switch context to the section. I've tried the following, errors are in comments after lines.
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
opts = Options()
browser = Firefox(options=opts)
browser.get('https://www.linkedin.com/feed/')
sign_in = '/html/body/div[1]/main/p/a'
browser.find_element_by_xpath(sign_in).click()
email = '//*[#id="username"]'
browser.find_element_by_xpath(email).send_keys(my_email)
pword = '//*[#id="password"]'
browser.find_element_by_xpath(pword).send_keys(my_pword)
signin = '/html/body/div/main/div[2]/div[1]/form/div[3]/button'
browser.find_element_by_xpath(signin).click()
search = '/html/body/div[8]/header/div[2]/div/div/div[1]/div[2]/input'
name = 'John McCain'
browser.find_element_by_xpath(search).send_keys(name+"\n")#click()
#click on first result
first_result = '/html/body/div[8]/div[3]/div/div[1]/div/div[1]/main/div/div/div[1]/div/div/div/div[2]/div[1]/div[1]/span/div/span[1]/span/a/span/span[1]'
browser.find_element_by_xpath(first_result).click()
#hit message button
msg_btn = '/html/body/div[8]/div[3]/div/div/div/div/div[2]/div/div/main/div/div[1]/section/div[2]/div[1]/div[2]/div/div/div[2]/a'
browser.find_element_by_xpath(msg_btn).click()
sleep(10)
## find subject box in section
section_class = '/html/body/div[3]/section'
browser.find_element_by_xpath(section_class) # no such element
browser.switch_to().frame('/html/body/div[3]/section') # no such frame
subject = '//*[#id="compose-form-subject-ember156"]'
browser.find_element_by_xpath(subject).click() # no such element
compose_class = 'compose-form__subject-field'
browser.find_element_by_class_name(compose_class) # no such class
id = 'compose-form-subject-ember156'
browser.find_element_by_id(id) # no such element
css_selector= 'compose-form-subject-ember156'
browser.find_element_by_css_selector(css_selector) # no such element
wind = '//*[#id="artdeco-hoverable-outlet__message-overlay"]
browser.find_element_by_xpath(wind) #no such element
A figure showing the developer info for the text box in question is attached.
How do I locate the text box and send keys to it? I'm new to selenium but have gotten thru login and basic navigation to this point.
I've put the page source (as seen by the Selenium browser object at this point) here.
The page source (as seen when I click in the browser window and hit 'copy page source') is here .
Despite the window in focus being the one I wanted it seems like the browser object saw things differently . Using
window_after = browser.window_handles[1]
browser.switch_to_window(window_after)
allowed me to find the element using an Xpath.

Find elements in webpage using Selenium in Jmeter

I want to do testing using selenium webdriver in Jmeter. And i was using By.linkText to find an element, and assert whether the element exists or not.
var elements = WDS.browser.findElements(pkg.By.linkText("Tools"));
eval(elements.length != 0);
But it seems if replace 'Tools' with any other string like 'asfasdsa' it will return True, and my test is passing. It seems By.linkText doesnt work in JMeter. Is there any other alternate way to find an element in webpage other than By.id??
Also, is this a good way to verify whether an element is present?
Selenium works just fine, I'm not sure what you're trying to do with eval(elements.length != 0); function call, it will return false but fail to see where and how you're using this value
If you want to fail the WebDriver Sampler when the number of returned elements is 0 you need to do this a little bit differently, in particular conditionally call WDS.sampleResult.setSuccessful() function
Suggested code change:
WDS.sampleResult.sampleStart()
WDS.browser.get('http://example.com')
var elements = WDS.browser.findElements(org.openqa.selenium.By.linkText('More information...'))
if (elements.length == 0) {
WDS.sampleResult.setSuccessful(false)
WDS.sampleResult.setResponseMessage('Failed to find any element matching the criteria')
}
WDS.sampleResult.sampleEnd()
The above code will pass as long as you don't change More information... to something else.
See The WebDriver Sampler: Your Top 10 Questions Answered for more WebDriver Sampler tops and tricks
You can use xpath:
Using text():
var elements = WDS.browser.findElements(pkg.By.xpath("//*[text()='Tools']"));
eval(elements.length != 0);
Using contains():
var elements = WDS.browser.findElements(pkg.By.xpath("//*[contains(., 'Tools')]"));
eval(elements.length != 0);

Count items, rows, users, etc in Katalon Studio

I am having some problem with Katalon Studio.
Can I somehow count items on the page by class or something?
I can do it with JavaScript but I don't know how to do it with
groovy language in Katalon studio.
document.getElementsByClassName("").length
I'm trying to convert this JavaScript code into groovy but nothing happens.
You can also use WebUiBuiltInKeywords to findWebElements as specified in the following URL. It will return a list of elements matching the locator.
static List<WebElement> findWebElements(TestObject to, int timeOut)
// Internal method to find web elements by test object
Examples
def elements = WebUiBuiltInKeywords.findWebElements(to, 5)
println elements.size()
I think you can use the same method of size() like done in Table:
See documentation.
import org.openqa.selenium.By as By
import org.openqa.selenium.WebDriver as WebDriver
import org.openqa.selenium.WebElement as WebElement
WebDriver driver = DriverFactory.getWebDriver()
'To locate table'
WebElement Table = driver.findElement(By.xpath("//table/tbody"))
'To locate rows of table it will Capture all the rows available in the table'
List<WebElement> rows_table = Table.findElements(By.tagName('tr'))
'To calculate no of rows In table'
int rows_count = rows_table.size()
println('No. of rows: ' + rows_count)
Hope this helps you!
Do this
WebDriver driver = DriverFactory.getWebDriver()
def eleCount = driver.findElements(By.className("your-class")).size()
println eleCount //prints out the number of the elements with "your-class" class

How can i write my own xpath from the html code

I have followig HTML code and want X path for the text "Analytics & Research"
<div id="LLCompositePageContainer" class="column-wrapper">
<div id="compositePageTitleDiv">
<h1 class="page-header">Analytics & Research</h1>
</div>
I am getting following xpath using chrome, but that didnt work.
//*[#id="compositePageTitleDiv"]
this is my code
WebElement header = driver.findElement(By.xpath("//div[#id='LLCompositePageContainer']/div[#id='compositePageTitleDiv']/h1[#class='page-header']"));
String header2 = header.getText();
System.out.println(header2);
and following error I am getting
Exception in thread "main" org.openqa.selenium.NoSuchElementException:
Unable to find element with xpath ==
//div[#id='LLCompositePageContainer']/div[#id='compositePageTitleDiv']/h1[#class='page-header']
(WARNING: The server did not provide any stacktrace information)
Command duration or timeout: 10.34 seconds For documentation on this
error, please visit:
http://seleniumhq.org/exceptions/no_such_element.html
Please try to use the below xpath:
driver.findElement(By.xpath(".//div[#id='compositePageTitleDiv']/h1")).getText();
If the element is inside the iframe. Then use the below code:
// Switching to the frame
driver.switchTo().frame(<name>);
// Storing the value of the Analytics & Research
String text = driver.findElement(By.xpath(".//div[#id='compositePageTitleDiv']/h1")).getText();
// Switching back to original window
driver.switchTo().defaultContent();
Hope this helps.
This is how it can be used :
WebElement element= driver.findElement(By.xpath("//*[#id='compositePageTitleDiv']"));
Or in case it is nested, can be accessed like this as well
WebElement element = driver.findElement(By.xpath("//html/body/div[3]/div[3]/"));
this is just a rough syntax.
No need to use Xpath here if you could simply locate the element using By.id(). Asuming are using Java, you should try as below :-
WebElement el = drive.findElement(By.id("compositePageTitleDiv"));
String text = el.getText();
Edited :- If element not found, may it is timing issues you need to implement WebDriverWait to wait for element until visible on the page as below :-
WebDriverWait wait = new WebDriverWait(webDriver, implicitWait);
WebElement el = wait.until(ExpectedConditions.visibilityOfElementLocated(By.id("compositePageTitleDiv")));
String text = el.getText();
Note :- if your element is inside any frame, you need to switch that frame before finding element as :- driver.switchTo().frame("your frame name or id");
Hope it helps..:)
You can also use
//div[#id='LLCompositePageContainer']/div[#id='compositePageTitleDiv']/
h1[contains(text(),'Analytics')]
This is the best way to reach to the specific web element, using contains minimize the chances of error.
The correct xpath is
//div[#id='LLCompositePageContainer']
/div[#id='compositePageTitleDiv']
/h1[#class='page-header']
But you could have find your answer easily with some researchs on google...

Scrapy Running Results

Just getting started with Scrapy, I'm hoping for a nudge in the right direction.
I want to scrape data from here:
https://www.sportstats.ca/display-results.xhtml?raceid=29360
This is what I have so far:
import scrapy
import re
class BlogSpider(scrapy.Spider):
name = 'sportstats'
start_urls = ['https://www.sportstats.ca/display-results.xhtml?raceid=29360']
def parse(self, response):
headings = []
results = []
tables = response.xpath('//table')
headings = list(tables[0].xpath('thead/tr/th/span/span/text()').extract())
rows = tables[0].xpath('tbody/tr[contains(#class, "ui-widget-content ui-datatable")]')
for row in rows:
result = []
tds = row.xpath('td')
for td in enumerate(tds):
if headings[td[0]].lower() == 'comp.':
content = None
elif headings[td[0]].lower() == 'view':
content = None
elif headings[td[0]].lower() == 'name':
content = td[1].xpath('span/a/text()').extract()[0]
else:
try:
content = td[1].xpath('span/text()').extract()[0]
except:
content = None
result.append(content)
results.append(result)
for result in results:
print(result)
Now I need to move on to the next page, which I can do in a browser by clicking the "right arrow" at the bottom, which I believe is the following li:
<li><a id="mainForm:j_idt369" href="#" class="ui-commandlink ui-widget fa fa-angle-right" onclick="PrimeFaces.ab({s:"mainForm:j_idt369",p:"mainForm",u:"mainForm:result_table mainForm:pageNav mainForm:eventAthleteDetailsDialog",onco:function(xhr,status,args){hideDetails('athlete-popup');showDetails('event-popup');scrollToTopOfElement('mainForm\\:result_table');;}});return false;"></a>
How can I get scrapy to follow that?
If you open the url in a browser without javascript you won't be able to move to the next page. As you can see inside the li tag, there is some javascript to be executed in order to get the next page.
Yo get around this, the first option is usually try to identify the request generated by javascript. In your case, it should be easy: just analyze the java script code and replicate it with python in your spider. If you can do that, you can send the same request from scrapy. If you can't do it, the next option is usually to use some package with javascript/browser emulation or someting like that. Something like ScrapyJS or Scrapy + Selenium.
You're going to need to perform a callback. Generate the url from the xpath from the 'next page' button. So url = response.xpath(xpath to next_page_button) and then when you're finished scraping that page you'll do yield scrapy.Request(url, callback=self.parse_next_page). Finally you create a new function called def parse_next_page(self, response):.
A final, final note is if it happens to be in Javascript (and you can't scrape it even if you're sure you're using the correct xpath) check out my repo in using splash with scrapy https://github.com/Liamhanninen/Scrape