Having trouble getting xpath to select an Next button - scrapy

I'm trying to crawl this gem website:
https://www.irocks.com/search?_token=q57It5iOxH0R1TpCusPK781faIVHprh47BexHVkM&code=&collection=&description=&interval=&locality=&max=&min=&mode=advanced&name=&operator=%3E%3D&query=&species=&status%5B0%5D=available&status%5B1%5D=on-hold
There's been some weird stuff going on and I can't figure out how to get certain elements like the href in the Next button.
For example,
response.xpath('//section') yields:
[<Selector xpath='//section' data='<section class="specimen-details">\n\t<...'>,
<Selector xpath='//section' data='<section class="specimen-related hidd...'>,
<Selector xpath='//section' data='<section class="shows hidden-print">\n...'>,
<Selector xpath='//section' data='<section class="blog hidden-print">\n ...'>,
<Selector xpath='//section' data='<section class="navigation">\n ...'>]
But when I look in the console I see an additional <section class="specimen-list"> that does not show up there and contains the navigation buttons within it. I'm not sure what's going on. Any help or advice appreciated!

The xpath to get href of next page is //a[#rel="next"]/#href
So you can basically do
response.xpath('//a[#rel="next"]/#href').get()
or using css selector
response.css('a[rel="next"]::attr(href)').get()
get() method works in newer version of scrapy, if it doesn't works in your use extract_first().

Related

How can I get an element without id, name and class attributes in red box using By?

<div class="bInputTab">
<ul>
<li class="onNow">网银支付</li>
<li>账号支付</li>
</ul>
</div>
How can I get the element in the red box using By?
Many thanks!
Try following xpath,
//a[#onClick='On click Value']
You could use xpath, tagName, all depends on HTML structure, You can find parent element, and search downwards:
//li[#class='onNow']/following-sibling::li[1]/a
if its only link in DOM driver.findElement(By.tagName("a"));
Hope this helps,
This XPath should work :
//li[#class='onNow']/following-sibling::li[1]/a
The link text should also work as well
driver.FindElement(By.LinkText("账号支付"));
Actually I did not show the key structure of the HTML. It is because the element is not in the default frame. So I add the WDS.browser.switchTo().frame("frame_main") to the code, it works.
Thanks for all of your help.
The reference is The WebDriver Sampler: Your Top 10 Questions Answered
The element’s locator is invalid
2. The element belongs to another frame
The element is not available in DOM yet

How to get the xpath for the span button

I am new to selenium IDE and trying to get the target in the right order.
I have tried many combination to get the right element when there is span for the button. I need to get the xpath for the "Read More" button.Can someone please advise how the target in the IDE should be.
Here is the code:
<div class="is_centered l_bottom_pad">
<a class="btn_teal_outline has_arrow" href="https://test.com/jobs/view.php?id=8">
<span>Read More</span>
</a>
</div>
In some browsers (I think it was mainly MSIE) it is necessary to address the <a> element, not its child <span> in order to click a button or link. So you should adress:
//a[span[text()='Read More']]
Or you go directly for LinkText ("Read More") instead of XPath!
Targeting for span button is very simple. Just see what is unique attribute in that element.
I feel the button text itself is unique.Try this one
xpath=.//span[text()='Read More']

how to locate element with selenium webdriver for below html

I have an issue clicking on the below HTML:
<div id="P7d2205a39cb24114b60b80b3c14cc45b_1_26iT0C0x0" style="word-wrap:break-word;white-space:pre-wrap;font-weight:500;" class="Ab73b430b430a49ebb0a0e8a49c8d71af3"><a tabindex="1" style="cursor:pointer;" onclick="var rp=$get('ctl00_ContentPlaceHolder1_ReportViewer1_ctl10_ReportControl');if(rp&&rp.control)rp.control.InvokeReportAction('Toggle','26iT0C0x0');return false;" onkeypress="if(event.keyCode == 13 || event.which == 13){var rp=$get('ctl00_ContentPlaceHolder1_ReportViewer1_ctl10_ReportControl');if(rp&&rp.control)rp.control.InvokeReportAction('Toggle','26iT0C0x0');}return false;"><img border="0" src="/Reserved.ReportViewerWebControl.axd?OpType=Resource&Version=10.0.30319.1&Name=Microsoft.ReportingServices.Rendering.HtmlRenderer.RendererResources.TogglePlus.gif" alt="+"></a> 2013</div>
I have used the below script to click anchor inside a div tag. For the above html code it is not fixed only end part of id example "26iT0C0x0" is fixed. The script that I have used is:
WebElement e1=wait.until(ExpectedConditions.elementToBeClickable(By.xpath("//div[ends-with(#id,'26iT0C0x0')]/a")));
e1.click();
You can use the 'contains' method within an xpath lookup:
driver.findElement(By.xpath("//div[contains(#id,'26iT0C0x0')]")
I would recommend you to consider CSS selector alternative as CSS working faster, than xpath.
So 'contains' in attribute in CSS stands for '*=', for example
if we want to find attribute by 'CSS' ending in this: <htmlTag A="blablaCSS" > we need do the following:
String CSSselector="htmlTag[A*=CSS]";
and you get this element searched.
So considering your example CSS selector be like:
String cssSearched="div[id*=26iT0C0x0] a";
also try to click not on link - a
but on parent div as well:
String cssSearched="div[id*=26iT0C0x0]";
driver.findElement(By.cssSelector(cssSearched));
hope this works for you.
As Mark Rwolands already mentioned: the xpath-Function 'ends-with()' isn't supported in Selenium 2.
Also, if you maybe consider to use chromeDriver in the future, I would recommend clicking the image, not the anchor, see:
https://sites.google.com/a/chromium.org/chromedriver/help/clicking-issues
edit:
Also your IDs are looking generated. I wouldn't count on them for a stable test-environment.

Handle elements that have changing ids all the time through Selenium Webdriver

I am running the script to automate test cases and having this unique problem.
I have detected and used IDs of the elements for click etc. purpose. However, all of a sudden these ids have changed and the script works no more.
Another weird thing is those IDs are same as in script when inspected in Chrome but different in Firefox driver browser.
Firebug for test driver: -
<p class="description" onclick="selectElementTextListForIE(this,'tile29', 'tile19');selectElementTextList(this,'tile29', '')" id="tile29_span_0_0">
Platinum
</p>
Chrome inspector for same element: -
<p class="description" onclick="selectElementTextListForIE(this,'tile20', 'tile19');selectElementTextList(this,'tile20', '')" id="tile20_span_0_0">
Platinum
</p>
Also, what could be the best strategy for detecting such elements whose IDs are generated on run.
I even tried using XPATH but that too contains id's reference
eg. #id="tile276_input
Any help will be appreciated.
Thanks.
Abhishek
You can utilize CSS for this. For your element, looks like its:
<* id="tile276_input" />
What you need to do is find out what is changing about it. I assume it's the number inbetween. If it is, then your selector would look something like:
By.cssSelector("*[id^='tile'][id$='input']")
This will look for anything that has an ID that "starts with tile" and "ends with input. In our case, "tile276_input" matches that.
See this article if you want more information
You also can try contains and starts-with() for such things
driver.findElement(By.xpath("//*[contains(#id,'title')]"))
or
driver.findElement(By.xpath("//* [start-with(#id,'title')]"))
WebElement element = driver.getElement(By.cssSelector("[id^='title']);
Or
WebElement element = driver.getElement(By.cssSelector("id:contains('title')"));
You Can use this element to do desired actions.

CSS locator for corresponding xpath for selenium

The some part of the html of the webpage which I'm testing looks like this
<div id="twoWideCallouts">
<div class="callout">
<a target="_blank" href="http://facebook.com">Facebook</a>
</div>
<div class="callout last">
<a target="_blank" href="http://youtube.com">Youtube</a>
</div>
I've to check using selenium that when I click on text, the URL opened is the same that is given in href and not error page.
Using Xpath I've written the following command
//i is iterator
selenium.getAttribute("//div[contains(#class, 'callout')]["+i+"]/a/#href")
However, this is very slow and for some of the links doesn't work. By reading many answers and comments on this site I've come to know that CSS loactors are faster and cleaner to maintain so I wrote it again as
css = div:contains(callout)
Firstly, I'm not able to reach to the anchor tag.
Secondly, This page can have any number of div where id = callout. Using xpathcount i can get the count of this, and I'll be iterating on that count and performing the href check. How can something similar be done using CSS locator?
Any help would be appreciated.
EDIT
I can click on the link using the locator css=div.callout a, but when I try to read the href value using String str = "css=div.callout a[href]";
selenium.getAttribute(str);. I get the Error - element not found. Console description is given below.
19:12:33.968 INFO - Command request: getAttribute[css=div.callout a[href], ] on session
19:12:33.993 INFO - Got result: ERROR: Element css=div.callout a[href not found on session
I tried to get the href attribute using xpath like this
"xpath=(//div[contains(#class, 'callout')])["+1+"]/a/#href" and it worked fine.
Please tell me what should be the corresponding CSS locator for this.
It should be -
css = div:contains(callout)
Did you notice ":" instead of "." you used?
For CSSCount this might help -
http://www.eviltester.com/index.php/2010/03/13/a-simple-getcsscount-helper-method-for-use-with-selenium-rc/
#
On a different note, did you see proposal of new selenium site on area 51 - http://area51.stackexchange.com/proposals/4693/selenium.
#
To read the sttribute I used css=div.callout a#href and it worked. The problem was with use of square brackets around attribute name.
For the first part of your question, anchor your identifier on the hyperlink:
css=a[href=http://youtube.com]
For achieving a count of elements in the DOM, based on CSS selectors, here's an excellent article.