How to output two responses.xpath in the same json field? - scrapy

It's possible concatenate two response.xpath in one? I have, for example i have one response.xpath to movie name and other for the director. But instead output this in two lines and the json i do like to have it in one.
Like this in the json :
{"movie" = "Pulp Fiction, Quentin Tarantino", "year":"1994"}
The HTML is like :
<div class="movie_col">
<div class="movie_director"><h1>Pulp Fiction</h1></div>
<div class="director-link"><a><span itemprop="director">
Quentin Tarantino
</span></div>
</div>

Related

Selenium Python, extract text from node and ALL child nodes

I have the opposite problem described here. I can't get the text more than one layer deep.
HTML is structured in the following manner:
<span class="data">
<p>This text is extracted just fine.</p>
<p>And so is this.</p>
<p>
And this.
<div>
<p>But this text is not extracted.</p>
</div>
</p>
<div>
<p>And neither is this.</p>
</div>
</span>
My Python code looks something like this:
el.find_element_by_xpath(".//span[contains(#class, 'data')]").text
Try the same with child elements:
print(el.find_element_by_xpath(".//span[contains(#class, 'data')]").text)
print(el.find_element_by_xpath(".//span[contains(#class, 'data')]/div").text)
print(el.find_element_by_xpath(".//span[contains(#class, 'data')]/p").text)
Not sure what's the referred el in your original post. But able to get all the text using the below.
driver.find_element_by_xpath("//span[#class='data']").text
Output:
'This text is extracted just fine.\nAnd so is this.\nAnd this.\nBut this text is not extracted.\nAnd neither is this.'
Instead of relying on WebElement.text property consider querying innerText property
Consider using Explicit Wait as it will make your test more robust and reliable in case if the element you're looking for is loaded by i.e. AJAX call
Assuming all above:
print(WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//span[#class='data']"))).get_attribute("innerText"))
Demo:

Selenium access a form field with bad id

Looking for the best approach to enter / read a value from a form field that lacks human readable ids / references.
The basic outline looks like
<div id="form-2143">
<div id="numberfield-1234">
<label id="numberfield-1234-label">
<span class="x-form-label">Field Name 1</span>
</label>
<div id="numberfield-1234-body">
<div id="numberfield-1234-wrap">
<input id="numberfield-1234-input" class="form-field" componentid="numberfield-1234">
</div>
</div>
</div>
...
</div>
There are more class defs and attributes involved, but above is the "basics" I have to work with.
This form has a number of entries, and there are more forms like it, so I am looking for a way to search for the label name, and access the input field within the same container.
I lack control of the site and cannot edit the HTML structure of the site; meaning I cannot give sensible names to the ids, but want to avoid hard referencing the poor names. Any suggestions on how to get Robot Framework & selenium to reference these elements?
Highlighting Andersson's answer in the comments
Using the XPath
//label[span[text()="Field Name 1"]]/following-sibling::div//input
Works for the above example.
The key part that answers the question of how to reference nearby elements is
/following-sibling

Xpath Selenium trouble

Can anyone help me? i tried using Firepath for a correct Xpath however the code it gives me is incorrect in my eyes. First line in the examples, is the provided one.
.//*[#id='content']/div/div/div[1]/h2/span
<div id="content" class="article">
<h1>News</h1>
<div>
<div>
<div class="summary">
<h2>
<span>9</span>
// this should be the correct xpath i think
_driver.findElement(By.xpath("//*div[#id='content']/div/span.getText()"));
Here i want check if the text in between is greater or equal to 1
and the other is:
.//*[#id='content']/div/div/div[3]
<div id="content" class="article">
<h1>News</h1>
<div>
<div>
<div class="summary">
<div class="form fancy">
<div class="common results">
Here i want to check if the div class common results has been made, 1 item equals 1 common results
For retrieving span text you can use this
String spanText=driver.findElement(By.xpath("//div[#id='content']/div/div/div/h2/span")).getText();
System.out.println(spanText);
From the second question I am not so much clear.You can get class name like this, Please explain me if its not your solution
String className=driver.findElement(By.xpath("//*[#id='content']/div/div/div/div/div")).getAttribute("class");
System.out.println(className);
I would suggest you making usage of:
//div[#id='content']/div/div/div/h2/span/text()
Note: the html code you shared was not well formed. I would suggest you to test in advance the code and the xpath with http://www.xpathtester.com/xpath (to fix the code) and http://codebeautify.org/Xpath-Tester (to test your xpath)

Full Text search using oracle regex

I am trying to use regexp_like() in oracle 11g to achieve text search.
for example I have a table called posts where in title column if I have values
'How to create a new post with pictures and videos'
'How to create a new post enter code here'
and if I write a query
select * from posts where regexp_like(title, '.*pictures.*|.*how.*|.*post.*', 'i')
It returns records having both titles where the second value does not have word pictures in it. I know its because I user '|' in the expression.
What I want to achieve is It should return records only if all the words are present in the column in any order. Help me with the regex please
Hope this will help you.
I wrote a plunker in AngularJS. It is just to demonstrate how the regular expression works.
http://plnkr.co/edit/6WrCCAfNwidVxKkRFH9l?p=preview
The regular expression you can use is
/^(?=.(pictures))(?=.(post))(?=.*(how))[a-zA-Z ]{1,100}$/
This means pictures, post, how should be there in the given paragraph.
You can adjust the length of the paragrah as per your requirement.
<form role="form" name="myform">
<div>
<div class="col-sm-8">
<input type="text" ng-model="user" name="inputfield" required="true" ng-pattern="/^(?=.*(pictures))(?=.*(post))(?=.*(how))[a-zA-Z ]{1,100}$/">
<span ng-show="myform.inputfield.$dirty && !myform.inputfield.$error.required && myform.inputfield.$error.pattern">
<small class="text-danger">
Not Matching
</small>
</span>
</div>
</div>
</form>

Retrieve 5th item's price value using selenium webdriver

Say I have multiple price quotes from multiple retailers, how will I retrieve the 5th value from a particular retailer - say Target or Walmart ? I can get to the 5th entry using the matching image logo bit how do I retrieve the value ?
Adding Html Code to make things more clear .I need to retrieve the ratePrice value (198)
<div id="rate-297" class="rateResult standardResult" vendor="15">
<div class="rateDetails">
<h4>Standard Goods
<br>
<img src="http://walmart.com/walmart/ZEUSSTAR999.jpg">
</h4>
<p>
<span class="vendorPart-380">
<img alt="Walmart" src="/cb2048547924/icons/15.gif">
<br>
<strong>
<br>
MNC
</span>
</p>
</div>
<div class="ratePrice">
<h3>
$198
<sup>49</sup>
</h3>
<p>
<strong>$754.49</strong>
<br>
</p>
<a class="button-select" href="https://www.walmart.com/us/order/95134/2013-05-14-10-00/95134/2013-05-17-10-00/297"> </a>
</div>
</div>
If you could provide some HTML it would help. Speaking generally from what you're asking you'd get a locator to the price div or whatever HTML element and then get its text using something like:
_driver.FindElement(locator_of_element).Text
The trick is understanding the HTML in order to target the 5th element. So if you can find the row that has the 5th entry then it's simply a matter or then finding the price div in that row and getting the text of it.
EDIT based on more info provided by OP in comments
Using the HTML you provided (which isn't well formed by the way, missing closing strong tag, a tag, etc.). I'd say do the following:
_driver.FindElement(By.XPath("//div[#class='ratePrice'][5]/h3")).Text