Scrape text under div tag that is in quotes - scrapy

Trying to scrape this part: "Lounge, Showers, Lockers"
https://i.stack.imgur.com/k5mzg.png
<div class="CourseAbout-otherFacilities more">
<h3 class="CourseAbout-otherFacilities-title">Available Facilities</h3> " Lounge, Showers, Lockers "
</div>
Website:
https://www.golfadvisor.com/courses/16929-black-at-bethpage-state-park-golf-course
response.css('.CourseAbout-foodAndBeverage.more::text').get() command returns " \n "
Thank you

There are three text elements in your target div (matched by your CSS expression):
<div class="CourseAbout-otherFacilities more">FIRST<h3
<h3 class="CourseAbout-otherFacilities-title">SECOND</h3>
</h3>THIRD</div>
By using .get() you're telling Scrapy to return first match.
I recommend to use XPath expression here instead and match your element by text:
//h3[.="Available Facilities"]/following-sibling::text()[1]'

Related

vuejs different v-ifs from a previous select

I am trying to display two different divs according to a select options, but I am only getting the first v-if. When I select "de", I do get the first div content, but I also get it when I select fr, instead of the second div.
I can't get my head around this. Any ideas of what I am getting wrong?
This goes on inside a form:
<select v-model="source" :key="source">
<option :value="de">de</option>
<option :value="fr">fr</option>
</select><br><br>
<div class="characters">
<div v-if="source === de" class="deChars" :key="source">
<h5>Special German characters:</h5>
<li v-for="character in deChars" :key="character">{{ character }}</li>
</div>
<div v-else-if="source === fr" class="frChars" :key="source">
<h5>Special French characters:</h5>
<li v-for="character in frChars" :key="character">{{ character }}</li>
</div>
<br>
</div>
on the script section I am using the options api with the data property source=" ", and two arrays for deChars and frChars.
note: those :key="source" I added to make sure it gets read when the source value changes.
The data properties:
data(){
return {
original: "",
translation: "",
source: "",
target: "en",
deChars: ["ä", "Ä", "é", "ö", "Ö", "ü", "Ü", "ß"],
frChars: ["à", "â", "ç", "é", "è", "ê", "ë", "î", "ï", "ô", "û", "ü", "œ"]
}
},
Thank you so much!
Try to remove the binding sign since it seems that de and fr are not declared as properties and they're just raw string :
<select v-model="source" >
<option value="de">de</option>
<option value="fr">fr</option>
</select>
then do v-if="source === 'de'" and v-else-if="source === 'fr'"
It seems that the problem was with the select options. THe binding seemed to mess about. Once I got rid of that, it works perfectly. So thanks for that Boussadjra. However, if on the v-if I change the de and fr to strings, it does not work. WHich is why I am adding this comment as the solution.

how to insert a new empty line on selected index number in an array using \n. while we loop through it using v-for and create a list

want to insert a new empty line before a selected array element while we loop through it using v-for to create a list..trying to do this using \n isn't working
<!-- this is the template part -->
<ul>
<li v-for = "ninja in ninjas" > {{ninja}}</li>
</ul>
/* this is the script part notice index no 2 in the array*/
data() {
return {
ninjas: [
'mati kahe kumhaar sey, tu kya ronday mohey',
' Ik din aisa aayega mai rondungi tohe',
'\n aaye hain toh jaayengay Raja, Rank, fkeer',
' Ik sinhaasan chodi chale, Ik baandhay zanjeer'
]
};
},
This is pretty easy to do using the <pre> tag which forces it to preserve white space, new lines, etc. This is often used for preserving formatting in code examples.
<div id="app">
<ul>
<li v-for = "ninja in ninjas" ><pre>{{ninja}}</pre></li>
</ul>
</div>
working example: https://jsfiddle.net/skribe/10yL6va8/8/
You will probably want to use css to style the text surrounded by <pre> since most browsers automatically format it differently.

How to select specific text to scrape

I'm trying to scrape the following HTML, I want just to get the Some Header part and not the additional info.
<li class="media">
<div class="media-body">
<h4> Some Header <span class="label label-info"> additional Info </span> </h4> Address info
<br>
</div> </li>`
I'm trying the following:
val li: Elements = ul.select("li")
val list: Elements = li.select("a")
val headers: Elements = list.select("h4")
`
and then when I try to get the inner text via, headers.text() I'm getting both Some Header and additional Info
How can I only scrape the Some Header part?
You are almost near to the solution .You are probably looking for calling ownText:
String s = "<li class=\"media\"> \n" +
" <div class=\"media-body\"> \n" +
" <h4> Some Header <span class=\"label label-info\"> additional Info </span> </h4> Address info\n" +
" <br> \n" +
" </div> </li>";
Document document = Jsoup.parse(s);
Elements element = document.select("li");
Elements elements = element.select("a");
System.out.println(elements.select("h4").first().ownText()); ;
Output:
Some Header

Select and "ol" html tag with Watir Webdriver

I want to create a for loop to get the text from several lis with watir.
Here's is the HTML I'm trying to scrape
<ol class="shared-connections">
<li class="small-result">
<a class="img-link" href="http://www.google.com"></a>
</li>
<li class="small-result">
<a class="img-link" href="http://www.google.com"></a>
</li>
</ol>
I'm tring to get the href value in the links with a loop, but I can't get the loop to initiate with this code:
#browser.ol(class: "shared-connections").lis(class: "small-result").each do |connection|
p "is this working?"
end
The "ol" tag prevents the loop from working and gives me this error:
/Library/Ruby/Gems/2.0.0/gems/watir-webdriver-0.9.1/lib/watir-webdriver/elements/element.rb:536:in `assert_element_found': unable to locate element, using {:class=>"shared-connections", :tag_name=>"ol"} (Watir::Exception::UnknownObjectException)
Any idea how to get "ol" to work with Watir? Thanks!
It seems you didn't open proper page via code.
Code is working well. I created file with code you provided
in irb:
pp File.readlines('a.html')
["<ol class=\"shared-connections\">",
" <li class=\"small-result\">,
" <a class=\"img-link\" href=\"http://www.google.com\"></a>,
" </li>",
" <li class=\"small-result\">,
" <a class=\"img-link\" href=\"http://www.google.com\"></a>",
" </li>",
"</ol>"]
Then
b = Watir::Browser.new :chrome
b.goto 'file://' + Dir.pwd + '/a.html'
b.ol(class: "shared-connections").lis(class: "small-result").each do |connection|
p "is this working?"
end
"is this working?"
"is this working?"
=> [#<Watir::LI:0x604055b6f2097db6 located=false selector={element: (webdriver element)}>, #<Watir::LI:0x..f5826cdc74313e1a located=false selector={element: (webdriver element)}>]
You can ensure about this with #browser.html

how to click on text field "Quantity" which id value is changing in different scenario

<div id="div_12_1_1_1_3_1_2_1_1_1_2" class="Quantity CoachView CoachView_show" data-eventid="" data-viewid="qty" data-config="config12" data-bindingtype="Decimal" data-binding="local.priceBreak.quantity" data-type="com.ibm.bpm.coach.Snapshot_a30ea40f_cb24_4729_a02e_25dc8e12dcab.Quantity">
<div class="w-decimal w-group clearfix">
<div class="p-label-container span4">
<div class="p-fields-container controls-row span8 l-input fixed-units">
<input id="div_12_1_1_1_3_1_2_1_1_1_2-in" class="p-field span8" type="text" maxlength="16">
<input id="div_12_1_1_1_3_1_2_1_1_1_2-iu" class="p-unit span4" type="text" maxlength="2" style="display: none;">
<select class="p-unit span4" style="display: none;"></select>
<div class="p-unit span4">CM</div>
<div class="p-help-block"></div>
</div>
<div class="p-fields-container span8 l-output" style="display: none;">
</div>
</div>
<div id="div_12_1_1_1_3_1_2_1_1_1_3" class="Quantity CoachView CoachView_show" data-eventid="" data-viewid="Quantity2" data-config="config73" data-bindingtype="Integer" data-binding="local.priceBreak.numberDeliveries" data-type="com.ibm.bpm.coach.Snapshot_a30ea40f_cb24_4729_a02e_25dc8e12dcab.Quantity">
here how to click on text box of whose id is "div_12_1_1_1_3_1_2_1_1_1_2-in "
but for some scenario its changing to "div_5_1_1_1_3_1_2_1_1_1_2-in "
i have tried with the following ,
driver.findElement(By.xpath("//div/input[ends-with(#id,'__1_1_1_3_1_2_1_1_1_2-in')]")).sendKeys("98989998989");
but it is not working ..
Output:
org.openqa.selenium.InvalidSelectorException: The given selector //div/input[ends-with(#id,'__1_1_1_3_1_2_1_1_1_2-in')] is either invalid or does not result in a WebElement. The following error occurred:
InvalidSelectorError: Unable to locate an element with the xpath expression //div/input[ends-with(#id,'__1_1_1_3_1_2_1_1_1_2-in')] because of the following error:
[Exception... "The expression is not a legal expression." code: "51" nsresult: "0x805b0033 (NS_ERROR_DOM_INVALID_EXPRESSION_ERR)" location: "file:///C:/Users/SUNIL~1.WAL/AppData/Local/Temp/anonymous4157273428687139624webdriver-profile/extensions/fxdriver#googlecode.com/components/driver_component.js Line: 5956"]
Command duration or timeout: 41 milliseconds
For documentation on this error, please visit: http://seleniumhq.org/exceptions/invalid_selector_exception.html
Build info: version: '2.37.0', revision: 'a7c61cb', time: '2013-10-18 17:15:02'
you can try with the following cssSelector,
driver.findElement(By.cssSelector("div.fixed-units > input[id$='in']")).sendKeys("98989998989");
If 'in' text is present always then you can use xpath. You can try //div[contains(#id, 'in')]
ends-with is an XPath 2 query, of which none of the five major actually support v2
Your options are either to use other methods as already suggested, or for an XPath 1 solution you could use:
//div/input[substring(#id, string-length(#id) - 22) = '_1_1_1_3_1_2_1_1_1_2-in']
Although it's ugly, really.
I actually used "starts-with"...but I see you have multiple that start with "div".
If your elements all stay in the same place on the page and aren't subject to change, try this out. Here's some Java code:
By by = By.xpath("(//*[starts-with(#" + attributeName + ", '" + attributeValue + "')])[" + n + "]");
In your case, it would look like this:
By by = By.xpath("(//*[starts-with(#id, 'div')])[2]");
What this will do is pick the second element that starts with "div" in the DOM.
It's a bit of a hack...but it might work out for you.