How can i fetch 'img' src attribute by scrapy?

How can i fetch 'img' src attribute by scrapy? - scrapy

I wanna to use scrapy to fetch the img's links so i write the program below in scrapy's crawler:
hxs.select('//dl[#class="clearfix"]//img/#src/text()').extract()
However, it's doesn't work anyway. is there any problem ?

If you are using CSS selectors instead of XPath, the syntax is ::attr(src)
response.css('.product-list img::attr(src)').extract() # extract_first() to get only one

text() is the text of the element. Just use #src:
hxs.select('//dl[#class="clearfix"]//img/#src').extract()

Related

I can't extract the links with scrapy

i need help for extract the links in the page: https://www.remax.pt/comprar-empreendimentos?searchQueryState={%22page%22:1,%22sort%22:{%22fieldToSort%22:%22PublishDate%22,%22order%22:1}}

You could shorten it, you don't have to target from the top element to your target. It's easier to debug then.
response.css('div.developments-search-details-component a::attr(href)').get()
You can change this to Xpath if you prefer that. But usually when you try to target an element and it returns null or empty list it's because of a typo or because that element is dynamically rendered after page load.
To debug I'll usually start at a higher element in the tree and see if that exists.
In this case you could try:
response.css('div.developments-search-details-component').get()
first and see if that works.

Get background-image or data-imageurl in <li>, its possible with scrapy?

I need get only background-image or data-imageurl, its possible with scrapy?
'imagem': response.xpath('//li[#id="propertyImageSlide"]').extract()
"imagem": ["<li id="propertyImageSlide" data-image="https://cdn.portugalproperty.com/images/made/property-images/170528/170528_38nwaa0b_1571143267_[size].jpg" class="slide">\n\t\t\t\t\t\t\n\t\t\t\t"]}

'imagem': response.xpath('//li[#id="propertyImageSlide"]/#data-image').extract()
# A bit more pretty with CSS selectors instead of Xpath
'imagem': response.css('li#propertyImageSlide::attr(data-image)').extract()

How to write a XPath for the text one4

I want to use XPath to locate a link behind a text.
I want to use XPath to locate a link behind a text. For example, locate "one4" by "what10". You can only use the text message "what10", but you can't use it in any other way, because the information on this page will change. I want to get is the "one4" link node.
<body>
<p>
so
<br>what1 one
<br>what2two
<br>what11one4
<br>what3three
<br>what4one1
<br>what5two2
<br>what6three3
<br>what7one3
<br>what8two3
<br>what9three3
<br>what10one4
<br>just return
<br></p>
</body>
For some special reasons, what I want to pass is that the text of what10 is positioned to one4.
Please help me.

You can use below line
WebElement loginLink = driver.findElement(By.linkText("one4"));

Selenium doesn't supports xpath-2.0 but uses xpath-1.0
The element which you are trying to refer i.e. which contains the text what10 is a Text Node and Selenium can't use it as a reference. So finding the node with text as one4 with reference to the text what10 won't be possible. As an alternative if the desired node is always the last but one node you can use the following solution:
xpath:
driver.findElement(By.xpath("//body/p//a[position()=last()-1]"));
Update
As per #MosheSlavin counter question here is the snapshot to demonstrate that the XPath works perfecto:

selenium webdriver_Is there any other way to find out the count of webelements present in a webpage without using "findElements()" method?

Is there any other way to find out the count of webelements present in a webpage without using "findElements()" method?
This Question is asked in Interview. So I would like to know whether it is possible to get the count of Webelements without using findElements().

Update to #mate and #custom answer,
using xpath, to extract the number.
driver.executeScript(() => $x('count(//div)'));
if there is 25 div elements, will return 25 as a number.

Here's one way:
Go to DevTools
Enter $x('//*') to console
You will get the number of HTML elements

You could use WebDriver.prototype.executeScript which lets you execute some JavaScript code. That code can either be a function or a string:
driver.executeScript(() => document.querySelectorAll('*').length);
PS: I'm using the JS flavour of selenium-webdriver.

Unable to locate element in the popup

I am trying to locate those buttons as shown here:
And this is HTML for the same :
I tried with all possible ways but nothing works.

As per the HTML you have shared you can locate the Continue with New button with either the following options :
XPATH
//button[#class='slm-btn slm-btn-AdvanceFlow' and #id='Continue']/span[#class='slm-btn-text']
CSS_SELECTOR
button.slm-btn.slm-btn-AdvanceFlow#Continue > span.slm-btn-text

First, you have to try with id because it is the best way to get it because it is unique.
Otherwise, you can use XPath:
//button[#id='ContinueNew']
//button[contains(#id,'ContinueNew')]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How can i fetch 'img' src attribute by scrapy? - scrapy

I wanna to use scrapy to fetch the img's links so i write the program below in scrapy's crawler: hxs.select('//dl[#class="clearfix"]//img/#src/text()').extract() However, it's doesn't work anyway. is there any problem ?

If you are using CSS selectors instead of XPath, the syntax is ::attr(src) response.css('.product-list img::attr(src)').extract() # extract_first() to get only one

text() is the text of the element. Just use #src: hxs.select('//dl[#class="clearfix"]//img/#src').extract()

Related

I can't extract the links with scrapy

Get background-image or data-imageurl in <li>, its possible with scrapy?

How to write a XPath for the text one4

selenium webdriver_Is there any other way to find out the count of webelements present in a webpage without using "findElements()" method?

Unable to locate element in the popup

Categories

Resources