Get background-image or data-imageurl in <li>, its possible with scrapy? - scrapy

I need get only background-image or data-imageurl, its possible with scrapy?
'imagem': response.xpath('//li[#id="propertyImageSlide"]').extract()
"imagem": ["<li id="propertyImageSlide" data-image="https://cdn.portugalproperty.com/images/made/property-images/170528/170528_38nwaa0b_1571143267_[size].jpg" class="slide">\n\t\t\t\t\t\t\n\t\t\t\t"]}

'imagem': response.xpath('//li[#id="propertyImageSlide"]/#data-image').extract()
# A bit more pretty with CSS selectors instead of Xpath
'imagem': response.css('li#propertyImageSlide::attr(data-image)').extract()

Related

How to read ol and li elements dynamically without using xpath

I'm new to selenium and Below is my HTML and i want to display sons of Dhritrashtra and grandsons of pandu (without using xpath). I've tried methods like getText and getLinkText but it's not working for me. Please help.Thanks.
Kuru
Shantanu
Vichitravirya
Dhritrashtra
DuryodhanaDushasanaDussalanJalagandhaSamaSahaVindhaAnuvindhaDurmukhaChitrasenaDurdarshaDurmarshaDussahaDurmadaVikarnaDushkarnaDurdharaVivinsatiDurmarshanaDurvishahaDurvimochanaDushpradharshaDurjayaJaitraBhurivalaRaviJayatsenaSujataSrutavanSrutantaJayatChitraUpachitraCharuchitraChitrakshaSarasanaChitrayudhaChitravarmanSuvarmaSudarsanaDhanurgrahaVivitsuSubaahuNandaUpanandaKrathaVatavegaNishaginKavashinPaasiVikataSomaSuvarchasasDhanurdharaAyobaahuMahabaahuChithraamgaChithrakundalaBheemarathaBheemavegaBheemabelaUgraayudhaKundhaadharaVrindaarakaDridhavarmaDridhakshathraDridhasandhaJaraasandhaSathyasandhaSadaasuvaakUgrasravasUgrasenaSenaanyAparaajithaKundhasaaiDridhahasthaSuhasthaSuvarchaAadithyakethuUgrasaaiKavachyKradhanaKundhyBheemavikraAlolupaAbhayaDhridhakarmaavuDhridharathaasrayaAnaadhrushyaKundhabhedyViraavyChithrakundalaPradhamaAmapramaadhyDeerkharomaSuveeryavaanDheerkhabaahuKaanchanadhwajaKundhaasyVirajas
Pandu
Yudhishtir
Prativindhya
Bhim
Sutasoma
Ghatotkch
Arjun
Srutakirti
Babhruvahan
Nakul
Satanika
Sahadev
Shrutkarma
Here is the solution for your query:
It's not mandatory to use xpath always. As per the Selenium Documentation & standards, consider the following attributes in sequence: id, name, css, linktext, xpath. If still unable to detect the element try for css/xpath with multiple attributes like class, src, etc.
Once you can identify the element then only you will be able to retrieve the properties of the element like getText() & getLinkText().
Most important, you have provided the copy of the text from the website. It's impossible to identify an an element from the website text to help you out. You need to provide some relevant part of the HTML DOM inorder to enable us to help you. You can look into the PageSource of any webpage to know the properties (id/name/css/xpath) of the elements. For that, while you are on a webpage, you can right-click and select "View Page Source". For Mozilla Firefox you can download & install extensions ​like Firepath & Firebug to know the properties of an element.
Finally you have to write some code either in Java/Python/C# to open a browser through Selenium of your choice, open a website and perform certain actions with different elements present on the webpage.
Let me know if this answers your question.
I don't know the exact syntax your html has but You can use cssSelector as below per my assumptions of your html elements for both your queries:
1) ul ol ol ol li:nth-child(n) - n= element index
2) ul ol ol ol:nth-child(2) li:nth-child(n) li:nth-child(1) - n= element index

selenium cssSelector vs. tagName

I have a use case that I need to find all iframe and object tags from the page.
Currently I'm using cssSelector() method. I have noticed that there is also tagName() method.
What is the difference between these 2 methods with the above use case ?
findElement(By.tagName("a_tag")) will find elements by html tags such as <iframe> , <div>. But you can only provide it with html tags, not css classes, etc ...
With findElement(By.cssSelector("a_tag")) you can find elements with html tags but you can also give a css class for example findElement(By.cssSelector("div.myClass"))
For your case you can use :
List<WebElement> iframes = driver.findElements(By.tagName("iframe"))
List<WebElement> objects = driver.findElements(By.tagName("object"))
And then perform a for loop to do your tests
It's recommended to use cssSelector/id/xpath/etc ... By since it will wait for the "needed element" displayed if the element is not present on the page initially.
Because By.cssSelector is more specific, selenium will continue checking if the element exists until the implicit wait (x seconds) times out.
By.Tag is not specific at all. Using By.tagName, selenium will not wait for the element. On findElements(By.tagName("table"), Selenium will return an array of all the tables that are present immediately after the page loads. As the "needed" element is not present yet, it will not be in the array.

Convert xpath(with respect to invoking object) to css

Following is the element that I located using xpath:
element2 = element1.find_element(:xpath, ".//a")
xpath is .//a
I want to convert it to css. What is equivalent css of it?
Note: . is important in .//a because I want to find it with respect to element1
Similarly, what could be css equivalent of ..//a
css equivalent for .//a is a with white-space before it (e.g. div a will find all a in all divs). This will find all elements with tag name a on any level inside parent element. For more information you can visit this
So your code will be the following: element2 = element1.find_element(:css, " a")
I don't sure about :css because i don't know the language you are using.

How can i fetch 'img' src attribute by scrapy?

I wanna to use scrapy to fetch the img's links so i write the program below in scrapy's crawler:
hxs.select('//dl[#class="clearfix"]//img/#src/text()').extract()
However, it's doesn't work anyway. is there any problem ?
If you are using CSS selectors instead of XPath, the syntax is ::attr(src)
response.css('.product-list img::attr(src)').extract() # extract_first() to get only one
text() is the text of the element. Just use #src:
hxs.select('//dl[#class="clearfix"]//img/#src').extract()

Image Rating using Selenium

I am trying to give rating to some image using selenium.
I gave rating as 4(clicked on 4th star) and i found the xpath of rating field using firebug add on. the xpath is
css=img[alt="4"]
So, i wrote
selenium.click("css=img[alt="4"]");
but it is giving error..
Any idea?
You didn't find an XPath, but a CSS selector.
XPath solution:
selenium.click("xpath=//img[#alt='4']");
CSS selector solution:
selenium.click("css=img[alt=4]");
Giving which error??If it is element not found use xpath as
xpath=//img[#alt='4']