Select attribute value by attribute name of element with xpath using scrapy - scrapy

<meta name="GLOBEL:pageid" id="logsss_pageid" content="x-444511621">
Tried response.css('#logsss_pageid').extract() and got:
['<meta name="GLOBEL:pageid" id="logsss_pageid" content="x-444511621">']
all I need is the x-444511621

Try with
response.css('#logsss_pageid').xpath('#content').get()
More info: selectors in scrapy using xpath

Related

Nested Span element URL text value - Selenium

I am trying to get the value #2011 which is a URL text from the HTML below. I tried the below code but didnt work. It says it is unable to locate the class
driver.find_element(By.XPATH, '//span[#class = "data-issue-and-pr-hovercards-enabled"]').get_attribute('a')
Can anyone help to correct the mistake? I am new to selenium.
<span data-issue-and-pr-hovercards-enabled>
<span><span> ยท Fixed by #2011</span><span></span></span>
</span>
Here is the link to the website - github.com/mlpack/mlpack/issues/2008 I want to get the #2011 which is next to the Fixed by Text (below the title of the issue). Is it possible to do this?
Try the below XPath:
This relative xpath will search for all tag names(*) which contains the text "#2011"
//*[contains(text(),'#2011')]
Or try the below one: Very similar explanation as above but this will search only within <a> tag
//a[contains(text(),'#2011')]
Update:
Try the below XPath:
//span[contains(text(),'Fixed by')]//a
Use .text method to fetch the required value. This will get you the below text value.
In the given HTML data-issue-and-pr-hovercards-enabled is an attribute but not the value of class.
To extract the text #2011 ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR and text attribute:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span[data-issue-and-pr-hovercards-enabled] a[data-hovercard-type='pull_request'][data-hovercard-url='/mlpack/mlpack/pull/2011/hovercard']"))).text)
Using XPATH and get_attribute("innerHTML"):
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[#data-issue-and-pr-hovercards-enabled]//a[#data-hovercard-type='pull_request' and #data-hovercard-url='/mlpack/mlpack/pull/2011/hovercard']"))).get_attribute("innerHTML"))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

Selenium XPath: how to get href value of a use attribute

I am trying to get the xpath of a svg that has an attribute <use href= "#icon-map">
So far the path //*[local-name()='svg']/*[local-name()='use'] works, but it finds 84 entries.
How can I modify the xpath in order to select only the use that has the href as "#icon-map"?
You can use this:
//*[local-name()='svg'][use[#href="#icon-map"]]
or
//*[local-name()='svg'][*[local-name()='use'][#href="#icon-map"]]
See example.
If you have more results than you expect then you should use more specific paths to the element or take your query into (..) and add number of an item into [..] like :
(//*[local-name()='svg'][use[#href="#icon-map"]])[2]
If use is an attribute then you could do this :
//*[name()='svg']//*[#use and #href='#icon-map']
Also the above solution assumes that #icon-map is unique in HTML DOM

Get innerHTML of a tag robotframework

I am new to robotframework. I have a requirement where I need the innerHTML of a tag. I tried something like this
Wait Until Page Contains Element xpath: //div[#id="toast-container"]
${temp_elem} = Set Variable xpath: //div[#id="toast-container"]
Log ${temp_elem}
But this is not working. Please help
What you need is to use the keyword Get Element Attribute, passing the locator to the element, and the target attribute:
${inner html}= Get Element Attribute xpath://div[#id="toast-container"] innerHTML

What is equivalent of value_of_css_property in Scrapy using Selector?

for getting background from this tag
<body style="background-image: url("http://www.auchandrive.fr/drive/static-media/public2/zones_edit/bannieres/_2016/S49/background_festif2016_boutique.jpg")
I use this code in Selenium
background = driver.find_element_by_css_selector('body').value_of_css_property('background-image')
how can i use this in Scrapy using Css Selector or Xpath?
In scrapy you can use CSS selectors directly:
You can get the node attribute with:
style = response.css('body::attr(style)').extract_first()
After this I am afraid scrapy doesn't offer something like value_of_css_property directly, so you'll have to parse the attribute yourself:
value = response.css("body::attr(style)").re_first('background-image: (.*)$')

Select checkbox using input tag and non standard attribute values using python selenium

I want to select a checkbox with the HTML code shown below using the attribute bayid:
<input type="checkbox" devid="bay" bayid="10" checked="">
I could get the XPath information - "//*[#id="svbSelectEnc1"]/table/tbody/tr[7]/td[3]/input", but I want to use the bayid for selecting as there are lot of checkboxes in the form of a table and only specific checkboxes have to be selected that are read from the config file.
You can achieve it by using CSS Selector or XPath as shown below.
By CSS Selector
driver.findElement(By.cssSelector("input[bayid='10']")).click();
By XPath
//input[#bayid='10']
Also I would suggest you to go through basic tutorial on how to find WebElement using CSS Selector and XPath
try the following XPath:
//input[#bayid='10']
CSS selector way to do this:
driver.findElement(By.cssSelector("yourTagName[attribute='attributeValue']")).click();
For your specific case:
driver.findElement(By.cssSelector("input[bayid='10']")).click();