How can I get the link of all the posts in the Instagram profile with Selenium? - selenium

I'm trying to get the links of all the posts in an instagram profile.
How can I get to the href="/p/CX067tNhZ8i/" in the photo.
What I'm trying to do is find the href= blabla of all posts.
All your posts are in class="v1Nh3 kIKUG _bz0w".
I tried to get the hraf= blabla value from this class with the get_attribute command, but it didn't work.
Thank you for your help.
browser.get("https://www.instagram.com/lightning.mcqueen34/")
links = []
elements = browser.find_element_by_xpath('//*[#id="react-root"]/div/div/section/main/div/div[4]/article/div[1]/div/div[1]/div[3]')
for i in elements:
links.append(i.get_attribute('href'))
I thought this would work but the elements value is not a list . It gave an error.

This should work:elements = browser.find_elements_by_tag_name('a')
Below answer will not work in all cases, dependant on how the DOM of the page is loaded.
Replace this line:
elements = browser.find_element_by_xpath('//*[#id="react-root"]/div/div/section/main/div/div[4]/article/div[1]/div/div[1]/div[3]')
With:
elements = browser.find_element_by_xpath("//a[#href]")
This will let you retreive all links with a href from the page.

Try to change XPath first to get the DIV class or ID after trying this //a[#href] Xpath to get all HREF.

Related

How to find the next link after an id with selenium?

I want to return the links to all posts from a specific subreddit on my Reddit homepage. My intuition is to do this by looking for the next link after it finds an href = r/whatever.
I was using https://www.reddit.com/r/programming/
I would recommend using infinite scroll load.
Then after use this to grab all the links.
links = [x.get_attribute("href") for x in driver.find_elements(By.XPATH, "//a[#href and #data-click-id='body']")]
you can find all a tags with href attribute and after that, you can iterate through this list. python implementation.
driver = webdriver.WhateverDriver
links = driver.find_elements(By.XPATH, "//a[#href]") # This will return all links

Selenium finding elements returns incorrect elements

I'm using Selenium to try and get some elements on a web page but I'm having trouble getting the ones I want. I'm getting some, but they're not the ones I want.
So what I have on my page are five divs that look like this:
<div class="membershipDetails">
Inside each one is something like this:
<div class="membershipDetail">
<h3>
VIP Membership
</h3>
</div>
They DO all have this same link, but they don't have the same text ('VIP Membership' would be replaced by something else)
So the first thing was to get all the divs above in a list. This is the line I use:
listElementsMembership = driver.find_elements_by_css_selector(div[class^='membershipDetail'])
This gives me five elements, just as I would expect. I checked the 'class' attribute name and they are what I would expect. At this point I should say that they aren't all EXACTLY the same name 'membershipDetail'. Some have variations. But I can see that I have all five.
The next thing is to go through these elements and try and get that element which contains the href ('VIP Membership').
So I did that like this:
for elem in listElementsMembership:
elemDetailsLink = elem.find_element_by_xpath('//a[contains(#href,"EditMembership")]')
Now this does return something, but it always got me the element from the FIRST of the five elements. It's as if the 'elem.find_element_by_xpath' line is going up a level first before finding these hrefs. I kind of confirmed this by switching this to a 'find_elements_by_xpath' (plural) and getting, you guessed it, five elements.
So is this line:
elemDetailsLink = elem.find_element_by_xpath('//a[contains(#href,"EditMembership")]')
going up a level before getting its results? If it is, now can I make it not do that and just restrict itself to the children?
If you are trying to find element with in an element use a . in the xpath like below:
listElementsMembership = driver.find_elements_by_css_selector(div[class^='membershipDetail'])
for elem in listElementsMembership:
elemDetailsLink = elem.find_element_by_xpath('.//a') # Finds the "a" tag with respect to "elem"
Suppose if you are looking for VIP Membership:
listElementsMembership = driver.find_elements_by_css_selector(div[class^='membershipDetail'])
for elem in listElementsMembership:
value = elem.find_element_by_xpath('.//a').get_attribute("innerText")
if "VIP Membership" in value:
print(elem.find_element_by_xpath('.//a').get_attribute("innerText"))
And if you dont want iterate over all the five elements try to use xpath like below: (As per the HTML you have shared)
//div[#class='membershipDetail']//a[text()='VIP Membership']
Or
//div[#class='membershipDetail']//a[contains(text(),'VIP Membership')]
You've few mistake in that css selector.
Quotes are missing.
^ is for starts-with, not sure if you really need that. In case it's partial matching please use * instead of ^
Also, I do not see any logic for the below statement in your code attempt.
The next thing is to go through these elements and try and get that
element which contains the href ('VIP Membership').
Code :
listElementsMembership = driver.find_elements_by_css_selector("div[class*='membershipDetail']")
for ele in listElementsMembership:
e = ele.find_element(By.XPATH, ".//descendant::a")
if "VIP Membership" in e.get_attribute('href'):
print(e.text, e.get_attribute('href'))
You can give an index using a square bracket like this.
elemDetailsLink = elem.find_element_by_xpath('(//a[contains(#href,"EditMembership")])[1]')
If you are trying to get an element using XPath, the index should start with 1, not 0.

Selenium: How to find text on page containing html tags? (Text Node)

I updated the question after it was answered!
I try to find a text in a list on the webpage, which contains a html tag like <p> text </p>.
Heres a screenshot how it does look like on the webpage:
Screenshot of text to search for
Inside the "inspect" i used //*[text()='<p> First do this, then this</p>'] which could be found as seen above in the screenshot.
In the code im using this codeline to find the text:
webDriver.FindElement(By.XPath("//*[text()='<p> First do this, then this</p>']"))
But during the test run it gives this error message:
OpenQA.Selenium.NoSuchElementException: no such element: Unable to locate element: {"method":"xpath","selector":"//*[text()='
First do this, then this']"}
As you can see, selenium does somehow ignore the html tags <p> </p>
Answer and solution from cruisepandey:
Thanks to #cruisepandey i know now, my text is inside a textnode.
The only way to get the text out is using this code:
var ele = webDriver.FindElement(By.XPath("//table[#class='mud-table-root']//tbody/tr[1]/td[2]"));
Console.WriteLine(ele.Text);
The output of this here is:
<p> First do this, then this</p>
That's a text node, you cannot simply use text() method from xpath v1.0
You can try with below xpath :
(//table[#class='mud-table-root']//tbody/tr)[1]/td[2]
Code:
var ele = webDriver.FindElement(By.XPath("//table[#class='mud-table-root']//tbody/tr[1]/td[2]/text()"));
Console.WriteLine(ele.Text);
Code (With explicit waits) : in you want to click on it.
var ele = new WebDriverWait(webDriver, TimeSpan.FromSeconds(20)).Until(ExpectedConditions.ElementToBeClickable(By.XPath("(//table[#class='mud-table-root']//tbody/tr)[1]/td[2]")));
Console.WriteLine(ele.Text);
p is the element tag name, not part of it's text.
Also possibly there are spaces in the text. In this case I prefer using contains instead of exactly equation text check.
Try this instead:
webDriver.FindElement(By.XPath("//p[contains(text(),'First do this, then this')]"))

Selenium driver is not reflecting updates after click on link

There are some posts about this topic but I cannot find any solution for my case, this is the situation:
I click on a link (a next page):
ActionChains(driver).move_to_element(next_el).click().perform()
Then I get the content of the new page(I'm interested on some script sections inside the body)
html = driver.find_element_by_xpath("//*").get_attribute("outerHTML")
But that content is always the same, no matter how long I wait for.
The only way to get the driver with new DOM information is to do a refresh(),
but for this case that is not a valid option.
Thanks and regards.
I am not sure what exactly you are looking for here, but if I am right you want to capture the content of script tag from the page.
If that is the case capture the page source in a string variable
sorce_code = driver.page_source , after you get the sting you can extract the value by any of the available string methods. I hope it helps.

How to retrieve the label text as per the html provided?

Sometimes I failed to get the inner text of a web-element; recently working on thePersonal insurance, and failed to get inner text of label (web element). Here are the script and screenshot of webpage inspected:
WebElement detailPostCode =driver.findElement(By.xpath("//label[#for='q_codePostalDetailAU']"));
System.out.println("postcode label text "+detailPostCode.getText());
Could any please help me understand the problem. Thank you for your kind concern.
As per the HTML you have shared the <label> with text as Postal code is a child node of the <div> tag with class attribute as q_codePostal.
What went wrong?
As per your code trial you have used:
WebElement detailPostCode = driver.findElement(By.xpath("//label[#for='q_codePostalDetailAU']"));
In this expression //label[#for='q_codePostalDetailAU'] will always refer to the descending <input> tag with id attribute as q_codePostalDetailAU. Hence your code trial didn't work.
Solution
As an alternative you can use the following solution:
WebElement detailPostCode = driver.findElement(By.cssSelector("div.q_codePostal>label"));
Can you try this xpath:
driver.findElement(By.xpath("//div[#class='q_codePostal']/label"));