Cant fint next button on linkedin - selenium

I am trying to scrape LinkedIn website using selenium and Beautiful Soup.
The idea is simple, I started in the company's LinkedIn website then go to company search and scroll to the bottom of the page to get all results on the page.
But because Linkedin just provides 10 people per page so I need to find the next button on that page to go to the next 10 people.
I use this code
browser.find_element_by_class_name('next')
Traceback (most recent call last): File "LinkedInWebcrawler2019.py",
line 71, in
nextt = browser.find_element_by_class_name('next') File "C:\Users\Afdal\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\remote\webdriver.py",
line 564, in find_element_by_class_name
return self.find_element(by=By.CLASS_NAME, value=name) File "C:\Users\Afdal\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\remote\webdriver.py",
line 978, in find_element
'value': value})['value'] File "C:\Users\Afdal\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\remote\webdriver.py",
line 321, in execute
self.error_handler.check_response(response) File "C:\Users\Afdal\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\remote\errorhandler.py",
line 242, in check_response
raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such
element: Unable to locate element: {"method":"css
selector","selector":".next"} (Session info: chrome=77.0.3865.75)
Any ideas?

Try this for finding the next button.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
Next =WebDriverWait(driver, 10).until(EC.element_to_be_clickable(By.XPATH,'//*[contains(#class,'button')'))
Next.click()

Related

Debugging: "Message: no such element: Unable to locate element:"

I am learning Pythons so please bear with me.
I adopted LinkedIn-Easy-Apply-Bot from: https://github.com/voidbydefault/LinkedIn-Easy-Apply-Bot
While the bot is working perfectly fine on my test account but when I change the email ID/password to real account (with everything being the same), I start getting these errors:
Traceback (most recent call last):
File "/home/me/Documents/LinkedIn-Easy-Apply-Bot/linkedineasyapply.py", line 124, in apply_jobs
job_results = self.browser.find_element_by_class_name("jobs-search-results")
File "/home/me/PycharmProjects/Better-LinkedIn-EasyApply-Bot/venv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 766, in find_element_by_class_name
return self.find_element(by=By.CLASS_NAME, value=name)
File "/home/me/PycharmProjects/Better-LinkedIn-EasyApply-Bot/venv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 1251, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "/home/me/PycharmProjects/Better-LinkedIn-EasyApply-Bot/venv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 430, in execute
self.error_handler.check_response(response)
File "/home/me/PycharmProjects/Better-LinkedIn-EasyApply-Bot/venv/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".jobs-search-results"}
(Session info: chrome=102.0.5005.61)
I have tried deleting chromedriver to ensure version conflicts do not exit, I have tried adding time.sleep(5) in line 124 (above) and also tried driver.implicitly_wait(10). Unfortunately, the error persists with my real account.
Note that there are no issues with my real account if used manually. I am able to apply all sorts of jobs whether EasyApply or otherwise and the bot is working 100% fine on my test account hence the elements the code is looking for exist.
Please help in fixing the problem.
Thanks.

Trouble using find element in URL 2

I'm very new at this so apologies if this is a silly question. I am trying to use Selenium to automate a form filling process. I have managed to automate the login details on URL 1. URL 1 then redirects me to URL 2 where I'm supposed to input data that will then redirect me to URL 3. The problem I'm facing is that the find elements command for the second URL does not seem to be working.
HTML code for URL 2 is as follows:
input name="ctl00$ContentPlaceHolder1$txtNofNo" type="text" id="ctl00_ContentPlaceHolder1_txtNofNo" class="standardtextbox"
`
code I've tried so far
I've edited this code based on suggestions:
wait = WebDriverWait(driver, 20)
time.sleep(10)
newURl = driver.window_handles[0]
driver.switch_to.window(newURl)
driver.find_element_by_xpath("//*[contains(#id,'NofNo')]").sendKeys("Selenium");
And I'm now getting the following error messages.
Traceback (most recent call last):
newURl = driver.window_handles[0] File "C:\Users\kg3517\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 724, in window_handles
return self.execute(Command.W3C_GET_WINDOW_HANDLES)['value'] File
"C:\Users\kg3517\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 321, in execute
self.error_handler.check_response(response) File "C:\Users\kg3517\Anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py",
line 242, in check_response
raise exception_class(message, screen, stacktrace) WebDriverException: chrome not reachable (Session info:
chrome=90.0.4430.212)
I suspect the problem mainly arises because of the change in URL but am not sure how to fix it.

Scrapy: drop item from scraper

I would like to drop an item from the scraper itself instead of add the particular dropping logic of this scraper into the pipeline due is a specific case.
Scrapy has DropItem exception that is nicely handled from the Pipeline but it produces an error if is raised from the scraper:
#...
raise DropItem('Item dropped ' + self.id())
Output:
2019-11-13 13:27:27 [scrapy.core.scraper] ERROR: Spider error processing <GET http://domain.tld/> (referer: http://domain.tld/referer)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/local/core/core/spiders/my_spider.py", line 46, in parse get.photos())
scrapy.exceptions.DropItem: Item dropped 35
Is there a more elegant way to handle this situation?

AttributeError: 'Context' object has no attribute 'browser'

I am currently experimenting with Behavioral Driven Development. I am using behave_django with selenium. I get the following output
Creating test database for alias 'default'...
Feature: Open website and print title # features/first_selenium.feature:1
Scenario: Open website # features/first_selenium.feature:2
Given I open seleniumframework website # features/steps/first_selenium.py:2 0.001s
Traceback (most recent call last):
File "/home/vagrant/newproject3/newproject3/venv/local/lib/python2.7/site-packages/behave/model.py", line 1456, in run
match.run(runner.context)
File "/home/vagrant/newproject3/newproject3/venv/local/lib/python2.7/site-packages/behave/model.py", line 1903, in run
self.func(context, *args, **kwargs)
File "features/steps/first_selenium.py", line 4, in step_impl
context.browser.get("http://www.seleniumframework.com")
File "/home/vagrant/newproject3/newproject3/venv/local/lib/python2.7/site-packages/behave/runner.py", line 214, in __getattr__
raise AttributeError(msg)
AttributeError: 'Context' object has no attribute 'browser'
Then I print the title # None
Failing scenarios:
features/first_selenium.feature:2 Open website
0 features passed, 1 failed, 0 skipped
0 scenarios passed, 1 failed, 0 skipped
0 steps passed, 1 failed, 1 skipped, 0 undefined
Took 0m0.001s
Destroying test database for alias 'default'...
Here is the code:
first_selenium.feature
Feature: Open website and print title
Scenario: Open website
Given I open seleniumframework website
Then I print the title
first_selenium.py
from behave import *
#given('I open seleniumframework website')
def step_impl(context):
context.browser.get("http://www.seleniumframework.com")
#then('I print the title')
def step_impl(context):
title = context.browser.title
assert "Selenium" in title
manage.py
#!/home/vagrant/newproject3/newproject3/venv/bin/python
import os
import sys
sys.path.append("/home/vagrant/newproject3/newproject3/site/v2/features")
import dotenv
if __name__ == "__main__":
path = os.path.realpath(os.path.dirname(__file__))
dotenv.load_dotenv(os.path.join(path, '.env'))
from configurations.management import execute_from_command_line
#from django.core.management import execute_from_command_line
execute_from_command_line(sys.argv)
I'm not sure what this error means
I know it is a late answer but maybe somebody is going to profit from it:
you need to declare the context.browser (in a before_all/before_scenario/before_feature hook definition or just test method definition) before you use it, e.g.:
context.browser = webdriver.Chrome()
Please note that the hooks must be defined in a separate environment.py module
In my case the browser wasn't installed. That can be a case too. Also ensure path to geckodriver is exposed if you are working with Firefox.

Python web crawler using urllib

I am trying to extract the data from a webpage/website. Here's my code:
from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re
webpage=urlopen('http://www.xxxxxxxxx.com').read()
patFinderTitle=re.compile('<title>(.*)</title>')
patFinderLink=re.compile('<link rel.*href="(.*)"/>')
findPatTitle=re.findall(patFinderTitle,webpage)
findPatLink=re.findall(patFinderLink,webpage)
listIterator=[]
listIterator[:]=range(2,16)
for i in listIterator:
print findPatTitle[i]
print findPatLink[i]
print "\n"
articlepage=urlopen(findPatLink[i]).read()
divbegin=articlepage.find('<div class="">')
article=articlepage[divbegin:(divbegin+1000)]
soup=BeautifulSoup(article)
paralist=soup.findAll('<p>')
for i in paralist:
print i
I want to list the title and all the links in the webpage. When I run the script it throws an error:
Traceback (most recent call last):
File "justdialcrawl.py", line 21, in <module>
print findPatTitle[i]
IndexError: list index out of range
I tried searching Google but I could not find answers.
You forgot one minor thing:
webpage=urlopen('http://www.xxxxxxxxx.com').read()
# this -> ^^^^^^^
Your code just generated an urlopen object and assigned it to webpage. To assign the contents of the page, you need .read().