How do you run BeautifulSoup - beautifulsoup

I do a course on coursera, Python for everybody. I am stuck on the 3rd course, week 4.
My problem is that when I try to run Beautifulsoup, it has a traceback. I am uploading 2 pics, one my code and one my traceback. I tried downloading it from the web, the tried from the course folder, and then tried it from the command prompt, using python -m pip install beautifulsoup. None of there are working. I really need to know how to run BeautifulSoup because i have to submit my code soon. So if you can explain how to run BeautifulSoup I would appreciate it. (Also would appreciate if you explained the traceback!)
Thanks!
traceback
my code
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")
print(soup.find_all("span"))
#tags = soup('a')
#for tag in tags:
#print('TAG:', tag)
#print('URL:', tag.get('href', None))
#print('Contents:', tag.contents[0])
#print('Attrs:', tag.attrs)

You're having a circular imports issue. I think it's because you're importing bs4 but you also have a folder bs4 next to your Python file. Try removing/renaming that folder.

The problem is not with the code snippet but your installation of Beautiful soup. Try reinstalling beautiful soup with
pip3 install —ignore-installed beautifulsoup4
as suggested here

Related

Can I install chromedriver on python anywhere and not use it headless?

I am trying to use this code that works on my local machine on python anywhere and i want to understand if it is even possible:
from selenium import webdriver
from bs4 import BeautifulSoup
import time
# Initialize webdriver
driver = webdriver.Chrome(executable_path="/Users/matteo/Downloads/chromedriver")
# Navigate to website
driver.get("https://apnews.com/article/prince-harry-book-meghan-royals-4141be64bcd1521d1d5cf0f9b65e20b5")
time.sleep(5)
# Parse page source
soup = BeautifulSoup(driver.page_source, "html.parser")
# Find desired elements using Beautiful Soup
elements = soup.find_all("p")
# Print element text
for element in elements:
print(element.text)
# Close webdriver
driver.quit()
Do i need to have installed chrome to make that work or is chromium enough? Because when i run that code on my local machine a chrome page opens up. How does that work on python anywhere? Would it crush?
I am wondering if the code i am using only works if someone is on a GUI with Chrome installed or if it can work on python anywhere too.
The short answer is no. ChromeDriver must have chrome installed. You can run your tests headless for time save, but chrome still must be installed.

ModuleNotFoundError: No module named 'pandas.io' for json_normalize

Please read carefully. In my Python script I have the following:
import json
import pandas
from pandas.io.json import json_normalize
and it returns the following error:
from pandas.io.json import json_normalize ModuleNotFoundError: No
module named 'pandas.io'; 'pandas' is not a package
My steps:
I have uninstalled and installed Pandas
I have upgraded pip and pandas
I have installed io (pip install -U pandas.io)
I have installed data_reader and replaced the pandas.io.json part with that: from pandas_datareader import json_normalize
I have tried every solution I saw on stackoverflow and github and nothing worked. The only one I have not tried is installing Anaconda but it should work with what I tried before. Do you think it is a Windows setting things I must change?
PS: My Python version is 3.7.4
Try:
Go to ...\Lib\site-packages\pytrends on your local disk and open file request.py
Change
from pandas.io.json._normalize import nested_to_record
to
from pandas.io.json.normalize import nested_to_record
I had the same error, but it helped me.
also change
from pandas.io.json.normalize
to
from pandas.io.json._normalize
The cause of the problem was the fact that the python file had the name pandas. The filename was pandas.py. After renaming it, the code worked normally without errors.
i had same problem and i solve it b uninstalling extra python versions install on my windows.now i have only one python installed by anaconda,and everything is working perfectly

Adding beautifulsoup in pycharm

How to install beautifulsoup4 in pycharm:
Click on settings, now look for the project interpreter. This is where you can add beautifulsoup by clicking on the green cross on the right side of the screen.
I did this in pycharm, but there occurs error in installation. The error message is:
Traceback (most recent call last):
File "C:/Users/Rishi/PycharmProjects/untitled2/beautifuls.py", line 3, in
from bs4 import BeautifulSoup
File "C:\Python34\lib\site-packages\bs4_init_.py", line 175
except Exception, e:
^
SyntaxError: invalid syntax
Process finished with exit code 1
python.org/pypi
This is the repositories set in the available packages list.
Do you have a version of Python 2 on the same machine as your Python 3.4 interpreter?
If so you may want to check to make sure that you are using the correct version of pip. I would suggest removing beautiful soup and re-adding it manually as a test.
pip3 install BeautifulSoup4
You also need to find "beautifulsoup4" (for python 3.x)

"No module named urllib2" after installed "urllib2" using pip in Visual Studio 2015

I have installed "urllib2" using pip by interface in Visual Studio 2015, but the debugger gets the error in line "import urllib2".
I have solved by now using:
with urllib.request.urlopen(url) as response:
html_doc = response.read()
gp = BeautifulSoup(html_doc, 'html.parser')
My job is just to parse an html page. Should those two lines be fine anyway? (I'm using BeautifulSoup4 with Python 3.4.3)
As you discovered, urllib2 has been split up into urllib.request and urllib.error in Python 3.
You can change your import to:
from urllib.request import urlopen
Your code looks fine to me.

Py2exe doesn't find bs4

In my original code, I have the line:
from bs4 import BeautifulSoup
When I use py2exe, it builds fine but further up in the output it says:
The following modules appear to be missing
['_scproxy', 'bs4']
I specifically put bs4 in the py2exe options:
"includes": ["bs4.BeautifulSoup"]
Is that how I should be referencing BeautifulSoup in the includes statement?
The fella over here didn't know how to do it either: 3rd Party Libraries and Py2exe
Do I need to use packages instead of includes or something? All regular libraries and some other like mechanize import fine, but I can't get BeautifulSoup to work fine. Any advice is appreciated.
EDIT: I solved part of this by uninstall BeautifulSoup and re-installing with --always-unzip option:
easy_install --always-unzip beautifulsoup4
However, it added 9 new missing modules. One being '_scproxy'.
First thing to check is that you have your setup.py in the same directory as your module and your are running it from that directory.
If that doesn't work your should add your module to your path in setup.py:
module_path = r'path\to\your\BeautifulSoup\module'
if module_path not in sys.path:
sys.path.append(modules_path)