is there alternate way to install Scrapy? - scrapy

I couldnt able to install Scrapy through pip (pip install Scrapy) through my corporate network.
Is there any alternate way to download and install Scrapy?
appreciate you help.

Download the zip file here.
Unzip and change directory the terminal to the unzip folder
run python setup.py install to install scrapy.

Related

SSL: CERTIFICATE_VERIFY_FAILED error while downloading python -m spacy download en

I have downloaded spacy in Anaconda prompt by using conda install -c conda-forge spacy. But when I tried to download en_core_we_sm using python -m spacy download en_core_web_sm I getting SSL: CERTIFICATE_VERIFY_FAILED error.
With HTTPS, trying to download something from a remote host produces an SSL connection error in some cases like if your computer is behind a proxy which does not let you to make SSL connection freely. For those cases, a downloading manager like pip , conda for python or apt-get or yum for Linux provide some options for a user to specify certificate for such connections or to allow untrusted communication with a remote host for such downloads.
However, downloading a model VIA spacy with python -m spacy download does not provide such options. You cannot add any SSL certificates nor specify trusted host for a download.
Fortunately, there's a workaround solution with two separate steps , downloading and installing. That is, download the model with any other clients which is under control with SSL (browser, curl, wget...) then install the downloaded model with pip install
Find appropriate model you need on https://github.com/explosion/spacy-models/releases and download tar.gz file like,
wget https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz
Then install it like,
python -m pip install ./en_core_web_sm-2.2.5.tar.gz
Just download the direct version.
python -m spacy download en_core_web_sm-2.2.0 --direct
I had the same error as you, gave this a try, and it worked. For more information here are some additional details from the model page:
https://spacy.io/usage/models
The answer provided by K. Symbol is helpful. As an alternative, the download and installation can be done in one statement with pip. Pip can be assigned "trusted-host" and the "install" object can be a website, so:
pip --trusted-host github.com --trusted-host objects.githubusercontent.com install https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.4.0/en_core_web_md-3.4.0.tar.gz
For me the issue was i was running the command "python -m spacy download en" from a different location other than "C:\WINDOWS\system32". When i ran the command from "C:\WINDOWS\system32" with "Run as Admin" it worked like charm. Seems from other locations it is not able to load the correct ssl config.
If you are unable to download it because you cannot verify the certificate as you are behind a company proxy, you can also do the following by first downloading the file via requests and specifying that you don't want to check certifictates, then install it via pip:
import requests, os
lang = 'en'
r = requests.get(f'https://github.com/explosion/spacy-models/releases/download/{lang}_core_news_sm-3.0.0/{lang}_core_news_sm-3.0.0-py3-none-any.whl',
verify=False) # verify=False to skip checking of certificate
file = f'{lang}_core_news_sm-3.0.0-py3-none-any.whl'
with open(file,'wb') as output_file:
output_file.write(r.content) # save the wheel locally
# then install it via pip
!pip install {file} --user
os.remove(file) # remove the file
First, Uninstall Spacy and clean the directories. Then install with the following link -
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org spacy
Use pip3 for Python3 and run following in a terminal
python -m spacy download en_core_web_sm
let me know if you still get error/s. Follow https://spacy.io/usage/models

Installing pandas without pip

Is it possible to install pandas without installing pip or Is there any other way to use pandas without installing pip.
Thanks in advance.
pip is a package management system used to install and manage software packages written in Python. Many packages can be found in the default source for packages and their dependencies
here is the another way:
Download and unzip the current pandapower distribution to your local hard drive.
Open a command prompt (e.g. Start–>cmd on Windows) and navigate to the folder that contains the setup.py file with the command cd
cd %path_to_pandapower%\pandapower-x.x.x\
Install pandapower by running
python setup.py install
You can get pandas installed using the Anaconda distribution, which includes the Anaconda prompt. After you open an anaconda prompt, you can run the following command:
conda install pandas
which will install the latest version of pandas, or:
conda install pandas=0.20.3
to get a specific version of the package. Another way to do it is to install it with Miniconda, which allows you to avoid downloading the Anaconda installer and hundreds of other packages. More information can be found here: https://pandas.pydata.org/pandas-docs/version/0.23.4/install.html

PIP install pandas not working

I am trying to install pandas with .whl file in a work computer but I get " cannot fetch URL" error. I have up to date version of PIP installed.How can I get this to work.I'm using Python 3.5.Any help will be appreciated.
I just used the following which was quite simple. First open a console then cd to where you've downloaded your file like some-package.whl and use
pip install some-package.whl
python -m pip install some-package.whl also works if pip is not found in PATH
Note: if pip.exe is not recognized, you may find it in the "Scripts" directory from where python has been installed. If pip is not installed, this page can help:
How do I install pip on Windows?
Note: for clarification
If you copy the *.whl file to your local drive (ex. C:\some-dir\some-file.whl) use the following command line parameters --
pip install C:/some-dir/some-file.whl
I guess since you are on a work computer
try this
sudo pip --proxy=http://username:password#proxyURL:portNumber install yolk
for example:
sudo pip --proxy=http://202.194.64.89:8000 install elasticsearch
202.194.64.89:8000 is my PROXY,

Has anyone installed Scrapy under Canopy distro

I am new to Canopy. I have some data mining projects that I would like to do using Python. I was wondering if anyone was able to install Scrapy under Canopy? Is it easy to install packages outside of the main repository?
Short answer is to try pip install scrapy from the command line (use the Canopy Command Prompt (Windows) or Canopy Terminal (OSX/Linux) found in the Tools menu; this ensures Canopy's User Python is on the PATH).
See this article in the Enthought Knowledge Base about installing external packages into Canopy. It provides information on the steps required, including the use of pip, as well as other considerations you will want to be aware of when installing external packages.
1)sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 627220E7
2)echo 'deb http://archive.scrapy.org/ubuntu scrapy main' | sudo tee /etc/apt/sources.list.d/scrapy.list
3)sudo apt-get update && sudo apt-get install scrapy-0.24
these are the above three steps which will install scrappy
even i had the same problem , below link helps you on this
http://doc.scrapy.org/en/latest/topics/ubuntu.html#topics-ubuntu

Scrapy add external library

How can we add external library to Scrapy.I want to add the follwing library to Scrapy:
https://github.com/scrapinghub/scrapylib
How can I add it?
I'm not entirely sure why pip can't install scrapylib (I updated pip to version 1.4, but the same issue occurred).
A workaround would be to download a zip of scrapylib directly from Github, extract the zip and then run python setup.py install. I was able to install scrapylib and run import scrapylib from the Python interpreter without any errors.