工作空余时间想写个爬虫练手,没想到在安装scrapy的过程中遇到了很多问题,在此记录一下。
1.安装python环境
Mac上自带python2.7,于是这一步省了。
2.安装pip
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py
3.安装scrapy
pip install scrapy
报错:
Collecting scrapy
Could not fetch URL https://pypi.python.org/simple/scrapy/: There was a problem confirming the ssl certificate: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:661) - skipping
Could not find a version that satisfies the requirement scrapy (from versions: )
No matching distribution found for scrapy
看起来像是公司开发网导致的无法下载,我打开了Proxifier,让它走代理服务器,解决。
再次pip install scrapy,又报错:
Collecting scrapy
Using cached https://files.pythonhosted.org/packages/db/9c/cb15b2dc6003a805afd21b9b396e0e965800765b51da72fe17cf340b9be2/Scrapy-1.5.0-py2.py3-none-any.whl
Collecting pyOpenSSL (from scrapy)
Using cached https://files.pythonhosted.org/packages/79/db/7c0cfe4aa8341a5fab4638952520d8db6ab85ff84505e12c00ea311c3516/pyOpenSSL-17.5.0-py2.py3-none-any.whl
Collecting queuelib (from scrapy)
Using cached https://files.pythonhosted.org/packages/4c/85/ae64e9145f39dd6d14f8af3fa809a270ef3729f3b90b3c0cf5aa242ab0d4/queuelib-1.5.0-py2.py3-none-any.whl
Collecting cssselect>=0.9 (from scrapy)
Using cached https://files.pythonhosted.org/packages/7b/44/25b7283e50585f0b4156960691d951b05d061abf4a714078393e51929b30/cssselect-1.0.3-py2.py3-none-any.whl
Collecting PyDispatcher>=2.0.5 (from scrapy)
Using cached https://files.pythonhosted.org/packages/cd/37/39aca520918ce1935bea9c356bcbb7ed7e52ad4e31bff9b943dfc8e7115b/PyDispatcher-2.0.5.tar.gz
Collecting Twisted>=13.1.0 (from scrapy)
Using cached https://files.pythonhosted.org/packages/a2/37/298f9547606c45d75aa9792369302cc63aa4bbcf7b5f607560180dd099d2/Twisted-17.9.0.tar.bz2
Complete output from command python setup.py egg_info:
Download error on https://pypi.python.org/simple/incremental/: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:661) -- Some packages may not be found!
Couldn't find index page for 'incremental' (maybe misspelled?)
Download error on https://pypi.python.org/simple/: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:661) -- Some packages may not be found!
No local packages or working download links found for incremental>=16.10.1
Traceback (most recent call last):
File "", line 1, in
File "/private/var/folders/c_/gmvr6xh546bcm2xtfmpc0q640000gn/T/pip-install-inDShH/Twisted/setup.py", line 21, in
setuptools.setup(**_setup["getSetupArgs"]())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 111, in setup
_setup_distribution = dist = klass(attrs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/setuptools/dist.py", line 315, in __init__
self.fetch_build_eggs(attrs['setup_requires'])
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/setuptools/dist.py", line 361, in fetch_build_eggs
replace_conflicting=True,
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 850, in resolve
dist = best[req.key] = env.best_match(req, ws, installer)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1122, in best_match
return self.obtain(req, installer)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1134, in obtain
return installer(requirement)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/setuptools/dist.py", line 429, in fetch_build_egg
return cmd.easy_install(req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 659, in easy_install
raise DistutilsError(msg)
distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('incremental>=16.10.1')
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/c_/gmvr6xh546bcm2xtfmpc0q640000gn/T/pip-install-inDShH/Twisted/
从log信息中我们可以看到问题出在Collecting Twisted这一步,看Download error on https://pypi.python.org/simple/: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version 还以为是openssl的问题,然而重新安装了openssl还是报一样的错。
那么我单独安装Twisted试一下?报了一样的错
经过一番焦头烂额的搜索尝试安装了一堆我也不知道干嘛的包,还是没有解决。。
【解决方法】:
第二天早上脑子清楚了,查到了stack overflow上的回答:https://stackoverflow.com/questions/42129767/pip-install-twisted-error-1
I had the same issue on a Mac OSX 10.11.6 in a new virtualenv with a fresh install of Python3.6.1. In my case, I had old versions of the Twisted dependency incremental installed, which prevented the installation.
pip install --upgrade incremental
pip install Twisted
Note I: I was installing a whole array of packages from a requirements file where the same incremental version was specified. I really wonder why the upgrade of incremental helped and have no clue what actually went wrong. If someone can clarify, that would be great.
Note II: Installing incremental ahead of of Twisted seems to be necessary on fresh installs, too [Experienced when working with CentOS7].
意思是说:出现这种问题是因为Twisted所依赖的incremental库本地已经有了一个旧版本(答主也是OSX,旧的incremental很可能是系统自带的),这个旧版本的incremental导致了Twisted安装失败。
那么按照答主所说运行这两条命令:先将incremental升级,再安装Twisted。Twisted安装成功~
然后再一次pip install scrapy,成功!