代码:
# -*- coding:utf-8 -*- from urllib import request resp = request.urlopen('http://www.xxx.com') print(resp.read().decode('utf-8'))
报错:
Traceback (most recent call last): File "F:/workspace/python/py3/test_urllib.py", line 7, in <module> print(resp.read().decode('utf-8')) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 201: invalid continuation byte
原因:
确定要抓取的页面的编码,并不是所有网站的编码都是utf-8的,resp.read().decode()应传入与要抓取的网页一致的编码。