scrapy DownloaderMiddleware中 response.text无法正确获取

2023年8月5日 384次阅读来源: 听闻不见

问题

DownloaderMiddleware中使用response.text时提示’response不为text’,并且也无法获取response.encoding

利用chrome查看得知encoding=gzip
gzip为一种压缩格式,故尝试解压。

from gzip import GzipFile
from io import BytesIO
def dezip(data):
    buf = BytesIO(data)
    f = GzipFile(fileobj=buf)
    return f.read()

注意response.body为bytes object，固使用ByteslO

直接使用resquests库的get请求时，没有遇到这个问题，可以直接使用response.text直接获取正确解码的数据，为何在DownloaderMiddleware中需要解压的问题尚未清楚。

    原文作者：听闻不见
    原文地址: https://www.jianshu.com/p/8ede883ace9a
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。