scrapy 中 Request 的 url 补全

2023年6月27日 338次阅读来源: Tim_Lee

如果是片段url

在python3中

from urllib import parse

在python2中

import urlparse

response.url

Request(url=parse.urljoin(response.url, post_url), callback=self.parse_detail)

只是初始化，如何交给scrapy 下载？使用yield。

yield Request(url=parse.urljoin(response.url, post_url), callback=self.parse_detail)

当需要用到两个class来定位一个节点时，

比如

<a class="next page-numbers" href="http://blog.jobbole.com/all-posts/page/3/">下一页 »</a>

这个时候，就是把.next和.page-numbers连在一起写。

next_url = response.css(".next.page-numbers::attr(href) ").extract_first()

    原文作者：Tim_Lee
    原文地址: https://www.jianshu.com/p/f30973c016e4
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。