一、手动更新IP池
1.在settings
配置文件中新增IP池:
IPPOOL=[
{"ipaddr":"61.129.70.131:8080"},
{"ipaddr":"61.152.81.193:9100"},
{"ipaddr":"120.204.85.29:3128"},
{"ipaddr":"219.228.126.86:8123"},
{"ipaddr":"61.152.81.193:9100"},
{"ipaddr":"218.82.33.225:53853"},
{"ipaddr":"223.167.190.17:42789"}
]
2.修改中间件文件middlewares.py
import random
from scrapy import signals
from myproxies.settings import IPPOOL
class MyproxiesSpiderMiddleware(object):
def __init__(self,ip=''):
self.ip=ip
def process_request(self, request, spider):
thisip=random.choice(IPPOOL)
print("this is ip:"+thisip["ipaddr"])
request.meta["proxy"]="http://"+thisip["ipaddr"]
3.在settings
中设置DOWNLOADER_MIDDLEWARES
DOWNLOADER_MIDDLEWARES = {
# 'myproxies.middlewares.MyCustomDownloaderMiddleware': 543,
'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware':543,
'myproxies.middlewares.MyproxiesSpiderMiddleware':125
}