将scrapy发布为服务并提供接口服务,以便其他项目调用:
pip install scrapyd
pip install scrapyd-client
- 修改scrapy.cfg:
- 启动服务:
scrapyd
[settings]
default = compass.settings
[deploy]
url = http://localhost:6800/
project = compass[settings]
- 部署服务:
scrapyd-deploy
- 启动服务:
scrapyd
- 调用服务:
curl http://localhost:6800/schedule.json -d project=your_project_name -d spider=your_spider_name
- 不清楚项目名和爬虫名可以用下面命令查询:
- 列出项目:
curl http://localhost:6800/listprojects.json
- 列出爬虫:
curl http://localhost:6800/listspiders.json?project=compass
- 列出项目:
- 其他命令:
- 调度爬虫
curl http://localhost:6800/schedule.json -d project=your_project_name -d spider=your_spider_name
- 包含参数
curl http://localhost:6800/schedule.json -d project=your_project_name -d spider=your_spider_name -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
- 取消运行
- curl http://localhost:6800/cancel.json -d project=your_project_name -d job=2bffadcb3218k9abbd23ccf016aa82f02
- 列出版本
curl http://localhost:6800/listversions.json?project=your_project_name
- 列出job
curl http://localhost:6800/listjobs.json?project=your_project_name
- 删除版本
curl http://localhost:6800/delversion.json -d project=your_project_name -d version==15419782769
- 删除项目
curl http://localhost:6800/delproject.json -d project=your_project_name
- 调度爬虫