Python爬虫：splash+requests简单示例-伙伴云

网友投稿 923 2022-05-30

说明：

render是get方式

execute是post方式

render

import requests def splash_render(url): splash_url = "http://localhost:8050/render.html" args = { "url": url, "timeout": 5, "image": 0, "proxy": "http://222.95.21.28:8888" } response = requests.get(splash_url, params=args) return response.text if __name__ == '__main__': url = "http://quotes.toscrape.com/js/" html = splash_render(url)

args参数说明：

url: 需要渲染的页面地址

timeout: 超时时间

proxy：代理

wait：等待渲染时间

images: 是否下载，默认1（下载）

js_source: 渲染页面前执行的js代码

execute

import json import requests def splash_execute(url): splash_url = "http://localhost:8050/execute" script = """ function main(splash) local url="{url}" splash:set_user_agent("Mozilla/5.0 Chrome/69.0.3497.100 Safari/537.36") splash:go(url) splash:wait(2) splash:go(url) return { html = splash:html() } end """ script = script.replace("{url}", url) data = { "timeout": 5, "lua_source": script } response = requests.post(splash_url, json=data) return response.json().get("html") if __name__ == '__main__': url = "http://quotes.toscrape.com/js/" html = splash_execute(url)

Python爬虫：splash+requests简单示例

参数说明：

timeout 超时

lua_source lua脚本

proxy 代理

模拟登录

以下是lua脚本

splash提供的select选择器，使用方法和jQuery类似

function main(splash, args) -- jquery加载比较慢 splash:autoload("https://code.jquery.com/jquery-3.3.1.min.js") splash:set_viewport_size(1366, 768) splash:set_user_agent("Mozilla/5.0 Chrome/69.0.3497.100 Safari/537.36") -- 从首页点击登录按钮 splash:go(splash.args.url) splash:wait(3) splash:runjs("$('#login').click()") splash:wait(2) -- 登录页输入账号密码，并提交 splash:select("#username"):send_text("username") splash:select("#password"):send_text("password") splash:wait(5) -- 可以使用splash自带的鼠标点击，并指定点击位置 local button = splash:select("#button") local bounds = button:bounds() button:mouse_click{x=bounds.width/3, y=bounds.height/3} splash:wait(2) -- 返回 return { html=splash:html(), png = splash:png(), cookie=splash:get_cookies() } end

参考:

splash文档：https://splash.readthedocs.io/en/stable/scripting-ref.html

Python

在 Python 中绘制 Mandelbrot 集（在线翻译器）

923 2022-05-30

Python 爬虫：splash+requests简单示例

python如何删除excel中不满足要求的工作簿（excel怎么删除工作表里不需要的）

怎么把python程序附在文档上（如何利用python在一个文档里写入）

在 Python 中绘制 Mandelbrot 集（在线翻译器）

推荐文章

企业生产管理是什么，企业生产管理软件

进盘点进销存软件排行榜前十名

进销存系统哪个简单好用？进销存系统优点

工厂生产管理（工厂生产管理流程及制度）

生产管理软件，机械制造业生产管理，制造业生产过程管理软件

进销存软件和ERP有什么区别？进销存与erp软件理解

进销存如何进行库存管理

如何利用excel制作销售订单管理系统？

数据库订单管理系统有哪些功能？数据库订单管理系统怎么设计？

什么是数据库管理系统？

最近发表

热评文章

零代码开发是什么？2022低代码平台排行榜

智能进销存库存管理系统（智慧进销存）

在线文档哪家强？8款在线文档编辑软件推荐

WPS2016怎么绘制简单的价格表?

智能定制家居管理系统：重新定义家庭生活方式

连锁餐饮管理系统的功能有哪些？餐饮服务系统的构成及工

友情链接