Playwright:修订间差异

(创建页面,内容为“Playwright是微软源的一个Web测试和自动化框架。支持 Chromium、Firefox和WebKit浏览器,LinuxmacOSWindows平台,Python、.NET和Java等多语言。 ==简介== ===时间轴=== ===安装=== 安装Python版本: <syntaxhighlight lang="bash" > # 安装pytest插件版playwright # pip install pytest-playwright # 安装Pytest pip install playwright # 安装所有支持的浏览器及配置驱动 # playwright ins…”)
 
无编辑摘要
 
(未显示同一用户的9个中间版本)
第20行: 第20行:
{{了解更多
{{了解更多
|[https://playwright.dev/python/docs/intro Playwright Python 文档:安装]
|[https://playwright.dev/python/docs/intro Playwright Python 文档:安装]
|[https://playwright.dev/python/docs/library  Playwright Python 文档:入门]
}}
}}
==快速入门==
==快速入门==
=== 同步模式 ===
<syntaxhighlight lang="python" >
from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
# 使用playwright.chromium, playwright.firefox or playwright.webkit
# 默认无界面模式,launch使用headless=False设置有界面
browser = playwright.firefox.launch(headless=False)
page = browser.new_page()
page.goto("https://www.baidu.com")
page.screenshot(path="截图.png")
browser.close()
playwright.stop()
</syntaxhighlight>
更常用使用with语句:
<syntaxhighlight lang="python" >
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.firefox.launch(headless=False)
    page = browser.new_page()
    page.goto("https://www.baidu.com/")
    # 输入框输入文字
    # page.locator('//input[@id="kw"]').fill('playwright')
    page.fill('//input[@id="kw"]', 'playwright')
    # 点击搜索按钮
    # page.locator('//input[@id="su"]').click() 
    page.click('//input[@id="su"]')
    # 延迟5秒,单位毫秒
    page.wait_for_timeout(5*1000)   
    page.screenshot(path="截图.png")
    browser.close()
</syntaxhighlight>
代码在[[Jupyter]]中运行会出现错误:<code>Error: It looks like you are using Playwright Sync API inside the asyncio loop.Please use the Async API instead.</code>。解决办法:代码保存到<code>测试.py</code>,在终端运行<code>python 测试.py</code>。
{{了解更多
|[https://playwright.dev/python/docs/library  Playwright Python 文档:入门]
}}
=== 异步模式 ===
使用with语句
<syntaxhighlight lang="python" >
import asyncio
from playwright.async_api import async_playwright
async def main():
    async with async_playwright() as p:
        browser = await p.firefox.launch(headless=False)
        page = await browser.new_page()
        await page.goto("https://wwww.baidu.com")
        print(await page.title())
        await browser.close()
asyncio.run(main())
</syntaxhighlight>
{{了解更多
|[https://playwright.dev/python/docs/library  Playwright Python 文档:入门]
}}
==浏览器==
===安装和使用===
{| class="wikitable"
! 名称
! 描述
|-
| chromium
| 使用<code>playwright install chromium</code>安装好浏览器和驱动
<syntaxhighlight lang="python" >
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.new_page()
    page.goto("https://www.baidu.com/")
    page.wait_for_timeout(5*1000)  # 等待5秒 
    page.screenshot(path="截图.png")
    browser.close()
</syntaxhighlight>
|-
| chrome
| 使用<code>playwright install chrome</code>自动安装浏览器和驱动,也可以自己安装浏览器和驱动。
<syntaxhighlight lang="python" >
browser = p.chromium.launch(
        channel="chrome",
        headless=False,
        slow_mo=10,
        # 跳过检测
        args=['--disable-blink-features=AutomationControlled']
    )
</syntaxhighlight>
|-
| firefox
| 使用<code>playwright install firefox</code>自动安装浏览器和驱动<syntaxhighlight lang="python" >
browser = p.firefox.launch(headless=False)
</syntaxhighlight>
|-
|
|
|}
==页面==
{| class="wikitable"
! 名称
! 描述
! 示例
|-
| goto()
|
|
|-
|
|
|
|-
| content()
| 页面HTML源代码
| <syntaxhighlight lang="python" >
with open('test.txt', 'w', encoding='utf-8' ) as f:
    f.write(page.content())
</syntaxhighlight>
|-
|
|
|
|}
==元素==
===定位===
{| class="wikitable"
! 名称
! 描述
! 示例
|-
|
|
|
|-
|
|
|
|-
|
|
|
|}
{{了解更多
|[https://playwright.dev/python/docs/locators Playwright Python 文档:定位]
}}
===属性===
{| class="wikitable"
! 名称
! 描述
! 示例
|-
|
|
|
|-
|
|
|
|-
|
|
|
|}
{{了解更多
|[https://playwright.dev/python/docs/api/class-locator Playwright Python API:locator类]
}}
== 网络 ==
===监听请求和响应===
使用<code>page.on("request", handler)</code>和<code>page.on("response", handler)</code>可以监听所有请求和响应事件。
{{了解更多
|[https://playwright.dev/python/docs/network#network-events Playwright Python 文档:网络 - 网络事件]
|[https://playwright.dev/python/docs/api/class-page#events Playwright Python API:Page类 - events]
|[https://playwright.dev/python/docs/api/class-response Playwright Python API:Response类]
|[https://playwright.dev/python/docs/api/class-request Playwright Python API:Request类]
}}
===处理请求===
使用<code>page.route()</code>或<code>browser_context.route()</code>可以修改或终止请求。
{{了解更多
|[https://playwright.dev/python/docs/network#handle-requests Playwright Python 文档:网络 - 处理请求]
|[https://playwright.dev/python/docs/api/class-page#page-route Playwright Python API:Page类 - route]
}}
==== 修改请求 ====
<syntaxhighlight lang="python" >
# 修改header,删除"x-secret"键
def handle_route(route):
    headers = route.request.headers
    del headers["x-secret"]
    route.continue_(headers=headers)
page.route("**/*", handle_route)
# Continue requests as POST.
page.route("**/*", lambda route: route.continue_(method="POST"))
</syntaxhighlight>
==== 终止请求 ====
使用<code>page.route()</code>和<code>route.abort()</code>可以终止请求。如有时候不想加载图片和一些请求。
<syntaxhighlight lang="python" >
page = browser.new_page()
page.route("**/*.{png,jpg,jpeg}", lambda route: route.abort())
page.goto("https://example.com")
browser.close()
</syntaxhighlight>
根据正则表达式:
<syntaxhighlight lang="python" >
page = browser.new_page()
page.route(re.compile(r"(\.png$)|(\.jpg$)"), lambda route: route.abort())
page.goto("https://example.com")
browser.close()
</syntaxhighlight>
以下代码根据请求类型和请求url终止某些请求:
<syntaxhighlight lang="python" >
from playwright.sync_api import sync_playwright
ABORT_TYPES = ['image', 'font', 'media']
ABORT_URL_NAME = ['bdstatic.com', '/static/superman', '.js']
def handle_route(route):
    if route.request.resource_type in ABORT_TYPES:
        return route.abort()
    elif any(name in route.request.url for name in ABORT_URL_NAME):
        return route.abort()       
    else:
        route.continue_()
with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.new_page()
    page.route("**/*", handle_route)
    page.goto("https://www.baidu.com")
    page.wait_for_timeout(10*1000)   
    browser.close()
</syntaxhighlight>
{{了解更多
|[https://playwright.dev/python/docs/network#abort-requests Playwright Python 文档:网络 - 终止请求]
|[https://playwright.dev/python/docs/api/class-page#page-route Playwright Python API:Page类 - route]
}}
===处理响应===
要修改响应,先使用<code>APIRequestContext</code>获取原始响应,然后将响应传递给<code>route.fulfill()</code> 。
{{了解更多
|[https://playwright.dev/python/docs/network#modify-responses Playwright Python 文档:网络 - 修改响应]
}}
==调试工具==
===Inspector===
设置为debug模式,运行代码时即可打开Playwright Inspector。设置方法:设置环境变量<code>PWDEBUG=1</code>。
<syntaxhighlight lang="python" >
# Bash
PWDEBUG=1 pytest -s
# PowerShell中
$env:PWDEBUG=1
# Batch
set PWDEBUG=1
</syntaxhighlight>
{{了解更多
|[https://playwright.dev/python/docs/debug#playwright-inspector  Playwright Python 文档:调试 - Playwright Inspector]
}}
===Trace Viewer===
{{了解更多
|[https://playwright.dev/python/docs/debug#trace-viewer  Playwright Python 文档:调试 - Trace Viewer]
}}
==代码生成器==
使用<code>playwright codegen</code>命令可以运行代码生成,会打开两个窗口,一个是浏览器,另一个是Playwright Inspector窗口。在浏览器器中操作,会在Inspector窗口实时生成代码。可以使用<code>playwright codegen -h</code>查看帮助。
<syntaxhighlight lang="bash" >
# 使用firefox浏览器,打开www.baidu.com网页。
playwright codegen www.baidu.com -b firefox
#  生成代码保存到test.py
playwright codegen -o test.py -b firefox
</syntaxhighlight>
{{了解更多
|[https://playwright.dev/python/docs/codegen Playwright Python 文档:测试生成器]
}}
==检测与防检测==
===防检测===
{| class="wikitable"
! 名称
! 描述
|-
| 设置参数,运行add_init_script
| 删除一些特征。
<syntaxhighlight lang="python" >
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.chromium.launch(
            channel="chrome",
            headless=False,
            slow_mo=10,
            # 防检测
            args=['--disable-blink-features=AutomationControlled']
        )
    page = browser.new_page()
    page.add_init_script("""
    Object.defineProperties(navigator, {webdriver:{get: () => undefined}});
    """)
    page.goto("https://wwww.baidu.com")
    page.wait_for_timeout(5*1000)   
    page.screenshot(path="截图.png")
    browser.close()
</syntaxhighlight>
|-
|
|
|-
|
|
|}


==资源==
==资源==
第27行: 第348行:
* Playwright 官网:https://playwright.dev
* Playwright 官网:https://playwright.dev
* Playwright 源代码:https://github.com/microsoft/playwright
* Playwright 源代码:https://github.com/microsoft/playwright
* Playwright Python版源代码:https://github.com/microsoft/playwright-python
* Playwright Python 文档:https://playwright.dev/python/docs/intro
* Playwright Python 文档:https://playwright.dev/python/docs/intro
* Playwright Python API:https://playwright.dev/python/docs/api/class-playwright


===网站===
===网站===


===文章===
===文章===
*[https://cuiqingcai.com/36045.html 静觅:崔庆才 - 新兴爬虫利器Playwright 的基本用法]
*[https://soulteary.com/2022/11/28/playwrights-concise-introductory-tutorial-recording-automated-test-cases-and-using-it-with-docker.html#%E5%86%99%E5%9C%A8%E5%89%8D%E9%9D%A2 苏洋博客:Playwright 简明入门教程:录制自动化测试用例,结合 Docker 使用]

2023年4月30日 (日) 14:49的最新版本

Playwright是微软源的一个Web测试和自动化框架。支持 Chromium、Firefox和WebKit浏览器,LinuxmacOSWindows平台,Python、.NET和Java等多语言。

简介

时间轴

安装

安装Python版本:

# 安装pytest插件版playwright
# pip install pytest-playwright
# 安装Pytest
pip install playwright 

# 安装所有支持的浏览器及配置驱动
# playwright install
# 只安装chrome浏览器及配置驱动,使用playwright install -h可以查看帮助
# 目前支持chromium, chrome, chrome-beta, msedge, msedge-beta, msedge-dev, firefox, webkit浏览器。 
playwright install chrome

了解更多 >> Playwright Python 文档:安装 Playwright Python 文档:入门


快速入门

同步模式

from playwright.sync_api import sync_playwright

playwright = sync_playwright().start()
# 使用playwright.chromium, playwright.firefox or playwright.webkit
# 默认无界面模式,launch使用headless=False设置有界面
browser = playwright.firefox.launch(headless=False)
page = browser.new_page()
page.goto("https://www.baidu.com")
page.screenshot(path="截图.png")
browser.close()
playwright.stop()

更常用使用with语句:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.firefox.launch(headless=False)
    page = browser.new_page()
    page.goto("https://www.baidu.com/")
    # 输入框输入文字
    # page.locator('//input[@id="kw"]').fill('playwright') 
    page.fill('//input[@id="kw"]', 'playwright')
    # 点击搜索按钮
    # page.locator('//input[@id="su"]').click()  
    page.click('//input[@id="su"]')
    # 延迟5秒,单位毫秒
    page.wait_for_timeout(5*1000)    
    page.screenshot(path="截图.png")
    browser.close()

代码在Jupyter中运行会出现错误:Error: It looks like you are using Playwright Sync API inside the asyncio loop.Please use the Async API instead.。解决办法:代码保存到测试.py,在终端运行python 测试.py

了解更多 >> Playwright Python 文档:入门


异步模式

使用with语句

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.firefox.launch(headless=False)
        page = await browser.new_page()
        await page.goto("https://wwww.baidu.com")
        print(await page.title())
        await browser.close()

asyncio.run(main())

了解更多 >> Playwright Python 文档:入门


浏览器

安装和使用

名称 描述
chromium 使用playwright install chromium安装好浏览器和驱动
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.new_page()
    page.goto("https://www.baidu.com/")
    page.wait_for_timeout(5*1000)  # 等待5秒   
    page.screenshot(path="截图.png")
    browser.close()
chrome 使用playwright install chrome自动安装浏览器和驱动,也可以自己安装浏览器和驱动。
browser = p.chromium.launch(
        channel="chrome",
        headless=False,
        slow_mo=10,
        # 跳过检测
        args=['--disable-blink-features=AutomationControlled']
    )
firefox 使用playwright install firefox自动安装浏览器和驱动
browser = p.firefox.launch(headless=False)

页面

名称 描述 示例
goto()
content() 页面HTML源代码
with open('test.txt', 'w', encoding='utf-8' ) as f:
    f.write(page.content())

元素

定位

名称 描述 示例

了解更多 >> Playwright Python 文档:定位


属性

名称 描述 示例

了解更多 >> Playwright Python API:locator类


网络

监听请求和响应

使用page.on("request", handler)page.on("response", handler)可以监听所有请求和响应事件。

了解更多 >> Playwright Python 文档:网络 - 网络事件 Playwright Python API:Page类 - events Playwright Python API:Response类 Playwright Python API:Request类


处理请求

使用page.route()browser_context.route()可以修改或终止请求。

了解更多 >> Playwright Python 文档:网络 - 处理请求 Playwright Python API:Page类 - route


修改请求

# 修改header,删除"x-secret"键
def handle_route(route):
    headers = route.request.headers
    del headers["x-secret"]
    route.continue_(headers=headers)
page.route("**/*", handle_route)

# Continue requests as POST.
page.route("**/*", lambda route: route.continue_(method="POST"))

终止请求

使用page.route()route.abort()可以终止请求。如有时候不想加载图片和一些请求。

page = browser.new_page()
page.route("**/*.{png,jpg,jpeg}", lambda route: route.abort())
page.goto("https://example.com")
browser.close()

根据正则表达式:

page = browser.new_page()
page.route(re.compile(r"(\.png$)|(\.jpg$)"), lambda route: route.abort())
page.goto("https://example.com")
browser.close()

以下代码根据请求类型和请求url终止某些请求:

from playwright.sync_api import sync_playwright

ABORT_TYPES = ['image', 'font', 'media']
ABORT_URL_NAME = ['bdstatic.com', '/static/superman', '.js']
def handle_route(route):
    if route.request.resource_type in ABORT_TYPES:
        return route.abort()
    elif any(name in route.request.url for name in ABORT_URL_NAME):
        return route.abort()        
    else:
        route.continue_()

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.new_page()
    page.route("**/*", handle_route)
    page.goto("https://www.baidu.com")
    page.wait_for_timeout(10*1000)    
    browser.close()

了解更多 >> Playwright Python 文档:网络 - 终止请求 Playwright Python API:Page类 - route


处理响应

要修改响应,先使用APIRequestContext获取原始响应,然后将响应传递给route.fulfill()

了解更多 >> Playwright Python 文档:网络 - 修改响应


调试工具

Inspector

设置为debug模式,运行代码时即可打开Playwright Inspector。设置方法:设置环境变量PWDEBUG=1

# Bash
PWDEBUG=1 pytest -s

# PowerShell中
$env:PWDEBUG=1

# Batch
set PWDEBUG=1

了解更多 >> Playwright Python 文档:调试 - Playwright Inspector


Trace Viewer

了解更多 >> Playwright Python 文档:调试 - Trace Viewer


代码生成器

使用playwright codegen命令可以运行代码生成,会打开两个窗口,一个是浏览器,另一个是Playwright Inspector窗口。在浏览器器中操作,会在Inspector窗口实时生成代码。可以使用playwright codegen -h查看帮助。

# 使用firefox浏览器,打开www.baidu.com网页。
playwright codegen www.baidu.com -b firefox

#  生成代码保存到test.py
playwright codegen -o test.py -b firefox

了解更多 >> Playwright Python 文档:测试生成器


检测与防检测

防检测

名称 描述
设置参数,运行add_init_script 删除一些特征。
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
            channel="chrome",
            headless=False,
            slow_mo=10,
            # 防检测
            args=['--disable-blink-features=AutomationControlled']
        )
    page = browser.new_page()
    page.add_init_script("""
    Object.defineProperties(navigator, {webdriver:{get: () => undefined}});
    """)
    page.goto("https://wwww.baidu.com")
    page.wait_for_timeout(5*1000)    
    page.screenshot(path="截图.png")
    browser.close()

资源

官网

网站

文章