PaddleOCR：修订间差异

2023年4月29日 (六) 04:28的版本

PaddleOCR是百度开源的OCR工具库，基于PaddlePaddle，支持80+语言识别，支持服务器、移动、嵌入式和物联网设备之间的训练和部署。

简介

时间轴

安装

使用pip安装PaddlePaddle：

# 仅使用CPU运算，安装paddlepaddle
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

# 使用GPU加速，安装paddlepaddle-gpu，需要先安装CUDA9或CUDA10
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple

使用pip安装PaddleOCR：

pip install paddleocr

了解更多 >> PaddleOCR 文档：PaddleOCR 快速开始

安装常见错误

错误1，缺少软件。

error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/

在Windows官网https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist，选择需要版本安装，重启计算机。检查是否安装成功，可以在cmd中查询注册表，如vc++ 14.0 x64版本

reg query HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\14.0\VC\Runtimes\X64

错误2，某个软件包版本不合适。

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
paddlepaddle 2.4.2 requires protobuf<=3.20.0,>=3.1.0, but you have protobuf 3.20.3 which is incompatible.

卸载当前版本的软件，重新安装合适版本：

pip uninstall -y protobuf
pip install protobuf==3.20.0

快速开始

Python中使用

首次使用PaddleOCR，会自动下载ppocr轻量级模型作为默认模型。

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换
# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`
ocr = PaddleOCR(use_angle_cls=True, lang="ch") 
img_path = './test.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
    res = result[idx]
    for line in res:
        print(line)

# 显示结果，存储到result.jpg
# 如果本地没有simfang.ttf，可以在doc/fonts目录下下载
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

了解更多 >> PaddleOCR文档：快速开始

资源

官网

PaddleOCR 官网：https://github.com/PaddlePaddle/PaddleOCR
PaddleOCR 中文文档：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README_ch.md

@@ 第1行： / 第1行： @@
-PaddleOCR是百度开源的OCR工具库，基于[[PaddlePaddle]]，支持80+语言识别，支持服务器、移动、嵌入式和物联网设备之间的训练和部署。
+PaddleOCR是百度开源的[[OCR]]工具库，基于[[PaddlePaddle]]，支持80+语言识别，支持服务器、移动、嵌入式和物联网设备之间的训练和部署。
 ==简介==
@@ 第7行： / 第7行： @@
 使用[[pip]]安装PaddlePaddle：
 <syntaxhighlight lang="bash" >
-# 您的机器安装的是CUDA9或CUDA10，请运行以下命令安装
+# 仅使用CPU运算，安装paddlepaddle
+python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+# 使用GPU加速，安装paddlepaddle-gpu，需要先安装CUDA9或CUDA10
 python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
-# 您的机器是CPU，请运行以下命令安装
-python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
 </syntaxhighlight>
 使用[[pip]]安装PaddleOCR：
   pip install paddleocr
 {{了解更多
 |[https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/quickstart.md PaddleOCR 文档：PaddleOCR 快速开始]
 }}
 ===安装常见错误===
-  <nowiki>error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/</nowiki>
+* 错误1，缺少软件。
+<syntaxhighlight lang="text" >
+error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
+</syntaxhighlight>
 在Windows官网https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist，选择需要版本安装，重启计算机。检查是否安装成功，可以在cmd中查询注册表，如vc++ 14.0 x64版本
   reg query HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\14.0\VC\Runtimes\X64
+* 错误2，某个软件包版本不合适。
+<syntaxhighlight lang="text" >
+ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
+paddlepaddle 2.4.2 requires protobuf<=3.20.0,>=3.1.0, but you have protobuf 3.20.3 which is incompatible.
+</syntaxhighlight>
+卸载当前版本的软件，重新安装合适版本：
+<syntaxhighlight lang="python" >
+pip uninstall -y protobuf
+pip install protobuf==3.20.0
+</syntaxhighlight>
+==快速开始==
+===Python中使用===
+首次使用PaddleOCR，会自动下载ppocr轻量级模型作为默认模型。
+<syntaxhighlight lang="python" >
+from paddleocr import PaddleOCR, draw_ocr
+# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换
+# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`
+ocr = PaddleOCR(use_angle_cls=True, lang="ch")
+img_path = './test.jpg'
+result = ocr.ocr(img_path, cls=True)
+for idx in range(len(result)):
+    res = result[idx]
+    for line in res:
+        print(line)
-==基础知识==
+# 显示结果，存储到result.jpg
+# 如果本地没有simfang.ttf，可以在doc/fonts目录下下载
+from PIL import Image
+result = result[0]
+image = Image.open(img_path).convert('RGB')
+boxes = [line[0] for line in result]
+txts = [line[1][0] for line in result]
+scores = [line[1][1] for line in result]
+im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')
+im_show = Image.fromarray(im_show)
+im_show.save('result.jpg')
+</syntaxhighlight>
+{{了解更多
+|[https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/quickstart.md  PaddleOCR文档：快速开始]
+}}
 ==资源==
 ===官网===
-PaddleOCR 官网：https://github.com/PaddlePaddle/PaddleOCR
+* PaddleOCR 官网：https://github.com/PaddlePaddle/PaddleOCR
+* PaddleOCR 中文文档：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README_ch.md
 ===网站===