tokenpocket钱包官方下载|paddleocr

作者: tokenpocket钱包官方下载
2024-03-17 03:57:06

PaddleOCR: 基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。

PaddleOCR: 基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。

登录

注册

开源

企业版

高校版

搜索

帮助中心

使用条款

关于我们

开源

企业版

高校版

私有云

Gitee AI

NEW

我知道了

查看详情

登录

注册

代码拉取完成,页面将自动刷新

开源项目

>

人工智能

>

计算机视觉/人脸识别

&&

捐赠

捐赠前请先登录

取消

前往登录

扫描微信二维码支付

取消

支付完成

支付提示

将跳转至支付宝完成支付

确定

取消

Watch

不关注

关注所有动态

仅关注版本发行动态

关注但不提醒动态

439

Star

3.4K

Fork

826

PaddlePaddle / PaddleOCR

代码

Issues

189

Pull Requests

6

Wiki

统计

流水线

服务

Gitee Pages

质量分析

Jenkins for Gitee

腾讯云托管

腾讯云 Serverless

悬镜安全

阿里云 SAE

Codeblitz

我知道了,不再自动展开

加入 Gitee

与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)

免费加入

已有帐号?

立即登录

返回

release/2.6

管理

管理

分支 (19)

标签 (6)

release/2.6

release/2.6.1

shiyutang-patch-1

release/2.7

dygraph

docs_update

release/2.5

revert-7381-dygraph

revert-8552-dygraph

release/2.6rc

release/2.0-rc1-0

release/1.1

release/2.0

release/2.1

release/2.2

release/2.3

release/2.4

static

revert-7437-dygraph

v2.6.0

v2.5.0

v2.1.1

v2.1.0

v2.0.0

v1.1.0

克隆/下载

克隆/下载

HTTPS

SSH

SVN

SVN+SSH

下载ZIP

该操作需登录 Gitee 帐号,请先登录后再操作。

立即登录

没有帐号,去注册

提示

下载代码请复制以下命令到终端执行

为确保你提交的代码身份被 Gitee 正确识别,请执行以下命令完成配置

git config --global user.name userName

git config --global user.email userEmail

初次使用 SSH 协议进行代码克隆、推送等操作时,需按下述提示完成 SSH 配置

1

生成 RSA 密钥

2

获取 RSA 公钥内容,并配置到 SSH公钥 中

在 Gitee 上使用 SVN,请访问 使用指南

使用 HTTPS 协议时,命令行会出现如下账号密码验证步骤。基于安全考虑,Gitee 建议 配置并使用私人令牌 替代登录密码进行克隆、推送等操作

Username for 'https://gitee.com': userName

Password for 'https://userName@gitee.com':

#

私人令牌

新建文件

新建子模块

上传文件

分支 19

标签 6

贡献代码

同步代码

创建 Pull Request

了解更多

对比差异

通过 Pull Request 同步

同步更新到分支

通过 Pull Request 同步

将会在向当前分支创建一个 Pull Request,合入后将完成同步

moe

update: Usinig intuitive initialization of...

b1f6c21

6035 次提交

提交

取消

提示:

由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件

.github

保存

取消

PPOCRLabel

保存

取消

StyleText

保存

取消

applications

保存

取消

benchmark

保存

取消

configs

保存

取消

deploy

保存

取消

doc

保存

取消

ppocr

保存

取消

ppstructure

保存

取消

test_tipc

保存

取消

tools

保存

取消

.clang_format.hook

保存

取消

.gitignore

保存

取消

.pre-commit-config.yaml

保存

取消

.style.yapf

保存

取消

LICENSE

保存

取消

MANIFEST.in

保存

取消

README.md

保存

取消

README_ch.md

保存

取消

__init__.py

保存

取消

paddleocr.py

保存

取消

requirements.txt

保存

取消

setup.py

保存

取消

train.sh

保存

取消

Loading...

README

Apache-2.0

English | 简体中文 | हिन्दी | 日本語 | 한국인 | Pу́сский язы́к

简介

PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。

近期更新

2023.3.10 PaddleOCR集成了高性能、全场景模型部署方案FastDeploy,欢迎参考指南试用(注意使用dygraph分支)。

2022.12 发布《OCR产业范例20讲》电子书,新增蒙古文、身份证、液晶屏缺陷等7个场景应用范例

2022.11 新增实现4种前沿算法:文本检测 DRRG, 文本识别 RFL, 文本超分Text Telescope,公式识别CAN

2022.10 优化JS版PP-OCRv3模型:模型大小仅4.3M,预测速度提升8倍,配套web demo开箱即用

直播回放:PaddleOCR研发团队详解PP-StructureV2优化策略。微信扫描下方二维码,关注公众号并填写问卷后进入官方交流群,获取直播回放链接与20G重磅OCR学习大礼包(内含PDF转Word应用程序、10种垂类模型、《动手学OCR》电子书等)

2022.8.24 发布 PaddleOCR release/2.6

发布PP-StructureV2,系统功能性能全面升级,适配中文场景,新增支持版面复原,支持一行命令完成PDF转Word;

版面分析模型优化:模型存储减少95%,速度提升11倍,平均CPU耗时仅需41ms;

表格识别模型优化:设计3大优化策略,预测耗时不变情况下,模型精度提升6%;

关键信息抽取模型优化:设计视觉无关模型结构,语义实体识别精度提升2.8%,关系抽取精度提升9.1%。

2022.8 发布 OCR场景应用集合:包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等9个垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。

2022.8 新增实现8种前沿算法

文本检测:FCENet, DB++

文本识别:ViTSTR, ABINet, VisionLAN, SPIN, RobustScanner

表格识别:TableMaster

2022.5.9 发布 PaddleOCR release/2.5

发布PP-OCRv3,速度可比情况下,中文场景效果相比于PP-OCRv2再提升5%,英文场景提升11%,80语种多语言模型平均识别准确率提升5%以上;

发布半自动标注工具PPOCRLabelv2:新增表格文字图像、图像关键信息抽取任务和不规则文字图像的标注功能;

发布OCR产业落地工具集:打通22种训练部署软硬件环境与方式,覆盖企业90%的训练部署环境需求;

发布交互式OCR开源电子书《动手学OCR》,覆盖OCR全栈技术的前沿理论与代码实践,并配套教学视频。

更多

特性

支持多种OCR相关前沿算法,在此基础上打造产业级特色模型PP-OCR和PP-Structure,并打通数据生产、模型训练、压缩、预测部署全流程。

上述内容的使用方法建议从文档教程中的快速开始体验

⚡ 快速开始

在线网站体验:超轻量PP-OCR mobile模型体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr

移动端demo体验:安装包DEMO下载地址(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)

一行命令快速使用:快速开始(中英文/多语言/文档分析)

《动手学OCR》电子书

《动手学OCR》电子书

开源社区

项目合作: 如果您是企业开发者且有明确的OCR垂类应用需求,填写问卷后可免费与官方团队展开不同层次的合作。

加入社区: 微信扫描二维码并填写问卷之后,加入交流群领取20G重磅OCR学习大礼包

包括《动手学OCR》电子书 ,配套讲解视频和notebook项目;PaddleOCR历次发版直播课回放链接;

OCR场景应用模型集合: 包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。

PDF2Word应用程序;OCR社区优秀开发者项目分享视频。

️社区项目:社区项目文档中包含了社区用户使用PaddleOCR开发的各种工具、应用以及为PaddleOCR贡献的功能、优化的文档与代码等,是官方为社区开发者打造的荣誉墙,也是帮助优质项目宣传的广播站。

社区常规赛:社区常规赛是面向OCR开发者的积分赛事,覆盖文档、代码、模型和应用四大类型,以季度为单位评选并发放奖励,赛题详情与报名方法可参考链接。

PaddleOCR官方交流群二维码

️ PP-OCR系列模型列表(更新中)

模型简介

模型名称

推荐场景

检测模型

方向分类器

识别模型

中英文超轻量PP-OCRv3模型(16.2M)

ch_PP-OCRv3_xx

移动端&服务器端

推理模型 / 训练模型

推理模型 / 训练模型

推理模型 / 训练模型

英文超轻量PP-OCRv3模型(13.4M)

en_PP-OCRv3_xx

移动端&服务器端

推理模型 / 训练模型

推理模型 / 训练模型

推理模型 / 训练模型

超轻量OCR系列更多模型下载(包括多语言),可以参考PP-OCR系列模型下载,文档分析相关模型参考PP-Structure系列模型下载

PaddleOCR场景应用模型

行业

类别

亮点

文档说明

模型下载

制造

数码管识别

数码管数据合成、漏识别调优

光功率计数码管字符识别

下载链接

金融

通用表单识别

多模态通用表单结构化提取

多模态表单识别

下载链接

交通

车牌识别

多角度图像处理、轻量模型、端侧部署

轻量级车牌识别

下载链接

更多制造、金融、交通行业的主要OCR垂类应用模型(如电表、液晶屏、高精度SVTR模型等),可参考场景应用模型下载

文档教程

运行环境准备

PP-OCR文本检测识别

快速开始

模型库

模型训练

文本检测

文本识别

文本方向分类器

模型压缩

模型量化

模型裁剪

知识蒸馏

推理部署

基于Python预测引擎推理

基于C++预测引擎推理

服务化部署

端侧部署

Paddle2ONNX模型转化与预测

云上飞桨部署工具

Benchmark

PP-Structure文档分析

快速开始

模型库

模型训练

版面分析

表格识别

关键信息提取

推理部署

基于Python预测引擎推理

基于C++预测引擎推理

服务化部署

前沿算法与模型

文本检测算法

文本识别算法

端到端OCR算法

表格识别算法

关键信息抽取算法

使用PaddleOCR架构添加新算法

场景应用

数据标注与合成

半自动标注工具PPOCRLabel

数据合成工具Style-Text

其它数据标注工具

其它数据合成工具

数据集

通用中英文OCR数据集

手写中文OCR数据集

垂类多语言OCR数据集

版面分析数据集

表格识别数据集

关键信息提取数据集

代码组织结构

效果展示

《动手学OCR》电子书

开源社区

FAQ

通用问题

PaddleOCR实战问题

参考文献

许可证书

效果展示 more

PP-OCRv3 中文模型

PP-OCRv3 英文模型

PP-OCRv3 多语言模型

PP-Structure 文档分析

版面分析+表格识别

SER(语义实体识别)

RE(关系提取)

许可证书

本项目的发布受Apache 2.0 license许可认证。

Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved

Apache License

Version 2.0, January 2004

http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction,

and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by

the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all

other entities that control, are controlled by, or are under common

control with that entity. For the purposes of this definition,

"control" means (i) the power, direct or indirect, to cause the

direction or management of such entity, whether by contract or

otherwise, or (ii) ownership of fifty percent (50%) or more of the

outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity

exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications,

including but not limited to software source code, documentation

source, and configuration files.

"Object" form shall mean any form resulting from mechanical

transformation or translation of a Source form, including but

not limited to compiled object code, generated documentation,

and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or

Object form, made available under the License, as indicated by a

copyright notice that is included in or attached to the work

(an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object

form, that is based on (or derived from) the Work and for which the

editorial revisions, annotations, elaborations, or other modifications

represent, as a whole, an original work of authorship. For the purposes

of this License, Derivative Works shall not include works that remain

separable from, or merely link (or bind by name) to the interfaces of,

the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including

the original version of the Work and any modifications or additions

to that Work or Derivative Works thereof, that is intentionally

submitted to Licensor for inclusion in the Work by the copyright owner

or by an individual or Legal Entity authorized to submit on behalf of

the copyright owner. For the purposes of this definition, "submitted"

means any form of electronic, verbal, or written communication sent

to the Licensor or its representatives, including but not limited to

communication on electronic mailing lists, source code control systems,

and issue tracking systems that are managed by, or on behalf of, the

Licensor for the purpose of discussing and improving the Work, but

excluding communication that is conspicuously marked or otherwise

designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity

on behalf of whom a Contribution has been received by Licensor and

subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of

this License, each Contributor hereby grants to You a perpetual,

worldwide, non-exclusive, no-charge, royalty-free, irrevocable

copyright license to reproduce, prepare Derivative Works of,

publicly display, publicly perform, sublicense, and distribute the

Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of

this License, each Contributor hereby grants to You a perpetual,

worldwide, non-exclusive, no-charge, royalty-free, irrevocable

(except as stated in this section) patent license to make, have made,

use, offer to sell, sell, import, and otherwise transfer the Work,

where such license applies only to those patent claims licensable

by such Contributor that are necessarily infringed by their

Contribution(s) alone or by combination of their Contribution(s)

with the Work to which such Contribution(s) was submitted. If You

institute patent litigation against any entity (including a

cross-claim or counterclaim in a lawsuit) alleging that the Work

or a Contribution incorporated within the Work constitutes direct

or contributory patent infringement, then any patent licenses

granted to You under this License for that Work shall terminate

as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the

Work or Derivative Works thereof in any medium, with or without

modifications, and in Source or Object form, provided that You

meet the following conditions:

(a) You must give any other recipients of the Work or

Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices

stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works

that You distribute, all copyright, patent, trademark, and

attribution notices from the Source form of the Work,

excluding those notices that do not pertain to any part of

the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its

distribution, then any Derivative Works that You distribute must

include a readable copy of the attribution notices contained

within such NOTICE file, excluding those notices that do not

pertain to any part of the Derivative Works, in at least one

of the following places: within a NOTICE text file distributed

as part of the Derivative Works; within the Source form or

documentation, if provided along with the Derivative Works; or,

within a display generated by the Derivative Works, if and

wherever such third-party notices normally appear. The contents

of the NOTICE file are for informational purposes only and

do not modify the License. You may add Your own attribution

notices within Derivative Works that You distribute, alongside

or as an addendum to the NOTICE text from the Work, provided

that such additional attribution notices cannot be construed

as modifying the License.

You may add Your own copyright statement to Your modifications and

may provide additional or different license terms and conditions

for use, reproduction, or distribution of Your modifications, or

for any such Derivative Works as a whole, provided Your use,

reproduction, and distribution of the Work otherwise complies with

the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,

any Contribution intentionally submitted for inclusion in the Work

by You to the Licensor shall be under the terms and conditions of

this License, without any additional terms or conditions.

Notwithstanding the above, nothing herein shall supersede or modify

the terms of any separate license agreement you may have executed

with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade

names, trademarks, service marks, or product names of the Licensor,

except as required for reasonable and customary use in describing the

origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or

agreed to in writing, Licensor provides the Work (and each

Contributor provides its Contributions) on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or

implied, including, without limitation, any warranties or conditions

of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A

PARTICULAR PURPOSE. You are solely responsible for determining the

appropriateness of using or redistributing the Work and assume any

risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,

whether in tort (including negligence), contract, or otherwise,

unless required by applicable law (such as deliberate and grossly

negligent acts) or agreed to in writing, shall any Contributor be

liable to You for damages, including any direct, indirect, special,

incidental, or consequential damages of any character arising as a

result of this License or out of the use or inability to use the

Work (including but not limited to damages for loss of goodwill,

work stoppage, computer failure or malfunction, or any and all

other commercial damages or losses), even if such Contributor

has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing

the Work or Derivative Works thereof, You may choose to offer,

and charge a fee for, acceptance of support, warranty, indemnity,

or other liability obligations and/or rights consistent with this

License. However, in accepting such obligations, You may act only

on Your own behalf and on Your sole responsibility, not on behalf

of any other Contributor, and only if You agree to indemnify,

defend, and hold each Contributor harmless for any liability

incurred by, or claims asserted against, such Contributor by reason

of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following

boilerplate notice, with the fields enclosed by brackets "[]"

replaced with your own identifying information. (Don't include

the brackets!) The text should be enclosed in the appropriate

comment syntax for the file format. We also recommend that a

file or class name and description of purpose be included on the

same "printed page" as the copyright notice for easier

identification within third-party archives.

Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

Starred

3.4K

Star

3.4K

Fork

826

捐赠

0 人次

举报

举报成功

我们将于2个工作日内通过站内信反馈结果给你!

请认真填写举报原因,尽可能描述详细。

举报类型

请选择举报类型

举报原因

取消

发送

误判申诉

此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。

如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。

取消

提交

简介

基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。

展开

收起

暂无标签

http://www.paddlepaddle.org

Python

等 6 种语言

Python

79.2%

C++

13.4%

Shell

4.6%

Java

1.3%

Cuda

0.4%

Other

1.1%

Apache-2.0

使用 Apache-2.0 开源许可协议

保存更改

取消

发行版

暂无发行版

开源评估指数源自 OSS-Compass 评估体系,评估体系围绕以下三个维度对项目展开评估:

1. 开源生态

生产力:来评估开源项目输出软件制品和开源价值的能力。

创新力:用于评估开源软件及其生态系统的多样化程度。

稳健性:用于评估开源项目面对多变的发展环境,抵御内外干扰并自我恢复的能力。

2. 协作、人、软件

协作:代表了开源开发行为中协作的程度和深度。

人:观察开源项目核心人员在开源项目中的影响力,并通过第三方视角考察用户和开发者对开源项目的评价。

软件:从开源项目对外输出的制品评估其价值最终落脚点。也是开源评估最“古老”的主流方向之一“开源软件” 的具体表现。

3. 评估模型

基于“开源生态”与“协作、人、软件”的维度,找到与该目标直接或间接相关的可量化指标,对开源项目健康与生态进行量化评估,最终形成开源评估指数。

贡献者

全部

近期动态

加载更多

不能加载更多了

编辑仓库简介

简介内容

基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。

主页

取消

保存更改

Python

1

https://gitee.com/paddlepaddle/PaddleOCR.git

git@gitee.com:paddlepaddle/PaddleOCR.git

paddlepaddle

PaddleOCR

PaddleOCR

release/2.6

深圳市奥思网络科技有限公司版权所有

Git 大全

Git 命令学习

CopyCat 代码克隆检测

APP与插件下载

Gitee Reward

Gitee 封面人物

GVP 项目

Gitee 博客

Gitee 公益计划

Gitee 持续集成

OpenAPI

帮助文档

在线自助服务

更新日志

关于我们

加入我们

使用条款

意见建议

合作伙伴

售前咨询客服

技术交流QQ群

微信服务号

client#oschina.cn

企业版在线使用:400-606-0201

专业版私有部署:

13670252304

13352947997

开放原子开源基金会

合作代码托管平台

违法和不良信息举报中心

粤ICP备12009483号

简 体

/

繁 體

/

English

点此查找更多帮助

搜索帮助

Git 命令在线学习

如何在 Gitee 导入 GitHub 仓库

Git 仓库基础操作

企业版和社区版功能对比

SSH 公钥设置

如何处理代码冲突

仓库体积过大,如何减小?

如何找回被删除的仓库数据

Gitee 产品配额说明

GitHub仓库快速导入Gitee及同步更新

什么是 Release(发行版)

将 PHP 项目自动发布到 packagist.org

评论

仓库举报

回到顶部

登录提示

该操作需登录 Gitee 帐号,请先登录后再操作。

立即登录

没有帐号,去注册

GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Skip to content

Toggle navigation

Sign in

Product

Actions

Automate any workflow

Packages

Host and manage packages

Security

Find and fix vulnerabilities

Codespaces

Instant dev environments

Copilot

Write better code with AI

Code review

Manage code changes

Issues

Plan and track work

Discussions

Collaborate outside of code

Explore

All features

Documentation

GitHub Skills

Blog

Solutions

For

Enterprise

Teams

Startups

Education

By Solution

CI/CD & Automation

DevOps

DevSecOps

Resources

Learning Pathways

White papers, Ebooks, Webinars

Customer Stories

Partners

Open Source

GitHub Sponsors

Fund open source developers

The ReadME Project

GitHub community articles

Repositories

Topics

Trending

Collections

Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

Sign up

You signed in with another tab or window. Reload to refresh your session.

You signed out in another tab or window. Reload to refresh your session.

You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

PaddlePaddle

/

PaddleOCR

Public

Notifications

Fork

7.1k

Star

37.4k

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

License

Apache-2.0 license

37.4k

stars

7.1k

forks

Branches

Tags

Activity

Star

Notifications

Code

Issues

1.1k

Pull requests

78

Discussions

Actions

Projects

1

Security

Insights

Additional navigation options

Code

Issues

Pull requests

Discussions

Actions

Projects

Security

Insights

PaddlePaddle/PaddleOCR

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

 release/2.7BranchesTagsGo to fileCodeFolders and filesNameNameLast commit messageLast commit dateLatest commit History6,131 Commits.github.github  PPOCRLabelPPOCRLabel  StyleTextStyleText  applicationsapplications  benchmarkbenchmark  configsconfigs  deploydeploy  docdoc  ppocrppocr  ppstructureppstructure  test_tipctest_tipc  toolstools  .clang_format.hook.clang_format.hook  .gitignore.gitignore  .pre-commit-config.yaml.pre-commit-config.yaml  .style.yapf.style.yapf  LICENSELICENSE  MANIFEST.inMANIFEST.in  README.mdREADME.md  README_en.mdREADME_en.md  __init__.py__init__.py  paddleocr.pypaddleocr.py  requirements.txtrequirements.txt  setup.pysetup.py  train.shtrain.sh  View all filesRepository files navigationREADMEApache-2.0 licenseEnglish | 简体中文 | हिन्दी | 日本語 | 한국인 | Pу́сский язы́к

简介

PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。

近期更新

PaddleOCR 算法模型挑战赛 火热开启!报名时间1/15-3/31,30万元奖金池!快来一展身手吧!

2023.11 发布 PP-ChatOCRv2: 一个SDK,覆盖20+高频应用场景,支持5种文本图像智能分析能力和部署,包括通用场景关键信息抽取(快递单、营业执照和机动车行驶证等)、复杂文档场景关键信息抽取(解决生僻字、特殊标点、多页pdf、表格等难点问题)、通用OCR、文档场景专用OCR、通用表格识别。针对垂类业务场景,也支持模型训练、微调和Prompt优化。

2023.8.7 发布 PaddleOCR release/2.7

发布PP-OCRv4,提供mobile和server两种模型

PP-OCRv4-mobile:速度可比情况下,中文场景效果相比于PP-OCRv3再提升4.5%,英文场景提升10%,80语种多语言模型平均识别准确率提升8%以上

PP-OCRv4-server:发布了目前精度最高的OCR模型,中英文场景上检测模型精度提升4.9%, 识别模型精度提升2%

可参考快速开始 一行命令快速使用,同时也可在飞桨AI套件(PaddleX)中的通用OCR产业方案中低代码完成模型训练、推理、高性能部署全流程

发布PP-ChatOCR ,使用融合PP-OCR模型和文心大模型的通用场景关键信息抽取全新方案

2022.11 新增实现4种前沿算法:文本检测 DRRG, 文本识别 RFL, 文本超分Text Telescope,公式识别CAN

2022.10 优化JS版PP-OCRv3模型:模型大小仅4.3M,预测速度提升8倍,配套web demo开箱即用

直播回放:PaddleOCR研发团队详解PP-StructureV2优化策略。微信扫描下方二维码,关注公众号并填写问卷后进入官方交流群,获取直播回放链接与20G重磅OCR学习大礼包(内含PDF转Word应用程序、10种垂类模型、《动手学OCR》电子书等)

2022.8.24 发布 PaddleOCR release/2.6

发布PP-StructureV2,系统功能性能全面升级,适配中文场景,新增支持版面复原,支持一行命令完成PDF转Word;

版面分析模型优化:模型存储减少95%,速度提升11倍,平均CPU耗时仅需41ms;

表格识别模型优化:设计3大优化策略,预测耗时不变情况下,模型精度提升6%;

关键信息抽取模型优化:设计视觉无关模型结构,语义实体识别精度提升2.8%,关系抽取精度提升9.1%。

2022.8 发布 OCR场景应用集合:包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等9个垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。

更多

特性

支持多种OCR相关前沿算法,在此基础上打造产业级特色模型PP-OCR、PP-Structure和PP-ChatOCRv2,并打通数据生产、模型训练、压缩、预测部署全流程。

上述内容的使用方法建议从文档教程中的快速开始体验

⚡ 快速开始

在线免费体验:

PP-OCRv4 在线体验地址:https://aistudio.baidu.com/application/detail/7658

PP-ChatOCRv2 在线体验地址:https://aistudio.baidu.com/application/detail/10368

一行命令快速使用:快速开始(中英文/多语言/文档分析)

移动端demo体验:安装包DEMO下载地址(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)

技术交流合作

飞桨低代码开发工具(PaddleX)—— 面向国内外主流AI硬件的飞桨精选模型一站式开发工具。包含如下核心优势:

【产业高精度模型库】:覆盖10个主流AI任务 40+精选模型,丰富齐全。

【特色模型产线】:提供融合大小模型的特色模型产线,精度更高,效果更好。

【低代码开发模式】:图形化界面支持统一开发范式,便捷高效。

【私有化部署多硬件支持】:适配国内外主流AI硬件,支持本地纯离线使用,满足企业安全保密需要。

PaddleX官网地址:https://aistudio.baidu.com/intro/paddlex

PaddleX官方交流频道:https://aistudio.baidu.com/community/channel/610

《动手学OCR》电子书

《动手学OCR》电子书

开源共建

加入社区:感谢大家长久以来对 PaddleOCR 的支持和关注,与广大开发者共同构建一个专业、和谐、相互帮助的开源社区是 PaddleOCR 的目标。我们非常欢迎各位开发者参与到飞桨社区的开源建设中,加入开源、共建飞桨。为感谢社区开发者在 PaddleOCR release2.7 中做出的代码贡献,我们将为贡献者制作与邮寄开源贡献证书,烦请填写问卷提供必要的邮寄信息。

社区活动:飞桨开源社区长期运营与发布各类丰富的活动与开发任务,在 PaddleOCR 社区,你可以关注以下社区活动,并选择自己感兴趣的内容参与开源共建:

飞桨套件快乐开源常规赛 | 传送门:OCR 社区常规赛升级版,以建设更好用的 OCR 套件为目标,包括但不限于学术前沿模型训练与推理、打磨优化 OCR 工具与应用项目开发等,任何有利于社区意见流动和问题解决的行为都热切希望大家的参与。让我们共同成长为飞桨套件的重要 Contributor 。

新需求征集 | 传送门:你在日常研究和实践深度学习过程中,有哪些你期望的 feature 亟待实现?请按照格式描述你想实现的 feature 和你提出的初步实现思路,我们会定期沟通与讨论这些需求,并将其纳入未来的版本规划中。

PP-SIG 技术研讨会 | 传送门:PP-SIG 是飞桨社区开发者由于相同的兴趣汇聚在一起形成的虚拟组织,通过定期召开技术研讨会的方式,分享行业前沿动态、探讨社区需求与技术开发细节、发起社区联合贡献任务。PaddleOCR 希望可以通过 AI 的力量助力任何一位有梦想的开发者实现自己的想法,享受创造价值带来的愉悦。

项目合作:如果你有企业中明确的 OCR 垂类应用需求,我们推荐你使用训压推一站式全流程高效率开发平台 PaddleX,助力 AI 技术快速落地。PaddleX 还支持联创开发,利润分成!欢迎广大的个人开发者和企业开发者参与进来,共创繁荣的 AI 技术生态!

️ PP-OCR系列模型列表(更新中)

模型简介

模型名称

推荐场景

检测模型

方向分类器

识别模型

中英文超轻量PP-OCRv4模型(15.8M)

ch_PP-OCRv4_xx

移动端&服务器端

推理模型 / 训练模型

推理模型 / 训练模型

推理模型 / 训练模型

中英文超轻量PP-OCRv3模型(16.2M)

ch_PP-OCRv3_xx

移动端&服务器端

推理模型 / 训练模型

推理模型 / 训练模型

推理模型 / 训练模型

英文超轻量PP-OCRv3模型(13.4M)

en_PP-OCRv3_xx

移动端&服务器端

推理模型 / 训练模型

推理模型 / 训练模型

推理模型 / 训练模型

超轻量OCR系列更多模型下载(包括多语言),可以参考PP-OCR系列模型下载,文档分析相关模型参考PP-Structure系列模型下载

PaddleOCR场景应用模型

行业

类别

亮点

文档说明

模型下载

制造

数码管识别

数码管数据合成、漏识别调优

光功率计数码管字符识别

下载链接

金融

通用表单识别

多模态通用表单结构化提取

多模态表单识别

下载链接

交通

车牌识别

多角度图像处理、轻量模型、端侧部署

轻量级车牌识别

下载链接

更多制造、金融、交通行业的主要OCR垂类应用模型(如电表、液晶屏、高精度SVTR模型等),可参考场景应用模型下载

文档教程

运行环境准备

PP-OCR文本检测识别

快速开始

模型库

模型训练

文本检测

文本识别

文本方向分类器

模型压缩

模型量化

模型裁剪

知识蒸馏

推理部署

基于Python预测引擎推理

基于C++预测引擎推理

服务化部署

端侧部署

Paddle2ONNX模型转化与预测

云上飞桨部署工具

Benchmark

PP-Structure文档分析

快速开始

模型库

模型训练

版面分析

表格识别

关键信息提取

推理部署

基于Python预测引擎推理

基于C++预测引擎推理

服务化部署

前沿算法与模型

文本检测算法

文本识别算法

端到端OCR算法

表格识别算法

关键信息抽取算法

使用PaddleOCR架构添加新算法

场景应用

数据标注与合成

半自动标注工具PPOCRLabel

数据合成工具Style-Text

其它数据标注工具

其它数据合成工具

数据集

通用中英文OCR数据集

手写中文OCR数据集

垂类多语言OCR数据集

版面分析数据集

表格识别数据集

关键信息提取数据集

代码组织结构

效果展示

《动手学OCR》电子书

开源社区

FAQ

通用问题

PaddleOCR实战问题

参考文献

许可证书

效果展示 more

PP-OCRv3 中文模型

PP-OCRv3 英文模型

PP-OCRv3 多语言模型

PP-Structure 文档分析

版面分析+表格识别

SER(语义实体识别)

RE(关系提取)

许可证书

本项目的发布受Apache 2.0 license许可认证。

About

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Topics

ocr

db

crnn

ocrlite

chineseocr

Resources

Readme

License

Apache-2.0 license

Activity

Custom properties

Stars

37.4k

stars

Watchers

431

watching

Forks

7.1k

forks

Report repository

Releases

8

PaddleOCRv2.7.1

Latest

Oct 18, 2023

+ 7 releases

Packages

0

No packages published

Used by 2.3k

+ 2,254

Contributors

147

+ 133 contributors

Languages

Python

78.5%

C++

12.5%

Shell

4.8%

Java

2.5%

CMake

0.4%

Cuda

0.4%

Other

0.9%

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms

Privacy

Security

Status

Docs

Contact

Manage cookies

Do not share my personal information

You can’t perform that action at this time.

PaddleOCR本地部署(安装,使用,模型优化/加速)_paddleocr 部署-CSDN博客

>

PaddleOCR本地部署(安装,使用,模型优化/加速)_paddleocr 部署-CSDN博客

PaddleOCR本地部署(安装,使用,模型优化/加速)

吨吨不打野

已于 2023-11-23 08:49:26 修改

阅读量4w

收藏

210

点赞数

49

分类专栏:

# OCR数字仪表识别

文章标签:

macos

pytorch

深度学习

于 2021-06-04 10:57:41 首次发布

本文为博主原创文章,未经博主允许不得转载。

本文链接:https://blog.csdn.net/Castlehe/article/details/117356343

版权

OCR数字仪表识别

专栏收录该内容

32 篇文章

64 订阅

订阅专栏

文章目录

1. 安装1.1 还是需要paddle1.2 确认各种包和环境1.3 可能不需要paddle?

2. 使用2.1 配置摄像头,读取,识别,显示2.3 检测模型的问题2.3.1 换个模型2.3.2 限定检测位置

3. 性能改进3.0 基本情况X 自己模型的速度和全用默认的速度对比3.1 端侧部署3.2 加速3.2.1 CPU下使用mkldnn加速3.2.2 修改参数3.2.3 内存泄露3.2.4 内存泄漏的问题记录

3.3 剪枝3.4 其他可能的途径3.3 更换模型3.4 多进程

3.5 cpu占用问题3.5.1 paddle绑定cpu问题

3.6 推理部署文档

1. 安装

1.1 还是需要paddle

根据:paddleocr package使用说明 一开始以为:

pip install "paddleocr>=2.0.1"

但是果然: so就在本机安装一下paddle好了,但是也仅需要paddle,参考:快速安装

# windows下 直接 python 不是python3

python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple

安装好再去运行,遇到经典的shapely错误,参考:Win10 CPU环境,OSError: [WinError 126] 找不到指定的模块 #212 windows下安装shapely,需要从这里下载,然后再

pip uninstall shapely

pip install Shapely-1.7.1-cp37-cp37m-win_amd64.whl

conda install shapely -c conda-forge

或者

更名为Shapely-1.7.0-cp39-cp39-win_amd64.rar,然后解压缩,从其子目录shapely\DLLs\中找到geos_c.dll,并将geos_c.dll拷贝到conda的环境(我的命名是ocr)目录 C:\Users\myusername\Miniconda3\envs\ocr\Library\bin中。问题解决 同时把geos.dll和geos_c.dll拷贝至你anaconda环境中的library\bin中

最简单的方案!!!

删除anaconda中之前装的shaply(文件夹和程序都删掉),重新安装,

参考:anaconda3+ paddleOCR安装使用

1.2 确认各种包和环境

本机上,使用了anaconda默认环境,各种版本如下:

python 3.7.6paddleocr 使用pip安装后看到的版本是: paddleocr-2.0.6-py3

requirments文件中的内容:

shapely scikit-image == 0.17.2 imgaug == 0.4.0 pyclipper lmdb opencv-python == 4.2.0.32 tqdm numpys visualdl python-Levenshtein

1.3 可能不需要paddle?

根据paddleocr的FAQ文档

Q3.4.23:安装paddleocr后,提示没有paddle A:这是因为paddlepaddle gpu版本和cpu版本的名称不一致,现在已经在whl的文档里做了安装说明。

但是参考:预测示例 (Python) 可知,想使用paddle系列的模型,是必须要使用paddle的inference的。

所以还是老老实实安装上paddle吧

2. 使用

改改路径就好了。

其中有一点需要注意:

[[[[72.0, 149.0], [113.0, 151.0], [113.0, 166.0], [72.0, 163.0]], ('40', 0.7172388)],

[[[62.0, 170.0], [237.0, 175.0], [233.0, 300.0], [58.0, 294.0]], ('1076', 0.9666834)]]

可以看到输出文件的结构,每一个文本识别结果,都包括四个点的坐标,一个二位数组,以及一个最后识别结果的元组(识别的文字结果,置信度),结构就是一个数组涵盖这两部分内容。 如果是多个文本识别结果,会有再外层的一个数组。

2.1 配置摄像头,读取,识别,显示

参考另一个文章:python opencv调用摄像头识别并绘制结果

发现一个神奇的事情,当你插着usb摄像头启动电脑时,cap = cv2.VideoCapture(0),usb摄像头的序号就是0;当启动电脑之后再插上usb摄像头,usb摄像头的序号就是2(我的电脑是一个前置+一个后置摄像头)

关于摄像头参数的调节,可以参考另一篇文章:Opencv摄像头相关参数

2.3 检测模型的问题

由于使用了摄像头读取图像,图片背景比较杂,对检测有难度,发现使用DB效果不是很好。(由于还没怎么研究过检测模型,所以很难判断问题到底出在哪里)

2.3.1 换个模型

参考: paddleocr文档 可以看到,其实EAST的准确率要比DB高,虽然存在过检。

EAST高效,准确,但对弯曲文本检测较差。

在paddleocr.py文件中看到:

parser.add_argument("--det_algorithm", type=str, default='DB')

# 调用时修改为EAST,但是报错

然后看到代码中有:

SUPPORT_DET_MODEL = ['DB']

VERSION = 2.0

SUPPORT_REC_MODEL = ['CRNN']

BASE_DIR = os.path.expanduser("~/.paddleocr/")

结论: 发现下载的是预训练模型,不是推理模型,无法使用EAST算法。

2.3.2 限定检测位置

设置一个按键,opencv摄像头有键盘响应,可以有相应的操作,参考:cv2.VideoCapture.get、set详解可以获取相机参数。

另外,参考:opencv python全屏显示、置窗口大小和位置

cap=cv2.VideoCapture(1)

cv2.VideoCapture.get(3) # CV_CAP_PROP_FRAME_WIDTH 在视频流的帧的宽度

cv2.VideoCapture.get(4) # CV_CAP_PROP_FRAME_HEIGHT 在视频流的帧的高度

# 除了get,还有set

capture.set(CV_CAP_PROP_FRAME_WIDTH, 1080); 宽度

capture.set(CV_CAP_PROP_FRAME_HEIGHT, 960); 高度

frame[top:bottom,left:right]

参考:python cv2图片剪裁

3. 性能改进

3.0 基本情况

检测时间比较久,检测+识别的时间差不多是0.7~1.2s,在cpu机器上,其实比较尴尬。

先查看一下模型的size,运行检测的时候会打印出模型的配置信息,可以从这里看到

Namespace(cls_batch_num=6, cls_image_shape='3, 48, 192',

cls_model_dir='C:\\Users\\huangshan/.paddleocr/2.1/cls', cls_thresh=0.9, det=True,

det_algorithm='DB', det_db_box_thresh=0.3, det_db_thresh=0.2,

det_db_unclip_ratio=2.2, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2,

det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max',

det_model_dir='C:\\Users\\huangshan/.paddleocr/2.1/det/ch', drop_score=0.5,

enable_mkldnn=False, gpu_mem=8000, image_dir='', ir_optim=True, label_list=['0',

'180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN',

rec_batch_num=6, rec_char_dict_path='C:/shaiic_work/ZhiNengKeJiOCR/digit.txt',

rec_char_type='ch', rec_image_shape='3, 32, 320',

rec_model_dir='C:/shaiic_work/ZhiNengKeJiOCR/rec_crnn_digit', use_angle_cls=False,

use_dilation=False, use_gpu=False, use_pdserving=False, use_space_char=True,

use_tensorrt=False, use_zero_copy_run=False)

采用的检测模型是自带的,位置在:det_model_dir='C:\\Users\\yourname/.paddleocr/2.1/det/ch',检测模型只有3M 先确认一下这个默认模型的信息,从代码中可以看到:

'rec': {

'ch': {

'url':'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar',

'dict_path': './ppocr/utils/ppocr_keys_v1.txt'

},

ch_ppocr_mobile_v2.0_rec_infer.tar所以这个默认的模型目测已经是剪枝过的了。

同时,参考:PP-OCR 2.0系列模型列表文档

识别模型是自己训练之后转为推理模型的,有94MB,确实对于比较简单的一块数字仪表识别很重。

X 自己模型的速度和全用默认的速度对比

上面已经给了几个自己模型的用时图,下面给几个全用默认的时间图

可知: 不使用mkldnn加速的情况下,使用默认的检测+自己的识别速度基本在0.7~1.2s 不使用mkldnn加速的情况下,使用默认的检测+默认的识别速度基本在0.7~0.9s 所以虽然识别时间本来就不到0.2s,但是可以变得更快,这样就只剩检测时间了。 个人猜测,是不是第一阶段检测模型是剪枝后的,比如是8位精度,第二阶段识别模型也是8位精度,这样系统处理是一致的。 如果两个阶段数据精度不一样,系统处理的时候不一致,是不是也会造成数据差异。

3.1 端侧部署

参考文档:端侧部署 这种会帮助有效减小模型size,但是推理速度似乎没有强调会不会变快。

3.2 加速

直接去FAQ文档中搜索加速,可以看到以下结果.

Q3.1.73: 如何使用TensorRT加速PaddleOCR预测? A: 目前paddle的dygraph分支已经支持了python和C++ TensorRT预测的代码,python端inference预测时把参数–use_tensorrt=True即可, C++TensorRT预测需要使用支持TRT的预测库并在编译时打开-DWITH_TENSORRT=ON。 如果想修改其他分支代码支持TensorRT预测,可以参考PR。 注:建议使用TensorRT大于等于6.1.0.5以上的版本。

另外,搜索速度,可以看到:

Q3.4.40: 使用hub_serving部署,延时较高,可能的原因是什么呀? A: 首先,测试的时候第一张图延时较高,可以多测试几张然后观察后几张图的速度;其次,如果是在cpu端部署serving端模型(如backbone为ResNet34),耗时较慢,建议在cpu端部署mobile(如backbone为MobileNetV3)模型。

这里建议在cpu端部署mobile模型

也可以只看预测部署部分,还可以看到以下比较有用的信息:

Q3.4.1:如何pip安装opt模型转换工具? A:由于OCR端侧部署需要某些算子的支持,这些算子仅在Paddle-Lite 最新develop分支中,所以需要自己编译opt模型转换工具。opt工具可以通过编译PaddleLite获得,编译步骤参考lite部署文档 中2.1 模型优化部分。

Q3.4.2:如何将PaddleOCR预测模型封装成SDK A:如果是Python的话,可以使用tools/infer/predict_system.py中的TextSystem进行sdk封装,如果是c++的话,可以使用deploy/cpp_infer/src下面的DBDetector和CRNNRecognizer完成封装

3.2.1 CPU下使用mkldnn加速

由于慢的地方主要是检测,所以即便对剪枝进行优化也不是很有效,所以这里先尝试使用mkldnn来进行加速。

Q3.1.77: 使用mkldnn加速预测时遇到 ‘Please compile with MKLDNN first to use MKLDNN’ A: 报错提示当前环境没有mkldnn,建议检查下当前CPU是否支持mlkdnn(MAC上是无法用mkldnn);另外的可能是使用的预测库不支持mkldnn, 建议从这里下载支持mlkdnn的CPU预测库。

Q1.1.10:PaddleOCR中,对于模型预测加速,CPU加速的途径有哪些?基于TenorRT加速GPU对输入有什么要求? A:(1)CPU可以使用mkldnn进行加速;对于python inference的话,可以把enable_mkldnn改为true,参考代码,对于cpp inference的话,在配置文件里面配置use_mkldnn 1即可,参考代码 (2)GPU需要注意变长输入问题等,TRT6 之后才支持变长输入

直接在inference代码中将enable_mkldnn改为true,确实变快了一些。 之前是0.7~1.2,现在基本就是0.6-0.98,反正没有超过1s的,0.8的比较多。

3.2.2 修改参数

想起来还有一些参数可以考虑修改,比如:

parser.add_argument("--det_limit_side_len", type=float, default=960)

根据FAQ文档

Q3.3.2:配置文件里面检测的阈值设置么? A:有的,检测相关的参数主要有以下几个: det_limit_side_len:预测时图像resize的长边尺寸 det_db_thresh: 用于二值化输出图的阈值 det_db_box_thresh:用于过滤文本框的阈值,低于此阈值的文本框不要 det_db_unclip_ratio: 文本框扩张的系数,关系到文本框的大小 这些参数的默认值见代码,可以通过从命令行传递参数进行修改。

det_limit_side_len默认是960,考虑改成32的倍数,但是改小一些,比如320。 det_limit_side_len改的再小一些,256,识别部分最大长度max_text_length=5,rec_image_shape=(3,32,256),之前默认是(3,32,320)。 很奇怪,一开始速度是0.6,后来逐渐稳定再0.8~0.9之间。打开内存任务管理器发现,这个东西内存占用可以达到98%???? 我本机是32G的内存,无语。

另外,关于常见PaddleOCR包里给出的参数说明,参考:paddleocr package使用说明最后有一个参数说明表: 关于检测可控的参数有很多,但是关于识别,其实并没有很多可以进行调优的参数。

3.2.3 内存泄露

根据FAQ文档,

Q3.4.43: 预测时显存爆炸、内存泄漏问题? A: 打开显存/内存优化开关enable_memory_optim可以解决该问题,相关代码已合入,查看详情。

可以看到这个代码的位置: https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/tools/infer/utility.py#L153

Q3.4.17: 预测内存泄漏问题 A:1. 使用hubserving出现内存泄漏,该问题为已知问题,预计在paddle2.0正式版中解决。相关讨论见issue A:2. C++ 预测出现内存泄漏,该问题已经在paddle2.0rc版本中解决,建议安装paddle2.0rc版本,并更新PaddleOCR代码到最新。

2021.6.3查看那个issue看到: 这个更新是26天前,查看自己的环境,似乎不是2.0rc,换。 所以虽然找不到paddle2.0rc版本,但是可以直接去下载2.1版本,开始使用

python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

# 直接用这个会显示已经安装了2.0,所以需要

pip install --upgrade paddlepaddle -i https://mirror.baidu.com/pypi/simple

# 更新到最新,就是2.1

改了之后,内存依然占用量很高,而且推理速度还变慢了。。。。都超过1s了,但是效果好像好了一些,连一些虚的都变好了。换成自带的识别模型之后,也比之前时间长了,无语。 但是更新到2.1之后,打开mkldnn,速度变快了,基本控制在0.3-0.5s。。。但是内存占用是100%基本上。 加速之后,超级快,但是内存占用非常高。

最后的结论 内存泄漏是因为开启了mkldnn,需要关闭。 同时cpu thread数量改为1,不然还是会有很高的内存占用率,同时改成1,其实速度影响并不大。

3.2.4 内存泄漏的问题记录

发现paddle的issue中有很多说速度很慢的:

使用CPU下进行加速处理,但是识别的速度将近30S,请问有什么方法提高嘛? #2950 这个用的是服务器端的模型,看到了server 还有关于PPOCRLabel也是自动标记过程中由快到慢: 关于半自动标注工具PPOCRLabel运行速度由快逐渐变慢的问题 #1391 类似的也有:PPOCRLabel自动标注跑着跑着就自己闪退了 #2724 有说版本变慢的:2.x版本比1.x版本慢2倍 #2630 还有识别时内存一直涨 溢出 #303 虽然这个issue关闭了,但是下面还是有人再报错。。。

3.3 剪枝

在上面下载支持mlkdnn的CPU预测库的时候,看到了一个很有用的说明文档:https://paddle-inference.readthedocs.io/en/latest/index.html,就是针对paddle系列的推理模型的。

模型量化(主要就是剪枝)——X86 CPU 上部署量化模型

大概介绍一下,搬运

众所周知,模型量化可以有效加快模型预测性能,飞桨也提供了强大的模型量化功能。所以,本文主要介绍在X86 CPU部署PaddleSlim产出的量化模型。 对于常见图像分类模型,在Casecade Lake机器上(例如Intel® Xeon® Gold 6271、6248,X2XX等),INT8模型进行推理的速度通常是FP32模型的3-3.7倍;在SkyLake机器上(例如Intel® Xeon® Gold 6148、8180,X1XX等),INT8模型进行推理的速度通常是FP32模型的1.5倍。 X86 CPU部署量化模型的步骤: 产出量化模型:使用PaddleSlim训练并产出量化模型 转换量化模型:将量化模型转换成最终部署的量化模型 部署量化模型:使用Paddle Inference预测库部署量化模型

一开始其实不太想用剪枝的,因为慢的原因主要在于检测,但是检测的模型已经是剪枝后的了,在比较过全都使用默认的剪枝模型(检测+识别),和使用默认的检测+自己的识别模型之后,发现其实还是有些效果的。

但是相比于剪枝的代价,并不值得。

3.4 其他可能的途径

3.3 更换模型

https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_ch/models_list.md

3.4 多进程

FAQ-如何多进程运行paddleocr?

Q3.4.33: 如何多进程运行paddleocr? A:实例化多个paddleocr服务,然后将服务注册到注册中心,之后通过注册中心统一调度即可,关于注册中心,可以搜索eureka了解一下具体使用,其他的注册中心也行。

Q3.4.44: 如何多进程预测 A: 近期PaddleOCR新增了多进程预测控制参数,use_mp表示是否使用多进程,total_process_num表示在使用多进程时的进程数。具体使用方式请参考文档。

parser.add_argument("--use_mp", type=str2bool, default=False)

# 只能命令行调用

# 使用方向分类器

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true

# 不使用方向分类器

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false

# 使用多进程

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false --use_mp=True --total_process_num=6

看了一下,其实这个文件https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/paddleocr.py 和另一个PaddleOCR/tools/infer/utility.py文件内容很像, wheel包里的那个paddleocr.py其实就是这个utility.py文件的一部分内容的简化,方便调用而已。

3.5 cpu占用问题

之前又说内存泄漏,还有个问题就是CPU抢占,默认会占到100%。 很多人也有这样的问题,比如:百度AI社区-ppocr部分 在FAQ文档中没有搜索到相关信息,移动端arm cpu优化学习笔记第3弹–绑定cpu(cpu affinity)

查看自己电脑核数 win10系统如何查看cpu核数 所以我这个电脑是8核。

3.5.1 paddle绑定cpu问题

3.6推理部署文档里涉及了一点点,主要是: Docs » Python API 文档 » Config 类 » 3. 使用 CPU 进行预测

后来想到PaddleOCR的代码中有: 主要是这里的infer文件夹中的五个脚本文件,关键就是搞清楚utility.py中config文件配置的项目都是怎么搞的。 由于人工找太累了而且还没找到,所以直接在代码里调试来查看:把调用ocr的附近打个断点,然后看输出的变量,里面就有ocr对象(PaddleOCR类) 然后就可以看到,有一个文本检测和文本识别 展开,可以看到大部分参数其实都是在文本检测那里配置的 文本检测配置了很多东西,但是并没有cpu相关的配置。

其实utility.py文件中,有一段代码,找到自己本机安装paddleocr的地方,C:\software\anaconda\Lib\site-packages\paddleocr\tools\infer,133行左右

if args.use_gpu:

config.enable_use_gpu(args.gpu_mem, 0)

if args.use_tensorrt:

config.enable_tensorrt_engine(

precision_mode=inference.PrecisionType.Half

if args.use_fp16 else inference.PrecisionType.Float32,

max_batch_size=args.max_batch_size)

else:

config.disable_gpu()

# cpu设置的关键

config.set_cpu_math_library_num_threads(6)

if args.enable_mkldnn:

# cache 10 different shapes for mkldnn to avoid memory leak

# mkldnn设置的关键

config.set_mkldnn_cache_capacity(10)

config.enable_mkldnn()

# TODO LDOUBLEV: fix mkldnn bug when bach_size > 1

#config.set_mkldnn_op({'conv2d', 'depthwise_conv2d', 'pool2d', 'batch_norm'})

args.rec_batch_num = 1

根据文档Docs » Python API 文档 » Config 类 » 3. 使用 CPU 进行预测说明,

在 CPU 可用核心数足够时,可以通过设置 set_cpu_math_library_num_threads 将线程数调高一些,默认线程数为 1

所以如果想要限制这个使用cpu的核数量,可以设置代码中

config.set_cpu_math_library_num_threads(6)

# 把6改成4好了

另外,由于启用了mkldnn,还是根据上面那个文档:

启用 MKLDNN 的前提为已经使用 CPU 进行预测,否则启用 MKLDNN 无法生效 启用 MKLDNN BF16 要求 CPU 型号可以支持 AVX512,否则无法启用 MKLDNN BF16

# 设置 MKLDNN 的 cache 容量大小

config.set_mkldnn_cache_capacity(1)

最后将cpu个数从6变成4,mkldnn从10变成5,需要重启电脑才生效,使用reload函数重新加载库似乎没什么用,关掉pycharm重新启动pycharm也没啥用。 但是检测速度又降低了。 而且重启电脑之后,第一次是控制在了50%左右,但是第二次再去进行的时候就不行了。

这是因为 python 把ppocr这个库缓存了,需要把它从 sys.modules 里删了再导入即可。 参考另一个博文:python3 reload

3.6 推理部署文档

paddle有很多关于推理部署的专门的文档,其实可以看看。 推理部署 这附近文档还有个图,感觉不错

另外,从官网这里可以切到文档:Python预测部署示例 还找到了一个使用Paddle inference进行口罩检测推理的: (二) 使用 Paddle Inference进行口罩检测

优惠劵

吨吨不打野

关注

关注

49

点赞

210

收藏

觉得还不错?

一键收藏

打赏

知道了

14

评论

PaddleOCR本地部署(安装,使用,模型优化/加速)

之前在服务器上要训练ppocr模型,所以需要额外安装PaddlePaddle,但是自己训练后已经把模型变成了推理模型,直接可以使用paddleocr package这一个包来进行运行,所以在迁移到别的环境时候,可以不再进行paddlepaddle的安装。参考:和之前的文章:PaddleOCR数字仪表识别——5. ppocr封装使用1. 安装根据:paddleocr package使用说明一开始以为:pip install "paddleocr>=2.0.1"但是果然:so就在本机安

复制链接

扫一扫

专栏目录

PaddleOCR做成exe程序,打开即用,无需安装任何环境,还可以POST访问

09-25

将最新的PaddleOCR部署到windows10上面,并且连同环境一起,打包成exe程序,打开直接运行,并且是做成web服务的方式,在浏览器里面输入http://localhost:18888/docs即可访问

PaddleOCR Docker 服务化 部署过程

04-15

PaddleOCR Docker 服务化 部署过程

14 条评论

您还未登录,请先

登录

后发表或查看评论

[深度学习]paddleocrv4模型推理要比v3版本慢很多原因

最新发布

FL1623863129的博客

01-29

1038

那个飞桨的页面也多次翻到过,但是并没有下旧版本的回来试过(还没有走到那步吧,想先试试其他法子能不能解决问题),而且 VS C# 的开发习惯还是喜欢直接用一键安装的 nuget 包不用自己折腾,所幸现在已经调整出可以接受的方案了。速度会变慢,上面3个排列组合任意一个不满足,速度都快。

部署paddleocr

m0_63590134的博客

11-06

566

最近接了 一个需求,客户想上传pdf,系统识别pdf内容并保存,用于后续内容检索。其实在我看来不如招一个打字员,服务器接收到文件直接让打字员识别,这样准确率又高,至于效率嘛那得看给别人开多少的工资了。奈何我不是老板,所以只能苦哈哈的去找能够代替打字员工作的方案了。目前市面上常用的tess4j、百度OCR、Tesseract-OCR、百度paddle。前三个要么收费,要么就是准确率要低一些。用现在的话来说,不是买不起,而是paddle更具性价比。

PaddleOCR安装步骤

顺其自然~专栏

02-14

8027

paddle([ˈpædl],桨,船桨)

Windows下的PIP安装

一、环境准备

1.1目前飞桨支持的环境

Windows 7/8/10 专业版/企业版 (64bit)

GPU版本支持CUDA 10.1/10.2/11.0/11.1/11.2,且仅支持单卡

Python 版本 3.6+/3.7+/3.8+/3.9+ (64 bit)

pip 版本 20.2.2或更高版本 (64 bit)

1.2如何查看您的环境

需要确认python的.

PaddleOCR安装教程(一)

m0_55776553的博客

03-09

3704

PaddleOCR安装教程(一)

PaddleOCR服务化部署

qq_44309220的博客

10-22

5187

PaddleOCR 提供 2 种服务部署方式:一种是 PaddleServing 的部署方式, 仅使用 CPU 推理预测在 Windows 和 Linux都能进行部署.若要使用 GPU 进行推理预测, 在 Windows 上只能使用 Docker 进行部署 (这步没有进行尝试).在 Linux 上可以手动部署, 也可以使用 Docker 部署 (这步没有进行尝试).另一种是 PaddleHub 的部署方式, 由于在 Windows 上设置 CUDA_VISIBLE_DEVICES=0 出现错误, 所以 Pa

正确安装PaddleOCR的方法

weixin_44063045的博客

02-25

3623

记录安装paddleorc踩过的坑

安装paddleocr

雪剑封心

11-21

1899

克隆代码我使用Anaconda3的虚拟环境进行安装直接在当前目录下标注工具安装有两种方法:1.直接pip,你可以在其他环境下,或者 根本无需下载PaddleOCR的情况下安装。2.在PaddleOCR里面找到PPOCRLabel,编译安装。进入标注工具的目录编译生成进入生成的文件夹cd dist安装编译文件使用标注工具命令启动PPOCRLabel。

PaddleOCR问题汇总(2)

《好好先生》专栏

11-07

7282

PaddleOCR问题汇总

Q3.1.64: config yml文件中的ratio_list参数的作用是什么?

A: 在动态图中,ratio_list在有多个数据源的情况下使用,ratio_list中的每个值是每个epoch从对应数据源采样数据的比例。如ratio_list=[0.3,0.2],label_file_list=['data1','data2'],代表每个epoch的训练数据包含data1 30%的数据,和data2里 20%的数据,ratio_list中数值的和不需要等于1。

rat

C#实现基于Csharp和OpenVINO部署PaddleOCR模型.zip

11-19

C#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于...

PaddleOCR系列-训练模型并部署android手机,源代码

08-11

PaddleOCR系列-训练模型并部署android手机,源代码 1.训练paddleocr模型 2.ocr模型部署安卓手机 3.文章:https://blog.csdn.net/qq122716072/article/details/126244000 4.

C#部署paddleocrv4模型例子源码

01-07

【测试环境】 vs2019 ...opencvsharp4.8.0 Sdcb.PaddleInference Sdcb.PaddleOCR 博客地址:https://blog.csdn.net/FL1623863129/article/details/135435809 视频演示:https://www.bilibili.com/video/BV19g4y1D75w/

c# 部署PaddleOCR(csdn)————程序.pdf

12-01

c# 部署PaddleOCR(csdn)————程序

基于Flask对PaddleOCR进行部署项目源码+项目操作说明(方便调用).zip

09-18

2. 安装好本地需要的环境(`paddlepaddle`、`paddleocr`、以及[requirements.txt](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/requirements.txt)); 3. 根据需求,修改`server.py`第97行`IP`...

将paddleocr灯光模型转换为ncnn,您可以由ncnn使用它。-C/C++开发

05-27

将paddleocr灯光模型转换为ncnn,您可以由ncnn使用它。 ncnn_paddleocr将paddleocr灯光模型转换为ncnn,您可以由ncnn使用它。 您可以使用chineseocr_lite项目的推断代码。 PS:如果使用角度模型plz,请将输入形状...

基于PaddleOCR车牌号识别模型

05-14

基于PaddleOCR车牌号识别模型,配合车牌号检测模型,可识别9种类型的车牌,需要可查看文章https://blog.csdn.net/YY007H/article/details/124655068。

基于PaddleOCR车牌号检测模型

05-14

基于PaddleOCR车牌号检测模型,可检测9种类型的车牌,需要可查看文章https://blog.csdn.net/YY007H/article/details/124651163。

paddleocr模型部署

08-27

paddleocr模型的部署可以分为几个步骤。首先,你需要下载所需的模型文件。可以通过执行以下命令下载分类模型:

```

!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar

```

接下来,你需要解压下载的压缩包。之后,你可以继续部署文本方向分类和文本识别模型。前一篇博客【模型部署】PaddleOCR模型openvino部署(一)已经介绍了检测模型DBNet的部署方法。你可以参考该博客,将检测、方向分类和文本识别模型串联起来,完成完整的部署流程。

如果你想在Android端部署PaddleOCR训练的新模型,你需要做一些准备工作。具体细节可以在相关文章中找到。123

#### 引用[.reference_title]

- *1* *2* [【模型部署】PaddleOCR模型openvino部署(二)](https://blog.csdn.net/qq_40035462/article/details/124436639)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"]

- *3* [基于PaddleOCR训练的新模型Android端部署全流程记录](https://blog.csdn.net/YY007H/article/details/124774019)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"]

[ .reference_list ]

“相关推荐”对你有帮助么?

非常没帮助

没帮助

一般

有帮助

非常有帮助

提交

吨吨不打野

CSDN认证博客专家

CSDN认证企业博客

码龄9年

人工智能领域新星创作者

363

原创

5034

周排名

3076

总排名

182万+

访问

等级

8070

积分

1万+

粉丝

1465

获赞

696

评论

6743

收藏

私信

关注

热门文章

禁用Windows Defender Antivirus Service

66081

pip install git(pip直接安装git上的项目)

63651

vscode使用ssh远程连接失败(及其他问题合集)

56929

PaddleOCR本地部署(安装,使用,模型优化/加速)

40522

带圈数字符号0-100和unicode编码

40202

分类专栏

动手学深度学习pytorch

付费

53篇

医学影像

付费

27篇

医学数字图像处理

ITK

13篇

医学影像知识

7篇

传统算法

2篇

项目实战

27篇

OCR数字仪表识别

32篇

PyQt

3篇

电动工具所项目

15篇

mac

15篇

opencv

24篇

学习Opencv

14篇

数字图像处理

3篇

C++

10篇

python基础

40篇

意外接触的一些知识

49篇

大模型实战营

4篇

OpenMMLab-AI实战营第二期

16篇

强化学习

8篇

量化交易

4篇

CVAT

3篇

OpenVINO

3篇

anyq

9篇

linux服务器相关

18篇

工具日常使用

26篇

工具推荐

11篇

其他

22篇

知识图谱

15篇

必备技能

24篇

docker

8篇

git

14篇

pytorch

8篇

DL

5篇

最新评论

OpenMMLab-AI实战营第二期——2-2.基于RTMPose的耳朵穴位关键点检测(Colab+MMPose)

ZeroRegister:

您好,请问能分享一下人耳关键点检测的数据集吗,有偿也可,非常感谢您

pip install git(pip直接安装git上的项目)

Yonggie:

是魔法!

DICOM-RT struct转换为nii.gz

吨吨不打野:

私聊你了。。。

DICOM-RT struct转换为nii.gz

Lil146:

非常感谢,我大致理解了。但是导师给我的dcm文件里面没有ImagePositionPatient标签。只有轮廓数据。用不了def get_mask(contours, slices):函数

DICOM-RT struct转换为nii.gz

吨吨不打野:

有 numpy array格式的label,就可以用simpleITK转换。

还有问题的话,你可以看看 我两个博客:

SimpleITK使用——5. dicom转nifti,获取dicom的meta信息,

看这个,就是读取dicom图像,获得图像的spaing等信息,方便保存

SimpleITK使用——2. 进行crop操作

看这个,就是看怎么把numpy的array用simpleITK转成nii格式的

您愿意向朋友推荐“博客详情页”吗?

强烈不推荐

不推荐

一般般

推荐

强烈推荐

提交

最新文章

学习Opencv(蝴蝶书/C++)——5.矩阵的其他算子(友元函数)

大模型实战营第二期——4. XTuner 大模型单卡低成本微调实战

大模型实战营第二期——3. 基于 InternLM 和 LangChain 搭建你的知识库

2024年6篇

2023年40篇

2022年98篇

2021年182篇

2020年47篇

2019年1篇

目录

目录

分类专栏

动手学深度学习pytorch

付费

53篇

医学影像

付费

27篇

医学数字图像处理

ITK

13篇

医学影像知识

7篇

传统算法

2篇

项目实战

27篇

OCR数字仪表识别

32篇

PyQt

3篇

电动工具所项目

15篇

mac

15篇

opencv

24篇

学习Opencv

14篇

数字图像处理

3篇

C++

10篇

python基础

40篇

意外接触的一些知识

49篇

大模型实战营

4篇

OpenMMLab-AI实战营第二期

16篇

强化学习

8篇

量化交易

4篇

CVAT

3篇

OpenVINO

3篇

anyq

9篇

linux服务器相关

18篇

工具日常使用

26篇

工具推荐

11篇

其他

22篇

知识图谱

15篇

必备技能

24篇

docker

8篇

git

14篇

pytorch

8篇

DL

5篇

目录

评论 14

被折叠的  条评论

为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

祝福语

请填写红包祝福语或标题

红包数量

红包个数最小为10个

红包总金额

红包金额最低5元

余额支付

当前余额3.43元

前往充值 >

需支付:10.00元

取消

确定

下一步

知道了

成就一亿技术人!

领取后你会自动成为博主和红包主的粉丝

规则

hope_wisdom 发出的红包

打赏作者

吨吨不打野

解决了问题,觉得还行就给点

¥1

¥2

¥4

¥6

¥10

¥20

扫码支付:¥1

获取中

扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额

0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。 2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值

PaddleOCR,一款文本识别效果不输于商用的Python库! - 知乎

PaddleOCR,一款文本识别效果不输于商用的Python库! - 知乎切换模式写文章登录/注册PaddleOCR,一款文本识别效果不输于商用的Python库!小张Python1、前言Hello 大家好呀,我是小张~本期将给大家介绍一个 Github 项目,用于OCR文本识别的;在之前的教程中,关于用 Python 实现OCR 识别,写过两篇文章:一篇是关于 python 与 Tesseract ,详情可参考:介绍一个Python 包 ,几行代码可实现 OCR 文本识别; tesseract 是基于传统机器学习方法实现的, 对于英文字符识别还是挺棒的,但中文字符的识别效果就差强人意了~~还有一篇是介绍了一个用于文本识别的 Github 项目Easy-OCR,相关用法详情可参考:关于文本OCR检测、分享一个基于深度学习技术的Python库Easy-OCR 是基于深度学习技术开发的,识别效果要优于 Tesserart,支持识别70+个国家语言,除了文本识别之外还能对文本块区域完成检测功能,并用线框将相关区域标注在原图上但测试后发现,该库对于某些路标识别效果并不是很精确~2 PaddleOCR 介绍这篇文章呢,将介绍一个新的 Github 项目,同样用于 OCR 识别、该项目名叫 PaddleOCR,是 Paddle 的一个分支;PaddleOCR 基于深度学习技术实现的, 所以使用时需要训练好的权重文件,但这个不需要我们担心,因为官方提供的有~本小节是对 PaddleOCR 项目的简单介绍,如果只对使用步骤感兴趣的同学可以跳过本小节看第三节部分~~~经测试 PaddleOCR 识别效果非常优秀,下面两张图片是从官网介绍中截取的几张图片图一图二为了测试该项目的识别性能、随后我在网上找了一张关于优惠卷的图片,图片中文字情况比较复杂,垂直、斜体等;还有中英文相结合,甚至还有小数点最终测试效果如下,无论左边图片文本复杂度有多高,图中文字基本都能识别到,非常Nice 关于 PaddleOCR 模型 ,有以下几个特点PaddleOCR 从 2020.5.14 发布,项目迭代到现在,功能一直处于在不断完善的过程;在 PaddleOCR 识别中,会依次完成三种任务:检测、方向分类及文本识别;关于预训练权重,PaddleOCR 官网根据提供权重文件大小分为两类:一类为轻量级,(检测+分类+识别)三类权重加起来大小一共才 9.4 M,适用于手机端和服务器部署;另一类(检测+分类+识别)三类权重内存加起来一共 143.4 MB ,适用于服务器部署;无论模型是否轻量级,识别效果都能与商业效果相比,在本期教程中将选用轻量级权重用于测试;支持多语言识别,目前能够支持 80 多种语言;除了能对中文、英语、数字识别之外,还能应对字体倾斜、文本中含有小数点字符等复杂情况提供有丰富的 OCR 领域相关工具供我们使用,方便我们制作自己的数据集、用于训练半自动数据标注工具;数据合成工具;支持 pip 安装,简单上手;3 PaddleOCR 使用简单介绍完之后,下面将手把手教大家怎么去使用 PaddleOCR,3.1 环境介绍介绍一下本次所用的测试环境os:Win10;Python:3.7.9;3.2 安装 PaddlePaddle2.0PaddleOCR 需在 PaddlePaddle2.0 下才可以正常运行,开始之前请确保 PaddlePaddle2.0 已经安装,pip3 install --upgrade pip

#

python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple3.2 克隆 PaddleOCR 仓库用 git clone 命令或者 Download 把项目仓库直接下载到本地git clone https://github.com/PaddlePaddle/PaddleOCR这里我用的是 git 命令3.3 安装PaddleOCR 第三方依赖包命令行进入 PaddleOCR 文件夹下cd PaddleOCR安装第三方依赖项pip3 install -r requirements.txt这一步骤如果报错的话,建议把改项目放置在一个虚拟环境中再进行安装,如果用虚拟环境的话,记得还需要安装一下 PaddlePaddle 包python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple3.4 下载权重文件权重链接地址分别贴在下方,需依次下载到本地;检测权重https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar方向分类权重https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar识别权重https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar下载到本地之后分别进行解压,创建一个 inference 文件夹,把前面解压后的三个文件夹放入 inference 中,再把 inference 文件夹放入 PaddleOCR 中,最终树形目录结构效果如下:3.5 PaddleOCR 使用 以上环境配置好之后,就可以使用 PaddleOCR 进行识别了,在PaddleOCR 项目环境下打开终端,根据自己情况,输入下面三种类型中的一种即可完成文本识别1,使用 gpu,识别单张图片 python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True2,使用 gpu ,识别多张图片python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True3,不使用gpu,识别单张图片python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False里面有两个参数需要自己配置一下,参数说明:image_dir -> 为需要识别图片路径或文件夹;det_model_dir -> 存放识别后图片路径或文件夹;PaddleOCR 识别一张图片很快,只用 CPU 的话,也只需要两三秒4. 数据、源码获取为了方便,我已经把测试数据、项目代码都打包在一起了,下载后完成以下两个步骤即可正常使用(使用方法参考章节 3.5 部分)创建虚拟环境;pip 工具安装依赖项;python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple

# 依赖项

pip3 install -r requirements.txt获取方式:5 小总结Paddle-OCR 属于Paddle 框架其中的一个应用,Paddle 除了 OCR 之外还有许多其它好玩的模型,关键开发者提供有训练好的预权重文件、降低了使用门槛后期呢,我也打算将从中挑一些好玩的项目,通过博文的方式手把手教大家跑起来好了,关于 PaddleOCR 的使用就介绍到这里了,如果内容对你有帮助的话不妨点个赞来鼓励一下我~最后感谢大家的阅读,我们下期见编辑于 2021-06-12 12:48OCR文字识别PythonGitHub​赞同 98​​16 条评论​分享​喜欢​收藏​申请

PaddleOCR史上最全安装教程 - 知乎

PaddleOCR史上最全安装教程 - 知乎切换模式写文章登录/注册PaddleOCR史上最全安装教程水底的土豆zhizai》简介1 简介PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。并可以此为契机学习深度学习。github:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README_ch.md官网:飞桨PaddlePaddle-源于产业实践的开源深度学习平台移动体验版:https://ai.baidu.com/easyedge/app/openSource?from=paddlelite2 特性3 环境Windows 11 专业版python 版本 (请注意python3.11可能有问题,大家可以先试下,可以的话请留言告诉我)CUDA11.74 安装步骤官网安装步骤:PaddleOCR/doc/doc_en/quickstart_en.md at release/2.6 · PaddlePaddle/PaddleOCR4.1 升级pippython -m pip install --upgrade pip4.2 安装paddlepaddle选择对应版本:因为我有NVIDIA® GPU,则安装GPU版本python -m pip install paddlepaddle-gpu==2.4.2.post117 -f https://www.paddlepaddle.org.cn/whl/windows/mkl/avx/stable.html4.3 安装Shapely地址:https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely?continueFlag=1961b80f775bfd9263cb4ee8416fc63d下载目录执行安装pip install Shapely-1.8.2-cp39-cp39-win_amd64.whl4.4 安装PaddleOCRpip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+安装4.5 安装CUDA参考来源:【Windows11】Cuda和Cudnn详细安装教程_cudnn安装_Jin·的博客-CSDN博客大家选择对应的CUDA版本,地址:CUDA Toolkit Archive地址选择校验4.6 安装cuDNN地址:https://developer.nvidia.com/rdp/cudnn-download地址选择所有文件拷贝至4.5CUDA的安装目录校验是否安装成功校验4.7 安装Zlib地址:Installation Guide将zlibwapi.dll拷贝至4.5CUDA的安装目录bin下4.8 验证参考:PaddleOCR/doc/doc_en/quickstart_en.md at release/2.6 · PaddlePaddle/PaddleOCR5 总结安装步骤还是很麻烦的,但是很开心,希望可以顺利开启深度学习之旅!编辑于 2023-06-22 09:13・IP 属地浙江Python 模块安装paddleWindow11​赞同 20​​15 条评论​分享​喜欢​收藏​申请

paddleocr · PyPI

paddleocr · PyPI

Skip to main content

Switch to mobile version

Warning

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

Search PyPI

Search

Help

Sponsors

Log in

Register

Menu

Help

Sponsors

Log in

Register

Search PyPI

Search

paddleocr 2.7.0.3

pip install paddleocr

Copy PIP instructions

Latest version

Released:

Sep 15, 2023

Awesome OCR toolkits based on PaddlePaddle (8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embeded and IoT devices)

Navigation

Project description

Release history

Download files

Project links

Homepage

Download

Statistics

GitHub statistics:

Stars:

Forks:

Open issues:

Open PRs:

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Meta

License: Apache License 2.0

Tags

ocr,

textdetection,

textrecognition,

paddleocr,

crnn,

east,

star-net,

rosetta,

ocrlite,

db,

chineseocr,

chinesetextdetection,

chinesetextrecognition

Maintainers

zhoujun

Classifiers

Intended Audience

Developers

Natural Language

Chinese (Simplified)

Operating System

OS Independent

Programming Language

Python :: 3

Python :: 3.2

Python :: 3.3

Python :: 3.4

Python :: 3.5

Python :: 3.6

Python :: 3.7

Topic

Utilities

Project description

Project details

Release history

Download files

Project description

Paddleocr Package

1 Get started quickly

1.1 install package

install by pypi

pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+

build own whl package and install

python3 setup.py bdist_wheel

pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr

2 Use

2.1 Use by code

The paddleocr whl package will automatically download the ppocr lightweight model as the default model, which can be customized and replaced according to the section 3 Custom Model.

detection angle classification and recognition

from paddleocr import PaddleOCR,draw_ocr

# Paddleocr supports Chinese, English, French, German, Korean and Japanese.

# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`

# to switch the language model in order.

ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory

img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# draw result

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

Output will be a list, each item contains bounding box, text and recognition confidence

[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]

[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]

[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]

......

Visualization of results

detection and recognition

from paddleocr import PaddleOCR,draw_ocr

ocr = PaddleOCR(lang='en') # need to run only once to download and load model into memory

img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'

result = ocr.ocr(img_path, cls=False)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# draw result

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

Output will be a list, each item contains bounding box, text and recognition confidence

[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]

[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]

[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]

......

Visualization of results

classification and recognition

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to load model into memory

img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'

result = ocr.ocr(img_path, det=False, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

Output will be a list, each item contains recognition text and confidence

['PAIN', 0.990372]

only detection

from paddleocr import PaddleOCR,draw_ocr

ocr = PaddleOCR() # need to run only once to download and load model into memory

img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'

result = ocr.ocr(img_path,rec=False)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# draw result

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

Output will be a list, each item only contains bounding box

[[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]]

[[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]]

[[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]]

......

Visualization of results

only recognition

from paddleocr import PaddleOCR

ocr = PaddleOCR(lang='en') # need to run only once to load model into memory

img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'

result = ocr.ocr(img_path, det=False, cls=False)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

Output will be a list, each item contains recognition text and confidence

['PAIN', 0.990372]

only classification

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True) # need to run only once to load model into memory

img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'

result = ocr.ocr(img_path, det=False, rec=False, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

Output will be a list, each item contains classification result and confidence

['0', 0.99999964]

2.2 Use by command line

show help information

paddleocr -h

detection classification and recognition

paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true --lang en

Output will be a list, each item contains bounding box, text and recognition confidence

[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]

[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]

[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]

......

pdf file is also supported, you can infer the first few pages by using the page_num parameter, the default is 0, which means infer all pages

paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2

detection and recognition

paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --lang en

Output will be a list, each item contains bounding box, text and recognition confidence

[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]

[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]

[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]

......

classification and recognition

paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --lang en

Output will be a list, each item contains text and recognition confidence

['PAIN', 0.9934559464454651]

only detection

paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false

Output will be a list, each item only contains bounding box

[[397.0, 802.0], [1092.0, 802.0], [1092.0, 841.0], [397.0, 841.0]]

[[397.0, 750.0], [1211.0, 750.0], [1211.0, 789.0], [397.0, 789.0]]

[[397.0, 702.0], [1209.0, 698.0], [1209.0, 734.0], [397.0, 738.0]]

......

only recognition

paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false --lang en

Output will be a list, each item contains text and recognition confidence

['PAIN', 0.9934559464454651]

only classification

paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --rec false

Output will be a list, each item contains classification result and confidence

['0', 0.99999964]

3 Use custom model

When the built-in model cannot meet the needs, you need to use your own trained model.

First, refer to export doc to convert your det and rec model to inference model, and then use it as follows

3.1 Use by code

from paddleocr import PaddleOCR,draw_ocr

# The path of detection and recognition model must contain model and params files

ocr = PaddleOCR(det_model_dir='{your_det_model_dir}', rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}', cls_model_dir='{your_cls_model_dir}', use_angle_cls=True)

img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# draw result

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

3.2 Use by command line

paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} --rec_char_dict_path {your_rec_char_dict_path} --cls_model_dir {your_cls_model_dir} --use_angle_cls true

4 Use web images or numpy array as input

4.1 Web image

Use by code

from paddleocr import PaddleOCR, draw_ocr

ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory

img_path = 'http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# show result

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

Use by command line

paddleocr --image_dir http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg --use_angle_cls=true

4.2 Numpy array

Support numpy array as input only when used by code

import cv2

from paddleocr import PaddleOCR, draw_ocr, download_with_progressbar

ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory

img_path = 'PaddleOCR/doc/imgs/11.jpg'

img = cv2.imread(img_path)

# img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY), If your own training model supports grayscale images, you can uncomment this line

result = ocr.ocr(img, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# show result

from PIL import Image

result = result[0]

download_with_progressbar(img_path, 'tmp.jpg')

image = Image.open('tmp.jpg').convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

5 PDF file

Use by command line

you can infer the first few pages by using the page_num parameter, the default is 0, which means infer all pages

paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2

Use by code

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr supports Chinese, English, French, German, Korean and Japanese.

# You can set the parameter `lang` as `ch`, `en`, `fr`, `german`, `korean`, `japan`

# to switch the language model in order.

ocr = PaddleOCR(use_angle_cls=True, lang="ch", page_num=2) # need to run only once to download and load model into memory

img_path = './xxx.pdf'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# draw result

import fitz

from PIL import Image

import cv2

import numpy as np

imgs = []

with fitz.open(img_path) as pdf:

for pg in range(0, pdf.pageCount):

page = pdf[pg]

mat = fitz.Matrix(2, 2)

pm = page.getPixmap(matrix=mat, alpha=False)

# if width or height > 2000 pixels, don't enlarge the image

if pm.width > 2000 or pm.height > 2000:

pm = page.getPixmap(matrix=fitz.Matrix(1, 1), alpha=False)

img = Image.frombytes("RGB", [pm.width, pm.height], pm.samples)

img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)

imgs.append(img)

for idx in range(len(result)):

res = result[idx]

image = imgs[idx]

boxes = [line[0] for line in res]

txts = [line[1][0] for line in res]

scores = [line[1][1] for line in res]

im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result_page_{}.jpg'.format(idx))

6 Parameter Description

Parameter

Description

Default value

use_gpu

use GPU or not

TRUE

gpu_mem

GPU memory size used for initialization

8000M

image_dir

The images path or folder path for predicting when used by the command line

page_num

Valid when the input type is pdf file, specify to predict the previous page_num pages, all pages are predicted by default

0

det_algorithm

Type of detection algorithm selected

DB

det_model_dir

the text detection inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to ~/.paddleocr/det; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path

None

det_max_side_len

The maximum size of the long side of the image. When the long side exceeds this value, the long side will be resized to this size, and the short side will be scaled proportionally

960

det_db_thresh

Binarization threshold value of DB output map

0.3

det_db_box_thresh

The threshold value of the DB output box. Boxes score lower than this value will be discarded

0.5

det_db_unclip_ratio

The expanded ratio of DB output box

2

det_db_score_mode

The parameter that control how the score of the detection frame is calculated. There are 'fast' and 'slow' options. If the text to be detected is curved, it is recommended to use 'slow'

'fast'

det_east_score_thresh

Binarization threshold value of EAST output map

0.8

det_east_cover_thresh

The threshold value of the EAST output box. Boxes score lower than this value will be discarded

0.1

det_east_nms_thresh

The NMS threshold value of EAST model output box

0.2

rec_algorithm

Type of recognition algorithm selected

CRNN

rec_model_dir

the text recognition inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to ~/.paddleocr/rec; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path

None

rec_image_shape

image shape of recognition algorithm

"3,32,320"

rec_batch_num

When performing recognition, the batchsize of forward images

30

max_text_length

The maximum text length that the recognition algorithm can recognize

25

rec_char_dict_path

the alphabet path which needs to be modified to your own path when rec_model_Name use mode 2

./ppocr/utils/ppocr_keys_v1.txt

use_space_char

Whether to recognize spaces

TRUE

drop_score

Filter the output by score (from the recognition model), and those below this score will not be returned

0.5

use_angle_cls

Whether to load classification model

FALSE

cls_model_dir

the classification inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to ~/.paddleocr/cls; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path

None

cls_image_shape

image shape of classification algorithm

"3,48,192"

label_list

label list of classification algorithm

['0','180']

cls_batch_num

When performing classification, the batchsize of forward images

30

enable_mkldnn

Whether to enable mkldnn

FALSE

use_zero_copy_run

Whether to forward by zero_copy_run

FALSE

lang

The support language, now only Chinese(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan) are supported

ch

det

Enable detction when ppocr.ocr func exec

TRUE

rec

Enable recognition when ppocr.ocr func exec

TRUE

cls

Enable classification when ppocr.ocr func exec((Use use_angle_cls in command line mode to control whether to start classification in the forward direction)

FALSE

show_log

Whether to print log

FALSE

type

Perform ocr or table structuring, the value is selected in ['ocr','structure']

ocr

ocr_version

OCR Model version number, the current model support list is as follows: PP-OCRv3 supports Chinese and English detection, recognition, multilingual recognition, direction classifier models, PP-OCRv2 support Chinese detection and recognition model, PP-OCR support Chinese detection, recognition and direction classifier, multilingual recognition model

PP-OCRv3

Project details

Project links

Homepage

Download

Statistics

GitHub statistics:

Stars:

Forks:

Open issues:

Open PRs:

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Meta

License: Apache License 2.0

Tags

ocr,

textdetection,

textrecognition,

paddleocr,

crnn,

east,

star-net,

rosetta,

ocrlite,

db,

chineseocr,

chinesetextdetection,

chinesetextrecognition

Maintainers

zhoujun

Classifiers

Intended Audience

Developers

Natural Language

Chinese (Simplified)

Operating System

OS Independent

Programming Language

Python :: 3

Python :: 3.2

Python :: 3.3

Python :: 3.4

Python :: 3.5

Python :: 3.6

Python :: 3.7

Topic

Utilities

Release history

Release notifications |

RSS feed

This version

2.7.0.3

Sep 15, 2023

2.7.0.2

Aug 10, 2023

2.7.0.1

Aug 9, 2023

2.7.0.0

Aug 1, 2023

2.6.1.3

Feb 8, 2023

2.6.1.2

Dec 14, 2022

2.6.1.1

Nov 24, 2022

2.6.1.0

Oct 24, 2022

2.6.0.3

Oct 21, 2022

2.6.0.2

Oct 12, 2022

2.6.0.1

Sep 7, 2022

2.6

Aug 24, 2022

2.5.0.3

May 10, 2022

2.5.0.2

May 9, 2022

2.5

Apr 25, 2022

2.4.0.4

Apr 2, 2022

2.4.0.3

Mar 24, 2022

2.4.0.2

Mar 18, 2022

2.4.0.1

Mar 17, 2022

2.4

Jan 10, 2022

2.3.0.2

Nov 17, 2021

2.3.0.1

Sep 7, 2021

2.3

Sep 7, 2021

2.2.0.2

Aug 10, 2021

2.2.0.1

Aug 3, 2021

2.2

Aug 3, 2021

2.0.6

Apr 13, 2021

2.0.5

Apr 13, 2021

2.0.4

Apr 9, 2021

2.0.3

Mar 17, 2021

2.0.2

Dec 18, 2020

2.0.1

Dec 16, 2020

1.1.1

Nov 27, 2020

1.0.1

Oct 30, 2020

1.0.0

Sep 21, 2020

0.0.3.1

Aug 24, 2020

0.0.3

Aug 22, 2020

0.0.2

Aug 22, 2020

0.0.1.1

Aug 22, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

paddleocr-2.7.0.3-py3-none-any.whl

(465.7 kB

view hashes)

Uploaded

Sep 15, 2023

py3

Close

Hashes for paddleocr-2.7.0.3-py3-none-any.whl

Hashes for paddleocr-2.7.0.3-py3-none-any.whl

Algorithm

Hash digest

SHA256

d85cbf75e9aa652fd22f2077b56d1e0dd1a97d09bf317d99f4dbba5c4f1f29ad

Copy

MD5

65d1d275a31702ca0f0dda5b97219c53

Copy

BLAKE2b-256

8fd01a2f9430f61781beb16556182baa938e8f93c8b46c27ad5865a5655fae05

Copy

Close

Help

Installing packages

Uploading packages

User guide

Project name retention

FAQs

About PyPI

PyPI on Twitter

Infrastructure dashboard

Statistics

Logos & trademarks

Our sponsors

Contributing to PyPI

Bugs and feedback

Contribute on GitHub

Translate PyPI

Sponsor PyPI

Development credits

Using PyPI

Code of conduct

Report security issue

Privacy policy

Terms of use

Acceptable Use Policy

Status:

all systems operational

Developed and maintained by the Python community, for the Python community.

Donate today!

"PyPI", "Python Package Index", and the blocks logos are registered trademarks of the Python Software Foundation.

© 2024 Python Software Foundation

Site map

Switch to desktop version

English

español

français

日本語

português (Brasil)

українська

Ελληνικά

Deutsch

中文 (简体)

中文 (繁體)

русский

עברית

esperanto

Supported by

AWS

Cloud computing and Security Sponsor

Datadog

Monitoring

Fastly

CDN

Google

Download Analytics

Microsoft

PSF Sponsor

Pingdom

Monitoring

Sentry

Error logging

StatusPage

Status page

飞桨PaddlePaddle-源于产业实践的开源深度学习平台

ddlePaddle-源于产业实践的开源深度学习平台You need to enable JavaScript to run this app.\u200E开始使用特性文档API使用指南工具平台工具AutoDLPaddleHubPARLERNIE全部平台AI StudioEasyDLEasyEdge资源模型和数据集学习资料应yload":{"allShortcutsEnabled":false,"fileTree":{"doc/doc_ch":{"items":[{"name":"dataset","path":"doc/doc_ch/dataset","contentType":"directory"},{"name":"FAQ.md","path":"doc/doc_ch/FAQ.md","contentType":"file"},{"name":"PP-OCRv3_introduction.md","path":"doc/doc_ch/PP-OCRv3_introduction.md","contentType":"file"},{"name":"PP-OCRv4_introduction.md","path":"doc/doc_ch/PP-OCRv4_introduction.md","contentType":"file"},{"name":"PPOCRv3_det_train.md","path":"doc/doc_ch/PPOCRv3_det_train.md","contentType":"file"},{"name":"add_new_algorithm.md","path":"doc/doc_ch/add_new_algorithm.md","contentType":"file"},{"name":"algorithm_det_ct.md","path":"doc/doc_ch/algorithm_det_ct.md","contentType":"file"},{"name":"algorithm_det_db.md","path":"doc/doc_ch/algorithm_det_db.md","contentType":"file"},{"name":"algorithm_det_drrg.md","path":"doc/doc_ch/algorithm_det_drrg.md","contentType":"file"},{"name":"algorithm_det_east.md","path":"doc/doc_ch/algorithm_det_east.md","contentType":"file"},{"name":"algorithm_det_fcenet.md","path":"doc/doc_ch/algorithm_det_fcenet.md","contentType":"file"},{"name":"algorithm_det_psenet.md","path":"doc/doc_ch/algorithm_det_psenet.md","contentType":"file"},{"name":"algorithm_det_sast.md","path":"doc/doc_ch/algorithm_det_sast.md","contentType":"file"},{"name":"algorithm_e2e_pgnet.md","path":"doc/doc_ch/algorithm_e2e_pgnet.md","contentType":"file"},{"name":"algorithm_inference.md","path":"doc/doc_ch/algorithm_inference.md","contentType":"file"},{"name":"algorithm_kie_layoutxlm.md","path":"doc/doc_ch/algorithm_kie_layoutxlm.md","contentType":"file"},{"name":"algorithm_kie_sdmgr.md","path":"doc/doc_ch/algorithm_kie_sdmgr.md","contentType":"file"},{"name":"algorithm_kie_vi_layoutxlm.md","path":"doc/doc_ch/algorithm_kie_vi_layoutxlm.md","contentType":"file"},{"name":"algorithm_overview.md","path":"doc/doc_ch/algorithm_overview.md","contentType":"file"},{"name":"algorithm_rec_abinet.md","path":"doc/doc_ch/algorithm_rec_abinet.md","contentType":"file"},{"name":"algorithm_rec_can.md","path":"doc/doc_ch/algorithm_rec_can.md","contentType":"file"},{"name":"algorithm_rec_crnn.md","path":"doc/doc_ch/algorithm_rec_crnn.md","contentType":"file"},{"name":"algorithm_rec_nrtr.md","path":"doc/doc_ch/algorithm_rec_nrtr.md","contentType":"file"},{"name":"algorithm_rec_rare.md","path":"doc/doc_ch/algorithm_rec_rare.md","contentType":"file"},{"name":"algorithm_rec_rfl.md","path":"doc/doc_ch/algorithm_rec_rfl.md","contentType":"file"},{"name":"algorithm_rec_robustscanner.md","path":"doc/doc_ch/algorithm_rec_robustscanner.md","contentType":"file"},{"name":"algorithm_rec_rosetta.md","path":"doc/doc_ch/algorithm_rec_rosetta.md","contentType":"file"},{"name":"algorithm_rec_sar.md","path":"doc/doc_ch/algorithm_rec_sar.md","contentType":"file"},{"name":"algorithm_rec_seed.md","path":"doc/doc_ch/algorithm_rec_seed.md","contentType":"file"},{"name":"algorithm_rec_spin.md","path":"doc/doc_ch/algorithm_rec_spin.md","contentType":"file"},{"name":"algorithm_rec_srn.md","path":"doc/doc_ch/algorithm_rec_srn.md","contentType":"file"},{"name":"algorithm_rec_starnet.md","path":"doc/doc_ch/algorithm_rec_starnet.md","contentType":"file"},{"name":"algorithm_rec_svtr.md","path":"doc/doc_ch/algorithm_rec_svtr.md","contentType":"file"},{"name":"algorithm_rec_visionlan.md","path":"doc/doc_ch/algorithm_rec_visionlan.md","contentType":"file"},{"name":"algorithm_rec_vitstr.md","path":"doc/doc_ch/algorithm_rec_vitstr.md","contentType":"file"},{"name":"algorithm_sr_gestalt.md","path":"doc/doc_ch/algorithm_sr_gestalt.md","contentType":"file"},{"name":"algorithm_sr_telescope.md","path":"doc/doc_ch/algorithm_sr_telescope.md","contentType":"file"},{"name":"algorithm_table_master.md","path":"doc/doc_ch/algorithm_table_master.md","contentType":"file"},{"name":"angle_class.md","path":"doc/doc_ch/angle_class.md","contentType":"file"},{"name":"application.md","path":"doc/doc_ch/application.md","contentType":"file"},{"name":"benchmark.md","path":"doc/doc_ch/benchmark.md","contentType":"file"},{"name":"clone.md","path":"doc/doc_ch/clone.md","contentType":"file"},{"name":"code_and_doc.md","path":"doc/doc_ch/code_and_doc.md","contentType":"file"},{"name":"config.md","path":"doc/doc_ch/config.md","contentType":"file"},{"name":"customize.md","path":"doc/doc_ch/customize.md","contentType":"file"},{"name":"data_annotation.md","path":"doc/doc_ch/data_annotation.md","contentType":"file"},{"name":"data_synthesis.md","path":"doc/doc_ch/data_synthesis.md","contentType":"file"},{"name":"detection.md","path":"doc/doc_ch/detection.md","contentType":"file"},{"name":"distributed_training.md","path":"doc/doc_ch/distributed_training.md","contentType":"file"},{"name":"enhanced_ctc_loss.md","path":"doc/doc_ch/enhanced_ctc_loss.md","contentType":"file"},{"name":"environment.md","path":"doc/doc_ch/environment.md","contentType":"file"},{"name":"equation_a_ctc.png","path":"doc/doc_ch/equation_a_ctc.png","contentType":"file"},{"name":"equation_c_ctc.png","path":"doc/doc_ch/equation_c_ctc.png","contentType":"file"},{"name":"equation_ctcloss.png","path":"doc/doc_ch/equation_ctcloss.png","contentType":"file"},{"name":"equation_focal_ctc.png","path":"doc/doc_ch/equation_focal_ctc.png","contentType":"file"},{"name":"finetune.md","path":"doc/doc_ch/finetune.md","contentType":"file"},{"name":"focal_loss_formula.png","path":"doc/doc_ch/focal_loss_formula.png","contentType":"file"},{"name":"focal_loss_image.png","path":"doc/doc_ch/focal_loss_image.png","contentType":"file"},{"name":"framework.png","path":"doc/doc_ch/framework.png","contentType":"file"},{"name":"inference_args.md","path":"doc/doc_ch/inference_args.md","contentType":"file"},{"name":"inference_ppocr.md","path":"doc/doc_ch/inference_ppocr.md","contentType":"file"},{"name":"installation.md","path":"doc/doc_ch/installation.md","contentType":"file"},{"name":"kie.md","path":"doc/doc_ch/kie.md","contentType":"file"},{"name":"knowledge_distillation.md","path":"doc/doc_ch/knowledge_distillation.md","contentType":"file"},{"name":"models.md","path":"doc/doc_ch/models.md","contentType":"file"},{"name":"models_list.md","path":"doc/doc_ch/models_list.md","contentType":"file"},{"name":"multi_languages.md","path":"doc/doc_ch/multi_languages.md","contentType":"file"},{"name":"ocr_book.md","path":"doc/doc_ch/ocr_book.md","contentType":"file"},{"name":"ppocr_introduction.md","path":"doc/doc_ch/ppocr_introduction.md","contentType":"file"},{"name":"quickstart.md","path":"doc/doc_ch/quickstart.md","contentType":"file"},{"name":"rec_algo_compare.png","path":"doc/doc_ch/rec_algo_compare.png","contentType":"file"},{"name":"recognition.md","path":"doc/doc_ch/recognition.md","contentType":"file"},{"name":"reference.md","path":"doc/doc_ch/reference.md","contentType":"file"},{"name":"table_recognition.md","path":"doc/doc_ch/table_recognition.md","contentType":"file"},{"name":"thirdparty.md","path":"doc/doc_ch/thirdparty.md","contentType":"file"},{"name":"training.md","path":"doc/doc_ch/training.md","contentType":"file"},{"name":"tree.md","path":"doc/doc_ch/tree.md","contentType":"file"},{"name":"update.md","path":"doc/doc_ch/update.md","contentType":"file"},{"name":"visualization.md","path":"doc/doc_ch/visualization.md","contentType":"file"},{"name":"whl.md","path":"doc/doc_ch/whl.md","contentType":"file"}],"totalCount":80},"doc":{"items":[{"name":"datasets","path":"doc/datasets","contentType":"directory"},{"name":"demo","path":"doc/demo","contentType":"directory"},{"name":"doc_ch","path":"doc/doc_ch","contentType":"directory"},{"name":"doc_en","path":"doc/doc_en","contentType":"directory"},{"name":"doc_i18n","path":"doc/doc_i18n","contentType":"directory"},{"name":"fonts","path":"doc/fonts","contentType":"directory"},{"name":"imgs","path":"doc/imgs","contentType":"directory"},{"name":"imgs_en","path":"doc/imgs_en","contentType":"directory"},{"name":"imgs_results","path":"doc/imgs_results","contentType":"directory"},{"name":"imgs_words","path":"doc/imgs_words","contentType":"directory"},{"name":"imgs_words_en","path":"doc/imgs_words_en","contentType":"directory"},{"name":"install","path":"doc/install","contentType":"directory"},{"name":"ppocr_v3","path":"doc/ppocr_v3","contentType":"directory"},{"name":"ppocr_v4","path":"doc/ppocr_v4","contentType":"directory"},{"name":"tricks","path":"doc/tricks","contentType":"directory"},{"name":"PaddleOCR_log.png","path":"doc/PaddleOCR_log.png","contentType":"file"},{"name":"banner.png","path":"doc/banner.png","contentType":"file"},{"name":"deployment.png","path":"doc/deployment.png","contentType":"file"},{"name":"deployment_en.png","path":"doc/deployment_en.png","contentType":"file"},{"name":"joinus.PNG","path":"doc/joinus.PNG","contentType":"file"},{"name":"joinus_paddlex.jpg","path":"doc/joinus_paddlex.jpg","contentType":"file"},{"name":"pgnet_framework.png","path":"doc/pgnet_framework.png","contentType":"file"},{"name":"ppocr_framework.png","path":"doc/ppocr_framework.png","contentType":"file"},{"name":"ppocrv2_framework.jpg","path":"doc/ppocrv2_framework.jpg","contentType":"file"},{"name":"ppocrv3_framework.png","path":"doc/ppocrv3_framework.png","contentType":"file"},{"name":"pr.png","path":"doc/pr.png","contentType":"file"},{"name":"precommit_pass.png","path":"doc/precommit_pass.png","contentType":"file"}],"totalCount":27},"":{"items":[{"name":".github","path":".github","contentType":"directory"},{"name":"PPOCRLabel","path":"PPOCRLabel","contentType":"directory"},{"name":"StyleText","path":"StyleText","contentType":"directory"},{"name":"applications","path":"applications","contentType":"directory"},{"name":"benchmark","path":"benchmark","contentType":"directory"},{"name":"configs","path":"configs","contentType":"directory"},{"name":"deploy","path":"deploy","contentType":"directory"},{"name":"doc","path":"doc","contentType":"directory"},{"name":"ppocr","path":"ppocr","contentType":"directory"},{"name":"ppstructure","path":"ppstructure","contentType":"directory"},{"name":"test_tipc","path":"test_tipc","contentType":"directory"},{"name":"tools","path":"tools","contentType":"directory"},{"name":".clang_format.hook","path":".clang_format.hook","contentType":"file"},{"name":".gitignore","path":".gitignore","contentType":"file"},{"name":".pre-commit-config.yaml","path":".pre-commit-config.yaml","contentType":"file"},{"name":".style.yapf","path":".style.yapf","contentType":"file"},{"name":"LICENSE","path":"LICENSE","contentType":"file"},{"name":"MANIFEST.in","path":"MANIFEST.in","contentType":"file"},{"name":"README.md","path":"README.md","contentType":"file"},{"name":"README_en.md","path":"README_en.md","contentType":"file"},{"name":"__init__.py","path":"__init__.py","contentType":"file"},{"name":"paddleocr.py","path":"paddleocr.py","contentType":"file"},{"name":"requirements.txt","path":"requirements.txt","contentType":"file"},{"name":"setup.py","path":"setup.py","contentType":"file"},{"name":"train.sh","path":"train.sh","contentType":"file"}],"totalCount":25}},"fileTreeProcessingTime":12.498346,"foldersToFetch":[],"repo":{"id":262296122,"defaultBranch":"release/2.7","name":"PaddleOCR","ownerLogin":"PaddlePaddle","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2020-05-08T10:38:16.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/23534030?v=4","public":true,"private":false,"isOrgOwned":true},"symbolsExpanded":false,"treeExpanded":true,"refInfo":{"name":"release/2.7","listCacheKey":"v0:1709784359.0","canEdit":false,"refType":"branch","currentOid":"69832ab5326c6db614af6fb74b530aeae1c9b80e"},"path":"doc/doc_ch/quickstart.md","currentUser":null,"blob":{"rawLines":null,"stylingDirectives":null,"colorizedLines":null,"csv":null,"csvError":null,"dependabotInfo":{"showConfigurationBanner":false,"configFilePath":null,"networkDependabotPath":"/PaddlePaddle/PaddleOCR/network/updates","dismissConfigurationNoticePath":"/settings/dismiss-notice/dependabot_configuration_notice","configurationNoticeDismissed":null},"displayName":"quickstart.md","displayUrl":"https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md?raw=true","headerInfo":{"blobSize":"8.68 KB","deleteTooltip":"You must be signed in to make or propose changes","editTooltip":"You must be signed in to make or propose changes","ghDesktopPath":"https://desktop.github.com","isGitLfs":false,"onBranch":true,"shortPath":"0600d16","siteNavLoginPath":"/login?return_to=https%3A%2F%2Fgithub.com%2FPaddlePaddle%2FPaddleOCR%2Fblob%2Frelease%2F2.7%2Fdoc%2Fdoc_ch%2Fquickstart.md","isCSV":false,"isRichtext":true,"toc":[{"level":1,"text":"PaddleOCR 快速开始","anchor":"paddleocr-快速开始","htmlText":"PaddleOCR 快速开始"},{"level":2,"text":"1. 安装","anchor":"1-安装","htmlText":"1. 安装"},{"level":3,"text":"1.1 安装PaddlePaddle","anchor":"11-安装paddlepaddle","htmlText":"1.1 安装PaddlePaddle"},{"level":3,"text":"1.2 安装PaddleOCR whl包","anchor":"12-安装paddleocr-whl包","htmlText":"1.2 安装PaddleOCR whl包"},{"level":2,"text":"2. 便捷使用","anchor":"2-便捷使用","htmlText":"2. 便捷使用"},{"level":3,"text":"2.1 命令行使用","anchor":"21-命令行使用","htmlText":"2.1 命令行使用"},{"level":4,"text":"2.1.1 中英文模型","anchor":"211-中英文模型","htmlText":"2.1.1 中英文模型"},{"level":4,"text":"2.1.2 多语言模型","anchor":"212-多语言模型","htmlText":"2.1.2 多语言模型"},{"level":3,"text":"2.2 Python脚本使用","anchor":"22-python脚本使用","htmlText":"2.2 Python脚本使用"},{"level":4,"text":"2.2.1 中英文与多语言使用","anchor":"221-中英文与多语言使用","htmlText":"2.2.1 中英文与多语言使用"},{"level":2,"text":"3. 小结","anchor":"3-小结","htmlText":"3. 小结"}],"lineInfo":{"truncatedLoc":"255","truncatedSloc":"188"},"mode":"file"},"image":false,"isCodeownersFile":null,"isPlain":false,"isValidLegacyIssueTemplate":false,"issueTemplate":null,"discussionTemplate":null,"language":"Markdown","languageID":222,"large":false,"planSupportInfo":{"repoIsFork":null,"repoOwnedByCurrentUser":null,"requestFullPath":"/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md","showFreeOrgGatedFeatureMessage":null,"showPlanSupportBanner":null,"upgradeDataAttributes":null,"upgradePath":null},"publishBannersInfo":{"dismissActionNoticePath":"/settings/dismiss-notice/publish_action_from_dockerfile","releasePath":"/PaddlePaddle/PaddleOCR/releases/new?marketplace=true","showPublishActionBanner":false},"rawBlobUrl":"https://github.com/PaddlePaddle/PaddleOCR/raw/release/2.7/doc/doc_ch/quickstart.md","renderImageOrRaw":false,"richText":"PaddleOCR 快速开始\n说明: 本文主要介绍PaddleOCR wheel包对PP-OCR系列模型的快速使用,如要体验文档分析相关功能,请参考PP-Structure快速使用教程。\n\n1. 安装\n\n1.1 安装PaddlePaddle\n1.2 安装PaddleOCR whl包\n\n\n2. 便捷使用\n\n2.1 命令行使用\n\n2.1.1 中英文模型\n2.1.2 多语言模型\n\n\n2.2 Python脚本使用\n\n2.2.1 中英文与多语言使用\n\n\n\n\n3.小结\n\n\n1. 安装\n\n1.1 安装PaddlePaddle\n\n如果您没有基础的Python运行环境,请参考运行环境准备。\n\n\n\n您的机器安装的是CUDA9或CUDA10,请运行以下命令安装\npython3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple\n\n\n您的机器是CPU,请运行以下命令安装\npython3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple\n\n\n更多的版本需求,请参照飞桨官网安装文档中的说明进行操作。\n\n1.2 安装PaddleOCR whl包\npip install \"paddleocr>=2.0.1\" # 推荐使用2.0.1+版本\n\n对于Windows环境用户:直接通过pip安装的shapely库可能出现[winRrror 126] 找不到指定模块的问题。建议从这里下载shapely安装包完成安装。\n\n\n2. 便捷使用\n\n2.1 命令行使用\nPaddleOCR提供了一系列测试图片,点击这里下载并解压,然后在终端中切换到相应目录\ncd /path/to/ppocr_img\n\n如果不使用提供的测试图片,可以将下方--image_dir参数替换为相应的测试图片路径。\n\n2.1.1 中英文模型\n\n\n检测+方向分类器+识别全流程:--use_angle_cls true设置使用方向分类器识别180度旋转文字,--use_gpu false设置不使用GPU\npaddleocr --image_dir ./imgs/11.jpg --use_angle_cls true --use_gpu false\n结果是一个list,每个item包含了文本框,文字和识别置信度\n[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]\n......\n此外,paddleocr也支持输入pdf文件,并且可以通过指定参数page_num来控制推理前面几页,默认为0,表示推理所有页。\npaddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2\n\n\n\n\n单独使用检测:设置--rec为false\npaddleocr --image_dir ./imgs/11.jpg --rec false\n结果是一个list,每个item只包含文本框\n[[27.0, 459.0], [136.0, 459.0], [136.0, 479.0], [27.0, 479.0]]\n[[28.0, 429.0], [372.0, 429.0], [372.0, 445.0], [28.0, 445.0]]\n......\n\n\n单独使用识别:设置--det为false\npaddleocr --image_dir ./imgs_words/ch/word_1.jpg --det false\n结果是一个list,每个item只包含识别结果和识别置信度\n['韩国小馆', 0.994467]\n\n\n版本说明\npaddleocr默认使用PP-OCRv4模型(--ocr_version PP-OCRv4),如需使用其他版本可通过设置参数--ocr_version,具体版本说明如下:\n\n\n\n版本名称\n版本说明\n\n\n\n\nPP-OCRv4\n支持中、英文检测和识别,方向分类器,支持多语种识别\n\n\nPP-OCRv3\n支持中、英文检测和识别,方向分类器,支持多语种识别\n\n\nPP-OCRv2\n支持中英文的检测和识别,方向分类器,多语言暂未更新\n\n\nPP-OCR\n支持中、英文检测和识别,方向分类器,支持多语种识别\n\n\n\n如需新增自己训练的模型,可以在paddleocr中增加模型链接和字段,重新编译即可。\n更多whl包使用可参考whl包文档\n\n2.1.2 多语言模型\nPaddleOCR目前支持80个语种,可以通过修改--lang参数进行切换,对于英文模型,指定--lang=en。\npaddleocr --image_dir ./imgs_en/254.jpg --lang=en\n\n \n \n\n结果是一个list,每个item包含了文本框,文字和识别置信度\n[[[67.0, 51.0], [327.0, 46.0], [327.0, 74.0], [68.0, 80.0]], ('PHOCAPITAL', 0.9944712519645691)]\n[[[72.0, 92.0], [453.0, 84.0], [454.0, 114.0], [73.0, 122.0]], ('107 State Street', 0.9744491577148438)]\n[[[69.0, 135.0], [501.0, 125.0], [501.0, 156.0], [70.0, 165.0]], ('Montpelier Vermont', 0.9357033967971802)]\n......\n\n常用的多语言简写包括\n\n\n\n语种\n缩写\n\n语种\n缩写\n\n语种\n缩写\n\n\n\n\n中文\nch\n\n法文\nfr\n\n日文\njapan\n\n\n英文\nen\n\n德文\ngerman\n\n韩文\nkorean\n\n\n繁体中文\nchinese_cht\n\n意大利文\nit\n\n俄罗斯文\nru\n\n\n\n全部语种及其对应的缩写列表可查看多语言模型教程\n\n2.2 Python脚本使用\n\n2.2.1 中英文与多语言使用\n通过Python脚本使用PaddleOCR whl包,whl包会自动下载ppocr轻量级模型作为默认模型。\n\n检测+方向分类器+识别全流程\n\nfrom paddleocr import PaddleOCR, draw_ocr\n\n# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换\n# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`\nocr = PaddleOCR(use_angle_cls=True, lang=\"ch\") # need to run only once to download and load model into memory\nimg_path = './imgs/11.jpg'\nresult = ocr.ocr(img_path, cls=True)\nfor idx in range(len(result)):\n res = result[idx]\n for line in res:\n print(line)\n\n# 显示结果\nfrom PIL import Image\nresult = result[0]\nimage = Image.open(img_path).convert('RGB')\nboxes = [line[0] for line in result]\ntxts = [line[1][0] for line in result]\nscores = [line[1][1] for line in result]\nim_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')\nim_show = Image.fromarray(im_show)\nim_show.save('result.jpg')\n结果是一个list,每个item包含了文本框,文字和识别置信度\n[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]\n......\n结果可视化\n\n \n\n\n如果输入是PDF文件,那么可以参考下面代码进行可视化\nfrom paddleocr import PaddleOCR, draw_ocr\n\n# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换\n# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`\nocr = PaddleOCR(use_angle_cls=True, lang=\"ch\", page_num=2) # need to run only once to download and load model into memory\nimg_path = './xxx.pdf'\nresult = ocr.ocr(img_path, cls=True)\nfor idx in range(len(result)):\n res = result[idx]\n for line in res:\n print(line)\n\n# 显示结果\nimport fitz\nfrom PIL import Image\nimport cv2\nimport numpy as np\nimgs = []\nwith fitz.open(img_path) as pdf:\n for pg in range(0, pdf.pageCount):\n page = pdf[pg]\n mat = fitz.Matrix(2, 2)\n pm = page.getPixmap(matrix=mat, alpha=False)\n # if width or height > 2000 pixels, don't enlarge the image\n if pm.width > 2000 or pm.height > 2000:\n pm = page.getPixmap(matrix=fitz.Matrix(1, 1), alpha=False)\n\n img = Image.frombytes(\"RGB\", [pm.width, pm.height], pm.samples)\n img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)\n imgs.append(img)\nfor idx in range(len(result)):\n res = result[idx]\n image = imgs[idx]\n boxes = [line[0] for line in res]\n txts = [line[1][0] for line in res]\n scores = [line[1][1] for line in res]\n im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')\n im_show = Image.fromarray(im_show)\n im_show.save('result_page_{}.jpg'.format(idx))\n3. 小结\n通过本节内容,相信您已经熟练掌握PaddleOCR whl包的使用方法并获得了初步效果。\n飞桨AI套件(PaddleX)提供了飞桨生态优质模型,是训压推一站式全流程高效率开发平台,其使命是助力AI技术快速落地,愿景是使人人成为AI Developer!目前PP-OCRv4已上线PaddleX,您可以进入通用OCR体验模型训练、压缩和推理部署全流程。\n","renderedFileInfo":null,"shortPath":null,"symbolsEnabled":true,"tabSize":8,"topBannersInfo":{"overridingGlobalFundingFile":false,"globalPreferredFundingPath":null,"showInvalidCitationWarning":false,"citationHelpUrl":"https://docs.github.com/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/about-citation-files","actionsOnboardingTip":null},"truncated":false,"viewable":true,"workflowRedirectUrl":null,"symbols":{"timed_out":false,"not_analyzed":false,"symbols":[{"name":"PaddleOCR 快速开始","kind":"section_1","ident_start":2,"ident_end":24,"extent_start":0,"extent_end":8893,"fully_qualified_name":"PaddleOCR 快速开始","ident_utf16":{"start":{"line_number":0,"utf16_col":2},"end":{"line_number":0,"utf16_col":16}},"extent_utf16":{"start":{"line_number":0,"utf16_col":0},"end":{"line_number":255,"utf16_col":0}}},{"name":"1. 安装","kind":"section_2","ident_start":582,"ident_end":591,"extent_start":579,"extent_end":1565,"fully_qualified_name":"1. 安装","ident_utf16":{"start":{"line_number":17,"utf16_col":3},"end":{"line_number":17,"utf16_col":8}},"extent_utf16":{"start":{"line_number":17,"utf16_col":0},"end":{"line_number":49,"utf16_col":0}}},{"name":"1.1 安装PaddlePaddle","kind":"section_3","ident_start":615,"ident_end":637,"extent_start":611,"extent_end":1209,"fully_qualified_name":"1.1 安装PaddlePaddle","ident_utf16":{"start":{"line_number":20,"utf16_col":4},"end":{"line_number":20,"utf16_col":22}},"extent_utf16":{"start":{"line_number":20,"utf16_col":0},"end":{"line_number":39,"utf16_col":0}}},{"name":"1.2 安装PaddleOCR whl包","kind":"section_3","ident_start":1213,"ident_end":1239,"extent_start":1209,"extent_end":1565,"fully_qualified_name":"1.2 安装PaddleOCR whl包","ident_utf16":{"start":{"line_number":39,"utf16_col":4},"end":{"line_number":39,"utf16_col":24}},"extent_utf16":{"start":{"line_number":39,"utf16_col":0},"end":{"line_number":49,"utf16_col":0}}},{"name":"2. 便捷使用","kind":"section_2","ident_start":1568,"ident_end":1583,"extent_start":1565,"extent_end":8398,"fully_qualified_name":"2. 便捷使用","ident_utf16":{"start":{"line_number":49,"utf16_col":3},"end":{"line_number":49,"utf16_col":10}},"extent_utf16":{"start":{"line_number":49,"utf16_col":0},"end":{"line_number":250,"utf16_col":0}}},{"name":"2.1 命令行使用","kind":"section_3","ident_start":1606,"ident_end":1625,"extent_start":1602,"extent_end":5473,"fully_qualified_name":"2.1 命令行使用","ident_utf16":{"start":{"line_number":51,"utf16_col":4},"end":{"line_number":51,"utf16_col":13}},"extent_utf16":{"start":{"line_number":51,"utf16_col":0},"end":{"line_number":157,"utf16_col":0}}},{"name":"2.1.1 中英文模型","kind":"section_4","ident_start":1971,"ident_end":1992,"extent_start":1966,"extent_end":4090,"fully_qualified_name":"2.1.1 中英文模型","ident_utf16":{"start":{"line_number":62,"utf16_col":5},"end":{"line_number":62,"utf16_col":16}},"extent_utf16":{"start":{"line_number":62,"utf16_col":0},"end":{"line_number":123,"utf16_col":0}}},{"name":"2.1.2 多语言模型","kind":"section_4","ident_start":4095,"ident_end":4116,"extent_start":4090,"extent_end":5473,"fully_qualified_name":"2.1.2 多语言模型","ident_utf16":{"start":{"line_number":123,"utf16_col":5},"end":{"line_number":123,"utf16_col":16}},"extent_utf16":{"start":{"line_number":123,"utf16_col":0},"end":{"line_number":157,"utf16_col":0}}},{"name":"2.2 Python脚本使用","kind":"section_3","ident_start":5477,"ident_end":5499,"extent_start":5473,"extent_end":8398,"fully_qualified_name":"2.2 Python脚本使用","ident_utf16":{"start":{"line_number":157,"utf16_col":4},"end":{"line_number":157,"utf16_col":18}},"extent_utf16":{"start":{"line_number":157,"utf16_col":0},"end":{"line_number":250,"utf16_col":0}}},{"name":"2.2.1 中英文与多语言使用","kind":"section_4","ident_start":5524,"ident_end":5557,"extent_start":5519,"extent_end":8398,"fully_qualified_name":"2.2.1 中英文与多语言使用","ident_utf16":{"start":{"line_number":159,"utf16_col":5},"end":{"line_number":159,"utf16_col":20}},"extent_utf16":{"start":{"line_number":159,"utf16_col":0},"end":{"line_number":250,"utf16_col":0}}},{"name":"3. 小结","kind":"section_2","ident_start":8401,"ident_end":8410,"extent_start":8398,"extent_end":8893,"fully_qualified_name":"3. 小结","ident_utf16":{"start":{"line_number":250,"utf16_col":3},"end":{"line_number":250,"utf16_col":8}},"extent_utf16":{"start":{"line_number":250,"utf16_col":0},"end":{"line_number":255,"utf16_col":0}}}]}},"copilotInfo":null,"copilotAccessAllowed":false,"csrf_tokens":{"/PaddlePaddle/PaddleOCR/branches":{"post":"NafVGnBYS6-2swJa1lLOnvTQsN-ylVUCIReflGLkKjKx52MoeGJHQhA7PYpr_BxT5tCbeUKv2sGQ7p3qydWZBA"},"/repos/preferences":{"post":"fjMO9SRnNUNwyaB9TZDaxd4bAovWvWyZ5JlN7kIs8o6ujMf1wRMXnpqRLYMHijhddFiDUyxkrMXOub4IFYxCgg"}}},"title":"PaddleOCR/doc/doc_ch/quickstart.md at release/2.7 · PaddlePaddle/PaddleOC

PaddleOCR: 百度开源OCR

PaddleOCR: 百度开源OCR

登录

注册

开源

企业版

高校版

搜索

帮助中心

使用条款

关于我们

开源

企业版

高校版

私有云

Gitee AI

NEW

我知道了

查看详情

登录

注册

代码拉取完成,页面将自动刷新

捐赠

捐赠前请先登录

取消

前往登录

扫描微信二维码支付

取消

支付完成

支付提示

将跳转至支付宝完成支付

确定

取消

Watch

不关注

关注所有动态

仅关注版本发行动态

关注但不提醒动态

2

Star

6

Fork

0

computer-vision / PaddleOCR

代码

Issues

0

Pull Requests

0

Wiki

统计

流水线

服务

Gitee Pages

JavaDoc

PHPDoc

质量分析

Jenkins for Gitee

腾讯云托管

腾讯云 Serverless

悬镜安全

阿里云 SAE

Codeblitz

我知道了,不再自动展开

加入 Gitee

与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)

免费加入

已有帐号?

立即登录

返回

release/2.4

管理

管理

分支 (10)

标签 (4)

release/2.4

dygraph

release/2.2

release/2.3

release/2.1

release/2.0

develop

release/1.1

release/2.0-rc1-0

master

v2.1.1

v2.1.0

v2.0.0

v1.1.0

克隆/下载

克隆/下载

HTTPS

SSH

SVN

SVN+SSH

下载ZIP

该操作需登录 Gitee 帐号,请先登录后再操作。

立即登录

没有帐号,去注册

提示

下载代码请复制以下命令到终端执行

为确保你提交的代码身份被 Gitee 正确识别,请执行以下命令完成配置

git config --global user.name userName

git config --global user.email userEmail

初次使用 SSH 协议进行代码克隆、推送等操作时,需按下述提示完成 SSH 配置

1

生成 RSA 密钥

2

获取 RSA 公钥内容,并配置到 SSH公钥 中

在 Gitee 上使用 SVN,请访问 使用指南

使用 HTTPS 协议时,命令行会出现如下账号密码验证步骤。基于安全考虑,Gitee 建议 配置并使用私人令牌 替代登录密码进行克隆、推送等操作

Username for 'https://gitee.com': userName

Password for 'https://userName@gitee.com':

#

私人令牌

新建文件

新建 Diagram 文件

新建子模块

上传文件

分支 10

标签 4

贡献代码

同步代码

创建 Pull Request

了解更多

对比差异

通过 Pull Request 同步

同步更新到分支

通过 Pull Request 同步

将会在向当前分支创建一个 Pull Request,合入后将完成同步

zhoujun

del infer.sh (#5226)

22b1fb3

3850 次提交

提交

取消

提示:

由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件

.github/ISSUE_TEMPLATE

保存

取消

PPOCRLabel

保存

取消

StyleText

保存

取消

benchmark

保存

取消

configs

保存

取消

deploy

保存

取消

doc

保存

取消

ppocr

保存

取消

ppstructure

保存

取消

test_tipc

保存

取消

tools

保存

取消

.clang_format.hook

保存

取消

.gitignore

保存

取消

.pre-commit-config.yaml

保存

取消

.style.yapf

保存

取消

LICENSE

保存

取消

MANIFEST.in

保存

取消

README.md

保存

取消

README_ch.md

保存

取消

__init__.py

保存

取消

paddleocr.py

保存

取消

requirements.txt

保存

取消

setup.py

保存

取消

train.sh

保存

取消

Loading...

README

Apache-2.0

English | 简体中文

简介

PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。

近期更新

2021.12.21《动手学OCR · 十讲》课程开讲,12月21日起每晚八点半线上授课!免费报名地址。

2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法(PSENet),3种文本识别算法(NRTR、SEED、SAR);文档结构化算法新增1种关键信息提取算法(SDMGR,文档),3种DocVQA算法(LayoutLM、LayoutLMv2,LayoutXLM,文档)。

PaddleOCR研发团队对最新发版内容技术深入解读,9月8日晚上20:15,课程回放。

2021.9.7 发布PaddleOCR v2.3与PP-OCRv2,CPU推理速度相比于PP-OCR server提升220%;效果相比于PP-OCR mobile 提升7%。

2021.8.3 发布PaddleOCR v2.2,新增文档结构分析PP-Structure工具包,支持版面分析与表格识别(含Excel导出)。

更多

特性

PP-OCR系列高质量预训练模型,准确的识别效果

超轻量PP-OCRv2系列:检测(3.1M)+ 方向分类器(1.4M)+ 识别(8.5M)= 13.0M

超轻量PP-OCR mobile移动端系列:检测(3.0M)+方向分类器(1.4M)+ 识别(5.0M)= 9.4M

通用PPOCR server系列:检测(47.1M)+方向分类器(1.4M)+ 识别(94.9M)= 143.4M

支持中英文数字组合识别、竖排文本识别、长文本识别

支持多语言识别:韩语、日语、德语、法语等约80种语言

PP-Structure文档结构化系统

支持版面分析与表格识别(含Excel导出)

支持关键信息提取任务

支持DocVQA任务

丰富易用的OCR相关工具组件

半自动数据标注工具PPOCRLabel:支持快速高效的数据标注

数据合成工具Style-Text:批量合成大量与目标场景类似的图像

支持用户自定义训练,提供丰富的预测推理部署方案

支持PIP快速安装使用

可运行于Linux、Windows、MacOS等多种系统

上述内容的使用方法建议从文档教程中的快速开始体验

社区、社区贡献与社区常规赛

加入社区:微信扫描下方二维码加入官方交流群,与各行各业开发者充分交流,期待您的加入。

社区贡献:社区贡献文档中包含了社区用户使用PaddleOCR开发的各种工具、应用以及为PaddleOCR贡献的功能、优化的文档与代码等,是官方为社区开发者打造的荣誉墙、也是帮助优质项目宣传的广播站。如果您的OCR项目未被收集在文档中,可根据文档说明与我们联系。最新社区贡献可查看此处。

社区常规赛:作为社区贡献的具体承载形式,社区常规赛是面向OCR开发者的积分赛事。首届社区常规赛与《动手学OCR · 十讲》课程联合推广。社区常规赛的赛题详情与报名方法可参考链接。

零代码体验

在线网站体验:超轻量PP-OCR mobile模型体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr

移动端:安装包DEMO下载地址(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)

PP-OCR系列模型列表(更新中)

模型简介

模型名称

推荐场景

检测模型

方向分类器

识别模型

中英文超轻量PP-OCRv2模型(13.0M)

ch_PP-OCRv2_xx

移动端&服务器端

推理模型 / 训练模型

推理模型 / 预训练模型

推理模型 / 训练模型

中英文超轻量PP-OCR mobile模型(9.4M)

ch_ppocr_mobile_v2.0_xx

移动端&服务器端

推理模型 / 预训练模型

推理模型 / 预训练模型

推理模型 / 预训练模型

中英文通用PP-OCR server模型(143.4M)

ch_ppocr_server_v2.0_xx

服务器端

推理模型 / 预训练模型

推理模型 / 预训练模型

推理模型 / 预训练模型

更多模型下载(包括多语言),可以参考PP-OCR 系列模型下载

文档教程

运行环境准备

快速开始(中英文/多语言/文档分析)

PaddleOCR全景图与项目克隆

PP-OCR产业落地:从训练到部署

PP-OCR模型与配置文件

PP-OCR模型下载

PP-OCR模型库快速推理

PP-OCR模型训练

文本检测

文本识别

文本方向分类器

知识蒸馏

配置文件内容与生成

PP-OCR模型推理部署

基于C++预测引擎推理

服务化部署

端侧部署

Benchmark

PP-Structure信息提取

版面分析

表格识别

DocVQA

关键信息提取

OCR学术圈

两阶段模型介绍与下载

端到端PGNet算法

基于Python脚本预测引擎推理

使用PaddleOCR架构添加新算法

数据标注与合成

半自动标注工具PPOCRLabel

数据合成工具Style-Text

其它数据标注工具

其它数据合成工具

数据集

通用中英文OCR数据集

手写中文OCR数据集

垂类多语言OCR数据集

效果展示

FAQ

通用问题

PaddleOCR实战问题

参考文献

许可证书

代码组织结构

PP-OCRv2 Pipeline

[1] PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框矫正和CRNN文本识别三部分组成。该系统从骨干网络选择和调整、预测头部的设计、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型自动裁剪量化8个方面,采用19个有效策略,对各个模块的模型进行效果调优和瘦身(如绿框所示),最终得到整体大小为3.5M的超轻量中英文OCR和2.8M的英文数字OCR。更多细节请参考PP-OCR技术方案 https://arxiv.org/abs/2009.09941

[2] PP-OCRv2在PP-OCR的基础上,进一步在5个方面重点优化,检测模型采用CML协同互学习知识蒸馏策略和CopyPaste数据增广策略;识别模型采用LCNet轻量级骨干网络、UDML 改进知识蒸馏策略和Enhanced CTC loss损失函数改进(如上图红框所示),进一步在推理速度和预测效果上取得明显提升。更多细节请参考PP-OCRv2技术报告。

效果展示 more

中文模型

英文模型

其他语言模型

最新社区贡献

基于PaddleOCR的社区项目: FastOCRLabel:完整的C#版本标注工具 (@ 包建强 )

为PaddleOCR新增功能:非常感谢 Evezerest, ninetailskim, edencfc, BeyondYourself, 1084667371 贡献了PPOCRLabel 的完整代码。

代码与文档优化:非常感谢 BeyondYourself 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。

多语言语料:非常感谢 Mejans 给PaddleOCR增加新语言奥克西坦语Occitan的字典和语料(#954)。

完整社区贡献列表可查看社区贡献文档

许可证书

本项目的发布受Apache 2.0 license许可认证。

Apache License

Version 2.0, January 2004

http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction,

and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by

the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all

other entities that control, are controlled by, or are under common

control with that entity. For the purposes of this definition,

"control" means (i) the power, direct or indirect, to cause the

direction or management of such entity, whether by contract or

otherwise, or (ii) ownership of fifty percent (50%) or more of the

outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity

exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications,

including but not limited to software source code, documentation

source, and configuration files.

"Object" form shall mean any form resulting from mechanical

transformation or translation of a Source form, including but

not limited to compiled object code, generated documentation,

and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or

Object form, made available under the License, as indicated by a

copyright notice that is included in or attached to the work

(an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object

form, that is based on (or derived from) the Work and for which the

editorial revisions, annotations, elaborations, or other modifications

represent, as a whole, an original work of authorship. For the purposes

of this License, Derivative Works shall not include works that remain

separable from, or merely link (or bind by name) to the interfaces of,

the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including

the original version of the Work and any modifications or additions

to that Work or Derivative Works thereof, that is intentionally

submitted to Licensor for inclusion in the Work by the copyright owner

or by an individual or Legal Entity authorized to submit on behalf of

the copyright owner. For the purposes of this definition, "submitted"

means any form of electronic, verbal, or written communication sent

to the Licensor or its representatives, including but not limited to

communication on electronic mailing lists, source code control systems,

and issue tracking systems that are managed by, or on behalf of, the

Licensor for the purpose of discussing and improving the Work, but

excluding communication that is conspicuously marked or otherwise

designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity

on behalf of whom a Contribution has been received by Licensor and

subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of

this License, each Contributor hereby grants to You a perpetual,

worldwide, non-exclusive, no-charge, royalty-free, irrevocable

copyright license to reproduce, prepare Derivative Works of,

publicly display, publicly perform, sublicense, and distribute the

Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of

this License, each Contributor hereby grants to You a perpetual,

worldwide, non-exclusive, no-charge, royalty-free, irrevocable

(except as stated in this section) patent license to make, have made,

use, offer to sell, sell, import, and otherwise transfer the Work,

where such license applies only to those patent claims licensable

by such Contributor that are necessarily infringed by their

Contribution(s) alone or by combination of their Contribution(s)

with the Work to which such Contribution(s) was submitted. If You

institute patent litigation against any entity (including a

cross-claim or counterclaim in a lawsuit) alleging that the Work

or a Contribution incorporated within the Work constitutes direct

or contributory patent infringement, then any patent licenses

granted to You under this License for that Work shall terminate

as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the

Work or Derivative Works thereof in any medium, with or without

modifications, and in Source or Object form, provided that You

meet the following conditions:

(a) You must give any other recipients of the Work or

Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices

stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works

that You distribute, all copyright, patent, trademark, and

attribution notices from the Source form of the Work,

excluding those notices that do not pertain to any part of

the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its

distribution, then any Derivative Works that You distribute must

include a readable copy of the attribution notices contained

within such NOTICE file, excluding those notices that do not

pertain to any part of the Derivative Works, in at least one

of the following places: within a NOTICE text file distributed

as part of the Derivative Works; within the Source form or

documentation, if provided along with the Derivative Works; or,

within a display generated by the Derivative Works, if and

wherever such third-party notices normally appear. The contents

of the NOTICE file are for informational purposes only and

do not modify the License. You may add Your own attribution

notices within Derivative Works that You distribute, alongside

or as an addendum to the NOTICE text from the Work, provided

that such additional attribution notices cannot be construed

as modifying the License.

You may add Your own copyright statement to Your modifications and

may provide additional or different license terms and conditions

for use, reproduction, or distribution of Your modifications, or

for any such Derivative Works as a whole, provided Your use,

reproduction, and distribution of the Work otherwise complies with

the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,

any Contribution intentionally submitted for inclusion in the Work

by You to the Licensor shall be under the terms and conditions of

this License, without any additional terms or conditions.

Notwithstanding the above, nothing herein shall supersede or modify

the terms of any separate license agreement you may have executed

with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade

names, trademarks, service marks, or product names of the Licensor,

except as required for reasonable and customary use in describing the

origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or

agreed to in writing, Licensor provides the Work (and each

Contributor provides its Contributions) on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or

implied, including, without limitation, any warranties or conditions

of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A

PARTICULAR PURPOSE. You are solely responsible for determining the

appropriateness of using or redistributing the Work and assume any

risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,

whether in tort (including negligence), contract, or otherwise,

unless required by applicable law (such as deliberate and grossly

negligent acts) or agreed to in writing, shall any Contributor be

liable to You for damages, including any direct, indirect, special,

incidental, or consequential damages of any character arising as a

result of this License or out of the use or inability to use the

Work (including but not limited to damages for loss of goodwill,

work stoppage, computer failure or malfunction, or any and all

other commercial damages or losses), even if such Contributor

has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing

the Work or Derivative Works thereof, You may choose to offer,

and charge a fee for, acceptance of support, warranty, indemnity,

or other liability obligations and/or rights consistent with this

License. However, in accepting such obligations, You may act only

on Your own behalf and on Your sole responsibility, not on behalf

of any other Contributor, and only if You agree to indemnify,

defend, and hold each Contributor harmless for any liability

incurred by, or claims asserted against, such Contributor by reason

of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following

boilerplate notice, with the fields enclosed by brackets "[]"

replaced with your own identifying information. (Don't include

the brackets!) The text should be enclosed in the appropriate

comment syntax for the file format. We also recommend that a

file or class name and description of purpose be included on the

same "printed page" as the copyright notice for easier

identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

Starred

6

Star

6

Fork

0

捐赠

0 人次

举报

举报成功

我们将于2个工作日内通过站内信反馈结果给你!

请认真填写举报原因,尽可能描述详细。

举报类型

请选择举报类型

举报原因

取消

发送

误判申诉

此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。

如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。

取消

提交

简介

百度开源OCR

展开

收起

暂无标签

/computer-vision/PaddleOCR

Python

等 6 种语言

Python

79.1%

C++

17.6%

Java

2.5%

CMake

0.5%

Makefile

0.2%

Other

0.1%

Apache-2.0

使用 Apache-2.0 开源许可协议

保存更改

取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多

不能加载更多了

编辑仓库简介

简介内容

百度开源OCR

主页

取消

保存更改

1

https://gitee.com/computer-vision/PaddleOCR.git

git@gitee.com:computer-vision/PaddleOCR.git

computer-vision

PaddleOCR

PaddleOCR

release/2.4

深圳市奥思网络科技有限公司版权所有

Git 大全

Git 命令学习

CopyCat 代码克隆检测

APP与插件下载

Gitee Reward

Gitee 封面人物

GVP 项目

Gitee 博客

Gitee 公益计划

Gitee 持续集成

OpenAPI

帮助文档

在线自助服务

更新日志

关于我们

加入我们

使用条款

意见建议

合作伙伴

售前咨询客服

技术交流QQ群

微信服务号

client#oschina.cn

企业版在线使用:400-606-0201

专业版私有部署:

13670252304

13352947997

开放原子开源基金会

合作代码托管平台

违法和不良信息举报中心

粤ICP备12009483号

简 体

/

繁 體

/

English

点此查找更多帮助

搜索帮助

Git 命令在线学习

如何在 Gitee 导入 GitHub 仓库

Git 仓库基础操作

企业版和社区版功能对比

SSH 公钥设置

如何处理代码冲突

仓库体积过大,如何减小?

如何找回被删除的仓库数据

Gitee 产品配额说明

GitHub仓库快速导入Gitee及同步更新

什么是 Release(发行版)

将 PHP 项目自动发布到 packagist.org

评论

仓库举报

回到顶部

登录提示

该操作需登录 Gitee 帐号,请先登录后再操作。

立即登录

没有帐号,去注册

PaddlePaddle/PaddleOCR: PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。 - doc/doc_ch/quickstart.md at release/2.6 - PaddleOCR - OpenI - 启智AI开源社区提供普惠算力!

PaddlePaddle/PaddleOCR: PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。 - doc/doc_ch/quickstart.md at release/2.6 - PaddleOCR - OpenI - 启智AI开源社区提供普惠算力!

This website works better with JavaScript.

AI

协作平台

Powered by C²NET

Home

Issues

Pull Requests

Milestones

Cloudbrain Task

Calculation Points

Repositories

Datasets

Model

Model Square

Large Model Base

PengCheng.Mind

Computing power

Computing resources

Domestic computing power

Explore

Organizations

Cloudbrain Mirror

Courses

OpenI Projects

Forum

Register

Sign In

关于云脑任务中统一路径访问方式的公告>>>

关于调整算力资源消耗积分的公告>>>

关于将启智集群GPU资源迁移至智算集群的公告>>>

PaddlePaddle

/

PaddleOCR

mirror of https://github.com/PaddlePaddle/PaddleOCR

Not watched

Unwatch

Watch all

Watch but not notify

6

Star

1

Fork

11

Code

Releases 8

Wiki

Activity

Issues 1

Datasets

Model

Cloudbrain

You can not select more than 25 topics

Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

Branch:

release/2.6

dygraph

release/1.1

release/2.0

release/2.0-rc1-0

release/2.1

release/2.2

release/2.3

release/2.4

release/2.5

release/2.6

release/2.6.1

release/2.6rc

release/2.7

release/2.7.1

revert-10769-cherry-pick-for-open

revert-7381-dygraph

revert-7437-dygraph

revert-8552-dygraph

static

v1.1.0

v2.0.0

v2.1.0

v2.1.1

v2.5.0

v2.6.0

v2.7.0

v2.7.1

Branches

Tags

${ item.name }

Create branch ${ searchTerm }

from 'release/2.6'

${ noResults }

PaddleOCR/doc/doc_ch/quickstart.md

8.5 KiB

Raw

Permalink

Blame

History

PaddleOCR 快速开始

1. 安装

1.1 安装PaddlePaddle

1.2 安装PaddleOCR whl包

2. 便捷使用

2.1 命令行使用

2.1.1 中英文模型

2.1.2 多语言模型

2.2 Python脚本使用

2.2.1 中英文与多语言使用

3. 小结

PaddleOCR 快速开始

说明: 本文主要介绍PaddleOCR wheel包对PP-OCR系列模型的快速使用,如要体验文档分析相关功能,请参考PP-Structure快速使用教程。

1. 安装

1.1 安装PaddlePaddle

1.2 安装PaddleOCR whl包

2. 便捷使用

2.1 命令行使用

2.1.1 中英文模型

2.1.2 多语言模型

2.2 Python脚本使用

2.2.1 中英文与多语言使用

3.小结

1. 安装

1.1 安装PaddlePaddle

如果您没有基础的Python运行环境,请参考运行环境准备。

您的机器安装的是CUDA9或CUDA10,请运行以下命令安装

python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple

您的机器是CPU,请运行以下命令安装

python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

更多的版本需求,请参照飞桨官网安装文档中的说明进行操作。

1.2 安装PaddleOCR whl包

pip install "paddleocr>=2.0.1" # 推荐使用2.0.1+版本

对于Windows环境用户:直接通过pip安装的shapely库可能出现[winRrror 126] 找不到指定模块的问题。建议从这里下载shapely安装包完成安装。

2. 便捷使用

2.1 命令行使用

PaddleOCR提供了一系列测试图片,点击这里下载并解压,然后在终端中切换到相应目录

cd /path/to/ppocr_img

如果不使用提供的测试图片,可以将下方--image_dir参数替换为相应的测试图片路径。

2.1.1 中英文模型

检测+方向分类器+识别全流程:--use_angle_cls true设置使用方向分类器识别180度旋转文字,--use_gpu false设置不使用GPU

paddleocr --image_dir ./imgs/11.jpg --use_angle_cls true --use_gpu false

结果是一个list,每个item包含了文本框,文字和识别置信度

[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]

......

此外,paddleocr也支持输入pdf文件,并且可以通过指定参数page_num来控制推理前面几页,默认为0,表示推理所有页。

paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2

单独使用检测:设置--rec为false

paddleocr --image_dir ./imgs/11.jpg --rec false

结果是一个list,每个item只包含文本框

[[27.0, 459.0], [136.0, 459.0], [136.0, 479.0], [27.0, 479.0]]

[[28.0, 429.0], [372.0, 429.0], [372.0, 445.0], [28.0, 445.0]]

......

单独使用识别:设置--det为false

paddleocr --image_dir ./imgs_words/ch/word_1.jpg --det false

结果是一个list,每个item只包含识别结果和识别置信度

['韩国小馆', 0.994467]

版本说明

paddleocr默认使用PP-OCRv3模型(--ocr_version PP-OCRv3),如需使用其他版本可通过设置参数--ocr_version,具体版本说明如下:

版本名称

版本说明

PP-OCRv3

支持中、英文检测和识别,方向分类器,支持多语种识别

PP-OCRv2

支持中英文的检测和识别,方向分类器,多语言暂未更新

PP-OCR

支持中、英文检测和识别,方向分类器,支持多语种识别

如需新增自己训练的模型,可以在paddleocr中增加模型链接和字段,重新编译即可。

更多whl包使用可参考whl包文档

2.1.2 多语言模型

PaddleOCR目前支持80个语种,可以通过修改--lang参数进行切换,对于英文模型,指定--lang=en。

paddleocr --image_dir ./imgs_en/254.jpg --lang=en

结果是一个list,每个item包含了文本框,文字和识别置信度

[[[67.0, 51.0], [327.0, 46.0], [327.0, 74.0], [68.0, 80.0]], ('PHOCAPITAL', 0.9944712519645691)]

[[[72.0, 92.0], [453.0, 84.0], [454.0, 114.0], [73.0, 122.0]], ('107 State Street', 0.9744491577148438)]

[[[69.0, 135.0], [501.0, 125.0], [501.0, 156.0], [70.0, 165.0]], ('Montpelier Vermont', 0.9357033967971802)]

......

常用的多语言简写包括

语种

缩写

语种

缩写

语种

缩写

中文

ch

法文

fr

日文

japan

英文

en

德文

german

韩文

korean

繁体中文

chinese_cht

意大利文

it

俄罗斯文

ru

全部语种及其对应的缩写列表可查看多语言模型教程

2.2 Python脚本使用

2.2.1 中英文与多语言使用

通过Python脚本使用PaddleOCR whl包,whl包会自动下载ppocr轻量级模型作为默认模型。

检测+方向分类器+识别全流程

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换

# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`

ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory

img_path = './imgs/11.jpg'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# 显示结果

# 如果本地没有simfang.ttf,可以在doc/fonts目录下下载

from PIL import Image

result = result[0]

image = Image.open(img_path).convert('RGB')

boxes = [line[0] for line in result]

txts = [line[1][0] for line in result]

scores = [line[1][1] for line in result]

im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result.jpg')

结果是一个list,每个item包含了文本框,文字和识别置信度

[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]

......

结果可视化

如果输入是PDF文件,那么可以参考下面代码进行可视化

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换

# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`

ocr = PaddleOCR(use_angle_cls=True, lang="ch", page_num=2) # need to run only once to download and load model into memory

img_path = './xxx.pdf'

result = ocr.ocr(img_path, cls=True)

for idx in range(len(result)):

res = result[idx]

for line in res:

print(line)

# 显示结果

import fitz

from PIL import Image

import cv2

import numpy as np

imgs = []

with fitz.open(img_path) as pdf:

for pg in range(0, pdf.pageCount):

page = pdf[pg]

mat = fitz.Matrix(2, 2)

pm = page.getPixmap(matrix=mat, alpha=False)

# if width or height > 2000 pixels, don't enlarge the image

if pm.width > 2000 or pm.height > 2000:

pm = page.getPixmap(matrix=fitz.Matrix(1, 1), alpha=False)

img = Image.frombytes("RGB", [pm.width, pm.height], pm.samples)

img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)

imgs.append(img)

for idx in range(len(result)):

res = result[idx]

image = imgs[idx]

boxes = [line[0] for line in res]

txts = [line[1][0] for line in res]

scores = [line[1][1] for line in res]

im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')

im_show = Image.fromarray(im_show)

im_show.save('result_page_{}.jpg'.format(idx))

3. 小结

通过本节内容,相信您已经熟练掌握PaddleOCR whl包的使用方法并获得了初步效果。

PaddleOCR是一套丰富领先实用的OCR工具库,打通数据、模型训练、压缩和推理部署全流程,您可以参考文档教程,正式开启PaddleOCR的应用之旅。

Please read the following content carefully:

Dear OpenI User

Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.

For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》

Agree and continue

Disagree, exit

Community

Council

Technical Committee

Join OpenI

Use agreement

News

Community News

Member news

Industry Advisory

help

English

English

简体中文

Tutorial

Feedback

Resource Note

OpenI subscription number

User communication group

Copyright: New Generation Artificial Intelligence Open Source Open Platform (OpenI) 京ICP备18004880号

京公网安备 11010802042693号

Powered_by Pengcheng CloudBrain、China Computing NET(C²NET)、Trustie确实、Gitea