tokenpocket钱包官方下载|paddleocr
PaddleOCR: 基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。
PaddleOCR: 基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。
登录
注册
开源
企业版
高校版
搜索
帮助中心
使用条款
关于我们
开源
企业版
高校版
私有云
Gitee AI
NEW
我知道了
查看详情
登录
注册
代码拉取完成,页面将自动刷新
开源项目
>
人工智能
>
计算机视觉/人脸识别
&&
捐赠
捐赠前请先登录
取消
前往登录
扫描微信二维码支付
取消
支付完成
支付提示
将跳转至支付宝完成支付
确定
取消
Watch
不关注
关注所有动态
仅关注版本发行动态
关注但不提醒动态
439
Star
3.4K
Fork
826
PaddlePaddle / PaddleOCR
代码
Issues
189
Pull Requests
6
Wiki
统计
流水线
服务
Gitee Pages
质量分析
Jenkins for Gitee
腾讯云托管
腾讯云 Serverless
悬镜安全
阿里云 SAE
Codeblitz
我知道了,不再自动展开
加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
已有帐号?
立即登录
返回
release/2.6
管理
管理
分支 (19)
标签 (6)
release/2.6
release/2.6.1
shiyutang-patch-1
release/2.7
dygraph
docs_update
release/2.5
revert-7381-dygraph
revert-8552-dygraph
release/2.6rc
release/2.0-rc1-0
release/1.1
release/2.0
release/2.1
release/2.2
release/2.3
release/2.4
static
revert-7437-dygraph
v2.6.0
v2.5.0
v2.1.1
v2.1.0
v2.0.0
v1.1.0
克隆/下载
克隆/下载
HTTPS
SSH
SVN
SVN+SSH
下载ZIP
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册
提示
下载代码请复制以下命令到终端执行
为确保你提交的代码身份被 Gitee 正确识别,请执行以下命令完成配置
git config --global user.name userName
git config --global user.email userEmail
初次使用 SSH 协议进行代码克隆、推送等操作时,需按下述提示完成 SSH 配置
1
生成 RSA 密钥
2
获取 RSA 公钥内容,并配置到 SSH公钥 中
在 Gitee 上使用 SVN,请访问 使用指南
使用 HTTPS 协议时,命令行会出现如下账号密码验证步骤。基于安全考虑,Gitee 建议 配置并使用私人令牌 替代登录密码进行克隆、推送等操作
Username for 'https://gitee.com': userName
Password for 'https://userName@gitee.com':
#
私人令牌
新建文件
新建子模块
上传文件
分支 19
标签 6
贡献代码
同步代码
创建 Pull Request
了解更多
对比差异
通过 Pull Request 同步
同步更新到分支
通过 Pull Request 同步
将会在向当前分支创建一个 Pull Request,合入后将完成同步
moe
update: Usinig intuitive initialization of...
b1f6c21
6035 次提交
提交
取消
提示:
由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
.github
保存
取消
PPOCRLabel
保存
取消
StyleText
保存
取消
applications
保存
取消
benchmark
保存
取消
configs
保存
取消
deploy
保存
取消
doc
保存
取消
ppocr
保存
取消
ppstructure
保存
取消
test_tipc
保存
取消
tools
保存
取消
.clang_format.hook
保存
取消
.gitignore
保存
取消
.pre-commit-config.yaml
保存
取消
.style.yapf
保存
取消
LICENSE
保存
取消
MANIFEST.in
保存
取消
README.md
保存
取消
README_ch.md
保存
取消
__init__.py
保存
取消
paddleocr.py
保存
取消
requirements.txt
保存
取消
setup.py
保存
取消
train.sh
保存
取消
Loading...
README
Apache-2.0
English | 简体中文 | हिन्दी | 日本語 | 한국인 | Pу́сский язы́к
简介
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。
近期更新
2023.3.10 PaddleOCR集成了高性能、全场景模型部署方案FastDeploy,欢迎参考指南试用(注意使用dygraph分支)。
2022.12 发布《OCR产业范例20讲》电子书,新增蒙古文、身份证、液晶屏缺陷等7个场景应用范例
2022.11 新增实现4种前沿算法:文本检测 DRRG, 文本识别 RFL, 文本超分Text Telescope,公式识别CAN
2022.10 优化JS版PP-OCRv3模型:模型大小仅4.3M,预测速度提升8倍,配套web demo开箱即用
直播回放:PaddleOCR研发团队详解PP-StructureV2优化策略。微信扫描下方二维码,关注公众号并填写问卷后进入官方交流群,获取直播回放链接与20G重磅OCR学习大礼包(内含PDF转Word应用程序、10种垂类模型、《动手学OCR》电子书等)
2022.8.24 发布 PaddleOCR release/2.6
发布PP-StructureV2,系统功能性能全面升级,适配中文场景,新增支持版面复原,支持一行命令完成PDF转Word;
版面分析模型优化:模型存储减少95%,速度提升11倍,平均CPU耗时仅需41ms;
表格识别模型优化:设计3大优化策略,预测耗时不变情况下,模型精度提升6%;
关键信息抽取模型优化:设计视觉无关模型结构,语义实体识别精度提升2.8%,关系抽取精度提升9.1%。
2022.8 发布 OCR场景应用集合:包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等9个垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。
2022.8 新增实现8种前沿算法
文本检测:FCENet, DB++
文本识别:ViTSTR, ABINet, VisionLAN, SPIN, RobustScanner
表格识别:TableMaster
2022.5.9 发布 PaddleOCR release/2.5
发布PP-OCRv3,速度可比情况下,中文场景效果相比于PP-OCRv2再提升5%,英文场景提升11%,80语种多语言模型平均识别准确率提升5%以上;
发布半自动标注工具PPOCRLabelv2:新增表格文字图像、图像关键信息抽取任务和不规则文字图像的标注功能;
发布OCR产业落地工具集:打通22种训练部署软硬件环境与方式,覆盖企业90%的训练部署环境需求;
发布交互式OCR开源电子书《动手学OCR》,覆盖OCR全栈技术的前沿理论与代码实践,并配套教学视频。
更多
特性
支持多种OCR相关前沿算法,在此基础上打造产业级特色模型PP-OCR和PP-Structure,并打通数据生产、模型训练、压缩、预测部署全流程。
上述内容的使用方法建议从文档教程中的快速开始体验
⚡ 快速开始
在线网站体验:超轻量PP-OCR mobile模型体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr
移动端demo体验:安装包DEMO下载地址(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)
一行命令快速使用:快速开始(中英文/多语言/文档分析)
《动手学OCR》电子书
《动手学OCR》电子书
开源社区
项目合作: 如果您是企业开发者且有明确的OCR垂类应用需求,填写问卷后可免费与官方团队展开不同层次的合作。
加入社区: 微信扫描二维码并填写问卷之后,加入交流群领取20G重磅OCR学习大礼包
包括《动手学OCR》电子书 ,配套讲解视频和notebook项目;PaddleOCR历次发版直播课回放链接;
OCR场景应用模型集合: 包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。
PDF2Word应用程序;OCR社区优秀开发者项目分享视频。
️社区项目:社区项目文档中包含了社区用户使用PaddleOCR开发的各种工具、应用以及为PaddleOCR贡献的功能、优化的文档与代码等,是官方为社区开发者打造的荣誉墙,也是帮助优质项目宣传的广播站。
社区常规赛:社区常规赛是面向OCR开发者的积分赛事,覆盖文档、代码、模型和应用四大类型,以季度为单位评选并发放奖励,赛题详情与报名方法可参考链接。
PaddleOCR官方交流群二维码
️ PP-OCR系列模型列表(更新中)
模型简介
模型名称
推荐场景
检测模型
方向分类器
识别模型
中英文超轻量PP-OCRv3模型(16.2M)
ch_PP-OCRv3_xx
移动端&服务器端
推理模型 / 训练模型
推理模型 / 训练模型
推理模型 / 训练模型
英文超轻量PP-OCRv3模型(13.4M)
en_PP-OCRv3_xx
移动端&服务器端
推理模型 / 训练模型
推理模型 / 训练模型
推理模型 / 训练模型
超轻量OCR系列更多模型下载(包括多语言),可以参考PP-OCR系列模型下载,文档分析相关模型参考PP-Structure系列模型下载
PaddleOCR场景应用模型
行业
类别
亮点
文档说明
模型下载
制造
数码管识别
数码管数据合成、漏识别调优
光功率计数码管字符识别
下载链接
金融
通用表单识别
多模态通用表单结构化提取
多模态表单识别
下载链接
交通
车牌识别
多角度图像处理、轻量模型、端侧部署
轻量级车牌识别
下载链接
更多制造、金融、交通行业的主要OCR垂类应用模型(如电表、液晶屏、高精度SVTR模型等),可参考场景应用模型下载
文档教程
运行环境准备
PP-OCR文本检测识别
快速开始
模型库
模型训练
文本检测
文本识别
文本方向分类器
模型压缩
模型量化
模型裁剪
知识蒸馏
推理部署
基于Python预测引擎推理
基于C++预测引擎推理
服务化部署
端侧部署
Paddle2ONNX模型转化与预测
云上飞桨部署工具
Benchmark
PP-Structure文档分析
快速开始
模型库
模型训练
版面分析
表格识别
关键信息提取
推理部署
基于Python预测引擎推理
基于C++预测引擎推理
服务化部署
前沿算法与模型
文本检测算法
文本识别算法
端到端OCR算法
表格识别算法
关键信息抽取算法
使用PaddleOCR架构添加新算法
场景应用
数据标注与合成
半自动标注工具PPOCRLabel
数据合成工具Style-Text
其它数据标注工具
其它数据合成工具
数据集
通用中英文OCR数据集
手写中文OCR数据集
垂类多语言OCR数据集
版面分析数据集
表格识别数据集
关键信息提取数据集
代码组织结构
效果展示
《动手学OCR》电子书
开源社区
FAQ
通用问题
PaddleOCR实战问题
参考文献
许可证书
效果展示 more
PP-OCRv3 中文模型
PP-OCRv3 英文模型
PP-OCRv3 多语言模型
PP-Structure 文档分析
版面分析+表格识别
SER(语义实体识别)
RE(关系提取)
许可证书
本项目的发布受Apache 2.0 license许可认证。
Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Starred
3.4K
Star
3.4K
Fork
826
捐赠
0 人次
举报
举报成功
我们将于2个工作日内通过站内信反馈结果给你!
请认真填写举报原因,尽可能描述详细。
举报类型
请选择举报类型
举报原因
取消
发送
误判申诉
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
取消
提交
简介
基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。
展开
收起
暂无标签
http://www.paddlepaddle.org
Python
等 6 种语言
Python
79.2%
C++
13.4%
Shell
4.6%
Java
1.3%
Cuda
0.4%
Other
1.1%
Apache-2.0
使用 Apache-2.0 开源许可协议
保存更改
取消
发行版
暂无发行版
开源评估指数源自 OSS-Compass 评估体系,评估体系围绕以下三个维度对项目展开评估:
1. 开源生态
生产力:来评估开源项目输出软件制品和开源价值的能力。
创新力:用于评估开源软件及其生态系统的多样化程度。
稳健性:用于评估开源项目面对多变的发展环境,抵御内外干扰并自我恢复的能力。
2. 协作、人、软件
协作:代表了开源开发行为中协作的程度和深度。
人:观察开源项目核心人员在开源项目中的影响力,并通过第三方视角考察用户和开发者对开源项目的评价。
软件:从开源项目对外输出的制品评估其价值最终落脚点。也是开源评估最“古老”的主流方向之一“开源软件” 的具体表现。
3. 评估模型
基于“开源生态”与“协作、人、软件”的维度,找到与该目标直接或间接相关的可量化指标,对开源项目健康与生态进行量化评估,最终形成开源评估指数。
贡献者
全部
近期动态
加载更多
不能加载更多了
编辑仓库简介
简介内容
基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测、文本识别的训练算法。
主页
取消
保存更改
Python
1
https://gitee.com/paddlepaddle/PaddleOCR.git
git@gitee.com:paddlepaddle/PaddleOCR.git
paddlepaddle
PaddleOCR
PaddleOCR
release/2.6
深圳市奥思网络科技有限公司版权所有
Git 大全
Git 命令学习
CopyCat 代码克隆检测
APP与插件下载
Gitee Reward
Gitee 封面人物
GVP 项目
Gitee 博客
Gitee 公益计划
Gitee 持续集成
OpenAPI
帮助文档
在线自助服务
更新日志
关于我们
加入我们
使用条款
意见建议
合作伙伴
售前咨询客服
技术交流QQ群
微信服务号
client#oschina.cn
企业版在线使用:400-606-0201
专业版私有部署:
13670252304
13352947997
开放原子开源基金会
合作代码托管平台
违法和不良信息举报中心
粤ICP备12009483号
简 体
/
繁 體
/
English
点此查找更多帮助
搜索帮助
Git 命令在线学习
如何在 Gitee 导入 GitHub 仓库
Git 仓库基础操作
企业版和社区版功能对比
SSH 公钥设置
如何处理代码冲突
仓库体积过大,如何减小?
如何找回被删除的仓库数据
Gitee 产品配额说明
GitHub仓库快速导入Gitee及同步更新
什么是 Release(发行版)
将 PHP 项目自动发布到 packagist.org
评论
仓库举报
回到顶部
登录提示
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册
GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Skip to content
Toggle navigation
Sign in
Product
Actions
Automate any workflow
Packages
Host and manage packages
Security
Find and fix vulnerabilities
Codespaces
Instant dev environments
Copilot
Write better code with AI
Code review
Manage code changes
Issues
Plan and track work
Discussions
Collaborate outside of code
Explore
All features
Documentation
GitHub Skills
Blog
Solutions
For
Enterprise
Teams
Startups
Education
By Solution
CI/CD & Automation
DevOps
DevSecOps
Resources
Learning Pathways
White papers, Ebooks, Webinars
Customer Stories
Partners
Open Source
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories
Topics
Trending
Collections
Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search
Clear
Search syntax tips
Provide feedback
We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
PaddlePaddle
/
PaddleOCR
Public
Notifications
Fork
7.1k
Star
37.4k
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
License
Apache-2.0 license
37.4k
stars
7.1k
forks
Branches
Tags
Activity
Star
Notifications
Code
Issues
1.1k
Pull requests
78
Discussions
Actions
Projects
1
Security
Insights
Additional navigation options
Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights
PaddlePaddle/PaddleOCR
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
release/2.7BranchesTagsGo to fileCodeFolders and filesNameNameLast commit messageLast commit dateLatest commit History6,131 Commits.github.github PPOCRLabelPPOCRLabel StyleTextStyleText applicationsapplications benchmarkbenchmark configsconfigs deploydeploy docdoc ppocrppocr ppstructureppstructure test_tipctest_tipc toolstools .clang_format.hook.clang_format.hook .gitignore.gitignore .pre-commit-config.yaml.pre-commit-config.yaml .style.yapf.style.yapf LICENSELICENSE MANIFEST.inMANIFEST.in README.mdREADME.md README_en.mdREADME_en.md __init__.py__init__.py paddleocr.pypaddleocr.py requirements.txtrequirements.txt setup.pysetup.py train.shtrain.sh View all filesRepository files navigationREADMEApache-2.0 licenseEnglish | 简体中文 | हिन्दी | 日本語 | 한국인 | Pу́сский язы́к
简介
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。
近期更新
PaddleOCR 算法模型挑战赛 火热开启!报名时间1/15-3/31,30万元奖金池!快来一展身手吧!
2023.11 发布 PP-ChatOCRv2: 一个SDK,覆盖20+高频应用场景,支持5种文本图像智能分析能力和部署,包括通用场景关键信息抽取(快递单、营业执照和机动车行驶证等)、复杂文档场景关键信息抽取(解决生僻字、特殊标点、多页pdf、表格等难点问题)、通用OCR、文档场景专用OCR、通用表格识别。针对垂类业务场景,也支持模型训练、微调和Prompt优化。
2023.8.7 发布 PaddleOCR release/2.7
发布PP-OCRv4,提供mobile和server两种模型
PP-OCRv4-mobile:速度可比情况下,中文场景效果相比于PP-OCRv3再提升4.5%,英文场景提升10%,80语种多语言模型平均识别准确率提升8%以上
PP-OCRv4-server:发布了目前精度最高的OCR模型,中英文场景上检测模型精度提升4.9%, 识别模型精度提升2%
可参考快速开始 一行命令快速使用,同时也可在飞桨AI套件(PaddleX)中的通用OCR产业方案中低代码完成模型训练、推理、高性能部署全流程
发布PP-ChatOCR ,使用融合PP-OCR模型和文心大模型的通用场景关键信息抽取全新方案
2022.11 新增实现4种前沿算法:文本检测 DRRG, 文本识别 RFL, 文本超分Text Telescope,公式识别CAN
2022.10 优化JS版PP-OCRv3模型:模型大小仅4.3M,预测速度提升8倍,配套web demo开箱即用
直播回放:PaddleOCR研发团队详解PP-StructureV2优化策略。微信扫描下方二维码,关注公众号并填写问卷后进入官方交流群,获取直播回放链接与20G重磅OCR学习大礼包(内含PDF转Word应用程序、10种垂类模型、《动手学OCR》电子书等)
2022.8.24 发布 PaddleOCR release/2.6
发布PP-StructureV2,系统功能性能全面升级,适配中文场景,新增支持版面复原,支持一行命令完成PDF转Word;
版面分析模型优化:模型存储减少95%,速度提升11倍,平均CPU耗时仅需41ms;
表格识别模型优化:设计3大优化策略,预测耗时不变情况下,模型精度提升6%;
关键信息抽取模型优化:设计视觉无关模型结构,语义实体识别精度提升2.8%,关系抽取精度提升9.1%。
2022.8 发布 OCR场景应用集合:包含数码管、液晶屏、车牌、高精度SVTR模型、手写体识别等9个垂类模型,覆盖通用,制造、金融、交通行业的主要OCR垂类应用。
更多
特性
支持多种OCR相关前沿算法,在此基础上打造产业级特色模型PP-OCR、PP-Structure和PP-ChatOCRv2,并打通数据生产、模型训练、压缩、预测部署全流程。
上述内容的使用方法建议从文档教程中的快速开始体验
⚡ 快速开始
在线免费体验:
PP-OCRv4 在线体验地址:https://aistudio.baidu.com/application/detail/7658
PP-ChatOCRv2 在线体验地址:https://aistudio.baidu.com/application/detail/10368
一行命令快速使用:快速开始(中英文/多语言/文档分析)
移动端demo体验:安装包DEMO下载地址(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)
技术交流合作
飞桨低代码开发工具(PaddleX)—— 面向国内外主流AI硬件的飞桨精选模型一站式开发工具。包含如下核心优势:
【产业高精度模型库】:覆盖10个主流AI任务 40+精选模型,丰富齐全。
【特色模型产线】:提供融合大小模型的特色模型产线,精度更高,效果更好。
【低代码开发模式】:图形化界面支持统一开发范式,便捷高效。
【私有化部署多硬件支持】:适配国内外主流AI硬件,支持本地纯离线使用,满足企业安全保密需要。
PaddleX官网地址:https://aistudio.baidu.com/intro/paddlex
PaddleX官方交流频道:https://aistudio.baidu.com/community/channel/610
《动手学OCR》电子书
《动手学OCR》电子书
开源共建
加入社区:感谢大家长久以来对 PaddleOCR 的支持和关注,与广大开发者共同构建一个专业、和谐、相互帮助的开源社区是 PaddleOCR 的目标。我们非常欢迎各位开发者参与到飞桨社区的开源建设中,加入开源、共建飞桨。为感谢社区开发者在 PaddleOCR release2.7 中做出的代码贡献,我们将为贡献者制作与邮寄开源贡献证书,烦请填写问卷提供必要的邮寄信息。
社区活动:飞桨开源社区长期运营与发布各类丰富的活动与开发任务,在 PaddleOCR 社区,你可以关注以下社区活动,并选择自己感兴趣的内容参与开源共建:
飞桨套件快乐开源常规赛 | 传送门:OCR 社区常规赛升级版,以建设更好用的 OCR 套件为目标,包括但不限于学术前沿模型训练与推理、打磨优化 OCR 工具与应用项目开发等,任何有利于社区意见流动和问题解决的行为都热切希望大家的参与。让我们共同成长为飞桨套件的重要 Contributor 。
新需求征集 | 传送门:你在日常研究和实践深度学习过程中,有哪些你期望的 feature 亟待实现?请按照格式描述你想实现的 feature 和你提出的初步实现思路,我们会定期沟通与讨论这些需求,并将其纳入未来的版本规划中。
PP-SIG 技术研讨会 | 传送门:PP-SIG 是飞桨社区开发者由于相同的兴趣汇聚在一起形成的虚拟组织,通过定期召开技术研讨会的方式,分享行业前沿动态、探讨社区需求与技术开发细节、发起社区联合贡献任务。PaddleOCR 希望可以通过 AI 的力量助力任何一位有梦想的开发者实现自己的想法,享受创造价值带来的愉悦。
项目合作:如果你有企业中明确的 OCR 垂类应用需求,我们推荐你使用训压推一站式全流程高效率开发平台 PaddleX,助力 AI 技术快速落地。PaddleX 还支持联创开发,利润分成!欢迎广大的个人开发者和企业开发者参与进来,共创繁荣的 AI 技术生态!
️ PP-OCR系列模型列表(更新中)
模型简介
模型名称
推荐场景
检测模型
方向分类器
识别模型
中英文超轻量PP-OCRv4模型(15.8M)
ch_PP-OCRv4_xx
移动端&服务器端
推理模型 / 训练模型
推理模型 / 训练模型
推理模型 / 训练模型
中英文超轻量PP-OCRv3模型(16.2M)
ch_PP-OCRv3_xx
移动端&服务器端
推理模型 / 训练模型
推理模型 / 训练模型
推理模型 / 训练模型
英文超轻量PP-OCRv3模型(13.4M)
en_PP-OCRv3_xx
移动端&服务器端
推理模型 / 训练模型
推理模型 / 训练模型
推理模型 / 训练模型
超轻量OCR系列更多模型下载(包括多语言),可以参考PP-OCR系列模型下载,文档分析相关模型参考PP-Structure系列模型下载
PaddleOCR场景应用模型
行业
类别
亮点
文档说明
模型下载
制造
数码管识别
数码管数据合成、漏识别调优
光功率计数码管字符识别
下载链接
金融
通用表单识别
多模态通用表单结构化提取
多模态表单识别
下载链接
交通
车牌识别
多角度图像处理、轻量模型、端侧部署
轻量级车牌识别
下载链接
更多制造、金融、交通行业的主要OCR垂类应用模型(如电表、液晶屏、高精度SVTR模型等),可参考场景应用模型下载
文档教程
运行环境准备
PP-OCR文本检测识别
快速开始
模型库
模型训练
文本检测
文本识别
文本方向分类器
模型压缩
模型量化
模型裁剪
知识蒸馏
推理部署
基于Python预测引擎推理
基于C++预测引擎推理
服务化部署
端侧部署
Paddle2ONNX模型转化与预测
云上飞桨部署工具
Benchmark
PP-Structure文档分析
快速开始
模型库
模型训练
版面分析
表格识别
关键信息提取
推理部署
基于Python预测引擎推理
基于C++预测引擎推理
服务化部署
前沿算法与模型
文本检测算法
文本识别算法
端到端OCR算法
表格识别算法
关键信息抽取算法
使用PaddleOCR架构添加新算法
场景应用
数据标注与合成
半自动标注工具PPOCRLabel
数据合成工具Style-Text
其它数据标注工具
其它数据合成工具
数据集
通用中英文OCR数据集
手写中文OCR数据集
垂类多语言OCR数据集
版面分析数据集
表格识别数据集
关键信息提取数据集
代码组织结构
效果展示
《动手学OCR》电子书
开源社区
FAQ
通用问题
PaddleOCR实战问题
参考文献
许可证书
效果展示 more
PP-OCRv3 中文模型
PP-OCRv3 英文模型
PP-OCRv3 多语言模型
PP-Structure 文档分析
版面分析+表格识别
SER(语义实体识别)
RE(关系提取)
许可证书
本项目的发布受Apache 2.0 license许可认证。
About
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Topics
ocr
db
crnn
ocrlite
chineseocr
Resources
Readme
License
Apache-2.0 license
Activity
Custom properties
Stars
37.4k
stars
Watchers
431
watching
Forks
7.1k
forks
Report repository
Releases
8
PaddleOCRv2.7.1
Latest
Oct 18, 2023
+ 7 releases
Packages
0
No packages published
Used by 2.3k
+ 2,254
Contributors
147
+ 133 contributors
Languages
Python
78.5%
C++
12.5%
Shell
4.8%
Java
2.5%
CMake
0.4%
Cuda
0.4%
Other
0.9%
Footer
© 2024 GitHub, Inc.
Footer navigation
Terms
Privacy
Security
Status
Docs
Contact
Manage cookies
Do not share my personal information
You can’t perform that action at this time.
PaddleOCR本地部署(安装,使用,模型优化/加速)_paddleocr 部署-CSDN博客
>PaddleOCR本地部署(安装,使用,模型优化/加速)_paddleocr 部署-CSDN博客
PaddleOCR本地部署(安装,使用,模型优化/加速)
吨吨不打野
已于 2023-11-23 08:49:26 修改
阅读量4w
收藏
210
点赞数
49
分类专栏:
# OCR数字仪表识别
文章标签:
macos
pytorch
深度学习
于 2021-06-04 10:57:41 首次发布
本文为博主原创文章,未经博主允许不得转载。
本文链接:https://blog.csdn.net/Castlehe/article/details/117356343
版权
OCR数字仪表识别
专栏收录该内容
32 篇文章
64 订阅
订阅专栏
文章目录
1. 安装1.1 还是需要paddle1.2 确认各种包和环境1.3 可能不需要paddle?
2. 使用2.1 配置摄像头,读取,识别,显示2.3 检测模型的问题2.3.1 换个模型2.3.2 限定检测位置
3. 性能改进3.0 基本情况X 自己模型的速度和全用默认的速度对比3.1 端侧部署3.2 加速3.2.1 CPU下使用mkldnn加速3.2.2 修改参数3.2.3 内存泄露3.2.4 内存泄漏的问题记录
3.3 剪枝3.4 其他可能的途径3.3 更换模型3.4 多进程
3.5 cpu占用问题3.5.1 paddle绑定cpu问题
3.6 推理部署文档
1. 安装
1.1 还是需要paddle
根据:paddleocr package使用说明 一开始以为:
pip install "paddleocr>=2.0.1"
但是果然: so就在本机安装一下paddle好了,但是也仅需要paddle,参考:快速安装
# windows下 直接 python 不是python3
python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple
安装好再去运行,遇到经典的shapely错误,参考:Win10 CPU环境,OSError: [WinError 126] 找不到指定的模块 #212 windows下安装shapely,需要从这里下载,然后再
pip uninstall shapely
pip install Shapely-1.7.1-cp37-cp37m-win_amd64.whl
conda install shapely -c conda-forge
或者
更名为Shapely-1.7.0-cp39-cp39-win_amd64.rar,然后解压缩,从其子目录shapely\DLLs\中找到geos_c.dll,并将geos_c.dll拷贝到conda的环境(我的命名是ocr)目录 C:\Users\myusername\Miniconda3\envs\ocr\Library\bin中。问题解决 同时把geos.dll和geos_c.dll拷贝至你anaconda环境中的library\bin中
最简单的方案!!!
删除anaconda中之前装的shaply(文件夹和程序都删掉),重新安装,
参考:anaconda3+ paddleOCR安装使用
1.2 确认各种包和环境
本机上,使用了anaconda默认环境,各种版本如下:
python 3.7.6paddleocr 使用pip安装后看到的版本是: paddleocr-2.0.6-py3
requirments文件中的内容:
shapely scikit-image == 0.17.2 imgaug == 0.4.0 pyclipper lmdb opencv-python == 4.2.0.32 tqdm numpys visualdl python-Levenshtein
1.3 可能不需要paddle?
根据paddleocr的FAQ文档
Q3.4.23:安装paddleocr后,提示没有paddle A:这是因为paddlepaddle gpu版本和cpu版本的名称不一致,现在已经在whl的文档里做了安装说明。
但是参考:预测示例 (Python) 可知,想使用paddle系列的模型,是必须要使用paddle的inference的。
所以还是老老实实安装上paddle吧
2. 使用
改改路径就好了。
其中有一点需要注意:
[[[[72.0, 149.0], [113.0, 151.0], [113.0, 166.0], [72.0, 163.0]], ('40', 0.7172388)],
[[[62.0, 170.0], [237.0, 175.0], [233.0, 300.0], [58.0, 294.0]], ('1076', 0.9666834)]]
可以看到输出文件的结构,每一个文本识别结果,都包括四个点的坐标,一个二位数组,以及一个最后识别结果的元组(识别的文字结果,置信度),结构就是一个数组涵盖这两部分内容。 如果是多个文本识别结果,会有再外层的一个数组。
2.1 配置摄像头,读取,识别,显示
参考另一个文章:python opencv调用摄像头识别并绘制结果
发现一个神奇的事情,当你插着usb摄像头启动电脑时,cap = cv2.VideoCapture(0),usb摄像头的序号就是0;当启动电脑之后再插上usb摄像头,usb摄像头的序号就是2(我的电脑是一个前置+一个后置摄像头)
关于摄像头参数的调节,可以参考另一篇文章:Opencv摄像头相关参数
2.3 检测模型的问题
由于使用了摄像头读取图像,图片背景比较杂,对检测有难度,发现使用DB效果不是很好。(由于还没怎么研究过检测模型,所以很难判断问题到底出在哪里)
2.3.1 换个模型
参考: paddleocr文档 可以看到,其实EAST的准确率要比DB高,虽然存在过检。
EAST高效,准确,但对弯曲文本检测较差。
在paddleocr.py文件中看到:
parser.add_argument("--det_algorithm", type=str, default='DB')
# 调用时修改为EAST,但是报错
然后看到代码中有:
SUPPORT_DET_MODEL = ['DB']
VERSION = 2.0
SUPPORT_REC_MODEL = ['CRNN']
BASE_DIR = os.path.expanduser("~/.paddleocr/")
结论: 发现下载的是预训练模型,不是推理模型,无法使用EAST算法。
2.3.2 限定检测位置
设置一个按键,opencv摄像头有键盘响应,可以有相应的操作,参考:cv2.VideoCapture.get、set详解可以获取相机参数。
另外,参考:opencv python全屏显示、置窗口大小和位置
cap=cv2.VideoCapture(1)
cv2.VideoCapture.get(3) # CV_CAP_PROP_FRAME_WIDTH 在视频流的帧的宽度
cv2.VideoCapture.get(4) # CV_CAP_PROP_FRAME_HEIGHT 在视频流的帧的高度
# 除了get,还有set
capture.set(CV_CAP_PROP_FRAME_WIDTH, 1080); 宽度
capture.set(CV_CAP_PROP_FRAME_HEIGHT, 960); 高度
frame[top:bottom,left:right]
参考:python cv2图片剪裁
3. 性能改进
3.0 基本情况
检测时间比较久,检测+识别的时间差不多是0.7~1.2s,在cpu机器上,其实比较尴尬。
先查看一下模型的size,运行检测的时候会打印出模型的配置信息,可以从这里看到
Namespace(cls_batch_num=6, cls_image_shape='3, 48, 192',
cls_model_dir='C:\\Users\\huangshan/.paddleocr/2.1/cls', cls_thresh=0.9, det=True,
det_algorithm='DB', det_db_box_thresh=0.3, det_db_thresh=0.2,
det_db_unclip_ratio=2.2, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2,
det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max',
det_model_dir='C:\\Users\\huangshan/.paddleocr/2.1/det/ch', drop_score=0.5,
enable_mkldnn=False, gpu_mem=8000, image_dir='', ir_optim=True, label_list=['0',
'180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN',
rec_batch_num=6, rec_char_dict_path='C:/shaiic_work/ZhiNengKeJiOCR/digit.txt',
rec_char_type='ch', rec_image_shape='3, 32, 320',
rec_model_dir='C:/shaiic_work/ZhiNengKeJiOCR/rec_crnn_digit', use_angle_cls=False,
use_dilation=False, use_gpu=False, use_pdserving=False, use_space_char=True,
use_tensorrt=False, use_zero_copy_run=False)
采用的检测模型是自带的,位置在:det_model_dir='C:\\Users\\yourname/.paddleocr/2.1/det/ch',检测模型只有3M 先确认一下这个默认模型的信息,从代码中可以看到:
'rec': {
'ch': {
'url':'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
},
ch_ppocr_mobile_v2.0_rec_infer.tar所以这个默认的模型目测已经是剪枝过的了。
同时,参考:PP-OCR 2.0系列模型列表文档
识别模型是自己训练之后转为推理模型的,有94MB,确实对于比较简单的一块数字仪表识别很重。
X 自己模型的速度和全用默认的速度对比
上面已经给了几个自己模型的用时图,下面给几个全用默认的时间图
可知: 不使用mkldnn加速的情况下,使用默认的检测+自己的识别速度基本在0.7~1.2s 不使用mkldnn加速的情况下,使用默认的检测+默认的识别速度基本在0.7~0.9s 所以虽然识别时间本来就不到0.2s,但是可以变得更快,这样就只剩检测时间了。 个人猜测,是不是第一阶段检测模型是剪枝后的,比如是8位精度,第二阶段识别模型也是8位精度,这样系统处理是一致的。 如果两个阶段数据精度不一样,系统处理的时候不一致,是不是也会造成数据差异。
3.1 端侧部署
参考文档:端侧部署 这种会帮助有效减小模型size,但是推理速度似乎没有强调会不会变快。
3.2 加速
直接去FAQ文档中搜索加速,可以看到以下结果.
Q3.1.73: 如何使用TensorRT加速PaddleOCR预测? A: 目前paddle的dygraph分支已经支持了python和C++ TensorRT预测的代码,python端inference预测时把参数–use_tensorrt=True即可, C++TensorRT预测需要使用支持TRT的预测库并在编译时打开-DWITH_TENSORRT=ON。 如果想修改其他分支代码支持TensorRT预测,可以参考PR。 注:建议使用TensorRT大于等于6.1.0.5以上的版本。
另外,搜索速度,可以看到:
Q3.4.40: 使用hub_serving部署,延时较高,可能的原因是什么呀? A: 首先,测试的时候第一张图延时较高,可以多测试几张然后观察后几张图的速度;其次,如果是在cpu端部署serving端模型(如backbone为ResNet34),耗时较慢,建议在cpu端部署mobile(如backbone为MobileNetV3)模型。
这里建议在cpu端部署mobile模型
也可以只看预测部署部分,还可以看到以下比较有用的信息:
Q3.4.1:如何pip安装opt模型转换工具? A:由于OCR端侧部署需要某些算子的支持,这些算子仅在Paddle-Lite 最新develop分支中,所以需要自己编译opt模型转换工具。opt工具可以通过编译PaddleLite获得,编译步骤参考lite部署文档 中2.1 模型优化部分。
Q3.4.2:如何将PaddleOCR预测模型封装成SDK A:如果是Python的话,可以使用tools/infer/predict_system.py中的TextSystem进行sdk封装,如果是c++的话,可以使用deploy/cpp_infer/src下面的DBDetector和CRNNRecognizer完成封装
3.2.1 CPU下使用mkldnn加速
由于慢的地方主要是检测,所以即便对剪枝进行优化也不是很有效,所以这里先尝试使用mkldnn来进行加速。
Q3.1.77: 使用mkldnn加速预测时遇到 ‘Please compile with MKLDNN first to use MKLDNN’ A: 报错提示当前环境没有mkldnn,建议检查下当前CPU是否支持mlkdnn(MAC上是无法用mkldnn);另外的可能是使用的预测库不支持mkldnn, 建议从这里下载支持mlkdnn的CPU预测库。
Q1.1.10:PaddleOCR中,对于模型预测加速,CPU加速的途径有哪些?基于TenorRT加速GPU对输入有什么要求? A:(1)CPU可以使用mkldnn进行加速;对于python inference的话,可以把enable_mkldnn改为true,参考代码,对于cpp inference的话,在配置文件里面配置use_mkldnn 1即可,参考代码 (2)GPU需要注意变长输入问题等,TRT6 之后才支持变长输入
直接在inference代码中将enable_mkldnn改为true,确实变快了一些。 之前是0.7~1.2,现在基本就是0.6-0.98,反正没有超过1s的,0.8的比较多。
3.2.2 修改参数
想起来还有一些参数可以考虑修改,比如:
parser.add_argument("--det_limit_side_len", type=float, default=960)
根据FAQ文档
Q3.3.2:配置文件里面检测的阈值设置么? A:有的,检测相关的参数主要有以下几个: det_limit_side_len:预测时图像resize的长边尺寸 det_db_thresh: 用于二值化输出图的阈值 det_db_box_thresh:用于过滤文本框的阈值,低于此阈值的文本框不要 det_db_unclip_ratio: 文本框扩张的系数,关系到文本框的大小 这些参数的默认值见代码,可以通过从命令行传递参数进行修改。
det_limit_side_len默认是960,考虑改成32的倍数,但是改小一些,比如320。 det_limit_side_len改的再小一些,256,识别部分最大长度max_text_length=5,rec_image_shape=(3,32,256),之前默认是(3,32,320)。 很奇怪,一开始速度是0.6,后来逐渐稳定再0.8~0.9之间。打开内存任务管理器发现,这个东西内存占用可以达到98%???? 我本机是32G的内存,无语。
另外,关于常见PaddleOCR包里给出的参数说明,参考:paddleocr package使用说明最后有一个参数说明表: 关于检测可控的参数有很多,但是关于识别,其实并没有很多可以进行调优的参数。
3.2.3 内存泄露
根据FAQ文档,
Q3.4.43: 预测时显存爆炸、内存泄漏问题? A: 打开显存/内存优化开关enable_memory_optim可以解决该问题,相关代码已合入,查看详情。
可以看到这个代码的位置: https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/tools/infer/utility.py#L153
Q3.4.17: 预测内存泄漏问题 A:1. 使用hubserving出现内存泄漏,该问题为已知问题,预计在paddle2.0正式版中解决。相关讨论见issue A:2. C++ 预测出现内存泄漏,该问题已经在paddle2.0rc版本中解决,建议安装paddle2.0rc版本,并更新PaddleOCR代码到最新。
2021.6.3查看那个issue看到: 这个更新是26天前,查看自己的环境,似乎不是2.0rc,换。 所以虽然找不到paddle2.0rc版本,但是可以直接去下载2.1版本,开始使用
python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
# 直接用这个会显示已经安装了2.0,所以需要
pip install --upgrade paddlepaddle -i https://mirror.baidu.com/pypi/simple
# 更新到最新,就是2.1
改了之后,内存依然占用量很高,而且推理速度还变慢了。。。。都超过1s了,但是效果好像好了一些,连一些虚的都变好了。换成自带的识别模型之后,也比之前时间长了,无语。 但是更新到2.1之后,打开mkldnn,速度变快了,基本控制在0.3-0.5s。。。但是内存占用是100%基本上。 加速之后,超级快,但是内存占用非常高。
最后的结论 内存泄漏是因为开启了mkldnn,需要关闭。 同时cpu thread数量改为1,不然还是会有很高的内存占用率,同时改成1,其实速度影响并不大。
3.2.4 内存泄漏的问题记录
发现paddle的issue中有很多说速度很慢的:
使用CPU下进行加速处理,但是识别的速度将近30S,请问有什么方法提高嘛? #2950 这个用的是服务器端的模型,看到了server 还有关于PPOCRLabel也是自动标记过程中由快到慢: 关于半自动标注工具PPOCRLabel运行速度由快逐渐变慢的问题 #1391 类似的也有:PPOCRLabel自动标注跑着跑着就自己闪退了 #2724 有说版本变慢的:2.x版本比1.x版本慢2倍 #2630 还有识别时内存一直涨 溢出 #303 虽然这个issue关闭了,但是下面还是有人再报错。。。
3.3 剪枝
在上面下载支持mlkdnn的CPU预测库的时候,看到了一个很有用的说明文档:https://paddle-inference.readthedocs.io/en/latest/index.html,就是针对paddle系列的推理模型的。
模型量化(主要就是剪枝)——X86 CPU 上部署量化模型
大概介绍一下,搬运
众所周知,模型量化可以有效加快模型预测性能,飞桨也提供了强大的模型量化功能。所以,本文主要介绍在X86 CPU部署PaddleSlim产出的量化模型。 对于常见图像分类模型,在Casecade Lake机器上(例如Intel® Xeon® Gold 6271、6248,X2XX等),INT8模型进行推理的速度通常是FP32模型的3-3.7倍;在SkyLake机器上(例如Intel® Xeon® Gold 6148、8180,X1XX等),INT8模型进行推理的速度通常是FP32模型的1.5倍。 X86 CPU部署量化模型的步骤: 产出量化模型:使用PaddleSlim训练并产出量化模型 转换量化模型:将量化模型转换成最终部署的量化模型 部署量化模型:使用Paddle Inference预测库部署量化模型
一开始其实不太想用剪枝的,因为慢的原因主要在于检测,但是检测的模型已经是剪枝后的了,在比较过全都使用默认的剪枝模型(检测+识别),和使用默认的检测+自己的识别模型之后,发现其实还是有些效果的。
但是相比于剪枝的代价,并不值得。
3.4 其他可能的途径
3.3 更换模型
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_ch/models_list.md
3.4 多进程
FAQ-如何多进程运行paddleocr?
Q3.4.33: 如何多进程运行paddleocr? A:实例化多个paddleocr服务,然后将服务注册到注册中心,之后通过注册中心统一调度即可,关于注册中心,可以搜索eureka了解一下具体使用,其他的注册中心也行。
Q3.4.44: 如何多进程预测 A: 近期PaddleOCR新增了多进程预测控制参数,use_mp表示是否使用多进程,total_process_num表示在使用多进程时的进程数。具体使用方式请参考文档。
parser.add_argument("--use_mp", type=str2bool, default=False)
# 只能命令行调用
# 使用方向分类器
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --cls_model_dir="./inference/cls/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=true
# 不使用方向分类器
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false
# 使用多进程
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/00018069.jpg" --det_model_dir="./inference/det_db/" --rec_model_dir="./inference/rec_crnn/" --use_angle_cls=false --use_mp=True --total_process_num=6
看了一下,其实这个文件https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/paddleocr.py 和另一个PaddleOCR/tools/infer/utility.py文件内容很像, wheel包里的那个paddleocr.py其实就是这个utility.py文件的一部分内容的简化,方便调用而已。
3.5 cpu占用问题
之前又说内存泄漏,还有个问题就是CPU抢占,默认会占到100%。 很多人也有这样的问题,比如:百度AI社区-ppocr部分 在FAQ文档中没有搜索到相关信息,移动端arm cpu优化学习笔记第3弹–绑定cpu(cpu affinity)
查看自己电脑核数 win10系统如何查看cpu核数 所以我这个电脑是8核。
3.5.1 paddle绑定cpu问题
3.6推理部署文档里涉及了一点点,主要是: Docs » Python API 文档 » Config 类 » 3. 使用 CPU 进行预测
后来想到PaddleOCR的代码中有: 主要是这里的infer文件夹中的五个脚本文件,关键就是搞清楚utility.py中config文件配置的项目都是怎么搞的。 由于人工找太累了而且还没找到,所以直接在代码里调试来查看:把调用ocr的附近打个断点,然后看输出的变量,里面就有ocr对象(PaddleOCR类) 然后就可以看到,有一个文本检测和文本识别 展开,可以看到大部分参数其实都是在文本检测那里配置的 文本检测配置了很多东西,但是并没有cpu相关的配置。
其实utility.py文件中,有一段代码,找到自己本机安装paddleocr的地方,C:\software\anaconda\Lib\site-packages\paddleocr\tools\infer,133行左右
if args.use_gpu:
config.enable_use_gpu(args.gpu_mem, 0)
if args.use_tensorrt:
config.enable_tensorrt_engine(
precision_mode=inference.PrecisionType.Half
if args.use_fp16 else inference.PrecisionType.Float32,
max_batch_size=args.max_batch_size)
else:
config.disable_gpu()
# cpu设置的关键
config.set_cpu_math_library_num_threads(6)
if args.enable_mkldnn:
# cache 10 different shapes for mkldnn to avoid memory leak
# mkldnn设置的关键
config.set_mkldnn_cache_capacity(10)
config.enable_mkldnn()
# TODO LDOUBLEV: fix mkldnn bug when bach_size > 1
#config.set_mkldnn_op({'conv2d', 'depthwise_conv2d', 'pool2d', 'batch_norm'})
args.rec_batch_num = 1
根据文档Docs » Python API 文档 » Config 类 » 3. 使用 CPU 进行预测说明,
在 CPU 可用核心数足够时,可以通过设置 set_cpu_math_library_num_threads 将线程数调高一些,默认线程数为 1
所以如果想要限制这个使用cpu的核数量,可以设置代码中
config.set_cpu_math_library_num_threads(6)
# 把6改成4好了
另外,由于启用了mkldnn,还是根据上面那个文档:
启用 MKLDNN 的前提为已经使用 CPU 进行预测,否则启用 MKLDNN 无法生效 启用 MKLDNN BF16 要求 CPU 型号可以支持 AVX512,否则无法启用 MKLDNN BF16
# 设置 MKLDNN 的 cache 容量大小
config.set_mkldnn_cache_capacity(1)
最后将cpu个数从6变成4,mkldnn从10变成5,需要重启电脑才生效,使用reload函数重新加载库似乎没什么用,关掉pycharm重新启动pycharm也没啥用。 但是检测速度又降低了。 而且重启电脑之后,第一次是控制在了50%左右,但是第二次再去进行的时候就不行了。
这是因为 python 把ppocr这个库缓存了,需要把它从 sys.modules 里删了再导入即可。 参考另一个博文:python3 reload
3.6 推理部署文档
paddle有很多关于推理部署的专门的文档,其实可以看看。 推理部署 这附近文档还有个图,感觉不错
另外,从官网这里可以切到文档:Python预测部署示例 还找到了一个使用Paddle inference进行口罩检测推理的: (二) 使用 Paddle Inference进行口罩检测
优惠劵
吨吨不打野
关注
关注
49
点赞
踩
210
收藏
觉得还不错?
一键收藏
打赏
知道了
14
评论
PaddleOCR本地部署(安装,使用,模型优化/加速)
之前在服务器上要训练ppocr模型,所以需要额外安装PaddlePaddle,但是自己训练后已经把模型变成了推理模型,直接可以使用paddleocr package这一个包来进行运行,所以在迁移到别的环境时候,可以不再进行paddlepaddle的安装。参考:和之前的文章:PaddleOCR数字仪表识别——5. ppocr封装使用1. 安装根据:paddleocr package使用说明一开始以为:pip install "paddleocr>=2.0.1"但是果然:so就在本机安
复制链接
扫一扫
专栏目录
PaddleOCR做成exe程序,打开即用,无需安装任何环境,还可以POST访问
09-25
将最新的PaddleOCR部署到windows10上面,并且连同环境一起,打包成exe程序,打开直接运行,并且是做成web服务的方式,在浏览器里面输入http://localhost:18888/docs即可访问
PaddleOCR Docker 服务化 部署过程
04-15
PaddleOCR Docker 服务化 部署过程
14 条评论
您还未登录,请先
登录
后发表或查看评论
[深度学习]paddleocrv4模型推理要比v3版本慢很多原因
最新发布
FL1623863129的博客
01-29
1038
那个飞桨的页面也多次翻到过,但是并没有下旧版本的回来试过(还没有走到那步吧,想先试试其他法子能不能解决问题),而且 VS C# 的开发习惯还是喜欢直接用一键安装的 nuget 包不用自己折腾,所幸现在已经调整出可以接受的方案了。速度会变慢,上面3个排列组合任意一个不满足,速度都快。
部署paddleocr
m0_63590134的博客
11-06
566
最近接了 一个需求,客户想上传pdf,系统识别pdf内容并保存,用于后续内容检索。其实在我看来不如招一个打字员,服务器接收到文件直接让打字员识别,这样准确率又高,至于效率嘛那得看给别人开多少的工资了。奈何我不是老板,所以只能苦哈哈的去找能够代替打字员工作的方案了。目前市面上常用的tess4j、百度OCR、Tesseract-OCR、百度paddle。前三个要么收费,要么就是准确率要低一些。用现在的话来说,不是买不起,而是paddle更具性价比。
PaddleOCR安装步骤
顺其自然~专栏
02-14
8027
paddle([ˈpædl],桨,船桨)
Windows下的PIP安装
一、环境准备
1.1目前飞桨支持的环境
Windows 7/8/10 专业版/企业版 (64bit)
GPU版本支持CUDA 10.1/10.2/11.0/11.1/11.2,且仅支持单卡
Python 版本 3.6+/3.7+/3.8+/3.9+ (64 bit)
pip 版本 20.2.2或更高版本 (64 bit)
1.2如何查看您的环境
需要确认python的.
PaddleOCR安装教程(一)
m0_55776553的博客
03-09
3704
PaddleOCR安装教程(一)
PaddleOCR服务化部署
qq_44309220的博客
10-22
5187
PaddleOCR 提供 2 种服务部署方式:一种是 PaddleServing 的部署方式, 仅使用 CPU 推理预测在 Windows 和 Linux都能进行部署.若要使用 GPU 进行推理预测, 在 Windows 上只能使用 Docker 进行部署 (这步没有进行尝试).在 Linux 上可以手动部署, 也可以使用 Docker 部署 (这步没有进行尝试).另一种是 PaddleHub 的部署方式, 由于在 Windows 上设置 CUDA_VISIBLE_DEVICES=0 出现错误, 所以 Pa
正确安装PaddleOCR的方法
weixin_44063045的博客
02-25
3623
记录安装paddleorc踩过的坑
安装paddleocr
雪剑封心
11-21
1899
克隆代码我使用Anaconda3的虚拟环境进行安装直接在当前目录下标注工具安装有两种方法:1.直接pip,你可以在其他环境下,或者 根本无需下载PaddleOCR的情况下安装。2.在PaddleOCR里面找到PPOCRLabel,编译安装。进入标注工具的目录编译生成进入生成的文件夹cd dist安装编译文件使用标注工具命令启动PPOCRLabel。
PaddleOCR问题汇总(2)
《好好先生》专栏
11-07
7282
PaddleOCR问题汇总
Q3.1.64: config yml文件中的ratio_list参数的作用是什么?
A: 在动态图中,ratio_list在有多个数据源的情况下使用,ratio_list中的每个值是每个epoch从对应数据源采样数据的比例。如ratio_list=[0.3,0.2],label_file_list=['data1','data2'],代表每个epoch的训练数据包含data1 30%的数据,和data2里 20%的数据,ratio_list中数值的和不需要等于1。
rat
C#实现基于Csharp和OpenVINO部署PaddleOCR模型.zip
11-19
C#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于Csharp和OpenVINO部署PaddleOCR模型.zipC#实现基于...
PaddleOCR系列-训练模型并部署android手机,源代码
08-11
PaddleOCR系列-训练模型并部署android手机,源代码 1.训练paddleocr模型 2.ocr模型部署安卓手机 3.文章:https://blog.csdn.net/qq122716072/article/details/126244000 4.
C#部署paddleocrv4模型例子源码
01-07
【测试环境】 vs2019 ...opencvsharp4.8.0 Sdcb.PaddleInference Sdcb.PaddleOCR 博客地址:https://blog.csdn.net/FL1623863129/article/details/135435809 视频演示:https://www.bilibili.com/video/BV19g4y1D75w/
c# 部署PaddleOCR(csdn)————程序.pdf
12-01
c# 部署PaddleOCR(csdn)————程序
基于Flask对PaddleOCR进行部署项目源码+项目操作说明(方便调用).zip
09-18
2. 安装好本地需要的环境(`paddlepaddle`、`paddleocr`、以及[requirements.txt](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/requirements.txt)); 3. 根据需求,修改`server.py`第97行`IP`...
将paddleocr灯光模型转换为ncnn,您可以由ncnn使用它。-C/C++开发
05-27
将paddleocr灯光模型转换为ncnn,您可以由ncnn使用它。 ncnn_paddleocr将paddleocr灯光模型转换为ncnn,您可以由ncnn使用它。 您可以使用chineseocr_lite项目的推断代码。 PS:如果使用角度模型plz,请将输入形状...
基于PaddleOCR车牌号识别模型
05-14
基于PaddleOCR车牌号识别模型,配合车牌号检测模型,可识别9种类型的车牌,需要可查看文章https://blog.csdn.net/YY007H/article/details/124655068。
基于PaddleOCR车牌号检测模型
05-14
基于PaddleOCR车牌号检测模型,可检测9种类型的车牌,需要可查看文章https://blog.csdn.net/YY007H/article/details/124651163。
paddleocr模型部署
08-27
paddleocr模型的部署可以分为几个步骤。首先,你需要下载所需的模型文件。可以通过执行以下命令下载分类模型:
```
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
```
接下来,你需要解压下载的压缩包。之后,你可以继续部署文本方向分类和文本识别模型。前一篇博客【模型部署】PaddleOCR模型openvino部署(一)已经介绍了检测模型DBNet的部署方法。你可以参考该博客,将检测、方向分类和文本识别模型串联起来,完成完整的部署流程。
如果你想在Android端部署PaddleOCR训练的新模型,你需要做一些准备工作。具体细节可以在相关文章中找到。123
#### 引用[.reference_title]
- *1* *2* [【模型部署】PaddleOCR模型openvino部署(二)](https://blog.csdn.net/qq_40035462/article/details/124436639)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"]
- *3* [基于PaddleOCR训练的新模型Android端部署全流程记录](https://blog.csdn.net/YY007H/article/details/124774019)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"]
[ .reference_list ]
“相关推荐”对你有帮助么?
非常没帮助
没帮助
一般
有帮助
非常有帮助
提交
吨吨不打野
CSDN认证博客专家
CSDN认证企业博客
码龄9年
人工智能领域新星创作者
363
原创
5034
周排名
3076
总排名
182万+
访问
等级
8070
积分
1万+
粉丝
1465
获赞
696
评论
6743
收藏
私信
关注
热门文章
禁用Windows Defender Antivirus Service
66081
pip install git(pip直接安装git上的项目)
63651
vscode使用ssh远程连接失败(及其他问题合集)
56929
PaddleOCR本地部署(安装,使用,模型优化/加速)
40522
带圈数字符号0-100和unicode编码
40202
分类专栏
动手学深度学习pytorch
付费
53篇
医学影像
付费
27篇
医学数字图像处理
ITK
13篇
医学影像知识
7篇
传统算法
2篇
项目实战
27篇
OCR数字仪表识别
32篇
PyQt
3篇
电动工具所项目
15篇
mac
15篇
opencv
24篇
学习Opencv
14篇
数字图像处理
3篇
C++
10篇
python基础
40篇
意外接触的一些知识
49篇
大模型实战营
4篇
OpenMMLab-AI实战营第二期
16篇
强化学习
8篇
量化交易
4篇
CVAT
3篇
OpenVINO
3篇
anyq
9篇
linux服务器相关
18篇
工具日常使用
26篇
工具推荐
11篇
其他
22篇
知识图谱
15篇
必备技能
24篇
docker
8篇
git
14篇
pytorch
8篇
DL
5篇
最新评论
OpenMMLab-AI实战营第二期——2-2.基于RTMPose的耳朵穴位关键点检测(Colab+MMPose)
ZeroRegister:
您好,请问能分享一下人耳关键点检测的数据集吗,有偿也可,非常感谢您
pip install git(pip直接安装git上的项目)
Yonggie:
是魔法!
DICOM-RT struct转换为nii.gz
吨吨不打野:
私聊你了。。。
DICOM-RT struct转换为nii.gz
Lil146:
非常感谢,我大致理解了。但是导师给我的dcm文件里面没有ImagePositionPatient标签。只有轮廓数据。用不了def get_mask(contours, slices):函数
DICOM-RT struct转换为nii.gz
吨吨不打野:
有 numpy array格式的label,就可以用simpleITK转换。
还有问题的话,你可以看看 我两个博客:
SimpleITK使用——5. dicom转nifti,获取dicom的meta信息,
看这个,就是读取dicom图像,获得图像的spaing等信息,方便保存
SimpleITK使用——2. 进行crop操作
看这个,就是看怎么把numpy的array用simpleITK转成nii格式的
您愿意向朋友推荐“博客详情页”吗?
强烈不推荐
不推荐
一般般
推荐
强烈推荐
提交
最新文章
学习Opencv(蝴蝶书/C++)——5.矩阵的其他算子(友元函数)
大模型实战营第二期——4. XTuner 大模型单卡低成本微调实战
大模型实战营第二期——3. 基于 InternLM 和 LangChain 搭建你的知识库
2024年6篇
2023年40篇
2022年98篇
2021年182篇
2020年47篇
2019年1篇
目录
目录
分类专栏
动手学深度学习pytorch
付费
53篇
医学影像
付费
27篇
医学数字图像处理
ITK
13篇
医学影像知识
7篇
传统算法
2篇
项目实战
27篇
OCR数字仪表识别
32篇
PyQt
3篇
电动工具所项目
15篇
mac
15篇
opencv
24篇
学习Opencv
14篇
数字图像处理
3篇
C++
10篇
python基础
40篇
意外接触的一些知识
49篇
大模型实战营
4篇
OpenMMLab-AI实战营第二期
16篇
强化学习
8篇
量化交易
4篇
CVAT
3篇
OpenVINO
3篇
anyq
9篇
linux服务器相关
18篇
工具日常使用
26篇
工具推荐
11篇
其他
22篇
知识图谱
15篇
必备技能
24篇
docker
8篇
git
14篇
pytorch
8篇
DL
5篇
目录
评论 14
被折叠的 条评论
为什么被折叠?
到【灌水乐园】发言
查看更多评论
添加红包
祝福语
请填写红包祝福语或标题
红包数量
个
红包个数最小为10个
红包总金额
元
红包金额最低5元
余额支付
当前余额3.43元
前往充值 >
需支付:10.00元
取消
确定
下一步
知道了
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝
规则
hope_wisdom 发出的红包
打赏作者
吨吨不打野
解决了问题,觉得还行就给点
¥1
¥2
¥4
¥6
¥10
¥20
扫码支付:¥1
获取中
扫码支付
您的余额不足,请更换扫码支付或充值
打赏作者
实付元
使用余额支付
点击重新获取
扫码支付
钱包余额
0
抵扣说明:
1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。 2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。
余额充值
PaddleOCR,一款文本识别效果不输于商用的Python库! - 知乎
PaddleOCR,一款文本识别效果不输于商用的Python库! - 知乎切换模式写文章登录/注册PaddleOCR,一款文本识别效果不输于商用的Python库!小张Python1、前言Hello 大家好呀,我是小张~本期将给大家介绍一个 Github 项目,用于OCR文本识别的;在之前的教程中,关于用 Python 实现OCR 识别,写过两篇文章:一篇是关于 python 与 Tesseract ,详情可参考:介绍一个Python 包 ,几行代码可实现 OCR 文本识别; tesseract 是基于传统机器学习方法实现的, 对于英文字符识别还是挺棒的,但中文字符的识别效果就差强人意了~~还有一篇是介绍了一个用于文本识别的 Github 项目Easy-OCR,相关用法详情可参考:关于文本OCR检测、分享一个基于深度学习技术的Python库Easy-OCR 是基于深度学习技术开发的,识别效果要优于 Tesserart,支持识别70+个国家语言,除了文本识别之外还能对文本块区域完成检测功能,并用线框将相关区域标注在原图上但测试后发现,该库对于某些路标识别效果并不是很精确~2 PaddleOCR 介绍这篇文章呢,将介绍一个新的 Github 项目,同样用于 OCR 识别、该项目名叫 PaddleOCR,是 Paddle 的一个分支;PaddleOCR 基于深度学习技术实现的, 所以使用时需要训练好的权重文件,但这个不需要我们担心,因为官方提供的有~本小节是对 PaddleOCR 项目的简单介绍,如果只对使用步骤感兴趣的同学可以跳过本小节看第三节部分~~~经测试 PaddleOCR 识别效果非常优秀,下面两张图片是从官网介绍中截取的几张图片图一图二为了测试该项目的识别性能、随后我在网上找了一张关于优惠卷的图片,图片中文字情况比较复杂,垂直、斜体等;还有中英文相结合,甚至还有小数点最终测试效果如下,无论左边图片文本复杂度有多高,图中文字基本都能识别到,非常Nice 关于 PaddleOCR 模型 ,有以下几个特点PaddleOCR 从 2020.5.14 发布,项目迭代到现在,功能一直处于在不断完善的过程;在 PaddleOCR 识别中,会依次完成三种任务:检测、方向分类及文本识别;关于预训练权重,PaddleOCR 官网根据提供权重文件大小分为两类:一类为轻量级,(检测+分类+识别)三类权重加起来大小一共才 9.4 M,适用于手机端和服务器部署;另一类(检测+分类+识别)三类权重内存加起来一共 143.4 MB ,适用于服务器部署;无论模型是否轻量级,识别效果都能与商业效果相比,在本期教程中将选用轻量级权重用于测试;支持多语言识别,目前能够支持 80 多种语言;除了能对中文、英语、数字识别之外,还能应对字体倾斜、文本中含有小数点字符等复杂情况提供有丰富的 OCR 领域相关工具供我们使用,方便我们制作自己的数据集、用于训练半自动数据标注工具;数据合成工具;支持 pip 安装,简单上手;3 PaddleOCR 使用简单介绍完之后,下面将手把手教大家怎么去使用 PaddleOCR,3.1 环境介绍介绍一下本次所用的测试环境os:Win10;Python:3.7.9;3.2 安装 PaddlePaddle2.0PaddleOCR 需在 PaddlePaddle2.0 下才可以正常运行,开始之前请确保 PaddlePaddle2.0 已经安装,pip3 install --upgrade pip
#
python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple3.2 克隆 PaddleOCR 仓库用 git clone 命令或者 Download 把项目仓库直接下载到本地git clone https://github.com/PaddlePaddle/PaddleOCR这里我用的是 git 命令3.3 安装PaddleOCR 第三方依赖包命令行进入 PaddleOCR 文件夹下cd PaddleOCR安装第三方依赖项pip3 install -r requirements.txt这一步骤如果报错的话,建议把改项目放置在一个虚拟环境中再进行安装,如果用虚拟环境的话,记得还需要安装一下 PaddlePaddle 包python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple3.4 下载权重文件权重链接地址分别贴在下方,需依次下载到本地;检测权重https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar方向分类权重https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar识别权重https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar下载到本地之后分别进行解压,创建一个 inference 文件夹,把前面解压后的三个文件夹放入 inference 中,再把 inference 文件夹放入 PaddleOCR 中,最终树形目录结构效果如下:3.5 PaddleOCR 使用 以上环境配置好之后,就可以使用 PaddleOCR 进行识别了,在PaddleOCR 项目环境下打开终端,根据自己情况,输入下面三种类型中的一种即可完成文本识别1,使用 gpu,识别单张图片 python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True2,使用 gpu ,识别多张图片python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True3,不使用gpu,识别单张图片python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False里面有两个参数需要自己配置一下,参数说明:image_dir -> 为需要识别图片路径或文件夹;det_model_dir -> 存放识别后图片路径或文件夹;PaddleOCR 识别一张图片很快,只用 CPU 的话,也只需要两三秒4. 数据、源码获取为了方便,我已经把测试数据、项目代码都打包在一起了,下载后完成以下两个步骤即可正常使用(使用方法参考章节 3.5 部分)创建虚拟环境;pip 工具安装依赖项;python3 -m pip install paddlepaddle==2.0.0 -i https://mirror.baidu.com/pypi/simple
# 依赖项
pip3 install -r requirements.txt获取方式:5 小总结Paddle-OCR 属于Paddle 框架其中的一个应用,Paddle 除了 OCR 之外还有许多其它好玩的模型,关键开发者提供有训练好的预权重文件、降低了使用门槛后期呢,我也打算将从中挑一些好玩的项目,通过博文的方式手把手教大家跑起来好了,关于 PaddleOCR 的使用就介绍到这里了,如果内容对你有帮助的话不妨点个赞来鼓励一下我~最后感谢大家的阅读,我们下期见编辑于 2021-06-12 12:48OCR文字识别PythonGitHub赞同 9816 条评论分享喜欢收藏申请
PaddleOCR史上最全安装教程 - 知乎
PaddleOCR史上最全安装教程 - 知乎切换模式写文章登录/注册PaddleOCR史上最全安装教程水底的土豆zhizai》简介1 简介PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。并可以此为契机学习深度学习。github:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README_ch.md官网:飞桨PaddlePaddle-源于产业实践的开源深度学习平台移动体验版:https://ai.baidu.com/easyedge/app/openSource?from=paddlelite2 特性3 环境Windows 11 专业版python 版本 (请注意python3.11可能有问题,大家可以先试下,可以的话请留言告诉我)CUDA11.74 安装步骤官网安装步骤:PaddleOCR/doc/doc_en/quickstart_en.md at release/2.6 · PaddlePaddle/PaddleOCR4.1 升级pippython -m pip install --upgrade pip4.2 安装paddlepaddle选择对应版本:因为我有NVIDIA® GPU,则安装GPU版本python -m pip install paddlepaddle-gpu==2.4.2.post117 -f https://www.paddlepaddle.org.cn/whl/windows/mkl/avx/stable.html4.3 安装Shapely地址:https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely?continueFlag=1961b80f775bfd9263cb4ee8416fc63d下载目录执行安装pip install Shapely-1.8.2-cp39-cp39-win_amd64.whl4.4 安装PaddleOCRpip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+安装4.5 安装CUDA参考来源:【Windows11】Cuda和Cudnn详细安装教程_cudnn安装_Jin·的博客-CSDN博客大家选择对应的CUDA版本,地址:CUDA Toolkit Archive地址选择校验4.6 安装cuDNN地址:https://developer.nvidia.com/rdp/cudnn-download地址选择所有文件拷贝至4.5CUDA的安装目录校验是否安装成功校验4.7 安装Zlib地址:Installation Guide将zlibwapi.dll拷贝至4.5CUDA的安装目录bin下4.8 验证参考:PaddleOCR/doc/doc_en/quickstart_en.md at release/2.6 · PaddlePaddle/PaddleOCR5 总结安装步骤还是很麻烦的,但是很开心,希望可以顺利开启深度学习之旅!编辑于 2023-06-22 09:13・IP 属地浙江Python 模块安装paddleWindow11赞同 2015 条评论分享喜欢收藏申请
paddleocr · PyPI
paddleocr · PyPI
Skip to main content
Switch to mobile version
Warning
Some features may not work without JavaScript. Please try enabling it if you encounter problems.
Search PyPI
Search
Help
Sponsors
Log in
Register
Menu
Help
Sponsors
Log in
Register
Search PyPI
Search
paddleocr 2.7.0.3
pip install paddleocr
Copy PIP instructions
Latest version
Released:
Sep 15, 2023
Awesome OCR toolkits based on PaddlePaddle (8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embeded and IoT devices)
Navigation
Project description
Release history
Download files
Project links
Homepage
Download
Statistics
GitHub statistics:
Stars:
Forks:
Open issues:
Open PRs:
View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery
Meta
License: Apache License 2.0
Tags
ocr,
textdetection,
textrecognition,
paddleocr,
crnn,
east,
star-net,
rosetta,
ocrlite,
db,
chineseocr,
chinesetextdetection,
chinesetextrecognition
Maintainers
zhoujun
Classifiers
Intended Audience
Developers
Natural Language
Chinese (Simplified)
Operating System
OS Independent
Programming Language
Python :: 3
Python :: 3.2
Python :: 3.3
Python :: 3.4
Python :: 3.5
Python :: 3.6
Python :: 3.7
Topic
Utilities
Project description
Project details
Release history
Download files
Project description
Paddleocr Package
1 Get started quickly
1.1 install package
install by pypi
pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+
build own whl package and install
python3 setup.py bdist_wheel
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
2 Use
2.1 Use by code
The paddleocr whl package will automatically download the ppocr lightweight model as the default model, which can be customized and replaced according to the section 3 Custom Model.
detection angle classification and recognition
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# draw result
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Output will be a list, each item contains bounding box, text and recognition confidence
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......
Visualization of results
detection and recognition
from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR(lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=False)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# draw result
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Output will be a list, each item contains bounding box, text and recognition confidence
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......
Visualization of results
classification and recognition
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
Output will be a list, each item contains recognition text and confidence
['PAIN', 0.990372]
only detection
from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path,rec=False)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# draw result
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Output will be a list, each item only contains bounding box
[[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]]
[[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]]
[[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]]
......
Visualization of results
only recognition
from paddleocr import PaddleOCR
ocr = PaddleOCR(lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, cls=False)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
Output will be a list, each item contains recognition text and confidence
['PAIN', 0.990372]
only classification
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, rec=False, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
Output will be a list, each item contains classification result and confidence
['0', 0.99999964]
2.2 Use by command line
show help information
paddleocr -h
detection classification and recognition
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true --lang en
Output will be a list, each item contains bounding box, text and recognition confidence
[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
......
pdf file is also supported, you can infer the first few pages by using the page_num parameter, the default is 0, which means infer all pages
paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2
detection and recognition
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --lang en
Output will be a list, each item contains bounding box, text and recognition confidence
[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
......
classification and recognition
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --lang en
Output will be a list, each item contains text and recognition confidence
['PAIN', 0.9934559464454651]
only detection
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false
Output will be a list, each item only contains bounding box
[[397.0, 802.0], [1092.0, 802.0], [1092.0, 841.0], [397.0, 841.0]]
[[397.0, 750.0], [1211.0, 750.0], [1211.0, 789.0], [397.0, 789.0]]
[[397.0, 702.0], [1209.0, 698.0], [1209.0, 734.0], [397.0, 738.0]]
......
only recognition
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false --lang en
Output will be a list, each item contains text and recognition confidence
['PAIN', 0.9934559464454651]
only classification
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --rec false
Output will be a list, each item contains classification result and confidence
['0', 0.99999964]
3 Use custom model
When the built-in model cannot meet the needs, you need to use your own trained model.
First, refer to export doc to convert your det and rec model to inference model, and then use it as follows
3.1 Use by code
from paddleocr import PaddleOCR,draw_ocr
# The path of detection and recognition model must contain model and params files
ocr = PaddleOCR(det_model_dir='{your_det_model_dir}', rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}', cls_model_dir='{your_cls_model_dir}', use_angle_cls=True)
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# draw result
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
3.2 Use by command line
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} --rec_char_dict_path {your_rec_char_dict_path} --cls_model_dir {your_cls_model_dir} --use_angle_cls true
4 Use web images or numpy array as input
4.1 Web image
Use by code
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# show result
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Use by command line
paddleocr --image_dir http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg --use_angle_cls=true
4.2 Numpy array
Support numpy array as input only when used by code
import cv2
from paddleocr import PaddleOCR, draw_ocr, download_with_progressbar
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
img = cv2.imread(img_path)
# img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY), If your own training model supports grayscale images, you can uncomment this line
result = ocr.ocr(img, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# show result
from PIL import Image
result = result[0]
download_with_progressbar(img_path, 'tmp.jpg')
image = Image.open('tmp.jpg').convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
5 PDF file
Use by command line
you can infer the first few pages by using the page_num parameter, the default is 0, which means infer all pages
paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2
Use by code
from paddleocr import PaddleOCR, draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `fr`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang="ch", page_num=2) # need to run only once to download and load model into memory
img_path = './xxx.pdf'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# draw result
import fitz
from PIL import Image
import cv2
import numpy as np
imgs = []
with fitz.open(img_path) as pdf:
for pg in range(0, pdf.pageCount):
page = pdf[pg]
mat = fitz.Matrix(2, 2)
pm = page.getPixmap(matrix=mat, alpha=False)
# if width or height > 2000 pixels, don't enlarge the image
if pm.width > 2000 or pm.height > 2000:
pm = page.getPixmap(matrix=fitz.Matrix(1, 1), alpha=False)
img = Image.frombytes("RGB", [pm.width, pm.height], pm.samples)
img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
imgs.append(img)
for idx in range(len(result)):
res = result[idx]
image = imgs[idx]
boxes = [line[0] for line in res]
txts = [line[1][0] for line in res]
scores = [line[1][1] for line in res]
im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result_page_{}.jpg'.format(idx))
6 Parameter Description
Parameter
Description
Default value
use_gpu
use GPU or not
TRUE
gpu_mem
GPU memory size used for initialization
8000M
image_dir
The images path or folder path for predicting when used by the command line
page_num
Valid when the input type is pdf file, specify to predict the previous page_num pages, all pages are predicted by default
0
det_algorithm
Type of detection algorithm selected
DB
det_model_dir
the text detection inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to ~/.paddleocr/det; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path
None
det_max_side_len
The maximum size of the long side of the image. When the long side exceeds this value, the long side will be resized to this size, and the short side will be scaled proportionally
960
det_db_thresh
Binarization threshold value of DB output map
0.3
det_db_box_thresh
The threshold value of the DB output box. Boxes score lower than this value will be discarded
0.5
det_db_unclip_ratio
The expanded ratio of DB output box
2
det_db_score_mode
The parameter that control how the score of the detection frame is calculated. There are 'fast' and 'slow' options. If the text to be detected is curved, it is recommended to use 'slow'
'fast'
det_east_score_thresh
Binarization threshold value of EAST output map
0.8
det_east_cover_thresh
The threshold value of the EAST output box. Boxes score lower than this value will be discarded
0.1
det_east_nms_thresh
The NMS threshold value of EAST model output box
0.2
rec_algorithm
Type of recognition algorithm selected
CRNN
rec_model_dir
the text recognition inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to ~/.paddleocr/rec; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path
None
rec_image_shape
image shape of recognition algorithm
"3,32,320"
rec_batch_num
When performing recognition, the batchsize of forward images
30
max_text_length
The maximum text length that the recognition algorithm can recognize
25
rec_char_dict_path
the alphabet path which needs to be modified to your own path when rec_model_Name use mode 2
./ppocr/utils/ppocr_keys_v1.txt
use_space_char
Whether to recognize spaces
TRUE
drop_score
Filter the output by score (from the recognition model), and those below this score will not be returned
0.5
use_angle_cls
Whether to load classification model
FALSE
cls_model_dir
the classification inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to ~/.paddleocr/cls; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path
None
cls_image_shape
image shape of classification algorithm
"3,48,192"
label_list
label list of classification algorithm
['0','180']
cls_batch_num
When performing classification, the batchsize of forward images
30
enable_mkldnn
Whether to enable mkldnn
FALSE
use_zero_copy_run
Whether to forward by zero_copy_run
FALSE
lang
The support language, now only Chinese(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan) are supported
ch
det
Enable detction when ppocr.ocr func exec
TRUE
rec
Enable recognition when ppocr.ocr func exec
TRUE
cls
Enable classification when ppocr.ocr func exec((Use use_angle_cls in command line mode to control whether to start classification in the forward direction)
FALSE
show_log
Whether to print log
FALSE
type
Perform ocr or table structuring, the value is selected in ['ocr','structure']
ocr
ocr_version
OCR Model version number, the current model support list is as follows: PP-OCRv3 supports Chinese and English detection, recognition, multilingual recognition, direction classifier models, PP-OCRv2 support Chinese detection and recognition model, PP-OCR support Chinese detection, recognition and direction classifier, multilingual recognition model
PP-OCRv3
Project details
Project links
Homepage
Download
Statistics
GitHub statistics:
Stars:
Forks:
Open issues:
Open PRs:
View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery
Meta
License: Apache License 2.0
Tags
ocr,
textdetection,
textrecognition,
paddleocr,
crnn,
east,
star-net,
rosetta,
ocrlite,
db,
chineseocr,
chinesetextdetection,
chinesetextrecognition
Maintainers
zhoujun
Classifiers
Intended Audience
Developers
Natural Language
Chinese (Simplified)
Operating System
OS Independent
Programming Language
Python :: 3
Python :: 3.2
Python :: 3.3
Python :: 3.4
Python :: 3.5
Python :: 3.6
Python :: 3.7
Topic
Utilities
Release history
Release notifications |
RSS feed
This version
2.7.0.3
Sep 15, 2023
2.7.0.2
Aug 10, 2023
2.7.0.1
Aug 9, 2023
2.7.0.0
Aug 1, 2023
2.6.1.3
Feb 8, 2023
2.6.1.2
Dec 14, 2022
2.6.1.1
Nov 24, 2022
2.6.1.0
Oct 24, 2022
2.6.0.3
Oct 21, 2022
2.6.0.2
Oct 12, 2022
2.6.0.1
Sep 7, 2022
2.6
Aug 24, 2022
2.5.0.3
May 10, 2022
2.5.0.2
May 9, 2022
2.5
Apr 25, 2022
2.4.0.4
Apr 2, 2022
2.4.0.3
Mar 24, 2022
2.4.0.2
Mar 18, 2022
2.4.0.1
Mar 17, 2022
2.4
Jan 10, 2022
2.3.0.2
Nov 17, 2021
2.3.0.1
Sep 7, 2021
2.3
Sep 7, 2021
2.2.0.2
Aug 10, 2021
2.2.0.1
Aug 3, 2021
2.2
Aug 3, 2021
2.0.6
Apr 13, 2021
2.0.5
Apr 13, 2021
2.0.4
Apr 9, 2021
2.0.3
Mar 17, 2021
2.0.2
Dec 18, 2020
2.0.1
Dec 16, 2020
1.1.1
Nov 27, 2020
1.0.1
Oct 30, 2020
1.0.0
Sep 21, 2020
0.0.3.1
Aug 24, 2020
0.0.3
Aug 22, 2020
0.0.2
Aug 22, 2020
0.0.1.1
Aug 22, 2020
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
paddleocr-2.7.0.3-py3-none-any.whl
(465.7 kB
view hashes)
Uploaded
Sep 15, 2023
py3
Close
Hashes for paddleocr-2.7.0.3-py3-none-any.whl
Hashes for paddleocr-2.7.0.3-py3-none-any.whl
Algorithm
Hash digest
SHA256
d85cbf75e9aa652fd22f2077b56d1e0dd1a97d09bf317d99f4dbba5c4f1f29ad
Copy
MD5
65d1d275a31702ca0f0dda5b97219c53
Copy
BLAKE2b-256
8fd01a2f9430f61781beb16556182baa938e8f93c8b46c27ad5865a5655fae05
Copy
Close
Help
Installing packages
Uploading packages
User guide
Project name retention
FAQs
About PyPI
PyPI on Twitter
Infrastructure dashboard
Statistics
Logos & trademarks
Our sponsors
Contributing to PyPI
Bugs and feedback
Contribute on GitHub
Translate PyPI
Sponsor PyPI
Development credits
Using PyPI
Code of conduct
Report security issue
Privacy policy
Terms of use
Acceptable Use Policy
Status:
all systems operational
Developed and maintained by the Python community, for the Python community.
Donate today!
"PyPI", "Python Package Index", and the blocks logos are registered trademarks of the Python Software Foundation.
© 2024 Python Software Foundation
Site map
Switch to desktop version
English
español
français
日本語
português (Brasil)
українська
Ελληνικά
Deutsch
中文 (简体)
中文 (繁體)
русский
עברית
esperanto
Supported by
AWS
Cloud computing and Security Sponsor
Datadog
Monitoring
Fastly
CDN
Download Analytics
Microsoft
PSF Sponsor
Pingdom
Monitoring
Sentry
Error logging
StatusPage
Status page
飞桨PaddlePaddle-源于产业实践的开源深度学习平台
ddlePaddle-源于产业实践的开源深度学习平台You need to enable JavaScript to run this app.\u200E开始使用特性文档API使用指南工具平台工具AutoDLPaddleHubPARLERNIE全部平台AI StudioEasyDLEasyEdge资源模型和数据集学习资料应yload":{"allShortcutsEnabled":false,"fileTree":{"doc/doc_ch":{"items":[{"name":"dataset","path":"doc/doc_ch/dataset","contentType":"directory"},{"name":"FAQ.md","path":"doc/doc_ch/FAQ.md","contentType":"file"},{"name":"PP-OCRv3_introduction.md","path":"doc/doc_ch/PP-OCRv3_introduction.md","contentType":"file"},{"name":"PP-OCRv4_introduction.md","path":"doc/doc_ch/PP-OCRv4_introduction.md","contentType":"file"},{"name":"PPOCRv3_det_train.md","path":"doc/doc_ch/PPOCRv3_det_train.md","contentType":"file"},{"name":"add_new_algorithm.md","path":"doc/doc_ch/add_new_algorithm.md","contentType":"file"},{"name":"algorithm_det_ct.md","path":"doc/doc_ch/algorithm_det_ct.md","contentType":"file"},{"name":"algorithm_det_db.md","path":"doc/doc_ch/algorithm_det_db.md","contentType":"file"},{"name":"algorithm_det_drrg.md","path":"doc/doc_ch/algorithm_det_drrg.md","contentType":"file"},{"name":"algorithm_det_east.md","path":"doc/doc_ch/algorithm_det_east.md","contentType":"file"},{"name":"algorithm_det_fcenet.md","path":"doc/doc_ch/algorithm_det_fcenet.md","contentType":"file"},{"name":"algorithm_det_psenet.md","path":"doc/doc_ch/algorithm_det_psenet.md","contentType":"file"},{"name":"algorithm_det_sast.md","path":"doc/doc_ch/algorithm_det_sast.md","contentType":"file"},{"name":"algorithm_e2e_pgnet.md","path":"doc/doc_ch/algorithm_e2e_pgnet.md","contentType":"file"},{"name":"algorithm_inference.md","path":"doc/doc_ch/algorithm_inference.md","contentType":"file"},{"name":"algorithm_kie_layoutxlm.md","path":"doc/doc_ch/algorithm_kie_layoutxlm.md","contentType":"file"},{"name":"algorithm_kie_sdmgr.md","path":"doc/doc_ch/algorithm_kie_sdmgr.md","contentType":"file"},{"name":"algorithm_kie_vi_layoutxlm.md","path":"doc/doc_ch/algorithm_kie_vi_layoutxlm.md","contentType":"file"},{"name":"algorithm_overview.md","path":"doc/doc_ch/algorithm_overview.md","contentType":"file"},{"name":"algorithm_rec_abinet.md","path":"doc/doc_ch/algorithm_rec_abinet.md","contentType":"file"},{"name":"algorithm_rec_can.md","path":"doc/doc_ch/algorithm_rec_can.md","contentType":"file"},{"name":"algorithm_rec_crnn.md","path":"doc/doc_ch/algorithm_rec_crnn.md","contentType":"file"},{"name":"algorithm_rec_nrtr.md","path":"doc/doc_ch/algorithm_rec_nrtr.md","contentType":"file"},{"name":"algorithm_rec_rare.md","path":"doc/doc_ch/algorithm_rec_rare.md","contentType":"file"},{"name":"algorithm_rec_rfl.md","path":"doc/doc_ch/algorithm_rec_rfl.md","contentType":"file"},{"name":"algorithm_rec_robustscanner.md","path":"doc/doc_ch/algorithm_rec_robustscanner.md","contentType":"file"},{"name":"algorithm_rec_rosetta.md","path":"doc/doc_ch/algorithm_rec_rosetta.md","contentType":"file"},{"name":"algorithm_rec_sar.md","path":"doc/doc_ch/algorithm_rec_sar.md","contentType":"file"},{"name":"algorithm_rec_seed.md","path":"doc/doc_ch/algorithm_rec_seed.md","contentType":"file"},{"name":"algorithm_rec_spin.md","path":"doc/doc_ch/algorithm_rec_spin.md","contentType":"file"},{"name":"algorithm_rec_srn.md","path":"doc/doc_ch/algorithm_rec_srn.md","contentType":"file"},{"name":"algorithm_rec_starnet.md","path":"doc/doc_ch/algorithm_rec_starnet.md","contentType":"file"},{"name":"algorithm_rec_svtr.md","path":"doc/doc_ch/algorithm_rec_svtr.md","contentType":"file"},{"name":"algorithm_rec_visionlan.md","path":"doc/doc_ch/algorithm_rec_visionlan.md","contentType":"file"},{"name":"algorithm_rec_vitstr.md","path":"doc/doc_ch/algorithm_rec_vitstr.md","contentType":"file"},{"name":"algorithm_sr_gestalt.md","path":"doc/doc_ch/algorithm_sr_gestalt.md","contentType":"file"},{"name":"algorithm_sr_telescope.md","path":"doc/doc_ch/algorithm_sr_telescope.md","contentType":"file"},{"name":"algorithm_table_master.md","path":"doc/doc_ch/algorithm_table_master.md","contentType":"file"},{"name":"angle_class.md","path":"doc/doc_ch/angle_class.md","contentType":"file"},{"name":"application.md","path":"doc/doc_ch/application.md","contentType":"file"},{"name":"benchmark.md","path":"doc/doc_ch/benchmark.md","contentType":"file"},{"name":"clone.md","path":"doc/doc_ch/clone.md","contentType":"file"},{"name":"code_and_doc.md","path":"doc/doc_ch/code_and_doc.md","contentType":"file"},{"name":"config.md","path":"doc/doc_ch/config.md","contentType":"file"},{"name":"customize.md","path":"doc/doc_ch/customize.md","contentType":"file"},{"name":"data_annotation.md","path":"doc/doc_ch/data_annotation.md","contentType":"file"},{"name":"data_synthesis.md","path":"doc/doc_ch/data_synthesis.md","contentType":"file"},{"name":"detection.md","path":"doc/doc_ch/detection.md","contentType":"file"},{"name":"distributed_training.md","path":"doc/doc_ch/distributed_training.md","contentType":"file"},{"name":"enhanced_ctc_loss.md","path":"doc/doc_ch/enhanced_ctc_loss.md","contentType":"file"},{"name":"environment.md","path":"doc/doc_ch/environment.md","contentType":"file"},{"name":"equation_a_ctc.png","path":"doc/doc_ch/equation_a_ctc.png","contentType":"file"},{"name":"equation_c_ctc.png","path":"doc/doc_ch/equation_c_ctc.png","contentType":"file"},{"name":"equation_ctcloss.png","path":"doc/doc_ch/equation_ctcloss.png","contentType":"file"},{"name":"equation_focal_ctc.png","path":"doc/doc_ch/equation_focal_ctc.png","contentType":"file"},{"name":"finetune.md","path":"doc/doc_ch/finetune.md","contentType":"file"},{"name":"focal_loss_formula.png","path":"doc/doc_ch/focal_loss_formula.png","contentType":"file"},{"name":"focal_loss_image.png","path":"doc/doc_ch/focal_loss_image.png","contentType":"file"},{"name":"framework.png","path":"doc/doc_ch/framework.png","contentType":"file"},{"name":"inference_args.md","path":"doc/doc_ch/inference_args.md","contentType":"file"},{"name":"inference_ppocr.md","path":"doc/doc_ch/inference_ppocr.md","contentType":"file"},{"name":"installation.md","path":"doc/doc_ch/installation.md","contentType":"file"},{"name":"kie.md","path":"doc/doc_ch/kie.md","contentType":"file"},{"name":"knowledge_distillation.md","path":"doc/doc_ch/knowledge_distillation.md","contentType":"file"},{"name":"models.md","path":"doc/doc_ch/models.md","contentType":"file"},{"name":"models_list.md","path":"doc/doc_ch/models_list.md","contentType":"file"},{"name":"multi_languages.md","path":"doc/doc_ch/multi_languages.md","contentType":"file"},{"name":"ocr_book.md","path":"doc/doc_ch/ocr_book.md","contentType":"file"},{"name":"ppocr_introduction.md","path":"doc/doc_ch/ppocr_introduction.md","contentType":"file"},{"name":"quickstart.md","path":"doc/doc_ch/quickstart.md","contentType":"file"},{"name":"rec_algo_compare.png","path":"doc/doc_ch/rec_algo_compare.png","contentType":"file"},{"name":"recognition.md","path":"doc/doc_ch/recognition.md","contentType":"file"},{"name":"reference.md","path":"doc/doc_ch/reference.md","contentType":"file"},{"name":"table_recognition.md","path":"doc/doc_ch/table_recognition.md","contentType":"file"},{"name":"thirdparty.md","path":"doc/doc_ch/thirdparty.md","contentType":"file"},{"name":"training.md","path":"doc/doc_ch/training.md","contentType":"file"},{"name":"tree.md","path":"doc/doc_ch/tree.md","contentType":"file"},{"name":"update.md","path":"doc/doc_ch/update.md","contentType":"file"},{"name":"visualization.md","path":"doc/doc_ch/visualization.md","contentType":"file"},{"name":"whl.md","path":"doc/doc_ch/whl.md","contentType":"file"}],"totalCount":80},"doc":{"items":[{"name":"datasets","path":"doc/datasets","contentType":"directory"},{"name":"demo","path":"doc/demo","contentType":"directory"},{"name":"doc_ch","path":"doc/doc_ch","contentType":"directory"},{"name":"doc_en","path":"doc/doc_en","contentType":"directory"},{"name":"doc_i18n","path":"doc/doc_i18n","contentType":"directory"},{"name":"fonts","path":"doc/fonts","contentType":"directory"},{"name":"imgs","path":"doc/imgs","contentType":"directory"},{"name":"imgs_en","path":"doc/imgs_en","contentType":"directory"},{"name":"imgs_results","path":"doc/imgs_results","contentType":"directory"},{"name":"imgs_words","path":"doc/imgs_words","contentType":"directory"},{"name":"imgs_words_en","path":"doc/imgs_words_en","contentType":"directory"},{"name":"install","path":"doc/install","contentType":"directory"},{"name":"ppocr_v3","path":"doc/ppocr_v3","contentType":"directory"},{"name":"ppocr_v4","path":"doc/ppocr_v4","contentType":"directory"},{"name":"tricks","path":"doc/tricks","contentType":"directory"},{"name":"PaddleOCR_log.png","path":"doc/PaddleOCR_log.png","contentType":"file"},{"name":"banner.png","path":"doc/banner.png","contentType":"file"},{"name":"deployment.png","path":"doc/deployment.png","contentType":"file"},{"name":"deployment_en.png","path":"doc/deployment_en.png","contentType":"file"},{"name":"joinus.PNG","path":"doc/joinus.PNG","contentType":"file"},{"name":"joinus_paddlex.jpg","path":"doc/joinus_paddlex.jpg","contentType":"file"},{"name":"pgnet_framework.png","path":"doc/pgnet_framework.png","contentType":"file"},{"name":"ppocr_framework.png","path":"doc/ppocr_framework.png","contentType":"file"},{"name":"ppocrv2_framework.jpg","path":"doc/ppocrv2_framework.jpg","contentType":"file"},{"name":"ppocrv3_framework.png","path":"doc/ppocrv3_framework.png","contentType":"file"},{"name":"pr.png","path":"doc/pr.png","contentType":"file"},{"name":"precommit_pass.png","path":"doc/precommit_pass.png","contentType":"file"}],"totalCount":27},"":{"items":[{"name":".github","path":".github","contentType":"directory"},{"name":"PPOCRLabel","path":"PPOCRLabel","contentType":"directory"},{"name":"StyleText","path":"StyleText","contentType":"directory"},{"name":"applications","path":"applications","contentType":"directory"},{"name":"benchmark","path":"benchmark","contentType":"directory"},{"name":"configs","path":"configs","contentType":"directory"},{"name":"deploy","path":"deploy","contentType":"directory"},{"name":"doc","path":"doc","contentType":"directory"},{"name":"ppocr","path":"ppocr","contentType":"directory"},{"name":"ppstructure","path":"ppstructure","contentType":"directory"},{"name":"test_tipc","path":"test_tipc","contentType":"directory"},{"name":"tools","path":"tools","contentType":"directory"},{"name":".clang_format.hook","path":".clang_format.hook","contentType":"file"},{"name":".gitignore","path":".gitignore","contentType":"file"},{"name":".pre-commit-config.yaml","path":".pre-commit-config.yaml","contentType":"file"},{"name":".style.yapf","path":".style.yapf","contentType":"file"},{"name":"LICENSE","path":"LICENSE","contentType":"file"},{"name":"MANIFEST.in","path":"MANIFEST.in","contentType":"file"},{"name":"README.md","path":"README.md","contentType":"file"},{"name":"README_en.md","path":"README_en.md","contentType":"file"},{"name":"__init__.py","path":"__init__.py","contentType":"file"},{"name":"paddleocr.py","path":"paddleocr.py","contentType":"file"},{"name":"requirements.txt","path":"requirements.txt","contentType":"file"},{"name":"setup.py","path":"setup.py","contentType":"file"},{"name":"train.sh","path":"train.sh","contentType":"file"}],"totalCount":25}},"fileTreeProcessingTime":12.498346,"foldersToFetch":[],"repo":{"id":262296122,"defaultBranch":"release/2.7","name":"PaddleOCR","ownerLogin":"PaddlePaddle","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2020-05-08T10:38:16.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/23534030?v=4","public":true,"private":false,"isOrgOwned":true},"symbolsExpanded":false,"treeExpanded":true,"refInfo":{"name":"release/2.7","listCacheKey":"v0:1709784359.0","canEdit":false,"refType":"branch","currentOid":"69832ab5326c6db614af6fb74b530aeae1c9b80e"},"path":"doc/doc_ch/quickstart.md","currentUser":null,"blob":{"rawLines":null,"stylingDirectives":null,"colorizedLines":null,"csv":null,"csvError":null,"dependabotInfo":{"showConfigurationBanner":false,"configFilePath":null,"networkDependabotPath":"/PaddlePaddle/PaddleOCR/network/updates","dismissConfigurationNoticePath":"/settings/dismiss-notice/dependabot_configuration_notice","configurationNoticeDismissed":null},"displayName":"quickstart.md","displayUrl":"https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md?raw=true","headerInfo":{"blobSize":"8.68 KB","deleteTooltip":"You must be signed in to make or propose changes","editTooltip":"You must be signed in to make or propose changes","ghDesktopPath":"https://desktop.github.com","isGitLfs":false,"onBranch":true,"shortPath":"0600d16","siteNavLoginPath":"/login?return_to=https%3A%2F%2Fgithub.com%2FPaddlePaddle%2FPaddleOCR%2Fblob%2Frelease%2F2.7%2Fdoc%2Fdoc_ch%2Fquickstart.md","isCSV":false,"isRichtext":true,"toc":[{"level":1,"text":"PaddleOCR 快速开始","anchor":"paddleocr-快速开始","htmlText":"PaddleOCR 快速开始"},{"level":2,"text":"1. 安装","anchor":"1-安装","htmlText":"1. 安装"},{"level":3,"text":"1.1 安装PaddlePaddle","anchor":"11-安装paddlepaddle","htmlText":"1.1 安装PaddlePaddle"},{"level":3,"text":"1.2 安装PaddleOCR whl包","anchor":"12-安装paddleocr-whl包","htmlText":"1.2 安装PaddleOCR whl包"},{"level":2,"text":"2. 便捷使用","anchor":"2-便捷使用","htmlText":"2. 便捷使用"},{"level":3,"text":"2.1 命令行使用","anchor":"21-命令行使用","htmlText":"2.1 命令行使用"},{"level":4,"text":"2.1.1 中英文模型","anchor":"211-中英文模型","htmlText":"2.1.1 中英文模型"},{"level":4,"text":"2.1.2 多语言模型","anchor":"212-多语言模型","htmlText":"2.1.2 多语言模型"},{"level":3,"text":"2.2 Python脚本使用","anchor":"22-python脚本使用","htmlText":"2.2 Python脚本使用"},{"level":4,"text":"2.2.1 中英文与多语言使用","anchor":"221-中英文与多语言使用","htmlText":"2.2.1 中英文与多语言使用"},{"level":2,"text":"3. 小结","anchor":"3-小结","htmlText":"3. 小结"}],"lineInfo":{"truncatedLoc":"255","truncatedSloc":"188"},"mode":"file"},"image":false,"isCodeownersFile":null,"isPlain":false,"isValidLegacyIssueTemplate":false,"issueTemplate":null,"discussionTemplate":null,"language":"Markdown","languageID":222,"large":false,"planSupportInfo":{"repoIsFork":null,"repoOwnedByCurrentUser":null,"requestFullPath":"/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md","showFreeOrgGatedFeatureMessage":null,"showPlanSupportBanner":null,"upgradeDataAttributes":null,"upgradePath":null},"publishBannersInfo":{"dismissActionNoticePath":"/settings/dismiss-notice/publish_action_from_dockerfile","releasePath":"/PaddlePaddle/PaddleOCR/releases/new?marketplace=true","showPublishActionBanner":false},"rawBlobUrl":"https://github.com/PaddlePaddle/PaddleOCR/raw/release/2.7/doc/doc_ch/quickstart.md","renderImageOrRaw":false,"richText":"PaddleOCR 快速开始\n说明: 本文主要介绍PaddleOCR wheel包对PP-OCR系列模型的快速使用,如要体验文档分析相关功能,请参考PP-Structure快速使用教程。\n\n1. 安装\n\n1.1 安装PaddlePaddle\n1.2 安装PaddleOCR whl包\n\n\n2. 便捷使用\n\n2.1 命令行使用\n\n2.1.1 中英文模型\n2.1.2 多语言模型\n\n\n2.2 Python脚本使用\n\n2.2.1 中英文与多语言使用\n\n\n\n\n3.小结\n\n\n1. 安装\n\n1.1 安装PaddlePaddle\n\n如果您没有基础的Python运行环境,请参考运行环境准备。\n\n\n\n您的机器安装的是CUDA9或CUDA10,请运行以下命令安装\npython3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple\n\n\n您的机器是CPU,请运行以下命令安装\npython3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple\n\n\n更多的版本需求,请参照飞桨官网安装文档中的说明进行操作。\n\n1.2 安装PaddleOCR whl包\npip install \"paddleocr>=2.0.1\" # 推荐使用2.0.1+版本\n\n对于Windows环境用户:直接通过pip安装的shapely库可能出现[winRrror 126] 找不到指定模块的问题。建议从这里下载shapely安装包完成安装。\n\n\n2. 便捷使用\n\n2.1 命令行使用\nPaddleOCR提供了一系列测试图片,点击这里下载并解压,然后在终端中切换到相应目录\ncd /path/to/ppocr_img\n\n如果不使用提供的测试图片,可以将下方--image_dir参数替换为相应的测试图片路径。\n\n2.1.1 中英文模型\n\n\n检测+方向分类器+识别全流程:--use_angle_cls true设置使用方向分类器识别180度旋转文字,--use_gpu false设置不使用GPU\npaddleocr --image_dir ./imgs/11.jpg --use_angle_cls true --use_gpu false\n结果是一个list,每个item包含了文本框,文字和识别置信度\n[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]\n......\n此外,paddleocr也支持输入pdf文件,并且可以通过指定参数page_num来控制推理前面几页,默认为0,表示推理所有页。\npaddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2\n\n\n\n\n单独使用检测:设置--rec为false\npaddleocr --image_dir ./imgs/11.jpg --rec false\n结果是一个list,每个item只包含文本框\n[[27.0, 459.0], [136.0, 459.0], [136.0, 479.0], [27.0, 479.0]]\n[[28.0, 429.0], [372.0, 429.0], [372.0, 445.0], [28.0, 445.0]]\n......\n\n\n单独使用识别:设置--det为false\npaddleocr --image_dir ./imgs_words/ch/word_1.jpg --det false\n结果是一个list,每个item只包含识别结果和识别置信度\n['韩国小馆', 0.994467]\n\n\n版本说明\npaddleocr默认使用PP-OCRv4模型(--ocr_version PP-OCRv4),如需使用其他版本可通过设置参数--ocr_version,具体版本说明如下:\n\n\n\n版本名称\n版本说明\n\n\n\n\nPP-OCRv4\n支持中、英文检测和识别,方向分类器,支持多语种识别\n\n\nPP-OCRv3\n支持中、英文检测和识别,方向分类器,支持多语种识别\n\n\nPP-OCRv2\n支持中英文的检测和识别,方向分类器,多语言暂未更新\n\n\nPP-OCR\n支持中、英文检测和识别,方向分类器,支持多语种识别\n\n\n\n如需新增自己训练的模型,可以在paddleocr中增加模型链接和字段,重新编译即可。\n更多whl包使用可参考whl包文档\n\n2.1.2 多语言模型\nPaddleOCR目前支持80个语种,可以通过修改--lang参数进行切换,对于英文模型,指定--lang=en。\npaddleocr --image_dir ./imgs_en/254.jpg --lang=en\n\n \n \n\n结果是一个list,每个item包含了文本框,文字和识别置信度\n[[[67.0, 51.0], [327.0, 46.0], [327.0, 74.0], [68.0, 80.0]], ('PHOCAPITAL', 0.9944712519645691)]\n[[[72.0, 92.0], [453.0, 84.0], [454.0, 114.0], [73.0, 122.0]], ('107 State Street', 0.9744491577148438)]\n[[[69.0, 135.0], [501.0, 125.0], [501.0, 156.0], [70.0, 165.0]], ('Montpelier Vermont', 0.9357033967971802)]\n......\n\n常用的多语言简写包括\n\n\n\n语种\n缩写\n\n语种\n缩写\n\n语种\n缩写\n\n\n\n\n中文\nch\n\n法文\nfr\n\n日文\njapan\n\n\n英文\nen\n\n德文\ngerman\n\n韩文\nkorean\n\n\n繁体中文\nchinese_cht\n\n意大利文\nit\n\n俄罗斯文\nru\n\n\n\n全部语种及其对应的缩写列表可查看多语言模型教程\n\n2.2 Python脚本使用\n\n2.2.1 中英文与多语言使用\n通过Python脚本使用PaddleOCR whl包,whl包会自动下载ppocr轻量级模型作为默认模型。\n\n检测+方向分类器+识别全流程\n\nfrom paddleocr import PaddleOCR, draw_ocr\n\n# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换\n# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`\nocr = PaddleOCR(use_angle_cls=True, lang=\"ch\") # need to run only once to download and load model into memory\nimg_path = './imgs/11.jpg'\nresult = ocr.ocr(img_path, cls=True)\nfor idx in range(len(result)):\n res = result[idx]\n for line in res:\n print(line)\n\n# 显示结果\nfrom PIL import Image\nresult = result[0]\nimage = Image.open(img_path).convert('RGB')\nboxes = [line[0] for line in result]\ntxts = [line[1][0] for line in result]\nscores = [line[1][1] for line in result]\nim_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')\nim_show = Image.fromarray(im_show)\nim_show.save('result.jpg')\n结果是一个list,每个item包含了文本框,文字和识别置信度\n[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]\n......\n结果可视化\n\n \n\n\n如果输入是PDF文件,那么可以参考下面代码进行可视化\nfrom paddleocr import PaddleOCR, draw_ocr\n\n# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换\n# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`\nocr = PaddleOCR(use_angle_cls=True, lang=\"ch\", page_num=2) # need to run only once to download and load model into memory\nimg_path = './xxx.pdf'\nresult = ocr.ocr(img_path, cls=True)\nfor idx in range(len(result)):\n res = result[idx]\n for line in res:\n print(line)\n\n# 显示结果\nimport fitz\nfrom PIL import Image\nimport cv2\nimport numpy as np\nimgs = []\nwith fitz.open(img_path) as pdf:\n for pg in range(0, pdf.pageCount):\n page = pdf[pg]\n mat = fitz.Matrix(2, 2)\n pm = page.getPixmap(matrix=mat, alpha=False)\n # if width or height > 2000 pixels, don't enlarge the image\n if pm.width > 2000 or pm.height > 2000:\n pm = page.getPixmap(matrix=fitz.Matrix(1, 1), alpha=False)\n\n img = Image.frombytes(\"RGB\", [pm.width, pm.height], pm.samples)\n img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)\n imgs.append(img)\nfor idx in range(len(result)):\n res = result[idx]\n image = imgs[idx]\n boxes = [line[0] for line in res]\n txts = [line[1][0] for line in res]\n scores = [line[1][1] for line in res]\n im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')\n im_show = Image.fromarray(im_show)\n im_show.save('result_page_{}.jpg'.format(idx))\n3. 小结\n通过本节内容,相信您已经熟练掌握PaddleOCR whl包的使用方法并获得了初步效果。\n飞桨AI套件(PaddleX)提供了飞桨生态优质模型,是训压推一站式全流程高效率开发平台,其使命是助力AI技术快速落地,愿景是使人人成为AI Developer!目前PP-OCRv4已上线PaddleX,您可以进入通用OCR体验模型训练、压缩和推理部署全流程。\n","renderedFileInfo":null,"shortPath":null,"symbolsEnabled":true,"tabSize":8,"topBannersInfo":{"overridingGlobalFundingFile":false,"globalPreferredFundingPath":null,"showInvalidCitationWarning":false,"citationHelpUrl":"https://docs.github.com/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/about-citation-files","actionsOnboardingTip":null},"truncated":false,"viewable":true,"workflowRedirectUrl":null,"symbols":{"timed_out":false,"not_analyzed":false,"symbols":[{"name":"PaddleOCR 快速开始","kind":"section_1","ident_start":2,"ident_end":24,"extent_start":0,"extent_end":8893,"fully_qualified_name":"PaddleOCR 快速开始","ident_utf16":{"start":{"line_number":0,"utf16_col":2},"end":{"line_number":0,"utf16_col":16}},"extent_utf16":{"start":{"line_number":0,"utf16_col":0},"end":{"line_number":255,"utf16_col":0}}},{"name":"1. 安装","kind":"section_2","ident_start":582,"ident_end":591,"extent_start":579,"extent_end":1565,"fully_qualified_name":"1. 安装","ident_utf16":{"start":{"line_number":17,"utf16_col":3},"end":{"line_number":17,"utf16_col":8}},"extent_utf16":{"start":{"line_number":17,"utf16_col":0},"end":{"line_number":49,"utf16_col":0}}},{"name":"1.1 安装PaddlePaddle","kind":"section_3","ident_start":615,"ident_end":637,"extent_start":611,"extent_end":1209,"fully_qualified_name":"1.1 安装PaddlePaddle","ident_utf16":{"start":{"line_number":20,"utf16_col":4},"end":{"line_number":20,"utf16_col":22}},"extent_utf16":{"start":{"line_number":20,"utf16_col":0},"end":{"line_number":39,"utf16_col":0}}},{"name":"1.2 安装PaddleOCR whl包","kind":"section_3","ident_start":1213,"ident_end":1239,"extent_start":1209,"extent_end":1565,"fully_qualified_name":"1.2 安装PaddleOCR whl包","ident_utf16":{"start":{"line_number":39,"utf16_col":4},"end":{"line_number":39,"utf16_col":24}},"extent_utf16":{"start":{"line_number":39,"utf16_col":0},"end":{"line_number":49,"utf16_col":0}}},{"name":"2. 便捷使用","kind":"section_2","ident_start":1568,"ident_end":1583,"extent_start":1565,"extent_end":8398,"fully_qualified_name":"2. 便捷使用","ident_utf16":{"start":{"line_number":49,"utf16_col":3},"end":{"line_number":49,"utf16_col":10}},"extent_utf16":{"start":{"line_number":49,"utf16_col":0},"end":{"line_number":250,"utf16_col":0}}},{"name":"2.1 命令行使用","kind":"section_3","ident_start":1606,"ident_end":1625,"extent_start":1602,"extent_end":5473,"fully_qualified_name":"2.1 命令行使用","ident_utf16":{"start":{"line_number":51,"utf16_col":4},"end":{"line_number":51,"utf16_col":13}},"extent_utf16":{"start":{"line_number":51,"utf16_col":0},"end":{"line_number":157,"utf16_col":0}}},{"name":"2.1.1 中英文模型","kind":"section_4","ident_start":1971,"ident_end":1992,"extent_start":1966,"extent_end":4090,"fully_qualified_name":"2.1.1 中英文模型","ident_utf16":{"start":{"line_number":62,"utf16_col":5},"end":{"line_number":62,"utf16_col":16}},"extent_utf16":{"start":{"line_number":62,"utf16_col":0},"end":{"line_number":123,"utf16_col":0}}},{"name":"2.1.2 多语言模型","kind":"section_4","ident_start":4095,"ident_end":4116,"extent_start":4090,"extent_end":5473,"fully_qualified_name":"2.1.2 多语言模型","ident_utf16":{"start":{"line_number":123,"utf16_col":5},"end":{"line_number":123,"utf16_col":16}},"extent_utf16":{"start":{"line_number":123,"utf16_col":0},"end":{"line_number":157,"utf16_col":0}}},{"name":"2.2 Python脚本使用","kind":"section_3","ident_start":5477,"ident_end":5499,"extent_start":5473,"extent_end":8398,"fully_qualified_name":"2.2 Python脚本使用","ident_utf16":{"start":{"line_number":157,"utf16_col":4},"end":{"line_number":157,"utf16_col":18}},"extent_utf16":{"start":{"line_number":157,"utf16_col":0},"end":{"line_number":250,"utf16_col":0}}},{"name":"2.2.1 中英文与多语言使用","kind":"section_4","ident_start":5524,"ident_end":5557,"extent_start":5519,"extent_end":8398,"fully_qualified_name":"2.2.1 中英文与多语言使用","ident_utf16":{"start":{"line_number":159,"utf16_col":5},"end":{"line_number":159,"utf16_col":20}},"extent_utf16":{"start":{"line_number":159,"utf16_col":0},"end":{"line_number":250,"utf16_col":0}}},{"name":"3. 小结","kind":"section_2","ident_start":8401,"ident_end":8410,"extent_start":8398,"extent_end":8893,"fully_qualified_name":"3. 小结","ident_utf16":{"start":{"line_number":250,"utf16_col":3},"end":{"line_number":250,"utf16_col":8}},"extent_utf16":{"start":{"line_number":250,"utf16_col":0},"end":{"line_number":255,"utf16_col":0}}}]}},"copilotInfo":null,"copilotAccessAllowed":false,"csrf_tokens":{"/PaddlePaddle/PaddleOCR/branches":{"post":"NafVGnBYS6-2swJa1lLOnvTQsN-ylVUCIReflGLkKjKx52MoeGJHQhA7PYpr_BxT5tCbeUKv2sGQ7p3qydWZBA"},"/repos/preferences":{"post":"fjMO9SRnNUNwyaB9TZDaxd4bAovWvWyZ5JlN7kIs8o6ujMf1wRMXnpqRLYMHijhddFiDUyxkrMXOub4IFYxCgg"}}},"title":"PaddleOCR/doc/doc_ch/quickstart.md at release/2.7 · PaddlePaddle/PaddleOCPaddleOCR: 百度开源OCR
PaddleOCR: 百度开源OCR
登录
注册
开源
企业版
高校版
搜索
帮助中心
使用条款
关于我们
开源
企业版
高校版
私有云
Gitee AI
NEW
我知道了
查看详情
登录
注册
代码拉取完成,页面将自动刷新
捐赠
捐赠前请先登录
取消
前往登录
扫描微信二维码支付
取消
支付完成
支付提示
将跳转至支付宝完成支付
确定
取消
Watch
不关注
关注所有动态
仅关注版本发行动态
关注但不提醒动态
2
Star
6
Fork
0
computer-vision / PaddleOCR
代码
Issues
0
Pull Requests
0
Wiki
统计
流水线
服务
Gitee Pages
JavaDoc
PHPDoc
质量分析
Jenkins for Gitee
腾讯云托管
腾讯云 Serverless
悬镜安全
阿里云 SAE
Codeblitz
我知道了,不再自动展开
加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
已有帐号?
立即登录
返回
release/2.4
管理
管理
分支 (10)
标签 (4)
release/2.4
dygraph
release/2.2
release/2.3
release/2.1
release/2.0
develop
release/1.1
release/2.0-rc1-0
master
v2.1.1
v2.1.0
v2.0.0
v1.1.0
克隆/下载
克隆/下载
HTTPS
SSH
SVN
SVN+SSH
下载ZIP
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册
提示
下载代码请复制以下命令到终端执行
为确保你提交的代码身份被 Gitee 正确识别,请执行以下命令完成配置
git config --global user.name userName
git config --global user.email userEmail
初次使用 SSH 协议进行代码克隆、推送等操作时,需按下述提示完成 SSH 配置
1
生成 RSA 密钥
2
获取 RSA 公钥内容,并配置到 SSH公钥 中
在 Gitee 上使用 SVN,请访问 使用指南
使用 HTTPS 协议时,命令行会出现如下账号密码验证步骤。基于安全考虑,Gitee 建议 配置并使用私人令牌 替代登录密码进行克隆、推送等操作
Username for 'https://gitee.com': userName
Password for 'https://userName@gitee.com':
#
私人令牌
新建文件
新建 Diagram 文件
新建子模块
上传文件
分支 10
标签 4
贡献代码
同步代码
创建 Pull Request
了解更多
对比差异
通过 Pull Request 同步
同步更新到分支
通过 Pull Request 同步
将会在向当前分支创建一个 Pull Request,合入后将完成同步
zhoujun
del infer.sh (#5226)
22b1fb3
3850 次提交
提交
取消
提示:
由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
.github/ISSUE_TEMPLATE
保存
取消
PPOCRLabel
保存
取消
StyleText
保存
取消
benchmark
保存
取消
configs
保存
取消
deploy
保存
取消
doc
保存
取消
ppocr
保存
取消
ppstructure
保存
取消
test_tipc
保存
取消
tools
保存
取消
.clang_format.hook
保存
取消
.gitignore
保存
取消
.pre-commit-config.yaml
保存
取消
.style.yapf
保存
取消
LICENSE
保存
取消
MANIFEST.in
保存
取消
README.md
保存
取消
README_ch.md
保存
取消
__init__.py
保存
取消
paddleocr.py
保存
取消
requirements.txt
保存
取消
setup.py
保存
取消
train.sh
保存
取消
Loading...
README
Apache-2.0
English | 简体中文
简介
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。
近期更新
2021.12.21《动手学OCR · 十讲》课程开讲,12月21日起每晚八点半线上授课!免费报名地址。
2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法(PSENet),3种文本识别算法(NRTR、SEED、SAR);文档结构化算法新增1种关键信息提取算法(SDMGR,文档),3种DocVQA算法(LayoutLM、LayoutLMv2,LayoutXLM,文档)。
PaddleOCR研发团队对最新发版内容技术深入解读,9月8日晚上20:15,课程回放。
2021.9.7 发布PaddleOCR v2.3与PP-OCRv2,CPU推理速度相比于PP-OCR server提升220%;效果相比于PP-OCR mobile 提升7%。
2021.8.3 发布PaddleOCR v2.2,新增文档结构分析PP-Structure工具包,支持版面分析与表格识别(含Excel导出)。
更多
特性
PP-OCR系列高质量预训练模型,准确的识别效果
超轻量PP-OCRv2系列:检测(3.1M)+ 方向分类器(1.4M)+ 识别(8.5M)= 13.0M
超轻量PP-OCR mobile移动端系列:检测(3.0M)+方向分类器(1.4M)+ 识别(5.0M)= 9.4M
通用PPOCR server系列:检测(47.1M)+方向分类器(1.4M)+ 识别(94.9M)= 143.4M
支持中英文数字组合识别、竖排文本识别、长文本识别
支持多语言识别:韩语、日语、德语、法语等约80种语言
PP-Structure文档结构化系统
支持版面分析与表格识别(含Excel导出)
支持关键信息提取任务
支持DocVQA任务
丰富易用的OCR相关工具组件
半自动数据标注工具PPOCRLabel:支持快速高效的数据标注
数据合成工具Style-Text:批量合成大量与目标场景类似的图像
支持用户自定义训练,提供丰富的预测推理部署方案
支持PIP快速安装使用
可运行于Linux、Windows、MacOS等多种系统
上述内容的使用方法建议从文档教程中的快速开始体验
社区、社区贡献与社区常规赛
加入社区:微信扫描下方二维码加入官方交流群,与各行各业开发者充分交流,期待您的加入。
社区贡献:社区贡献文档中包含了社区用户使用PaddleOCR开发的各种工具、应用以及为PaddleOCR贡献的功能、优化的文档与代码等,是官方为社区开发者打造的荣誉墙、也是帮助优质项目宣传的广播站。如果您的OCR项目未被收集在文档中,可根据文档说明与我们联系。最新社区贡献可查看此处。
社区常规赛:作为社区贡献的具体承载形式,社区常规赛是面向OCR开发者的积分赛事。首届社区常规赛与《动手学OCR · 十讲》课程联合推广。社区常规赛的赛题详情与报名方法可参考链接。
零代码体验
在线网站体验:超轻量PP-OCR mobile模型体验地址:https://www.paddlepaddle.org.cn/hub/scene/ocr
移动端:安装包DEMO下载地址(基于EasyEdge和Paddle-Lite, 支持iOS和Android系统)
PP-OCR系列模型列表(更新中)
模型简介
模型名称
推荐场景
检测模型
方向分类器
识别模型
中英文超轻量PP-OCRv2模型(13.0M)
ch_PP-OCRv2_xx
移动端&服务器端
推理模型 / 训练模型
推理模型 / 预训练模型
推理模型 / 训练模型
中英文超轻量PP-OCR mobile模型(9.4M)
ch_ppocr_mobile_v2.0_xx
移动端&服务器端
推理模型 / 预训练模型
推理模型 / 预训练模型
推理模型 / 预训练模型
中英文通用PP-OCR server模型(143.4M)
ch_ppocr_server_v2.0_xx
服务器端
推理模型 / 预训练模型
推理模型 / 预训练模型
推理模型 / 预训练模型
更多模型下载(包括多语言),可以参考PP-OCR 系列模型下载
文档教程
运行环境准备
快速开始(中英文/多语言/文档分析)
PaddleOCR全景图与项目克隆
PP-OCR产业落地:从训练到部署
PP-OCR模型与配置文件
PP-OCR模型下载
PP-OCR模型库快速推理
PP-OCR模型训练
文本检测
文本识别
文本方向分类器
知识蒸馏
配置文件内容与生成
PP-OCR模型推理部署
基于C++预测引擎推理
服务化部署
端侧部署
Benchmark
PP-Structure信息提取
版面分析
表格识别
DocVQA
关键信息提取
OCR学术圈
两阶段模型介绍与下载
端到端PGNet算法
基于Python脚本预测引擎推理
使用PaddleOCR架构添加新算法
数据标注与合成
半自动标注工具PPOCRLabel
数据合成工具Style-Text
其它数据标注工具
其它数据合成工具
数据集
通用中英文OCR数据集
手写中文OCR数据集
垂类多语言OCR数据集
效果展示
FAQ
通用问题
PaddleOCR实战问题
参考文献
许可证书
代码组织结构
PP-OCRv2 Pipeline
[1] PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框矫正和CRNN文本识别三部分组成。该系统从骨干网络选择和调整、预测头部的设计、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型自动裁剪量化8个方面,采用19个有效策略,对各个模块的模型进行效果调优和瘦身(如绿框所示),最终得到整体大小为3.5M的超轻量中英文OCR和2.8M的英文数字OCR。更多细节请参考PP-OCR技术方案 https://arxiv.org/abs/2009.09941
[2] PP-OCRv2在PP-OCR的基础上,进一步在5个方面重点优化,检测模型采用CML协同互学习知识蒸馏策略和CopyPaste数据增广策略;识别模型采用LCNet轻量级骨干网络、UDML 改进知识蒸馏策略和Enhanced CTC loss损失函数改进(如上图红框所示),进一步在推理速度和预测效果上取得明显提升。更多细节请参考PP-OCRv2技术报告。
效果展示 more
中文模型
英文模型
其他语言模型
最新社区贡献
基于PaddleOCR的社区项目: FastOCRLabel:完整的C#版本标注工具 (@ 包建强 )
为PaddleOCR新增功能:非常感谢 Evezerest, ninetailskim, edencfc, BeyondYourself, 1084667371 贡献了PPOCRLabel 的完整代码。
代码与文档优化:非常感谢 BeyondYourself 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。
多语言语料:非常感谢 Mejans 给PaddleOCR增加新语言奥克西坦语Occitan的字典和语料(#954)。
完整社区贡献列表可查看社区贡献文档
许可证书
本项目的发布受Apache 2.0 license许可认证。
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Starred
6
Star
6
Fork
0
捐赠
0 人次
举报
举报成功
我们将于2个工作日内通过站内信反馈结果给你!
请认真填写举报原因,尽可能描述详细。
举报类型
请选择举报类型
举报原因
取消
发送
误判申诉
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
取消
提交
简介
百度开源OCR
展开
收起
暂无标签
/computer-vision/PaddleOCR
Python
等 6 种语言
Python
79.1%
C++
17.6%
Java
2.5%
CMake
0.5%
Makefile
0.2%
Other
0.1%
Apache-2.0
使用 Apache-2.0 开源许可协议
保存更改
取消
发行版
暂无发行版
贡献者
全部
近期动态
加载更多
不能加载更多了
编辑仓库简介
简介内容
百度开源OCR
主页
取消
保存更改
1
https://gitee.com/computer-vision/PaddleOCR.git
git@gitee.com:computer-vision/PaddleOCR.git
computer-vision
PaddleOCR
PaddleOCR
release/2.4
深圳市奥思网络科技有限公司版权所有
Git 大全
Git 命令学习
CopyCat 代码克隆检测
APP与插件下载
Gitee Reward
Gitee 封面人物
GVP 项目
Gitee 博客
Gitee 公益计划
Gitee 持续集成
OpenAPI
帮助文档
在线自助服务
更新日志
关于我们
加入我们
使用条款
意见建议
合作伙伴
售前咨询客服
技术交流QQ群
微信服务号
client#oschina.cn
企业版在线使用:400-606-0201
专业版私有部署:
13670252304
13352947997
开放原子开源基金会
合作代码托管平台
违法和不良信息举报中心
粤ICP备12009483号
简 体
/
繁 體
/
English
点此查找更多帮助
搜索帮助
Git 命令在线学习
如何在 Gitee 导入 GitHub 仓库
Git 仓库基础操作
企业版和社区版功能对比
SSH 公钥设置
如何处理代码冲突
仓库体积过大,如何减小?
如何找回被删除的仓库数据
Gitee 产品配额说明
GitHub仓库快速导入Gitee及同步更新
什么是 Release(发行版)
将 PHP 项目自动发布到 packagist.org
评论
仓库举报
回到顶部
登录提示
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册
PaddlePaddle/PaddleOCR: PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。 - doc/doc_ch/quickstart.md at release/2.6 - PaddleOCR - OpenI - 启智AI开源社区提供普惠算力!
PaddlePaddle/PaddleOCR: PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力开发者训练出更好的模型,并应用落地。 - doc/doc_ch/quickstart.md at release/2.6 - PaddleOCR - OpenI - 启智AI开源社区提供普惠算力!
This website works better with JavaScript.
AI
协作平台
Powered by C²NET
Home
Issues
Pull Requests
Milestones
Cloudbrain Task
Calculation Points
Repositories
Datasets
Model
Model Square
Large Model Base
PengCheng.Mind
Computing power
Computing resources
Domestic computing power
Explore
Organizations
Cloudbrain Mirror
Courses
OpenI Projects
Forum
Register
Sign In
关于云脑任务中统一路径访问方式的公告>>>
关于调整算力资源消耗积分的公告>>>
关于将启智集群GPU资源迁移至智算集群的公告>>>
PaddlePaddle
/
PaddleOCR
mirror of https://github.com/PaddlePaddle/PaddleOCR
Not watched
Unwatch
Watch all
Watch but not notify
6
Star
1
Fork
11
Code
Releases 8
Wiki
Activity
Issues 1
Datasets
Model
Cloudbrain
You can not select more than 25 topics
Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
Branch:
release/2.6
dygraph
release/1.1
release/2.0
release/2.0-rc1-0
release/2.1
release/2.2
release/2.3
release/2.4
release/2.5
release/2.6
release/2.6.1
release/2.6rc
release/2.7
release/2.7.1
revert-10769-cherry-pick-for-open
revert-7381-dygraph
revert-7437-dygraph
revert-8552-dygraph
static
v1.1.0
v2.0.0
v2.1.0
v2.1.1
v2.5.0
v2.6.0
v2.7.0
v2.7.1
Branches
Tags
${ item.name }
Create branch ${ searchTerm }
from 'release/2.6'
${ noResults }
PaddleOCR/doc/doc_ch/quickstart.md
8.5 KiB
Raw
Permalink
Blame
History
PaddleOCR 快速开始
1. 安装
1.1 安装PaddlePaddle
1.2 安装PaddleOCR whl包
2. 便捷使用
2.1 命令行使用
2.1.1 中英文模型
2.1.2 多语言模型
2.2 Python脚本使用
2.2.1 中英文与多语言使用
3. 小结
PaddleOCR 快速开始
说明: 本文主要介绍PaddleOCR wheel包对PP-OCR系列模型的快速使用,如要体验文档分析相关功能,请参考PP-Structure快速使用教程。
1. 安装
1.1 安装PaddlePaddle
1.2 安装PaddleOCR whl包
2. 便捷使用
2.1 命令行使用
2.1.1 中英文模型
2.1.2 多语言模型
2.2 Python脚本使用
2.2.1 中英文与多语言使用
3.小结
1. 安装
1.1 安装PaddlePaddle
如果您没有基础的Python运行环境,请参考运行环境准备。
您的机器安装的是CUDA9或CUDA10,请运行以下命令安装
python3 -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
您的机器是CPU,请运行以下命令安装
python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
更多的版本需求,请参照飞桨官网安装文档中的说明进行操作。
1.2 安装PaddleOCR whl包
pip install "paddleocr>=2.0.1" # 推荐使用2.0.1+版本
对于Windows环境用户:直接通过pip安装的shapely库可能出现[winRrror 126] 找不到指定模块的问题。建议从这里下载shapely安装包完成安装。
2. 便捷使用
2.1 命令行使用
PaddleOCR提供了一系列测试图片,点击这里下载并解压,然后在终端中切换到相应目录
cd /path/to/ppocr_img
如果不使用提供的测试图片,可以将下方--image_dir参数替换为相应的测试图片路径。
2.1.1 中英文模型
检测+方向分类器+识别全流程:--use_angle_cls true设置使用方向分类器识别180度旋转文字,--use_gpu false设置不使用GPU
paddleocr --image_dir ./imgs/11.jpg --use_angle_cls true --use_gpu false
结果是一个list,每个item包含了文本框,文字和识别置信度
[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]
......
此外,paddleocr也支持输入pdf文件,并且可以通过指定参数page_num来控制推理前面几页,默认为0,表示推理所有页。
paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2
单独使用检测:设置--rec为false
paddleocr --image_dir ./imgs/11.jpg --rec false
结果是一个list,每个item只包含文本框
[[27.0, 459.0], [136.0, 459.0], [136.0, 479.0], [27.0, 479.0]]
[[28.0, 429.0], [372.0, 429.0], [372.0, 445.0], [28.0, 445.0]]
......
单独使用识别:设置--det为false
paddleocr --image_dir ./imgs_words/ch/word_1.jpg --det false
结果是一个list,每个item只包含识别结果和识别置信度
['韩国小馆', 0.994467]
版本说明
paddleocr默认使用PP-OCRv3模型(--ocr_version PP-OCRv3),如需使用其他版本可通过设置参数--ocr_version,具体版本说明如下:
版本名称
版本说明
PP-OCRv3
支持中、英文检测和识别,方向分类器,支持多语种识别
PP-OCRv2
支持中英文的检测和识别,方向分类器,多语言暂未更新
PP-OCR
支持中、英文检测和识别,方向分类器,支持多语种识别
如需新增自己训练的模型,可以在paddleocr中增加模型链接和字段,重新编译即可。
更多whl包使用可参考whl包文档
2.1.2 多语言模型
PaddleOCR目前支持80个语种,可以通过修改--lang参数进行切换,对于英文模型,指定--lang=en。
paddleocr --image_dir ./imgs_en/254.jpg --lang=en
结果是一个list,每个item包含了文本框,文字和识别置信度
[[[67.0, 51.0], [327.0, 46.0], [327.0, 74.0], [68.0, 80.0]], ('PHOCAPITAL', 0.9944712519645691)]
[[[72.0, 92.0], [453.0, 84.0], [454.0, 114.0], [73.0, 122.0]], ('107 State Street', 0.9744491577148438)]
[[[69.0, 135.0], [501.0, 125.0], [501.0, 156.0], [70.0, 165.0]], ('Montpelier Vermont', 0.9357033967971802)]
......
常用的多语言简写包括
语种
缩写
语种
缩写
语种
缩写
中文
ch
法文
fr
日文
japan
英文
en
德文
german
韩文
korean
繁体中文
chinese_cht
意大利文
it
俄罗斯文
ru
全部语种及其对应的缩写列表可查看多语言模型教程
2.2 Python脚本使用
2.2.1 中英文与多语言使用
通过Python脚本使用PaddleOCR whl包,whl包会自动下载ppocr轻量级模型作为默认模型。
检测+方向分类器+识别全流程
from paddleocr import PaddleOCR, draw_ocr
# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换
# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = './imgs/11.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# 显示结果
# 如果本地没有simfang.ttf,可以在doc/fonts目录下下载
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
结果是一个list,每个item包含了文本框,文字和识别置信度
[[[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], ('纯臻营养护发素', 0.9658738374710083)]
......
结果可视化
如果输入是PDF文件,那么可以参考下面代码进行可视化
from paddleocr import PaddleOCR, draw_ocr
# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换
# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`
ocr = PaddleOCR(use_angle_cls=True, lang="ch", page_num=2) # need to run only once to download and load model into memory
img_path = './xxx.pdf'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# 显示结果
import fitz
from PIL import Image
import cv2
import numpy as np
imgs = []
with fitz.open(img_path) as pdf:
for pg in range(0, pdf.pageCount):
page = pdf[pg]
mat = fitz.Matrix(2, 2)
pm = page.getPixmap(matrix=mat, alpha=False)
# if width or height > 2000 pixels, don't enlarge the image
if pm.width > 2000 or pm.height > 2000:
pm = page.getPixmap(matrix=fitz.Matrix(1, 1), alpha=False)
img = Image.frombytes("RGB", [pm.width, pm.height], pm.samples)
img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
imgs.append(img)
for idx in range(len(result)):
res = result[idx]
image = imgs[idx]
boxes = [line[0] for line in res]
txts = [line[1][0] for line in res]
scores = [line[1][1] for line in res]
im_show = draw_ocr(image, boxes, txts, scores, font_path='doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result_page_{}.jpg'.format(idx))
3. 小结
通过本节内容,相信您已经熟练掌握PaddleOCR whl包的使用方法并获得了初步效果。
PaddleOCR是一套丰富领先实用的OCR工具库,打通数据、模型训练、压缩和推理部署全流程,您可以参考文档教程,正式开启PaddleOCR的应用之旅。
Please read the following content carefully:
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》
Agree and continue
Disagree, exit
Community
Council
Technical Committee
Join OpenI
Use agreement
News
Community News
Member news
Industry Advisory
help
English
English
简体中文
Tutorial
Feedback
Resource Note
OpenI subscription number
User communication group
Copyright: New Generation Artificial Intelligence Open Source Open Platform (OpenI) 京ICP备18004880号
京公网安备 11010802042693号
Powered_by Pengcheng CloudBrain、China Computing NET(C²NET)、Trustie确实、Gitea