Skip to content

Commit 0b91f4d

Browse files
jzhang533ToddBear
andauthored
This is a combination of 59 commits from v2.7.0 to v2.7.1 in (#11831)
release/2.7.1 branch Co-authored-by: ToddBear <43341135+ToddBear@users.noreply.github.com>
1 parent ddaa85d commit 0b91f4d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

77 files changed

+4119
-583
lines changed

.github/ISSUE_TEMPLATE/newfeature.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
name: New Feature Issue template
3+
about: Issue template for new features.
4+
title: ''
5+
labels: 'Code PR is needed'
6+
assignees: 'shiyutang'
7+
8+
---
9+
10+
## 背景
11+
12+
经过需求征集https://github.com/PaddlePaddle/PaddleOCR/issues/10334 和每周技术研讨会 https://github.com/PaddlePaddle/PaddleOCR/issues/10223 讨论,我们确定了XXXX任务。
13+
14+
## 解决步骤
15+
1. 根据开源代码进行网络结构、评估指标转换。代码链接:XXXX
16+
2. 结合[论文复现指南](https://github.com/PaddlePaddle/models/blob/release%2F2.2/tutorials/article-implementation/ArticleReproduction_CV.md),进行前反向对齐等操作,达到论文Table.1中的指标。
17+
3. 参考[PR提交规范](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/code_and_doc.md)提交代码PR到ppocr中。

.pre-commit-config.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,3 @@ repos:
4040
hooks:
4141
- id: ruff
4242
args: [--fix, --exit-non-zero-on-fix, --no-cache]
43-

PPOCRLabel/gen_ocr_train_val_test.py

Lines changed: 38 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -17,48 +17,43 @@ def isCreateOrDeleteFolder(path, flag):
1717
return flagAbsPath
1818

1919

20-
def splitTrainVal(root, absTrainRootPath, absValRootPath, absTestRootPath, trainTxt, valTxt, testTxt, flag):
21-
# 按照指定的比例划分训练集、验证集、测试集
22-
dataAbsPath = os.path.abspath(root)
23-
24-
if flag == "det":
25-
labelFilePath = os.path.join(dataAbsPath, args.detLabelFileName)
26-
elif flag == "rec":
27-
labelFilePath = os.path.join(dataAbsPath, args.recLabelFileName)
28-
29-
labelFileRead = open(labelFilePath, "r", encoding="UTF-8")
30-
labelFileContent = labelFileRead.readlines()
31-
random.shuffle(labelFileContent)
32-
labelRecordLen = len(labelFileContent)
33-
34-
for index, labelRecordInfo in enumerate(labelFileContent):
35-
imageRelativePath = labelRecordInfo.split('\t')[0]
36-
imageLabel = labelRecordInfo.split('\t')[1]
37-
imageName = os.path.basename(imageRelativePath)
38-
39-
if flag == "det":
40-
imagePath = os.path.join(dataAbsPath, imageName)
41-
elif flag == "rec":
42-
imagePath = os.path.join(dataAbsPath, "{}\\{}".format(args.recImageDirName, imageName))
43-
44-
# 按预设的比例划分训练集、验证集、测试集
45-
trainValTestRatio = args.trainValTestRatio.split(":")
46-
trainRatio = eval(trainValTestRatio[0]) / 10
47-
valRatio = trainRatio + eval(trainValTestRatio[1]) / 10
48-
curRatio = index / labelRecordLen
49-
50-
if curRatio < trainRatio:
51-
imageCopyPath = os.path.join(absTrainRootPath, imageName)
52-
shutil.copy(imagePath, imageCopyPath)
53-
trainTxt.write("{}\t{}".format(imageCopyPath, imageLabel))
54-
elif curRatio >= trainRatio and curRatio < valRatio:
55-
imageCopyPath = os.path.join(absValRootPath, imageName)
56-
shutil.copy(imagePath, imageCopyPath)
57-
valTxt.write("{}\t{}".format(imageCopyPath, imageLabel))
58-
else:
59-
imageCopyPath = os.path.join(absTestRootPath, imageName)
60-
shutil.copy(imagePath, imageCopyPath)
61-
testTxt.write("{}\t{}".format(imageCopyPath, imageLabel))
20+
def splitTrainVal(root, abs_train_root_path, abs_val_root_path, abs_test_root_path, train_txt, val_txt, test_txt, flag):
21+
22+
data_abs_path = os.path.abspath(root)
23+
label_file_name = args.detLabelFileName if flag == "det" else args.recLabelFileName
24+
label_file_path = os.path.join(data_abs_path, label_file_name)
25+
26+
with open(label_file_path, "r", encoding="UTF-8") as label_file:
27+
label_file_content = label_file.readlines()
28+
random.shuffle(label_file_content)
29+
label_record_len = len(label_file_content)
30+
31+
for index, label_record_info in enumerate(label_file_content):
32+
image_relative_path, image_label = label_record_info.split('\t')
33+
image_name = os.path.basename(image_relative_path)
34+
35+
if flag == "det":
36+
image_path = os.path.join(data_abs_path, image_name)
37+
elif flag == "rec":
38+
image_path = os.path.join(data_abs_path, args.recImageDirName, image_name)
39+
40+
train_val_test_ratio = args.trainValTestRatio.split(":")
41+
train_ratio = eval(train_val_test_ratio[0]) / 10
42+
val_ratio = train_ratio + eval(train_val_test_ratio[1]) / 10
43+
cur_ratio = index / label_record_len
44+
45+
if cur_ratio < train_ratio:
46+
image_copy_path = os.path.join(abs_train_root_path, image_name)
47+
shutil.copy(image_path, image_copy_path)
48+
train_txt.write("{}\t{}\n".format(image_copy_path, image_label))
49+
elif cur_ratio >= train_ratio and cur_ratio < val_ratio:
50+
image_copy_path = os.path.join(abs_val_root_path, image_name)
51+
shutil.copy(image_path, image_copy_path)
52+
val_txt.write("{}\t{}\n".format(image_copy_path, image_label))
53+
else:
54+
image_copy_path = os.path.join(abs_test_root_path, image_name)
55+
shutil.copy(image_path, image_copy_path)
56+
test_txt.write("{}\t{}\n".format(image_copy_path, image_label))
6257

6358

6459
# 删掉存在的文件
@@ -148,4 +143,4 @@ def genDetRecTrainVal(args):
148143
help="the name of the folder where the cropped recognition dataset is located"
149144
)
150145
args = parser.parse_args()
151-
genDetRecTrainVal(args)
146+
genDetRecTrainVal(args)

README.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -68,12 +68,10 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
6868

6969
<a name="技术交流合作"></a>
7070
## 📖 技术交流合作
71-
72-
- 飞桨低代码开发工具(PaddleX)—— 面向国内外主流AI硬件的飞桨精选模型一站式开发工具。包含如下核心优势:
73-
- 【产业高精度模型库】:覆盖10个主流AI任务 40+精选模型,丰富齐全。
74-
- 【特色模型产线】:提供融合大小模型的特色模型产线,精度更高,效果更好。
75-
- 【低代码开发模式】:图形化界面支持统一开发范式,便捷高效。
76-
- 【私有化部署多硬件支持】:适配国内外主流AI硬件,支持本地纯离线使用,满足企业安全保密需要。
71+
- 飞桨AI套件([PaddleX](http://10.136.157.23:8080/paddle/paddleX))提供了飞桨模型训压推一站式全流程高效率开发平台,其使命是助力AI技术快速落地,愿景是使人人成为AI Developer!
72+
- PaddleX 目前覆盖图像分类、目标检测、图像分割、3D、OCR和时序预测等领域方向,已内置了36种基础单模型,例如RT-DETR、PP-YOLOE、PP-HGNet、PP-LCNet、PP-LiteSeg等;集成了12种实用的产业方案,例如PP-OCRv4、PP-ChatOCR、PP-ShiTu、PP-TS、车载路面垃圾检测、野生动物违禁制品识别等。
73+
- PaddleX 提供了“工具箱”和“开发者”两种AI开发模式。工具箱模式可以无代码调优关键超参,开发者模式可以低代码进行单模型训压推和多模型串联推理,同时支持云端和本地端。
74+
- PaddleX 还支持联创开发,利润分成!目前 PaddleX 正在快速迭代,欢迎广大的个人开发者和企业开发者参与进来,共创繁荣的 AI 技术生态!
7775

7876
- PaddleX官网地址:https://aistudio.baidu.com/intro/paddlex
7977

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy