PaddleOCR训练自己的私有数据集(包括标注、制作数据集、训练及应用)
目录
一、制作数据集
1、进入到PaddleOCR-releas-2.7目录
2、首先启用PPOCRLabel:在终端激活环境
3、接着点击左下角的自动标注
4、确认完成后点击左上角
5、新建gen_ocr_train_val_test.py
二、训练文字检测模型
1、模型下载
2.、配置ppocr检测模型文件
3、文字检测模型开始训练
4.、测试训练模型
三、训练文字识别模型
1、修改识别模型配置文件
2、模型训练
3、模型测试
四、转换成推理模型
1、在anaconda终端中输入指令进行测试
2.用predict_system.py进行验证
五、推理模型部署在RK3588上
记录自己使用的过程
前提:
已经完成电脑的基础配置包括:
cuda、cudnn、pytorch、conda、PPOCRLabel等
没有安装的可以参考:ubuntu22.04安装PPOCRLabel-CSDN博客
我之前写的一篇文章:PaddleOCR环境搭建、模型训练、推理、部署全流程(Ubuntu系统)_随记1-CSDN博客
这次主要改进并优化之前写的一篇内容
一、制作数据集
1、进入到PaddleOCR-releas-2.7目录
新建一个data文件夹、data文件夹下放置存放要标注的图片,名字命名images
在data下新建test_images用来存放测试图片
2、首先启用PPOCRLabel:在终端激活环境
conda activate label4
PPOCRLabel --lang ch --kie True
接着点击左上角文件,把下面三个都选上
自动导出标记结果、自动重新识别、自动保存未提交变更
3、接着点击左下角的自动标注
自动标注完成点击
对每一个图片内容进行检测并改正;完成之后点击右下角确认
4、确认完成后点击左上角
导出标记结果、导出识别结果
完成之后可以发现/home/sxj/ppocr-1/data/images下多了这几个文件
5、新建gen_ocr_train_val_test.py
功能描述:分别划分检测和识别的训练集、验证集、测试集
# coding:utf8
import os
import shutil
import random
import argparse# 删除划分的训练集、验证集、测试集文件夹,重新创建一个空的文件夹
def isCreateOrDeleteFolder(path, flag):flagPath = os.path.join(path, flag)if os.path.exists(flagPath):shutil.rmtree(flagPath)os.makedirs(flagPath)flagAbsPath = os.path.abspath(flagPath)return flagAbsPathdef splitTrainVal(root, absTrainRootPath, absValRootPath, absTestRootPath, trainTxt, valTxt, testTxt, flag):# 按照指定的比例划分训练集、验证集、测试集dataAbsPath = os.path.abspath(root)if flag == "det":labelFilePath = os.path.join(dataAbsPath, args.detLabelFileName)elif flag == "rec":labelFilePath = os.path.join(dataAbsPath, args.recLabelFileName)labelFileRead = open(labelFilePath, "r", encoding="UTF-8")labelFileContent = labelFileRead.readlines()random.shuffle(labelFileContent)labelRecordLen = len(labelFileContent)for index, labelRecordInfo in enumerate(labelFileContent):imageRelativePath = labelRecordInfo.split('\t')[0]imageLabel = labelRecordInfo.split('\t')[1]imageName = os.path.basename(imageRelativePath)if flag == "det":imagePath = os.path.join(dataAbsPath, imageName)elif flag == "rec":imagePath = os.path.join(dataAbsPath, "{}/{}".format(args.recImageDirName, imageName))# 按预设的比例划分训练集、验证集、测试集trainValTestRatio = args.trainValTestRatio.split(":")trainRatio = eval(trainValTestRatio[0]) / 10valRatio = trainRatio + eval(trainValTestRatio[1]) / 10curRatio = index / labelRecordLenif curRatio < trainRatio:imageCopyPath = os.path.join(absTrainRootPath, imageName)shutil.copy(imagePath, imageCopyPath)trainTxt.write("{}\t{}".format(imageCopyPath, imageLabel))elif curRatio >= trainRatio and curRatio < valRatio:imageCopyPath = os.path.join(absValRootPath, imageName)shutil.copy(imagePath, imageCopyPath)valTxt.write("{}\t{}".format(imageCopyPath, imageLabel))else:imageCopyPath = os.path.join(absTestRootPath, imageName)shutil.copy(imagePath, imageCopyPath)testTxt.write("{}\t{}".format(imageCopyPath, imageLabel))# 删掉存在的文件
def removeFile(path):if os.path.exists(path):os.remove(path)def genDetRecTrainVal(args):detAbsTrainRootPath = isCreateOrDeleteFolder(args.detRootPath, "train")detAbsValRootPath = isCreateOrDeleteFolder(args.detRootPath, "val")detAbsTestRootPath = isCreateOrDeleteFolder(args.detRootPath, "test")recAbsTrainRootPath = isCreateOrDeleteFolder(args.recRootPath, "train")recAbsValRootPath = isCreateOrDeleteFolder(args.recRootPath, "val")recAbsTestRootPath = isCreateOrDeleteFolder(args.recRootPath, "test")removeFile(os.path.join(args.detRootPath, "train.txt"))removeFile(os.path.join(args.detRootPath, "val.txt"))removeFile(os.path.join(args.detRootPath, "test.txt"))removeFile(os.path.join(args.recRootPath, "train.txt"))removeFile(os.path.join(args.recRootPath, "val.txt"))removeFile(os.path.join(args.recRootPath, "test.txt"))detTrainTxt = open(os.path.join(args.detRootPath, "train.txt"), "a", encoding="UTF-8")detValTxt = open(os.path.join(args.detRootPath, "val.txt"), "a", encoding="UTF-8")detTestTxt = open(os.path.join(args.detRootPath, "test.txt"), "a", encoding="UTF-8")recTrainTxt = open(os.path.join(args.recRootPath, "train.txt"), "a", encoding="UTF-8")recValTxt = open(os.path.join(args.recRootPath, "val.txt"), "a", encoding="UTF-8")recTestTxt = open(os.path.join(args.recRootPath, "test.txt"), "a", encoding="UTF-8")splitTrainVal(args.datasetRootPath, detAbsTrainRootPath, detAbsValRootPath, detAbsTestRootPath, detTrainTxt, detValTxt,detTestTxt, "det")for root, dirs, files in os.walk(args.datasetRootPath):for dir in dirs:if dir == 'crop_img':splitTrainVal(root, recAbsTrainRootPath, recAbsValRootPath, recAbsTestRootPath, recTrainTxt, recValTxt,recTestTxt, "rec")else:continuebreakif __name__ == "__main__":# 功能描述:分别划分检测和识别的训练集、验证集、测试集# 说明:可以根据自己的路径和需求调整参数,图像数据往往多人合作分批标注,每一批图像数据放在一个文件夹内用PPOCRLabel进行标注,# 如此会有多个标注好的图像文件夹汇总并划分训练集、验证集、测试集的需求parser = argparse.ArgumentParser()parser.add_argument("--trainValTestRatio",type=str,default="6:2:2",help="ratio of trainset:valset:testset")parser.add_argument("--datasetRootPath",type=str,default="./data/",help="path to the dataset marked by ppocrlabel, E.g, dataset folder named 1,2,3...")parser.add_argument("--detRootPath",type=str,default="./data/det",help="the path where the divided detection dataset is placed")parser.add_argument("--recRootPath",type=str,default="./data/rec",help="the path where the divided recognition dataset is placed")parser.add_argument("--detLabelFileName",type=str,default="Label.txt",help="the name of the detection annotation file")parser.add_argument("--recLabelFileName",type=str,default="rec_gt.txt",help="the name of the recognition annotation file")parser.add_argument("--recImageDirName",type=str,default="crop_img",help="the name of the folder where the cropped recognition dataset is located")args = parser.parse_args()genDetRecTrainVal(args)
运行:
python gen_ocr_train_val_test.py --trainValTestRatio 6:2:2 --datasetRootPath ./data/images
完成之后在data文件夹下多了:det、rec两个文件夹
二、训练文字检测模型
可使用的模型参考模型列表,ppocr版本这里PPOCR版本作为预训练模型:
(经常用放在这里)
1、模型下载
下载之后在PaddleOCR-release-2.7根目录下建立pretrain_models文件夹,并将训练模型解压至该文件夹下。如下图:
2.、配置ppocr检测模型文件
在configs / det / ch_ppocr_v2.0 /找到 ch_det_res18_db_v2.0.yml配置文件
我的ch_det_res18_db_v2.0.yml代码:
Global:use_gpu: true # 是否用GPU,无改为falseepoch_num: 50 # 训练迭代次数log_smooth_window: 20print_batch_step: 2 # 一次图片传输张数save_model_dir: ./output/ch_db_res18/ # 输出模型文件路径save_epoch_step: 50 # 训练迭代多少次保存一次训练模型# evaluation is run every 5000 iterations after the 4000th iterationeval_batch_step: [3000, 2000]cal_metric_during_train: Falsepretrained_model: ./pretrain_models/ch_PP-OCRv4_det_train/best_accuracy.pdparams # 刚下载好的训练模型路径checkpoints:save_inference_dir:use_visualdl: Falseinfer_img: doc/imgs_en/img_10.jpgsave_res_path: ./output/det_db/predicts_db.txtArchitecture:model_type: detalgorithm: DBTransform:Backbone:name: ResNet_vdlayers: 18disable_se: TrueNeck:name: DBFPNout_channels: 256Head:name: DBHeadk: 50Loss:name: DBLossbalance_loss: truemain_loss_type: DiceLossalpha: 5beta: 10ohem_ratio: 3Optimizer:name: Adambeta1: 0.9beta2: 0.999lr:name: Cosinelearning_rate: 0.001warmup_epoch: 2regularizer:name: 'L2'factor: 0PostProcess:name: DBPostProcessthresh: 0.3box_thresh: 0.6max_candidates: 1000unclip_ratio: 1.5Metric:name: DetMetricmain_indicator: hmeanTrain:dataset:name: SimpleDataSetdata_dir: ./data/ # train_data路径label_file_list:- ./data/det/train.txt # 数据集标签路径ratio_list: [1.0]transforms:- DecodeImage: # load imageimg_mode: BGRchannel_first: False- DetLabelEncode: # Class handling label- IaaAugment:augmenter_args:- { 'type': Fliplr, 'args': { 'p': 0.5 } }- { 'type': Affine, 'args': { 'rotate': [-10, 10] } }- { 'type': Resize, 'args': { 'size': [0.5, 3] } }- EastRandomCropData:size: [960, 960]max_tries: 50keep_ratio: true- MakeBorderMap:shrink_ratio: 0.4thresh_min: 0.3thresh_max: 0.7- MakeShrinkMap:shrink_ratio: 0.4min_text_size: 8- NormalizeImage:scale: 1./255.mean: [0.485, 0.456, 0.406]std: [0.229, 0.224, 0.225]order: 'hwc'- ToCHWImage:- KeepKeys:keep_keys: ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask'] # the order of the dataloader listloader:shuffle: Truedrop_last: Falsebatch_size_per_card: 2num_workers: 2Eval:dataset:name: SimpleDataSetdata_dir: ./data/ # train_data路径label_file_list:- ./data/det/val.txt # 数据集中的评估标签transforms:- DecodeImage: # load imageimg_mode: BGRchannel_first: False- DetLabelEncode: # Class handling label- DetResizeForTest:
# image_shape: [736, 1280]- NormalizeImage:scale: 1./255.mean: [0.485, 0.456, 0.406]std: [0.229, 0.224, 0.225]order: 'hwc'- ToCHWImage:- KeepKeys:keep_keys: ['image', 'shape', 'polys', 'ignore_tags']loader:shuffle: Falsedrop_last: Falsebatch_size_per_card: 1 # must be 1num_workers: 2
3、文字检测模型开始训练
打开anaconda终端,激活环境进入到PaddleOCR-releas-2.7根目录下
输入以下指令开始模型训练:
conda activate label4
python tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml
4.、测试训练模型
找到模型保存的路径:/output/ch_db_res18/
使用best_accuracy.pdparams进行我们的模型测试,没有说明训练次数少,用latest.pdparams模型测试
在anaconda终端中输入以下指令进行测试, 其中Global.pretrained_model是训练好并且需要测试的模型,Global.infer_img为所要检测的图片路径:
python tools/infer_det.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.pretrained_model=output/ch_db_res18/latest.pdparams Global.infer_img="./data/test_images/1.jpg"
python tools/infer_det.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.pretrained_model=output/ch_db_res18/best_accuracy.pdparams Global.infer_img="./data/test_images/1.jpeg"
检测模型完成,接下来进行识别模型训练和测试!!!
三、训练文字识别模型
1、修改识别模型配置文件
文字识别使用配置文件为ch_PP-OCRv3_rec.yml
在configs / rec / PP-OCRv3 /找到 ch_PP-OCRv3_rec.yml 配置文件
修改的地方和文字检测修改类似:
我自己ch_PP-OCRv3_rec.yml代码:
Global:debug: falseuse_gpu: trueepoch_num: 50log_smooth_window: 20print_batch_step: 1save_model_dir: ./output/rec_ppocr_v3save_epoch_step: 15eval_batch_step: [3000, 2000]cal_metric_during_train: truepretrained_model: ./pretrain_models/ch_PP-OCRv4_rec_train/student.pdparams # 识别训练模型路径checkpoints:save_inference_dir:use_visualdl: falseinfer_img: doc/imgs_words/ch/word_1.jpgcharacter_dict_path: ppocr/utils/ppocr_keys_v1.txtmax_text_length: &max_text_length 25infer_mode: falseuse_space_char: truedistributed: truesave_res_path: ./output/rec/predicts_ppocrv4.txtOptimizer:name: Adambeta1: 0.9beta2: 0.999lr:name: Cosinelearning_rate: 0.001warmup_epoch: 5regularizer:name: L2factor: 3.0e-05Architecture:model_type: recalgorithm: SVTR_LCNetTransform:Backbone:name: MobileNetV1Enhancescale: 0.5last_conv_stride: [1, 2]last_pool_type: avglast_pool_kernel_size: [2, 2]Head:name: MultiHeadhead_list:- CTCHead:Neck:name: svtrdims: 64depth: 2hidden_dims: 120use_guide: TrueHead:fc_decay: 0.00001- SARHead:enc_dim: 512max_text_length: *max_text_lengthLoss:name: MultiLossloss_config_list:- CTCLoss:- SARLoss:PostProcess: name: CTCLabelDecodeMetric:name: RecMetricmain_indicator: accignore_space: FalseTrain:dataset:name: SimpleDataSetdata_dir: ./train_data/ext_op_transform_idx: 1label_file_list:- ./data/rec/train.txt # 识别训练数据集标签路径transforms:- DecodeImage:img_mode: BGRchannel_first: false- RecConAug:prob: 0.5ext_data_num: 2image_shape: [48, 320, 3]max_text_length: *max_text_length- RecAug:- MultiLabelEncode:- RecResizeImg:image_shape: [3, 48, 320]- KeepKeys:keep_keys:- image- label_ctc- label_sar- length- valid_ratioloader:shuffle: truebatch_size_per_card: 16drop_last: truenum_workers: 8
Eval:dataset:name: SimpleDataSetdata_dir: ./data/label_file_list:- ./data/rec/val.txt # 识别数据集中的评估标签路径transforms:- DecodeImage:img_mode: BGRchannel_first: false- MultiLabelEncode:- RecResizeImg:image_shape: [3, 48, 320]- KeepKeys:keep_keys:- image- label_ctc- label_sar- length- valid_ratioloader:shuffle: falsedrop_last: falsebatch_size_per_card: 16num_workers: 8
2、模型训练
打开anaconda终端,激活环境进入到PaddleOCR-releas-2.7根目录下。输入以下指令开始模型训练。
python tools/train.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml
训练完成。
3、模型测试
在anaconda终端中输入以下指令进行测试。 其中Global.pretrained_model是我们训练好并且需要测试的模型,Global.infer_img为所要检测的图片路径。
python tools/infer_rec.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml -o Global.pretrained_model=output/rec_ppocr_v3/latest.pdparams Global.infer_img="./data/test_images/1.jpeg"
识别模型结束
***************************************************************************
四、转换成推理模型
1、在anaconda终端中输入指令进行测试
其中Global.pretrained_model是训练好并且需要推理的模型,Global.save_inference_dir为要保存推理模型的位置。推理模型是可以直接被调用进行识别和检测。分别把训练好的文字检测模型和文字识别模型推理。
python tools/export_model.py -c "./configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml" -o Global.pretrained_model="./output/ch_db_res18/latest.pdparams" Global.save_inference_dir="./inference_model/det/"
保存在 inference model is saved to ./inference_model/det/inference
python tools/export_model.py -c "./configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml" -o Global.pretrained_model="./output/rec_ppocr_v3/latest.pdparams" Global.save_inference_dir="./inference_model/rec/"
保存在 inference model is saved to ./inference_model/rec/inference
其中det和rec即是保存的推理模型
2.用predict_system.py进行验证
打开anaconda终端输入以下指令:
python tools/infer/predict_system.py --det_model_dir="./inference_model/det/" --rec_model_dir="./inference_model/rec" --image_dir="./data/test_images/3.jpeg"
结果保存在saved in ./inference_results/
五、推理模型部署在RK3588上
未完后面接着更...