当前位置: 首页 > news >正文

车牌识别之二:车牌OCR识别(包含全部免费的数据集、源码和模型下载)

重要的事说在前面

数据集:
https://pan.baidu.com/s/1YayAeqgdqZ0u2vSovd0Z4w
提取码:8888
如果作者误删的话,参考这里下载的CCPD2019.tar.xz和CCPD2020.zip获取。

背景

上一节车牌识别之一:车牌检测(包含全部免费的数据集、源码和模型下载)利用物体检测技术将车牌已经定位出来了,包括外包框和四个角点,本文进行车牌区域的OCR识别探讨

依赖

paddlepaddle-gpu==2.6.2
shapely
scikit-image
six
pyclipper
lmdb
tqdm
numpy
rapidfuzz
opencv-python
opencv-contrib-python
cython
Pillow
pyyaml
requests
albumentations==1.4.10
# to be compatible with albumentations
albucore==0.0.13

安装paddleocr

paddleocr是目前认为针对中文识别精度最高的框架,这是本项目的不二选择

git clone https://github.com/PaddlePaddle/PaddleOCR
cd PaddleOCR
pip install -v -e .

本项目所测试的paddleocr的版本是2.9.0

将数据集下载至data/ccpd并解压

解压后的数据目录为:

data/ccpd
├── CCPD2019
├── CCPD2019.tar.xz
├── CCPD2020
└── CCPD2020.zip

生成训练测试数据集

然后解压后的数据以5:1的大小生成训练数据, 新建gen_paddleocr_format_data.py,内容为:

import os,sys
import glob
import random
import cv2
import numpy as np
from tqdm import tqdm
provincelist = ["皖", "沪", "津", "渝", "冀","晋", "蒙", "辽", "吉", "黑","苏", "浙", "京", "闽", "赣","鲁", "豫", "鄂", "湘", "粤","桂", "琼", "川", "贵", "云","西", "陕", "甘", "青", "宁","新"]wordlist = ["A", "B", "C", "D", "E","F", "G", "H", "J", "K","L", "M", "N", "P", "Q","R", "S", "T", "U", "V","W", "X", "Y", "Z", "0","1", "2", "3", "4", "5","6", "7", "8", "9"]
def gen_paddlecor_format_data(rootpath, dstpath):os.makedirs(dstpath, exist_ok=True)os.makedirs(os.path.join(dstpath, 'train'), exist_ok=True)os.makedirs(os.path.join(dstpath, 'val'), exist_ok=True)list_images = glob.glob(f'{rootpath}/**/*.jpg', recursive=True)for imgpath in tqdm(list_images):if "/ccpd_np/" in imgpath:#cpd_np是没有车牌的图片,跳过continue#print(imgpath)img = cv2.imread(imgpath)# 图像名imgname = os.path.basename(imgpath).split('.')[0]# 根据图像名分割标注_, _, box, points, label, brightness, blurriness = imgname.split('-')# --- 边界框信息box = box.split('_')box = [list(map(int, i.split('&'))) for i in box]box_w = box[1][0]-box[0][0]box_h = box[1][1]-box[0][1]filename = label# --- 读取车牌号label = label.split('_')# 省份缩写province = provincelist[int(label[0])]# 车牌信息words = [wordlist[int(i)] for i in label[1:]]# 车牌号label = province+''.join(words)#print(label)img_plate = img[box[0][1]:box[1][1], box[0][0]:box[1][0]]random_number = random.uniform(0, 1)if random_number > 0.1:#traindst_img_path = os.path.join(dstpath, 'train', f"{filename}.jpg")labelfileadd = os.path.join(dstpath, 'train.txt',)labelcontent = f"train/{filename}.jpg\t{label}\n"else:#valdst_img_path = os.path.join(dstpath, 'val', f"{filename}.jpg")labelfileadd = os.path.join(dstpath, 'val.txt',)labelcontent = f"val/{filename}.jpg\t{label}\n"cv2.imwrite(dst_img_path, img_plate)with open(labelfileadd, 'a') as f:f.write(labelcontent)
def gent_license_dict(dict_save_path):allwordlist = wordlist + provincelistallwordlist.sort()print(allwordlist)with open(dict_save_path, 'w') as f:for word in allwordlist:f.write(word+'\n')
if __name__ == '__main__':if len(sys.argv)!= 3:print("Usage: python gen_paddlecor_format_data.py <ccpd_dataset_path> <output_path>")exit(1)gen_paddlecor_format_data(sys.argv[1], sys.argv[2])gent_license_dict("data/ccpd_paddleocr/dict.txt")

运行命令:

python gen_yolo_format_data.py data/ccpd data/ccpd_paddleocr

生成识别训练集的层级目录:

ccpd_paddleocr/
├── dict.txt
├── train
├── train.txt
├── val
└── val.txt

下载预训练模型

点这里获取,然后在pretrain_models中解压

配置文件

PaddleOCR/coofigs/rec/PP-OCRv3中,新建ch_PP-OCRv3_rec_liecece.yml,其内容为

Global:debug: falseuse_gpu: trueepoch_num: 100log_smooth_window: 20print_batch_step: 10save_model_dir: ./output/rec_ppocr_licence_v3save_epoch_step: 3eval_batch_step: [0, 2000]cal_metric_during_train: truepretrained_model: pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy.pdparamscheckpoints:save_inference_dir:use_visualdl: falseinfer_img: doc/imgs_words/ch/word_1.jpgcharacter_dict_path: data/ccpd_paddleocr/dict.txt#character_dict_path: ppocr/utils/ppocr_keys_v1.txtmax_text_length: &max_text_length 12infer_mode: falseuse_space_char: falsedistributed: truesave_res_path: ./output/rec/predicts_licence_ppocrv3.txtOptimizer:name: Adambeta1: 0.9beta2: 0.999lr:name: Cosinelearning_rate: 0.001warmup_epoch: 5regularizer:name: L2factor: 3.0e-05Architecture:model_type: recalgorithm: SVTR_LCNetTransform:Backbone:name: MobileNetV1Enhancescale: 0.5last_conv_stride: [1, 2]last_pool_type: avglast_pool_kernel_size: [2, 2]Head:name: MultiHeadhead_list:- CTCHead:Neck:name: svtrdims: 64depth: 2hidden_dims: 120use_guide: TrueHead:fc_decay: 0.00001- SARHead:enc_dim: 512max_text_length: *max_text_lengthLoss:name: MultiLossloss_config_list:- CTCLoss:- SARLoss:PostProcess:  name: CTCLabelDecodeMetric:name: RecMetricmain_indicator: accignore_space: FalseTrain:dataset:name: SimpleDataSetdata_dir: data/ccpd_paddleocr/ext_op_transform_idx: 1label_file_list:- data/ccpd_paddleocr/train.txttransforms:- DecodeImage:img_mode: BGRchannel_first: false- RecConAug:prob: 0.5ext_data_num: 2image_shape: [48, 320, 3]max_text_length: *max_text_length- RecAug:- MultiLabelEncode:- RecResizeImg:image_shape: [3, 48, 320]- KeepKeys:keep_keys:- image- label_ctc- label_sar- length- valid_ratioloader:shuffle: truebatch_size_per_card: 128drop_last: truenum_workers: 4
Eval:dataset:name: SimpleDataSetdata_dir: data/ccpd_paddleocr/label_file_list:- data/ccpd_paddleocr/val.txttransforms:- DecodeImage:img_mode: BGRchannel_first: false- MultiLabelEncode:- RecResizeImg:image_shape: [3, 48, 320]- KeepKeys:keep_keys:- image- label_ctc- label_sar- length- valid_ratioloader:shuffle: falsedrop_last: falsebatch_size_per_card: 128num_workers: 4

开始训练

python tools/train.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_liecece.yml -o Global.pretrained_model=./pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy.pdparams

训练过程

...[2024/12/17 18:25:33] ppocr INFO: epoch: [85/100], global_step: 215960, lr: 0.000101, acc: 0.992187, norm_edit_dis: 0.998884, CTCLoss: 0.029224, SARLoss: 0.201802, loss: 0.234332, avg_reader_cost: 0.23257 s, avg_batch_cost: 0.42802 s, avg_samples: 128.0, ips: 299.05106 samples/s, eta: 4:16:11, max_mem_reserved: 9719 MB, max_mem_allocated: 9308 MB
[2024/12/17 18:25:37] ppocr INFO: epoch: [85/100], global_step: 215970, lr: 0.000101, acc: 0.984375, norm_edit_dis: 0.997768, CTCLoss: 0.037204, SARLoss: 0.198890, loss: 0.242042, avg_reader_cost: 0.15294 s, avg_batch_cost: 0.35018 s, avg_samples: 128.0, ips: 365.52281 samples/s, eta: 4:16:07, max_mem_reserved: 9719 MB, max_mem_allocated: 9308 MB
[2024/12/17 18:25:41] ppocr INFO: epoch: [85/100], global_step: 215980, lr: 0.000101, acc: 0.984375, norm_edit_dis: 0.997210, CTCLoss: 0.071525, SARLoss: 0.196448, loss: 0.259408, avg_reader_cost: 0.20332 s, avg_batch_cost: 0.39946 s, avg_samples: 128.0, ips: 320.43009 samples/s, eta: 4:16:03, max_mem_reserved: 9719 MB, max_mem_allocated: 9308 MB
[2024/12/17 18:25:44] ppocr INFO: epoch: [85/100], global_step: 215990, lr: 0.000101, acc: 0.984375, norm_edit_dis: 0.997280, CTCLoss: 0.063012, SARLoss: 0.195158, loss: 0.248105, avg_reader_cost: 0.10683 s, avg_batch_cost: 0.30685 s, avg_samples: 128.0, ips: 417.13550 samples/s, eta: 4:15:59, max_mem_reserved: 9719 MB, max_mem_allocated: 9308 MB
[2024/12/17 18:25:48] ppocr INFO: epoch: [85/100], global_step: 216000, lr: 0.000101, acc: 0.988281, norm_edit_dis: 0.997280, CTCLoss: 0.038435, SARLoss: 0.206021, loss: 0.248680, avg_reader_cost: 0.22180 s, avg_batch_cost: 0.41802 s, avg_samples: 128.0, ips: 306.20233 samples/s, eta: 4:15:56, max_mem_reserved: 9719 MB, max_mem_allocated: 9308 MB
eval model:: 100%|██████████████████████████████████████████████████████████| 284/284 [00:16<00:00, 16.73it/s]
[2024/12/17 18:26:05] ppocr INFO: cur metric, acc: 0.9785529428504617, norm_edit_dis: 0.9959577823762181, fps: 3396.2841101808885
[2024/12/17 18:26:06] ppocr INFO: save best model is to ./output/rec_ppocr_licence_v3/best_accuracy
[2024/12/17 18:26:06] ppocr INFO: best metric, acc: 0.9785529428504617, is_float16: False, norm_edit_dis: 0.9959577823762181, fps: 3396.2841101808885, best_epoch: 85
...

我们看到测试精度97.86%,说明识别模型还是很不错的。

训练结果保存

output/rec_ppocr_licence_v3/
├── best_accuracy.pdopt
├── best_accuracy.pdparams
├── best_accuracy.states
├── best_model
│   ├── model.pdopt
│   └── model.pdparams
├── config.yml
├── latest.pdopt
├── latest.pdparams
├── latest.states
└── train.log

分数最高的模型保存在output/rec_ppocr_licence_v3/best_model中。

接下来随便找几张先手动切出来看一下识别结果

以下是推理命令

python tools/infer_rec.py -c configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_liecece.yml -o Global.pretrained_model=./output/rec_ppocr_licence_v3/best_model/model Global.infer_img=../license_plate/test_plate.jpg

在这里插入图片描述 在这里插入图片描述

结果显示

infer_img: ../license_plate/test_plate.jpgresult: 皖AD49590	0.9894002079963684
infer_img: ../license_plate/22_16_5_0_25_24_25.jpgresult: 川SFA101	

一点点遗憾

这个模型用于普通场景下的车牌识别,是足够用的了。但是在训练的字典中,没有发现特殊车牌,比较双行文字,大使馆,军车等,这也是因为数据集有缺失导致,只能在以后收集更加完整的数据集后,再次使用相同的方法进行训练了。

附模型

点这里下载

车牌识别之一:车牌检测(包含全部免费的数据集、源码和模型下载)
车牌识别之三:检测+识别的onnx部署(免费下载高精度onnx模型)


http://www.mrgr.cn/news/80524.html

相关文章:

  • 优选算法——分治(归并)
  • R语言混合模型回归GBTM群组轨迹模型绘图可视化研究
  • Halcon中derivate_gauss (Operator)算子原理及应用详解
  • Elasticsearch 集群快照的定期备份设置指南
  • 力扣hot100——子串
  • 【计算机网络】应用层
  • Chinese-Clip实现以文搜图和以图搜图
  • Qt WORD/PDF(四)使用 QAxObject 对 Word 替换(QWidget)
  • 使用 DeepSpeed 微调 OPT 基础语言模型
  • 【并发容器】ConcurrentLinkedQueue:优雅地实现非阻塞式线程安全队列
  • 掌握时间,从`datetime`开始
  • Windows环境 (Ubuntu 24.04.1 LTS ) 国内镜像,用apt-get命令安装RabbitMQ
  • 241217-解决Ollama无法通过配置文件修改模型下载路径的方法
  • VirtualBox使用教程
  • python03-保留字、标识符;变量、常量;数据类型、数据类型之间的转化
  • ST-Linker V2 烧录器详解说明文档
  • STM32 水质水位检测项目(硬件架构)及(软件架构)
  • 实训项目11基于51单片机的门禁监测系统设计
  • Java系统对接企业微信审批项目流程
  • 非完全谵妄的发生率、高风险因素及预防护理概述
  • 【计算机网络】期末考试预习复习|中
  • 【机器学习】以机器学习为翼,翱翔网络安全创新苍穹
  • 【Leetcode 每日一题】3291. 形成目标字符串需要的最少字符串数 I
  • 确保某路径下存在某文件
  • 作业Day5:
  • Mapbox-GL 的源码解读的一般步骤