【键盘识别】实例分割
第一步 键盘检测
方案一 canny边缘检测
canny边缘检测检测结果不稳定,容易因为复杂背景或光线变换检测出其他目标。
如图是用canny边缘检测方法标出的检测出的边缘的四个红点。
参考的是这篇文章OpenCV实战之三 | 基于OpenCV实现图像校正_opencv 图像校正-CSDN博客
方案二 Mask-RCNN
论文1703.06870
参考Mask Rcnn目标分割-训练自己数据集-详细步骤_maskrcnn训练自己的数据集-CSDN博客
1. 下载代码和配置环境
远程+源代码方案
发布 ·Matterport/Mask_RCNN
点击上传zip文件。
Linux 系统中不同层级的目录结构
根目录 /
下的文件目录。根目录 /
是 Linux 文件系统的最顶层目录,所有其他目录都是它的子目录。
root@autodl-container-9403468337-223581eb:~# cd /
root@autodl-container-9403468337-223581eb:/# ls
bin boot dev etc home init lib lib32 lib64 libx32 media mnt NGC-DL-CONTAINER-LICENSE opt proc root run sbin srv sys tmp usr var
用户主目录 ~
下的文件目录。用户主目录是每个用户专属的工作目录,用户可以在其中自由创建、修改和删除文件及文件夹。在这个目录中看到的内容通常与用户的特定操作和环境有关:
root@autodl-container-9403468337-223581eb:/# cd ~
root@autodl-container-9403468337-223581eb:~# ls
autodl-pub autodl-tmp Mask_RCNN-master.zip miniconda3 tf-logs
解压代码包
root@autodl-container-9403468337-223581eb:~# unzip Mask_RCNN-master.zip
之后根据确保requirements.txt安装必要的库。
本地+网传代码方案
(keyboard) C:\Users\吴伊晴>git clone https://github.com/matterport/Mask_RCNN.git
Cloning into 'Mask_RCNN'...
remote: Enumerating objects: 956, done.
remote: Total 956 (delta 0), reused 0 (delta 0), pack-reused 956 (from 1)
Receiving objects: 100% (956/956), 137.67 MiB | 16.28 MiB/s, done.
Resolving deltas: 100% (558/558), done.
(keyboard) C:\Users\吴伊晴>cd Mask_RCNN(keyboard) C:\Users\吴伊晴\Mask_RCNN>python setup.py install
setup.py:9: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.htmlimport pkg_resources
WARNING:root:Fail load requirements file, so using default ones.
D:\Env\ANACONDA\envs\keyboard\lib\site-packages\setuptools\dist.py:452: SetuptoolsDeprecationWarning: Invalid dash-separated options
!!********************************************************************************Usage of dash-separated 'description-file' will not be supported in futureversions. Please use the underscore name 'description_file' instead.This deprecation is overdue, please update your project and remove deprecatedcalls to avoid build errors in the future.See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.********************************************************************************!!opt = self.warn_dash_deprecation(opt, section)
D:\Env\ANACONDA\envs\keyboard\lib\site-packages\setuptools\dist.py:452: SetuptoolsDeprecationWarning: Invalid dash-separated options
!!********************************************************************************Usage of dash-separated 'license-file' will not be supported in futureversions. Please use the underscore name 'license_file' instead.This deprecation is overdue, please update your project and remove deprecatedcalls to avoid build errors in the future.See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.********************************************************************************!!opt = self.warn_dash_deprecation(opt, section)
D:\Env\ANACONDA\envs\keyboard\lib\site-packages\setuptools\dist.py:452: SetuptoolsDeprecationWarning: Invalid dash-separated options
!!********************************************************************************Usage of dash-separated 'requirements-file' will not be supported in futureversions. Please use the underscore name 'requirements_file' instead.This deprecation is overdue, please update your project and remove deprecatedcalls to avoid build errors in the future.See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.********************************************************************************!!opt = self.warn_dash_deprecation(opt, section)
INFO:root:running install
D:\Env\ANACONDA\envs\keyboard\lib\site-packages\setuptools\_distutils\cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!********************************************************************************Please avoid running ``setup.py`` directly.Instead, use pypa/build, pypa/installer or otherstandards-based tools.See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.********************************************************************************!!self.initialize_options()
D:\Env\ANACONDA\envs\keyboard\lib\site-packages\setuptools\_distutils\cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!********************************************************************************Please avoid running ``setup.py`` and ``easy_install``.Instead, use pypa/build, pypa/installer or otherstandards-based tools.See https://github.com/pypa/setuptools/issues/917 for details.********************************************************************************!!self.initialize_options()
INFO:root:running bdist_egg
INFO:root:running egg_info
INFO:root:creating mask_rcnn.egg-info
INFO:root:writing mask_rcnn.egg-info\PKG-INFO
INFO:root:writing dependency_links to mask_rcnn.egg-info\dependency_links.txt
INFO:root:writing top-level names to mask_rcnn.egg-info\top_level.txt
INFO:root:writing manifest file 'mask_rcnn.egg-info\SOURCES.txt'
INFO:root:reading manifest file 'mask_rcnn.egg-info\SOURCES.txt'
INFO:root:reading manifest template 'MANIFEST.in'
INFO:root:adding license file 'LICENSE'
INFO:root:writing manifest file 'mask_rcnn.egg-info\SOURCES.txt'
INFO:root:installing library code to build\bdist.win-amd64\egg
INFO:root:running install_lib
INFO:root:running build_py
INFO:root:creating build\lib\mrcnn
INFO:root:copying mrcnn\config.py -> build\lib\mrcnn
INFO:root:copying mrcnn\model.py -> build\lib\mrcnn
INFO:root:copying mrcnn\parallel_model.py -> build\lib\mrcnn
INFO:root:copying mrcnn\utils.py -> build\lib\mrcnn
INFO:root:copying mrcnn\visualize.py -> build\lib\mrcnn
INFO:root:copying mrcnn\__init__.py -> build\lib\mrcnn
INFO:root:creating build\bdist.win-amd64\egg
INFO:root:creating build\bdist.win-amd64\egg\mrcnn
INFO:root:copying build\lib\mrcnn\config.py -> build\bdist.win-amd64\egg\mrcnn
INFO:root:copying build\lib\mrcnn\model.py -> build\bdist.win-amd64\egg\mrcnn
INFO:root:copying build\lib\mrcnn\parallel_model.py -> build\bdist.win-amd64\egg\mrcnn
INFO:root:copying build\lib\mrcnn\utils.py -> build\bdist.win-amd64\egg\mrcnn
INFO:root:copying build\lib\mrcnn\visualize.py -> build\bdist.win-amd64\egg\mrcnn
INFO:root:copying build\lib\mrcnn\__init__.py -> build\bdist.win-amd64\egg\mrcnn
INFO:root:byte-compiling build\bdist.win-amd64\egg\mrcnn\config.py to config.cpython-38.pyc
INFO:root:byte-compiling build\bdist.win-amd64\egg\mrcnn\model.py to model.cpython-38.pyc
build\bdist.win-amd64\egg\mrcnn\model.py:2359: SyntaxWarning: "is" with a literal. Did you mean "=="?if os.name is 'nt':
INFO:root:byte-compiling build\bdist.win-amd64\egg\mrcnn\parallel_model.py to parallel_model.cpython-38.pyc
INFO:root:byte-compiling build\bdist.win-amd64\egg\mrcnn\utils.py to utils.cpython-38.pyc
INFO:root:byte-compiling build\bdist.win-amd64\egg\mrcnn\visualize.py to visualize.cpython-38.pyc
INFO:root:byte-compiling build\bdist.win-amd64\egg\mrcnn\__init__.py to __init__.cpython-38.pyc
INFO:root:creating build\bdist.win-amd64\egg\EGG-INFO
INFO:root:copying mask_rcnn.egg-info\PKG-INFO -> build\bdist.win-amd64\egg\EGG-INFO
INFO:root:copying mask_rcnn.egg-info\SOURCES.txt -> build\bdist.win-amd64\egg\EGG-INFO
INFO:root:copying mask_rcnn.egg-info\dependency_links.txt -> build\bdist.win-amd64\egg\EGG-INFO
INFO:root:copying mask_rcnn.egg-info\top_level.txt -> build\bdist.win-amd64\egg\EGG-INFO
WARNING:root:zip_safe flag not set; analyzing archive contents...
INFO:root:creating dist
INFO:root:creating 'dist\mask_rcnn-2.1-py3.8.egg' and adding 'build\bdist.win-amd64\egg' to it
INFO:root:removing 'build\bdist.win-amd64\egg' (and everything under it)
INFO:root:Processing mask_rcnn-2.1-py3.8.egg
INFO:root:Copying mask_rcnn-2.1-py3.8.egg to d:\env\anaconda\envs\keyboard\lib\site-packages
INFO:root:Adding mask-rcnn 2.1 to easy-install.pth file
INFO:root:
Installed d:\env\anaconda\envs\keyboard\lib\site-packages\mask_rcnn-2.1-py3.8.egg
INFO:root:Processing dependencies for mask-rcnn==2.1
INFO:root:Finished processing dependencies for mask-rcnn==2.1
2. 准备数据集
为了训练 Mask R-CNN,我准备了一个包含键盘的标注数据集。我先收集键盘图片,再使用 LabelMe工具来手动标注键盘。(由于云端服务器好像打不开LabelMe,我在本地服务器标注的数据集。)
标注后得到的数据集是这样的。
由于数据集的格式需要与 Mask R-CNN 所要求的格式兼容,所以要将标签转换为coco数据集格式。使用转换代码
import argparse
import base64
import json
import os
import os.path as ospimport imgviz
import PIL.Imagefrom labelme.logger import logger
from labelme import utilsimport glob# 最前面加入导包
import yamldef main():logger.warning("This script is aimed to demonstrate how to convert the ""JSON file to a single image dataset.")logger.warning("It won't handle multiple JSON files to generate a ""real-use dataset.")parser = argparse.ArgumentParser()###############################################增加的语句############################### parser.add_argument("json_file")parser.add_argument("--json_dir",default="D:/2021file/Biye/Mask_RCNN-master/samples/Mydata")###############################################end###################################parser.add_argument("-o", "--out", default=None)args = parser.parse_args()###############################################增加的语句##############################assert args.json_dir is not None and len(args.json_dir) > 0# json_file = args.json_filejson_dir = args.json_dirif osp.isfile(json_dir):json_list = [json_dir] if json_dir.endswith('.json') else []else:json_list = glob.glob(os.path.join(json_dir, '*.json'))###############################################end###################################for json_file in json_list:json_name = osp.basename(json_file).split('.')[0]out_dir = args.out if (args.out is not None) else osp.join(osp.dirname(json_file), json_name)###############################################end###################################if not osp.exists(out_dir):os.makedirs(out_dir)data = json.load(open(json_file))imageData = data.get("imageData")if not imageData:imagePath = os.path.join(os.path.dirname(json_file), data["imagePath"])with open(imagePath, "rb") as f:imageData = f.read()imageData = base64.b64encode(imageData).decode("utf-8")img = utils.img_b64_to_arr(imageData)label_name_to_value = {"_background_": 0}for shape in sorted(data["shapes"], key=lambda x: x["label"]):label_name = shape["label"]if label_name in label_name_to_value:label_value = label_name_to_value[label_name]else:label_value = len(label_name_to_value)label_name_to_value[label_name] = label_valuelbl, _ = utils.shapes_to_label(img.shape, data["shapes"], label_name_to_value)label_names = [None] * (max(label_name_to_value.values()) + 1)for name, value in label_name_to_value.items():label_names[value] = namelbl_viz = imgviz.label2rgb(lbl, imgviz.asgray(img), label_names=label_names, loc="rb")PIL.Image.fromarray(img).save(osp.join(out_dir, "img.png"))utils.lblsave(osp.join(out_dir, "label.png"), lbl)PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, "label_viz.png"))with open(osp.join(out_dir, "label_names.txt"), "w") as f:for lbl_name in label_names:f.write(lbl_name + "\n")logger.info("Saved to: {}".format(out_dir))########增加了yaml生成部分logger.warning('info.yaml is being replaced by label_names.txt')info = dict(label_names=label_names)with open(osp.join(out_dir, 'info.yaml'), 'w') as f:yaml.safe_dump(info, f, default_flow_style=False)logger.info('Saved to: {}'.format(out_dir))if __name__ == "__main__":main()
将自己的.jpg和.json文件批量转换,每一个数据对应的生成的文件夹下一共包含5个文件。
然后这个代码一直跑不通……就换了一种方法。
方案三 YOLOv8
数据集
├─images
│ ├─test
│ ├─train
│ └─val
└─labels├─test├─train└─val
训练
from ultralytics import YOLOdef main():# Load a modelmodel = YOLO("yolov8n-seg.pt") # load a pretrained model (recommended for training)# Train the modelresults = model.train(data="./keyboard.yaml", epochs=100, plots=True, batch=4)if __name__ == '__main__':main()
测试
from ultralytics import YOLO
import numpy as np
from pathlib import Path
import cv2model = YOLO("best.pt")results = model(r"E:\0\keyboard\datasets\train_data\images\train")
for result in results:img = np.copy(result.orig_img)img_name = Path(result.path).stem # 获取源图像的基本名称# 创建一个与原始图像相同大小的透明背景图像transparent_img = np.zeros_like(img, dtype=np.uint8)for ci, c in enumerate(result):# 获取检测到的类别名称label = c.names[c.boxes.cls.tolist().pop()]# 获取分割掩码masks = c.masks.xy # 获取所有分割掩码for i, mask in enumerate(masks):# 创建二进制掩码图像b_mask = np.zeros(img.shape[:2], np.uint8)contour = mask.astype(np.int32).reshape(-1, 1, 2)cv2.drawContours(b_mask, [contour], -1, (255), cv2.FILLED)# 将掩码区域复制到透明背景图像中transparent_img[b_mask == 255] = img[b_mask == 255]# 保存掩码图像mask_img_name = f"./runs/crop/{img_name}_{label}_mask_{i+1}.png"cv2.imwrite(mask_img_name, transparent_img)
第二步 图像裁剪+透视变换+灰度处理
测试文件修改了一下
from ultralytics import YOLO
import numpy as np
from pathlib import Path
import cv2model = YOLO("best.pt")
# 获取用户主目录
home_dir = Path.home()
# 构建完整路径
source_path = home_dir / 'YOLO' / 'datasets' / 'images' / 'test'
# 进行预测
results = model(str(source_path))for result in results:img = np.copy(result.orig_img)img_name = Path(result.path).stem # 获取源图像的基本名称for ci, c in enumerate(result):# 获取检测到的类别名称label = c.names[c.boxes.cls.tolist().pop()]# 获取分割掩码masks = c.masks.xy # 获取所有分割掩码for i, mask in enumerate(masks):# 创建二进制掩码图像b_mask = np.zeros(img.shape[:2], np.uint8)contour = mask.astype(np.int32).reshape(-1, 1, 2)cv2.drawContours(b_mask, [contour], -1, (255), cv2.FILLED)# 查找最大轮廓contours, _ = cv2.findContours(b_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)if not contours:continuemax_contour = max(contours, key=cv2.contourArea)# 轮廓近似为四边形epsilon = 0.02 * cv2.arcLength(max_contour, True)approx = cv2.approxPolyDP(max_contour, epsilon, True)if len(approx) != 4:continue# 重新排列顶点顺序pts = approx.reshape(4, 2)rect = np.zeros((4, 2), dtype="float32")s = pts.sum(axis=1)rect[0] = pts[np.argmin(s)]rect[2] = pts[np.argmax(s)]diff = np.diff(pts, axis=1)rect[1] = pts[np.argmin(diff)]rect[3] = pts[np.argmax(diff)]# 计算透视变换的目标矩形(tl, tr, br, bl) = rectwidthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))maxWidth = max(int(widthA), int(widthB))heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))maxHeight = max(int(heightA), int(heightB))dst = np.array([[0, 0],[maxWidth - 1, 0],[maxWidth - 1, maxHeight - 1],[0, maxHeight - 1]], dtype="float32")# 计算透视变换矩阵M = cv2.getPerspectiveTransform(rect, dst)# 应用透视变换warped_img = cv2.warpPerspective(img, M, (maxWidth, maxHeight))# # 转换为灰度图像
# gray_img = cv2.cvtColor(warped_img, cv2.COLOR_BGR2GRAY)# # 二值化处理
# _, binary_img = cv2.threshold(gray_img, 100, 255, cv2.THRESH_BINARY)# # 保存裁剪后的二值化图像mask_img_name = f"./runs/crop/{img_name}_{label}_mask_{i + 1}.png"cv2.imwrite(mask_img_name, warped_img)
由于训练集是自己临时做的,不是特别大,然后光线问题,做出来的效果跟需求有一定区别。
改进:
1.扩充数据集
2.平衡照片光照