基于 YOLOV5-MobilenetV3 和声呐图像的鱼类识别轻量化模型

罗毅智 1; 3; 陆华忠 4; 周星星 1; 袁 余 1; 3; 齐海军1; 3; 李 斌 1; 刘志昌 2

文章摘要

罗毅智 1,3，陆华忠 4，周星星 1，袁余 1,3，齐海军1,3，李斌 1，刘志昌 2.基于 YOLOV5-MobilenetV3 和声呐图像的鱼类识别轻量化模型[J].广东农业科学,2023,50(7):37-46

查看全文 HTML 基于 YOLOV5-MobilenetV3 和声呐图像的鱼类识别轻量化模型

Lightweight Model for Fish Recognition Based on YOLOV5-MobilenetV3 and Sonar Images

DOI：10.16768/j.issn.1004-874X.2023.07.004

中文关键词: 前视声纳鱼类轻量化目标识别网箱

英文关键词: forward-looking sonar fish lightweight object detection cage

基金项目:广东省农业科学院协同创新中心项目（XT202203）；农业农村部重点实验室开放课题（2011NYZD2205）；广东省乡村振兴战略专项资金 - 设施智慧水产专家工作站（2023 工作站 10）；广东省乡村振兴战略专项（农业科技能力提升）（2023TS-1-3）

作者	单位
罗毅智 1,3，陆华忠 4，周星星 1，袁余 1,3，齐海军1,3，李斌 1，刘志昌 2	1. 广东省农业科学院设施农业研究所，广东广州 510640；2. 广东省农业科学院动物科学研究所（水产研究所），广东广州 510645；3. 农业农村部设施农业装备与信息化重点实验室，浙江杭州 311000；4. 广东省农业科学院，广东广州 510640

摘要点击次数: 1187

全文下载次数: 1284

中文摘要:

【目的】网箱生物识别和统计是海洋牧场的养殖管理的关键参考因素之一。针对混响噪声和复杂背景的干扰，构建不同光照条件下鱼类检测数据集，采用前视声呐成像技术，提出一种基于 YOLOV5-MobilenetV3 和声呐图像的鱼类识别轻量化模型（LAPR-Net），实现浑浊或黑暗场景下水体网箱的鱼类识别。【方法】以罗非鱼为研究对象，基于 YOLOV5 模型的框架结构，主干网络模块采用轻量级 MobileNetV3 bneck 模块，利用线性瓶颈的逆残差结构和深度可分离卷积提取声呐图像中鱼类的特征，通过注意力机制 SE-Net 获取声呐图像多尺度语义特征并增强特征之间的相关性；颈部网络采用路径聚合网络结构，对目标特征进行多尺度融合，增强特征融合能力；预测部分采用基于非极大抑制方法进行最大局部搜索，去除冗余的检测框，筛选置信度最高的检测框，最终输出并显示鱼的检测结果，包含位置、类别以及检测目标的概率。【结果】选择 4 种其他主流的检测模型进行对比试验，包含 YOLOV3-ting（Darknet53）、YOLOV5（CSPdarknet53）、YOLOV5（Repvgg）、YOLOV5s（Transformer），提出模型参数量为 3 545 453、计算量为 6.3 G、mAP 为 0.957，模型平均每张图片推理速度为 0.08868 s，同 YOLOV5 模型相比，改进后模型 mAP 提高 9.7%。【结论】本文提出的模型提高了训练和识别速度，降低了硬件设备要求，可为海洋牧场网箱养殖鱼类检测模型提供参考。

英文摘要:

【Objective】Cage biometrics and statistics are one of the key reference factors for marine pasture farming management. Aiming at the interference of reverberation noise and complex background, this paper constructs fish detection data sets under different lighting conditions, and uses forward-looking sonar imaging technology to propose a fish recognition lightweight model based on YOLOV5-MobilenetV3 and sonar images (LAPR-Net) to realize fish recognition in water cages in turbid or dark scenes.【Method】Taking tilapia as the research object, based on the frame structure of the YOLOV5 model, the backbone network module ado pts the lightweight Mob ileNetV3 bneck block, using the linear bottleneck inverse residual structure and depth separable convolution extract the features of fish in sonar images, applying the attention mechanism SE-Net to obtain multi-scale semantic features of sonar images and enhance the correlation between features; the neck network adopts the path aggregation network structure to perform multi-scale fusion of target features, to enhance the feature fusion ability; the prediction part adopts the maximum local search based on the non-maximum suppression method, removes the redundant detection frame, screens the detection frame with the highest confidence, and finally outputs and displays the detection result of the fish, including the position, category and detection probability of detecting an object.【Result】Four other mainstream detection models were selected for comparative experiments, including YOLOV3-ting (Darknet53), YOLOV5 (CSPdarknet53), YOLOV5 (Repvgg), and YOLOV5s (Transformer). It proposes the model parameter quantityof 3 545 453, FLOPs of 6.3 G, and the mAP of 0.957, and the average inference speed of each picture of the model is 0.08868 s. Compared with the YOLOV5 model, the mAP of the improved model has increased by 9.7%.【Conclusion】The proposed network improves the speed of training and recognition, reduces the requirements for hardware equipment, and provides a reference for the detection model of cage cultured fish in marine pastures.

查看/发表评论下载PDF阅读器