基于 YOLOV5-MobilenetV3 和声呐图像的鱼类识别轻量化模型

罗毅智 1; 3; 陆华忠 4; 周星星 1; 袁 余 1; 3; 齐海军1; 3; 李 斌 1; 刘志昌 2

文章摘要

Lightweight Model for Fish Recognition Based on YOLOV5-MobilenetV3 and Sonar Images

DOI：10.16768/j.issn.1004-874X.2023.07.004

Author Name	Affiliation
LUO Yizhi 1,3, LU Huazhong4, ZHOU Xingxing1, YUAN Yu1,3, QI Haijun1,3, LI Bin1, LIU Zhichang2	1. 广东省农业科学院设施农业研究所，广东广州 510640；2. 广东省农业科学院动物科学研究所（水产研究所），广东广州 510645；3. 农业农村部设施农业装备与信息化重点实验室，浙江杭州 311000；4. 广东省农业科学院，广东广州 510640

Hits: 928

Download times: 939

Abstract:

【Objective】Cage biometrics and statistics are one of the key reference factors for marine pasture farming management. Aiming at the interference of reverberation noise and complex background, this paper constructs fish detection data sets under different lighting conditions, and uses forward-looking sonar imaging technology to propose a fish recognition lightweight model based on YOLOV5-MobilenetV3 and sonar images (LAPR-Net) to realize fish recognition in water cages in turbid or dark scenes.【Method】Taking tilapia as the research object, based on the frame structure of the YOLOV5 model, the backbone network module ado pts the lightweight Mob ileNetV3 bneck block, using the linear bottleneck inverse residual structure and depth separable convolution extract the features of fish in sonar images, applying the attention mechanism SE-Net to obtain multi-scale semantic features of sonar images and enhance the correlation between features; the neck network adopts the path aggregation network structure to perform multi-scale fusion of target features, to enhance the feature fusion ability; the prediction part adopts the maximum local search based on the non-maximum suppression method, removes the redundant detection frame, screens the detection frame with the highest confidence, and finally outputs and displays the detection result of the fish, including the position, category and detection probability of detecting an object.【Result】Four other mainstream detection models were selected for comparative experiments, including YOLOV3-ting (Darknet53), YOLOV5 (CSPdarknet53), YOLOV5 (Repvgg), and YOLOV5s (Transformer). It proposes the model parameter quantityof 3 545 453, FLOPs of 6.3 G, and the mAP of 0.957, and the average inference speed of each picture of the model is 0.08868 s. Compared with the YOLOV5 model, the mAP of the improved model has increased by 9.7%.【Conclusion】The proposed network improves the speed of training and recognition, reduces the requirements for hardware equipment, and provides a reference for the detection model of cage cultured fish in marine pastures.

View Full Text View/Add Comment Download reader