Application of neural network technologies for underwater munitions detection

Vadym Slyusar

doi:10.3103/S0735272723030020

Authors

Vadym Slyusar Central Scientific Research Institute of Armament and Military Equipment of Armed Forces of Ukraine, Kyiv, Ukraine http://orcid.org/0000-0002-2912-3149

DOI:

https://doi.org/10.3103/S0735272723030020

Keywords:

YOLO3, YOLO4, YOLO5, Object Detection

Abstract

In the article, the substantiated proposals for the use of YOLO family neural networks to detect the underwater undetonated munitions are proposed. At the same time, the YOLO3, YOLO4 and YOLO5 neural networks previously trained on the MS COCO dataset are used. The retraining of YOLO3 and YOLO4 neural networks is carried out on the modified Trash-ICRA19 underwater trash dataset, with the number of object classes equal to 13 and 2 of them are fictitious. The average class detection accuracy of 13 object classes using YOLO4 in the mAP50 metric is equal to 75.2% or 88.9% taking into account fictitious classes. The images obtained from video recordings of the demining reservoirs process with the help of remotely operated underwater vehicles (ROV) are used to test neural networks. The improved neural network as a cascade of several serially connected YOLO-segments with multi-pass image processing and tensor-matrix description of the attention mechanism are proposed. The recommendations for further increasing the efficiency of the neural network method of underwater munition selection are developed.

References

Espresso.tv, “Pyrotechnicians demine reservoirs in the Kyiv region,” Youtube channel Espresso.tv, 2022. https://www.youtube.com/watch?v=TxL8MQhBWnU.
5th Channel, “Attributes of the ‘Russian world’: how sappers demine the village of Gorenka near Kyiv,” Youtube channel 5.tv, 2022. https://www.youtube.com/watch?v=Jd4nWc4apTQ.
V. Slyusar et al., “Improving the model of object detection on aerial photographs and video in unmanned aerial systems,” Eastern-European J. Enterp. Technol., vol. 1, no. 9(115), pp. 24–34, 2022, doi: https://doi.org/10.15587/1729-4061.2022.252876.
M. S. Fulton, J. Hong, J. Sattar, “Trash-ICRA19: A Bounding Box Labeled Dataset of Underwater Trash,” 2020. doi: https://doi.org/10.13020/x0qn-y082.
M. Fulton, J. Hong, M. J. Islam, J. Sattar, “Robotic detection of marine litter using deep visual detection models,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 5752–5758, doi: https://doi.org/10.1109/ICRA.2019.8793975.
C. H. Lampert, H. Nickisch, S. Harmeling, “Attribute-based classification for zero-shot visual object categorization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 3, pp. 453–465, 2014, doi: https://doi.org/10.1109/TPAMI.2013.140.
K. Jang et al., “Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles,” in Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, 2019, pp. 291–300, doi: https://doi.org/10.1145/3302509.3313784.
N. O. Salscheider, “FeatureNMS: Non-maximum suppression by learning feature embeddings,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 7848–7854, doi: https://doi.org/10.1109/ICPR48806.2021.9412930.
D. Bahdanau, K. Cho, Y. Bengio, “Neural machine translation by jointly learning to align and translate,” 2015. doi: https://doi.org/10.48550/arXiv.1409.0473.
V. I. Slyusar, “A family of face products of matrices and its properties,” Cybern. Syst. Anal., vol. 35, no. 3, pp. 379–384, 1999, doi: https://doi.org/10.1007/BF02733426.
V. Slyusar et al., “Improving a neural network model for semantic segmentation of images of monitored objects in aerial photographs,” Eastern-European J. Enterp. Technol., vol. 6, no. 2 (114), pp. 86–95, 2021, doi: https://doi.org/10.15587/1729-4061.2021.248390.
K. Liu, L. Peng, S. Tang, “Underwater object detection using TC-YOLO with attention mechanisms,” Sensors, vol. 23, no. 5, p. 2567, 2023, doi: https://doi.org/10.3390/s23052567.
V. Slyusar, “Architectural and mathematical fundamentals of improvement neural networks for classification of images,” Artif. Intell., vol. 27, no. jai2022.27(1), pp. 245–258, 2022, doi: https://doi.org/10.15407/jai2022.01.245.
V. Slyusar, “The role of Artificial Intelligence in cross-platform tailoring of AR data,” in VIII International Scientific and Practical Conference, 2020, doi: https://doi.org/10.13140/RG.2.2.22122.13760.
M. O. D. UK, “SAPIENT Interface Control Document. DSTL/PUB145591, 01-Feb-2023,” 2023. uri: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1144352/SAPIENT_Interface_Control_Document_v7_FINAL__fixed2_.pdf.
M. O. D. UK, “SAPIENT autonomous sensor system. Last updated 20 April 2023,” 2023. uri: https://www.gov.uk/guidance/sapient-autonomous-sensor-system.
N. Barman, N. Khan, M. G. Martini, “Analysis of spatial and temporal information variation for 10-bit and 8-bit video sequences,” in 2019 IEEE 24th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), 2019, pp. 1–6, doi: https://doi.org/10.1109/CAMAD.2019.8858486.
A. Mauri, R. Khemmar, B. Decoux, M. Haddad, R. Boutteau, “Real-time 3D multi-object detection and localization based on deep learning for road and railway smart mobility,” J. Imaging, vol. 7, no. 8, p. 145, 2021, doi: https://doi.org/10.3390/jimaging7080145.
V. I. Slyusar, “2050 battlefield virtualization concept,” Weapons Mil. Equip., no. 3, pp. 111–112, 2021, doi: https://doi.org/10.34169/2414-0651.2021.3(31).111-112.