SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment

Accurate recognition of fruits in the orchard is an important step for robot picking in the natural environment, since many CNN models have a low recognition rate when dealing with irregularly shaped and very dense fruits, such as a grape bunch. It is a new trend to use a transformer structure and a...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Jinhai Wang, Zongyin Zhang, Lufeng Luo, Wenbo Zhu, Jianwen Chen, Wei Wang
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	grape bunch detection Swin Transformer SwinGD computer vision Plant culture SB1-1110
Acceso en línea:	https://doaj.org/article/637f97f3c8cb44399e586660971d04f0
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:637f97f3c8cb44399e586660971d04f0
record_format	dspace
spelling	oai:doaj.org-article:637f97f3c8cb44399e586660971d04f02021-11-25T17:47:37ZSwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment10.3390/horticulturae71104922311-7524https://doaj.org/article/637f97f3c8cb44399e586660971d04f02021-11-01T00:00:00Zhttps://www.mdpi.com/2311-7524/7/11/492https://doaj.org/toc/2311-7524Accurate recognition of fruits in the orchard is an important step for robot picking in the natural environment, since many CNN models have a low recognition rate when dealing with irregularly shaped and very dense fruits, such as a grape bunch. It is a new trend to use a transformer structure and apply it to a computer vision domain for image processing. This paper provides Swin Transformer and DETR models to achieve grape bunch detection. Additionally, they are compared with traditional CNN models, such as Faster-RCNN, SSD, and YOLO. In addition, the optimal number of stages for a Swin Transformer through experiments is selected. Furthermore, the latest YOLOX model is also used to make a comparison with the Swin Transformer, and the experimental results show that YOLOX has higher accuracy and better detection effect. The above models are trained under red grape datasets collected under natural light. In addition, the dataset is expanded through image data augmentation to achieve a better training effect. After 200 epochs of training, SwinGD obtained an exciting mAP value of 94% when <i>IoU</i> = 0.5. In case of overexposure, overdarkness, and occlusion, SwinGD can recognize more accurately and robustly compared with other models. At the same time, SwinGD still has a better effect when dealing with dense grape bunches. Furthermore, 100 pictures of grapes containing 655 grape bunches are downloaded from Baidu pictures to detect the effect. The Swin Transformer has an accuracy of 91.5%. In order to verify the universality of SwinGD, we conducted a test under green grape images. The experimental results show that SwinGD has a good effect in practical application. The success of SwinGD provides a new solution for precision harvesting in agriculture.Jinhai WangZongyin ZhangLufeng LuoWenbo ZhuJianwen ChenWei WangMDPI AGarticlegrape bunch detectionSwin TransformerSwinGDcomputer visionPlant cultureSB1-1110ENHorticulturae, Vol 7, Iss 492, p 492 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	grape bunch detection Swin Transformer SwinGD computer vision Plant culture SB1-1110
spellingShingle	grape bunch detection Swin Transformer SwinGD computer vision Plant culture SB1-1110 Jinhai Wang Zongyin Zhang Lufeng Luo Wenbo Zhu Jianwen Chen Wei Wang SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
description	Accurate recognition of fruits in the orchard is an important step for robot picking in the natural environment, since many CNN models have a low recognition rate when dealing with irregularly shaped and very dense fruits, such as a grape bunch. It is a new trend to use a transformer structure and apply it to a computer vision domain for image processing. This paper provides Swin Transformer and DETR models to achieve grape bunch detection. Additionally, they are compared with traditional CNN models, such as Faster-RCNN, SSD, and YOLO. In addition, the optimal number of stages for a Swin Transformer through experiments is selected. Furthermore, the latest YOLOX model is also used to make a comparison with the Swin Transformer, and the experimental results show that YOLOX has higher accuracy and better detection effect. The above models are trained under red grape datasets collected under natural light. In addition, the dataset is expanded through image data augmentation to achieve a better training effect. After 200 epochs of training, SwinGD obtained an exciting mAP value of 94% when <i>IoU</i> = 0.5. In case of overexposure, overdarkness, and occlusion, SwinGD can recognize more accurately and robustly compared with other models. At the same time, SwinGD still has a better effect when dealing with dense grape bunches. Furthermore, 100 pictures of grapes containing 655 grape bunches are downloaded from Baidu pictures to detect the effect. The Swin Transformer has an accuracy of 91.5%. In order to verify the universality of SwinGD, we conducted a test under green grape images. The experimental results show that SwinGD has a good effect in practical application. The success of SwinGD provides a new solution for precision harvesting in agriculture.
format	article
author	Jinhai Wang Zongyin Zhang Lufeng Luo Wenbo Zhu Jianwen Chen Wei Wang
author_facet	Jinhai Wang Zongyin Zhang Lufeng Luo Wenbo Zhu Jianwen Chen Wei Wang
author_sort	Jinhai Wang
title	SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
title_short	SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
title_full	SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
title_fullStr	SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
title_full_unstemmed	SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
title_sort	swingd: a robust grape bunch detection model based on swin transformer in complex vineyard environment
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/637f97f3c8cb44399e586660971d04f0
work_keys_str_mv	AT jinhaiwang swingdarobustgrapebunchdetectionmodelbasedonswintransformerincomplexvineyardenvironment AT zongyinzhang swingdarobustgrapebunchdetectionmodelbasedonswintransformerincomplexvineyardenvironment AT lufengluo swingdarobustgrapebunchdetectionmodelbasedonswintransformerincomplexvineyardenvironment AT wenbozhu swingdarobustgrapebunchdetectionmodelbasedonswintransformerincomplexvineyardenvironment AT jianwenchen swingdarobustgrapebunchdetectionmodelbasedonswintransformerincomplexvineyardenvironment AT weiwang swingdarobustgrapebunchdetectionmodelbasedonswintransformerincomplexvineyardenvironment
_version_	1718411988314357760

SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment

Ejemplares similares