BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.

In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author's stance (favor or against) towards a specific target or proposition in the text. Pre-trained lan...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Yang Li, Yuqing Sun, Nana Zhu
Formato:	article
Lenguaje:	EN
Publicado:	Public Library of Science (PLoS) 2021
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/20905e10c23048df9ce82dda4d80aa51
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:20905e10c23048df9ce82dda4d80aa51
record_format	dspace
spelling	oai:doaj.org-article:20905e10c23048df9ce82dda4d80aa512021-12-02T20:08:19ZBERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.1932-620310.1371/journal.pone.0257130https://doaj.org/article/20905e10c23048df9ce82dda4d80aa512021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0257130https://doaj.org/toc/1932-6203In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author's stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel 'student' CNN structure from a much larger 'teacher' language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously.Yang LiYuqing SunNana ZhuPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 9, p e0257130 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Yang Li Yuqing Sun Nana Zhu BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.
description	In recent years, text sentiment analysis has attracted wide attention, and promoted the rise and development of stance detection research. The purpose of stance detection is to determine the author's stance (favor or against) towards a specific target or proposition in the text. Pre-trained language models like BERT have been proven to perform well in this task. However, in many reality scenes, they are usually very expensive in computation, because such heavy models are difficult to implement with limited resources. To improve the efficiency while ensuring the performance, we propose a knowledge distillation model BERTtoCNN, which combines the classic distillation loss and similarity-preserving loss in a joint knowledge distillation framework. On the one hand, BERTtoCNN provides an efficient distillation process to train a novel 'student' CNN structure from a much larger 'teacher' language model BERT. On the other hand, based on the similarity-preserving loss function, BERTtoCNN guides the training of a student network, so that input pairs with similar (dissimilar) activation in the teacher network have similar (dissimilar) activation in the student network. We conduct experiments and test the proposed model on the open Chinese and English stance detection datasets. The experimental results show that our model outperforms the competitive baseline methods obviously.
format	article
author	Yang Li Yuqing Sun Nana Zhu
author_facet	Yang Li Yuqing Sun Nana Zhu
author_sort	Yang Li
title	BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.
title_short	BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.
title_full	BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.
title_fullStr	BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.
title_full_unstemmed	BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.
title_sort	berttocnn: similarity-preserving enhanced knowledge distillation for stance detection.
publisher	Public Library of Science (PLoS)
publishDate	2021
url	https://doaj.org/article/20905e10c23048df9ce82dda4d80aa51
work_keys_str_mv	AT yangli berttocnnsimilaritypreservingenhancedknowledgedistillationforstancedetection AT yuqingsun berttocnnsimilaritypreservingenhancedknowledgedistillationforstancedetection AT nanazhu berttocnnsimilaritypreservingenhancedknowledgedistillationforstancedetection
_version_	1718375167713869824

BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.

Ejemplares similares