Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach

In this paper, we study language used by suicidal users on Reddit social media platform. To do that, we firstly collect a large-scale dataset of Reddit posts and annotate it with highly trained and expert annotators under a rigorous annotation scheme. Next, we perform a multifaceted analysis of the...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Michal Ptaszynski, Monika Zasko-Zielinska, Michal Marcinczuk, Gniewosz Leliwa, Marcin Fortuna, Kamil Soliwoda, Ida Dziublewska, Olimpia Hubert, Pawel Skrzek, Jan Piesiewicz, Paula Karbowska, Maria Dowgiallo, Juuso Eronen, Patrycja Tempska, Maciej Brochocki, Marek Godny, Michal Wroczynski
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
R
Acceso en línea:https://doaj.org/article/2d3a0300c6374946bd729c3366a673a3
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:2d3a0300c6374946bd729c3366a673a3
record_format dspace
spelling oai:doaj.org-article:2d3a0300c6374946bd729c3366a673a32021-11-25T17:48:02ZLooking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach10.3390/ijerph1822117591660-46011661-7827https://doaj.org/article/2d3a0300c6374946bd729c3366a673a32021-11-01T00:00:00Zhttps://www.mdpi.com/1660-4601/18/22/11759https://doaj.org/toc/1661-7827https://doaj.org/toc/1660-4601In this paper, we study language used by suicidal users on Reddit social media platform. To do that, we firstly collect a large-scale dataset of Reddit posts and annotate it with highly trained and expert annotators under a rigorous annotation scheme. Next, we perform a multifaceted analysis of the dataset, including: (1) the analysis of user activity before and after posting a suicidal message, and (2) a pragmalinguistic study on the vocabulary used by suicidal users. In the second part of the analysis, we apply LIWC, a dictionary-based toolset widely used in psychology and linguistic research, which provides a wide range of linguistic category annotations on text. However, since raw LIWC scores are not sufficiently reliable, or informative, we propose a procedure to decrease the possibility of unreliable and misleading LIWC scores leading to misleading conclusions by analyzing not each category separately, but in pairs with other categories. The analysis of the results supported the validity of the proposed approach by revealing a number of valuable information on the vocabulary used by suicidal users and helped to pin-point false predictors. For example, we were able to specify that death-related words, typically associated with suicidal posts in the majority of the literature, become false predictors, when they co-occur with apostrophes, even in high-risk subreddits. On the other hand, the category-pair based disambiguation helped to specify that death becomes a predictor only when co-occurring with future-focused language, informal language, discrepancy, or 1st person pronouns. The promising applicability of the approach was additionally analyzed for its limitations, where we found out that although LIWC is a useful and easily applicable tool, the lack of any contextual processing makes it unsuitable for application in psychological and linguistic studies. We conclude that disadvantages of LIWC can be easily overcome by creating a number of high-performance AI-based classifiers trained for annotation of similar categories as LIWC, which we plan to pursue in future work.Michal PtaszynskiMonika Zasko-ZielinskaMichal MarcinczukGniewosz LeliwaMarcin FortunaKamil SoliwodaIda DziublewskaOlimpia HubertPawel SkrzekJan PiesiewiczPaula KarbowskaMaria DowgialloJuuso EronenPatrycja TempskaMaciej BrochockiMarek GodnyMichal WroczynskiMDPI AGarticlesuicidal declarationsLIWCsocial mediaMedicineRENInternational Journal of Environmental Research and Public Health, Vol 18, Iss 11759, p 11759 (2021)
institution DOAJ
collection DOAJ
language EN
topic suicidal declarations
LIWC
social media
Medicine
R
spellingShingle suicidal declarations
LIWC
social media
Medicine
R
Michal Ptaszynski
Monika Zasko-Zielinska
Michal Marcinczuk
Gniewosz Leliwa
Marcin Fortuna
Kamil Soliwoda
Ida Dziublewska
Olimpia Hubert
Pawel Skrzek
Jan Piesiewicz
Paula Karbowska
Maria Dowgiallo
Juuso Eronen
Patrycja Tempska
Maciej Brochocki
Marek Godny
Michal Wroczynski
Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach
description In this paper, we study language used by suicidal users on Reddit social media platform. To do that, we firstly collect a large-scale dataset of Reddit posts and annotate it with highly trained and expert annotators under a rigorous annotation scheme. Next, we perform a multifaceted analysis of the dataset, including: (1) the analysis of user activity before and after posting a suicidal message, and (2) a pragmalinguistic study on the vocabulary used by suicidal users. In the second part of the analysis, we apply LIWC, a dictionary-based toolset widely used in psychology and linguistic research, which provides a wide range of linguistic category annotations on text. However, since raw LIWC scores are not sufficiently reliable, or informative, we propose a procedure to decrease the possibility of unreliable and misleading LIWC scores leading to misleading conclusions by analyzing not each category separately, but in pairs with other categories. The analysis of the results supported the validity of the proposed approach by revealing a number of valuable information on the vocabulary used by suicidal users and helped to pin-point false predictors. For example, we were able to specify that death-related words, typically associated with suicidal posts in the majority of the literature, become false predictors, when they co-occur with apostrophes, even in high-risk subreddits. On the other hand, the category-pair based disambiguation helped to specify that death becomes a predictor only when co-occurring with future-focused language, informal language, discrepancy, or 1st person pronouns. The promising applicability of the approach was additionally analyzed for its limitations, where we found out that although LIWC is a useful and easily applicable tool, the lack of any contextual processing makes it unsuitable for application in psychological and linguistic studies. We conclude that disadvantages of LIWC can be easily overcome by creating a number of high-performance AI-based classifiers trained for annotation of similar categories as LIWC, which we plan to pursue in future work.
format article
author Michal Ptaszynski
Monika Zasko-Zielinska
Michal Marcinczuk
Gniewosz Leliwa
Marcin Fortuna
Kamil Soliwoda
Ida Dziublewska
Olimpia Hubert
Pawel Skrzek
Jan Piesiewicz
Paula Karbowska
Maria Dowgiallo
Juuso Eronen
Patrycja Tempska
Maciej Brochocki
Marek Godny
Michal Wroczynski
author_facet Michal Ptaszynski
Monika Zasko-Zielinska
Michal Marcinczuk
Gniewosz Leliwa
Marcin Fortuna
Kamil Soliwoda
Ida Dziublewska
Olimpia Hubert
Pawel Skrzek
Jan Piesiewicz
Paula Karbowska
Maria Dowgiallo
Juuso Eronen
Patrycja Tempska
Maciej Brochocki
Marek Godny
Michal Wroczynski
author_sort Michal Ptaszynski
title Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach
title_short Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach
title_full Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach
title_fullStr Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach
title_full_unstemmed Looking for Razors and Needles in a Haystack: Multifaceted Analysis of Suicidal Declarations on Social Media—A Pragmalinguistic Approach
title_sort looking for razors and needles in a haystack: multifaceted analysis of suicidal declarations on social media—a pragmalinguistic approach
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/2d3a0300c6374946bd729c3366a673a3
work_keys_str_mv AT michalptaszynski lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT monikazaskozielinska lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT michalmarcinczuk lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT gniewoszleliwa lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT marcinfortuna lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT kamilsoliwoda lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT idadziublewska lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT olimpiahubert lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT pawelskrzek lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT janpiesiewicz lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT paulakarbowska lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT mariadowgiallo lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT juusoeronen lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT patrycjatempska lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT maciejbrochocki lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT marekgodny lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
AT michalwroczynski lookingforrazorsandneedlesinahaystackmultifacetedanalysisofsuicidaldeclarationsonsocialmediaapragmalinguisticapproach
_version_ 1718411975280558080